This commit is contained in:
James Ketr 2025-04-27 00:49:33 -07:00
parent f812eca165
commit 33d9c1d28a
3 changed files with 15 additions and 2 deletions

View File

@ -18,9 +18,22 @@ This system was built to run on commodity hardware, for example the Intel Arc B5
Before you spend too much time learning how to customize Backstory, you may want to see it in action with your own information. Fine-tuning the LLM with your data can take a while, so you might want to see what the system can do just by utilizing retrieval-augmented generation.
Backstory works by generating a set of facts about you. Those facts can be exposed to the LLM via RAG, or baked into the LLM by fine-tuning. In either scenario, Backstory needs to know your relationship with a given fact.
The `./docs` directory has been seeded with an AI generated persona. That directory is only used during development; actual content should be put into the `./docs-prod` directory.
Launching with the defaults, you can ask things like `Who is Eliza Morgan?`
If you want to seed your own data:
1. `docker compose down backstory`
2. Remove everything from docs/: `rm -rf docs/*`
3. Put your generic resume in docs/resume/generic[.pdf,.md,.txt,.docx]
4. Remove everything from chromadb/: `rm --rf chromadb/*`
5. `docker compose up backstory -d`
**WIP**
Backstory works by generating a set of facts about you. Those facts can be exposed to the LLM via RAG, or baked into the LLM by fine-tuning. In either scenario, Backstory needs to know your relationship with a given fact.
WIP notes: Right now, it just uses RAG. I'm working on the PEFT+QLoRA code. So take this section as aspirational... (patches welcome)
To facilitate this, Backstory expects the documents it reads to be marked with information that highlights your role in relation to the document. That information is either stored within each document as [Front Matter (YAML)](https://jekyllrb.com/docs/front-matter/) or as a YAML sidecar file (a file with the same name as the content, plus the extension .yml)

View File

@ -11,7 +11,7 @@ max_context = 2048*8*2
doc_dir = "/opt/backstory/docs/"
session_dir = "/opt/backstory/sessions"
static_content = "/opt/backstory/frontend/deployed"
resume_doc = "/opt/backstory/docs/resume/generic.txt"
resume_doc = "/opt/backstory/docs/resume/generic.md"
# Only used for testing; backstory-prod will not use this
key_path = "/opt/backstory/keys/key.pem"
cert_path = "/opt/backstory/keys/cert.pem"