Updated docs

2025-04-27 00:11:09 -07:00 · 2025-04-27 00:11:09 -07:00 · 948da26444
commit 948da26444
parent b8067d05a9
1 changed files with 68 additions and 8 deletions
--- a/README.md
+++ b/README.md
@ -2,8 +2,8 @@

 Backstory is an AI Resume agent that provides context into a diverse career narrative. Backstory will take a collection of documents about a person and provide:

-* Through the use of several custom Language Processing Modules (LPM), develop a comprehensive set of test and validation data based on the input documents. While manual review of content should be performed to ensure accuracy, several LLM techniques are employed in the LPM in order to isolate and remove hallucinations and inaccuracies in the test and validation data.
-* Utilizing quantized low-rank adaption (QLoRA) and parameter effecient tine tuning (PEFT,) provide a hyper parameter tuned and customized LLM for use in chat and content creation scenarios with expert knowledge about the individual. 
+* WIP: Through the use of several custom Language Processing Modules (LPM), develop a comprehensive set of test and validation data based on the input documents. While manual review of content should be performed to ensure accuracy, several LLM techniques are employed in the LPM in order to isolate and remove hallucinations and inaccuracies in the test and validation data.
+* WIP: Utilizing quantized low-rank adaption (QLoRA) and parameter effecient tine tuning (PEFT,) provide a hyper parameter tuned and customized LLM for use in chat and content creation scenarios with expert knowledge about the individual. 
 * Post-training, utilize additional RAG content to further enhance the information domain used in conversations and content generation.
 * An integrated document publishing work flow that will transform a "Job Description" into a customized "Resume" for the person the LLM has been trained on.
 * "Fact Check" the resulting resume against the RAG content directly provided by the user in order to remove hallucinations.
@ -111,21 +111,81 @@ docker compose exec -it ollama /fetch-models.sh

 The persisted volume mounts (`./cache` and `./ollama`) can grow quite large with models, GPU kernel caching, etc. During the development of this project, the cache directory has grown to consume ~250G of disk space.

-Inside the cache you will see directories like:
+Inside the `cache` you will see directories like:

 | Directory | Size | What's in it? |
-|:----------|:-----:--------------|
+|:----------|:-----|:--------------|
 | datasets  | 23G  | If you download any HF datasets, they will be here |
-| hub       | 310G | All of the HF models will show up here. |
+| hub       | 310G | All of the HF models will show up here. `docker exec backstory shell "huggingface-cli scan-cache"` |
 | libsycl_cache | 2.9G | Used by... libsycl. It caches pre-compiled things here. |
 | modules   | ~1M | Not sure what created this. It has some microsoft code, so maybe from markitdown? |
 | neo_compiler_cache | 1.1G | If you are on an Intel GPU, this is where JIT compiled GPU kernels go. If you launch a model and it seems to stall out, `watch ls -alt cache/neo_compiler_cache` to see if Intel's compute runtime (NEO) is writing here. |

-And in ollama:
+I haven't kept up on pruning out old models I'm not using. Sample output of running the hugging-cli command:
+
+```
+$ docker exec backstory shell "huggingface-cli scan-cache -vvv"
+REPO ID                                              REPO TYPE REVISION                                 SIZE ON DISK NB FILES LAST_MODIFIED REFS       LOCAL PATH                                                                                                                        
+---------------------------------------------------- --------- ---------------------------------------- ------------ -------- ------------- ---------- --------------------------------------------------------------------------------------------------------------------------------- 
+Matthijs/cmu-arctic-xvectors                         dataset   36e87b347a6a70f0420445b02ec40c55556f9ed7        21.3M        1 5 weeks ago              /root/.cache/hub/datasets--Matthijs--cmu-arctic-xvectors/snapshots/36e87b347a6a70f0420445b02ec40c55556f9ed7                       
+Matthijs/cmu-arctic-xvectors                         dataset   5c1297a9eb6c91714ea77c0d4ac5aca9b6a952e5         2.4K        2 5 weeks ago   main       /root/.cache/hub/datasets--Matthijs--cmu-arctic-xvectors/snapshots/5c1297a9eb6c91714ea77c0d4ac5aca9b6a952e5                       
+McAuley-Lab/Amazon-Reviews-2023                      dataset   2b6d039ed471f2ba5fd2acb718bf33b0a7e5598e        25.2G       10 3 weeks ago   main       /root/.cache/hub/datasets--McAuley-Lab--Amazon-Reviews-2023/snapshots/2b6d039ed471f2ba5fd2acb718bf33b0a7e5598e                    
+yahma/alpaca-cleaned                                 dataset   12567cabf869d7c92e573c7c783905fc160e9639        44.3M        2 2 months ago  main       /root/.cache/hub/datasets--yahma--alpaca-cleaned/snapshots/12567cabf869d7c92e573c7c783905fc160e9639                               
+IDEA-Research/grounding-dino-tiny                    model     a2bb814dd30d776dcf7e30523b00659f4f141c71       690.3M        8 2 days ago    main       /root/.cache/hub/models--IDEA-Research--grounding-dino-tiny/snapshots/a2bb814dd30d776dcf7e30523b00659f4f141c71                    
+Intel/neural-chat-7b-v3-3                            model     7506dfc5fb325a8a8e0c4f9a6a001671833e5b8e        14.5G       10 3 months ago  main       /root/.cache/hub/models--Intel--neural-chat-7b-v3-3/snapshots/7506dfc5fb325a8a8e0c4f9a6a001671833e5b8e                            
+Qwen/CodeQwen1.5-7B-Chat                             model     7b0cc3380fe815e6f08fe2f80c03e05a8b1883d8        14.5G       10 4 weeks ago   main       /root/.cache/hub/models--Qwen--CodeQwen1.5-7B-Chat/snapshots/7b0cc3380fe815e6f08fe2f80c03e05a8b1883d8                             
+TheBloke/neural-chat-7B-v3-2-AWQ                     model     f3c5e4160e0faecf91ca396558527ba13f1efb72         2.3M        6 2 months ago  main       /root/.cache/hub/models--TheBloke--neural-chat-7B-v3-2-AWQ/snapshots/f3c5e4160e0faecf91ca396558527ba13f1efb72                     
+TheBloke/neural-chat-7B-v3-2-GGUF                    model     97de3dbd877a4b022eda57b292d0efba0187ed79         7.5G        3 2 months ago  main       /root/.cache/hub/models--TheBloke--neural-chat-7B-v3-2-GGUF/snapshots/97de3dbd877a4b022eda57b292d0efba0187ed79                    
+black-forest-labs/FLUX.1-dev                         model     0ef5fff789c832c5c7f4e127f94c8b54bbcced44        57.9G       29 6 weeks ago   main       /root/.cache/hub/models--black-forest-labs--FLUX.1-dev/snapshots/0ef5fff789c832c5c7f4e127f94c8b54bbcced44                         
+black-forest-labs/FLUX.1-schnell                     model     741f7c3ce8b383c54771c7003378a50191e9efe9        33.7G       23 6 weeks ago   main       /root/.cache/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9                     
+deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B            model     ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562         3.6G        5 2 months ago  main       /root/.cache/hub/models--deepseek-ai--DeepSeek-R1-Distill-Qwen-1.5B/snapshots/ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562            
+deepseek-ai/DeepSeek-R1-Distill-Qwen-7B              model     916b56a44061fd5cd7d6a8fb632557ed4f724f60        15.2G        7 2 months ago  main       /root/.cache/hub/models--deepseek-ai--DeepSeek-R1-Distill-Qwen-7B/snapshots/916b56a44061fd5cd7d6a8fb632557ed4f724f60              
+intel/neural-chat-7b-v3                              model     7f6ebc113310e0d2ecc92ae94daeddba5493704d         2.3M        7 2 months ago  main       /root/.cache/hub/models--intel--neural-chat-7b-v3/snapshots/7f6ebc113310e0d2ecc92ae94daeddba5493704d                              
+intel/neural-chat-7b-v3-3                            model     7506dfc5fb325a8a8e0c4f9a6a001671833e5b8e         2.3M        7 2 months ago  main       /root/.cache/hub/models--intel--neural-chat-7b-v3-3/snapshots/7506dfc5fb325a8a8e0c4f9a6a001671833e5b8e                            
+llmware/intel-neural-chat-7b-v3-2-ov                 model     7a0a312108b4b9c37c739eb83b592c30c9965eb0         2.3M        5 2 months ago  main       /root/.cache/hub/models--llmware--intel-neural-chat-7b-v3-2-ov/snapshots/7a0a312108b4b9c37c739eb83b592c30c9965eb0                 
+meta-llama/Llama-3.2-3B                              model     13afe5124825b4f3751f836b40dafda64c1ed062         9.1M        3 3 weeks ago   main       /root/.cache/hub/models--meta-llama--Llama-3.2-3B/snapshots/13afe5124825b4f3751f836b40dafda64c1ed062                              
+meta-llama/Llama-3.2-3B-Instruct                     model     0cb88a4f764b7a12671c53f0838cd831a0843b95         9.1M        3 5 weeks ago   main       /root/.cache/hub/models--meta-llama--Llama-3.2-3B-Instruct/snapshots/0cb88a4f764b7a12671c53f0838cd831a0843b95                     
+microsoft/Florence-2-base                            model     ceaf371f01ef66192264811b390bccad475a4f02       467.1M        9 2 days ago    main       /root/.cache/hub/models--microsoft--Florence-2-base/snapshots/ceaf371f01ef66192264811b390bccad475a4f02                            
+microsoft/florence-2-base                            model     ceaf371f01ef66192264811b390bccad475a4f02         2.5M        7 2 days ago    main       /root/.cache/hub/models--microsoft--florence-2-base/snapshots/ceaf371f01ef66192264811b390bccad475a4f02                            
+microsoft/speecht5_hifigan                           model     6f01b211b404df2e0a0a20ca79628a757bb35854        50.6M        1 5 weeks ago   refs/pr/1  /root/.cache/hub/models--microsoft--speecht5_hifigan/snapshots/6f01b211b404df2e0a0a20ca79628a757bb35854                           
+microsoft/speecht5_hifigan                           model     bb6f429406e86a9992357a972c0698b22043307d        50.7M        2 5 weeks ago   main       /root/.cache/hub/models--microsoft--speecht5_hifigan/snapshots/bb6f429406e86a9992357a972c0698b22043307d                           
+microsoft/speecht5_tts                               model     30fcde30f19b87502b8435427b5f5068e401d5f6       585.7M        7 5 weeks ago   main       /root/.cache/hub/models--microsoft--speecht5_tts/snapshots/30fcde30f19b87502b8435427b5f5068e401d5f6                               
+microsoft/speecht5_tts                               model     a01d4f293234515125d07f68be3c36d739ccac93       585.4M        1 5 weeks ago   refs/pr/28 /root/.cache/hub/models--microsoft--speecht5_tts/snapshots/a01d4f293234515125d07f68be3c36d739ccac93                               
+mistralai/Mistral-Small-3.1-24B-Instruct-2503        model     247c7a102f360e2ab181caf6aa7e8144316fd488        96.1G       25 5 weeks ago   main       /root/.cache/hub/models--mistralai--Mistral-Small-3.1-24B-Instruct-2503/snapshots/247c7a102f360e2ab181caf6aa7e8144316fd488        
+openlm-research/open_llama_3b_v2                     model     4293833c8795656cdacfae811f713ada0e7a2726         6.9G        1 2 months ago  refs/pr/16 /root/.cache/hub/models--openlm-research--open_llama_3b_v2/snapshots/4293833c8795656cdacfae811f713ada0e7a2726                     
+openlm-research/open_llama_3b_v2                     model     bce5d60d3b0c68318862270ec4e794d83308d80a         6.9G        6 2 months ago  main       /root/.cache/hub/models--openlm-research--open_llama_3b_v2/snapshots/bce5d60d3b0c68318862270ec4e794d83308d80a                     
+openlm-research/open_llama_7b_v2                     model     e5961def23172a2384543940e773ab676033c963        13.5G       10 3 months ago  main       /root/.cache/hub/models--openlm-research--open_llama_7b_v2/snapshots/e5961def23172a2384543940e773ab676033c963                     
+runwayml/stable-diffusion-v1-5                       model     451f4fe16113bff5a5d2269ed5ad43b0592e9a14         5.5G       15 6 weeks ago   main       /root/.cache/hub/models--runwayml--stable-diffusion-v1-5/snapshots/451f4fe16113bff5a5d2269ed5ad43b0592e9a14                       
+sentence-transformers/all-MiniLM-L6-v2               model     c9745ed1d9f207416be6d2e6f8de32d1f16199bf        91.6M       11 2 months ago  main       /root/.cache/hub/models--sentence-transformers--all-MiniLM-L6-v2/snapshots/c9745ed1d9f207416be6d2e6f8de32d1f16199bf               
+stabilityai/stable-diffusion-xl-base-1.0             model     462165984030d82259a11f4367a4eed129e94a7b         7.1G       19 1 day ago     main       /root/.cache/hub/models--stabilityai--stable-diffusion-xl-base-1.0/snapshots/462165984030d82259a11f4367a4eed129e94a7b             
+unsloth/Mistral-Small-24B-Base-2501-unsloth-bnb-4bit model     4e277e563e75dc642a9947b0a5e42b16440c9546        15.7G       12 5 weeks ago   main       /root/.cache/hub/models--unsloth--Mistral-Small-24B-Base-2501-unsloth-bnb-4bit/snapshots/4e277e563e75dc642a9947b0a5e42b16440c9546 
+
+Done in 0.0s. Scanned 29 repo(s) for a total of 326.8G.
+```
+
+And inside `ollama`:

 | Directory | Size | What's in it? |
-|:----------|:-----:--------------|
-| models    | 32G  | All models downloaded via `ollama pull ...`. |
+|:----------|:-----|:--------------|
+| models    | 32G  | All models downloaded via `ollama pull ...`. Run `docker exec ollama ollama list` |
+
+Sample output of running `ollama list`:
+
+```
+$ docker exec ollama ollama list
+ggml_sycl_init: found 1 SYCL devices:
+NAME                        ID              SIZE      MODIFIED     
+mxbai-embed-large:latest    468836162de7    669 MB    15 hours ago    
+qwen2.5:3b                  357c53fb659c    1.9 GB    10 days ago     
+mistral:7b                  f974a74358d6    4.1 GB    2 weeks ago     
+qwen2.5:7b                  845dbda0ea48    4.7 GB    3 weeks ago     
+llama3.2:latest             a80c4f17acd5    2.0 GB    3 weeks ago     
+dolphin-phi:latest          c5761fc77240    1.6 GB    6 weeks ago     
+llama3.2-vision:latest      085a1fdae525    7.9 GB    6 weeks ago     
+llava:latest                8dd30f6b0cb1    4.7 GB    6 weeks ago     
+deepseek-r1:1.5b            a42b25d8c10a    1.1 GB    7 weeks ago     
+deepseek-r1:7b              0a8c26691023    4.7 GB    7 weeks ago  
+```

 ## Running