Free, Autonomous Self-Orchestrating Local Agent.
- If you would like to play around with Autonomous GPT Agents
- If you don't like the idea that agent can empty your API balance
- If you have a laptop with 16 Gb of RAM
- If you don't mind things running slow, but free and with no budget limits
- If you're not a fan of fancy Python, but more a JS guy.
Quantified local GPT models are dumb as sh@t. My first impression when I first ran GPT4ALL on my computer after all those YouTubers screaming in excitement - I was very disappointed. There is no comparison to OpenAI 3.5 - it is not even a competition. Writing poems? Really? We all need AGI! And the closest thing we have is autonomous agents like AutoGPT and BabyAGI, based on the work of llangchain project.
I want me some of that, but I don't want to pay for API. I'm just an average Joe with a laptop, but goddamn I want my computer to have the potential to rule the Earth!
So I thought - can I use elaborate prompts and some algorithmic control to make dumb quantified local models less dumb? That's what FASOLA is all about.
Now it can use search tool to search the internet and give answers. War even on this first step is not over. Trying to trick the model to stay on track.
A framework with clear tool structure, human readable templates, template variants for different levels of model stupidity. Automatic model fetching from hugginface. Humman community will optimize each tool best model, prompt and runtime (time till end result) hence making collection of relatively stupid and slow models to perform somewhat usefull. So each tool might run it's own model and prompt logic to perform only single one task.
A set of tools will be dynamicly linked by an orchestrator. Special core tool: tool selector using semantic similarity will help orchestrator to select the best tool for the subtask of main goal.
Variables tool will help orchistrator to store results in memory, not having to run data through the prompt each time.
In the end i want FASOLA to be able not only to work at the level of BabyAGI but also make an accent on creating the simplest tools using JavaScript to solve tasks and populate the tool database.
P.S. We deliberately do not use Pinecone or store data in tokenized form or embedding based database. Our main goal is not 'fast' or 'optimal' but rather human readable and understandable.
-
Download
all-MiniLM-L6-v2
model in ONNX format:File
model.onnx
from https://huggingface.co/philschmid/all-MiniLM-L6-v2-optimum-embeddings and place it into./models/sentence-transformers/all-MiniLM-L6-v2/default/
along with *.json files (can be cloned via 'git lfs' size is ok)This model used as a core subtool to create embeddings and search.
-
Download
ggml-vic13b-q5_1.bin
file from https://huggingface.co/eachadea/ggml-vicuna-13b-1.1 (can be done using download-model.py or download-model.js file because the whole repository is huge!)This is the orchestrator model - the one responsible for the main logic. You can use any model in ggml format used by llama.cpp. Fact to note here is that quantified models are not good, so probably initial tool prompts will need to be adjusted per model.
-
Download llama.cpp repo to
./llama.cpp
folder and build it. https://github.com/ggerganov/llama.cppGGML model executor.
-
Get user search API key, create .env file, and fill variables:
GOOGLE_SEARCH_API_KEY=
GOOGLE_SEARCH_CX=
Factory Search tool which keeps dependencies between queries -> text-query-iterator (possibly caching)
Anti fading: After two observations repeat the prompt and those two observations.
Human impersonation breaker - break and repeat init prompt with another example if tool offers one.
Too long without formatted response breaker - words count threshold to restart with another example.
Interface for tools with description and use examples. ( Tags, Autoreload, semantic search for tool )
Tool factory that returns the most appropriate tool for the query.
Template standard.
- Several examples array
- Short tool reminder
- Tool description
- Task splitter
- Tool selector
- Variables tool: storage memory
- Variables listing, prioritizing, garbage collection
- Variables manipulation: concat, filter
"I need to use tool X, give result to tool Y, store result as Z"
for fetching table data without passing it through prompt
export to csv
creation, eslint, execution, testing, add to toolbox
tools pipe executor
you/me/apples validation test javaScript creation test tool usage test
not a priority
also. JS geeks we are after
- https://github.com/nomic-ai/gpt4all
- https://github.com/hwchase17/langchain
- https://github.com/ggerganov/llama.cpp
- https://huggingface.co/eachadea/ggml-vicuna-13b-1.1
- https://huggingface.co/philschmid/all-MiniLM-L6-v2-optimum-embeddings
- https://huggingface.co/Xenova/transformers.js
- https://huggingface.co/TheBloke