This example deploys a recursive query decomposition example for chat Q&A.
Query decomposition can perform RAG when the agent needs to access information from several different documents (also referred to as chunks) or to perform some computation on the answers. This example uses a custom LangChain agent that recursively breaks down the questions into subquestions. The agent then attempts to answer the subquestions.
The agent has access to two tools:
- search: to perform standard RAG on a subquestion.
- math: to pose a math question to the LLM.
The agent continues to break down the question into subquestions until it has the answers that it needs to form the final answer.
Model | Embedding | Framework | Vector Database | File Types |
---|---|---|---|---|
meta/llama3-70b-instruct | nvidia/nv-embedqa-e5-v5 | LangChain | Milvus | TXT, PDF, MD |
Complete the common prerequisites.
-
Export your NVIDIA API key as an environment variable:
export NVIDIA_API_KEY="nvapi-<...>"
-
Start the containers:
cd RAG/examples/advanced_rag/query_decomposition_rag/ docker compose up -d --build
Example Output
✔ Network nvidia-rag Created ✔ Container rag-playground Started ✔ Container milvus-minio Started ✔ Container chain-server Started ✔ Container milvus-etcd Started ✔ Container milvus-standalone Started
-
Confirm the containers are running:
docker ps --format "table {{.ID}}\t{{.Names}}\t{{.Status}}"
Example Output
CONTAINER ID NAMES STATUS 39a8524829da rag-playground Up 2 minutes bfbd0193dbd2 chain-server Up 2 minutes ec02ff3cc58b milvus-standalone Up 3 minutes 6969cf5b4342 milvus-minio Up 3 minutes (healthy) 57a068d62fbb milvus-etcd Up 3 minutes (healthy)
-
Open a web browser and access http://localhost:8090 to use the RAG Playground.
Refer to Using the Sample Web Application for information about uploading documents and using the web interface.
- Vector Database Customizations
- Stop the containers by running
docker compose down
. - Use the RAG Application: Query Decomposition Agent Helm chart to deploy this example in Kubernetes.