Skip to content

Latest commit

 

History

History
36 lines (31 loc) · 2 KB

README.md

File metadata and controls

36 lines (31 loc) · 2 KB

RAG Lyon Data Science

Simple RAG (Retrieval Augmented Generation) using Vertex AI Generative AI (PaLM 2 model) and Qdrant Vector Database, presented at Lyon Data Science meetup

Installation

  • The project was developed and tested using Python 3.10 on MacOS
python3.10 -m venv ./venv
source ./venv/bin/activate
pip install -r requirements.txt
  • Start a local Qdrant instance
docker pull qdrant/qdrant
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

Usage

This repository contains 2 scripts:

  • rag_indexing.py that index the content of data/knowledge_base.json (list of questions / responses from WikiQA dataset) in a Qdrant vector database.
    • Questions are embedded using Vertex AI Generative model (Gecko).
    • Qdrant URL can be configured in constant.py (default localhost).
    • You can define the port to use (HTTP and GRPC) when creating the Qdrant client (default are 6333 for HTTP, 6334 for GRPC).
  • rag_inference.py that performs the following steps:
    1. Ask a question to the user.
    2. Embed the question using the same model as during indexing.
    3. Retrieve the nearest (semantic) questions in the database.
    4. Build a context for the user question, using the responses to the questions retrieved at the previous step.
    5. Build a prompt and ask a LLM (here Bison) for the response.

Note that the scripts can be easily adapted to use another Vector DB / LLM (GPT for example).

Overall process: RAG Architecture