Skip to content

Latest commit

 

History

History
106 lines (61 loc) · 4.63 KB

doc_retrieval_and_QA_LLMs.md

File metadata and controls

106 lines (61 loc) · 4.63 KB
description
Generative AI - Document Retrieval and Question Answering with LLMs(Apply LLMs to the domain-specific data)

Overview

With LLMs, we can integrate domain-specific data to answer questions. This is especially useful for data unavaliable to the model during its initial training, like a company's internal documentation or knowledge base.

Implementaion code Todo

The architecture is called Retrieval Augmentation Generation or (Generative Question Answering). Below we will go through implement this architecture using LLMs and a Vector Database.

Here are some use cases:

  • It reduce the time we need to interact woth documents
  • There is no need for us anymore to search for answers in search results
  • The LLM takes care of precisely finding the most relevant documents and using them to generate the answer right from the documents

Preparing

  • Lanchanin(or other chain)
  • Documents
  • Vector Database

LangChain

It is used for tasks like Data Loading, Document Chunking, Vector Stores, and Text and Embedding model interaction. Here is more detail about Text embedding models

Documents

Here the example we use the full Google Cloud Vertex AI Documentation https://cloud.google.com/vertex-ai/sitemap.xml Any type of document is possible.

Combining LLMs and Vector Databases for Enhanced Question Answering Systems

Let's examine how we can combine the strengths of LLMs and Vector Databases to create a powerful document retrieval and question answering system.

Source:https://medium.com/google-cloud/generative-ai-document-retrieval-and-question-answering-with-llms-2b0fb80ae76d

Step involved

  • Getting the documents and preprocess

Loading data

Using LangChains sitemap document loader to load the documents and split them into smaller chunks. It reduces the size of the text that is sent to the LLM and it has two main advantages:

  • The embedding that is created for that document chunk more accurately represented the information to our question.
  • During retrieval, we can receive smaller chunks and keep the LLM contenxt small. This leads to faster latency and lower costs.

After this process, we have a folder containing a bunch of chunk documents. Those chunk documents are used in the next steps to:

  • Create the document embeddings
  • Answer our question

Document Embedding

The embedding step transforms our documents into a vector representation called embedding. We can use an LLM to encode each document chunk into a high-dimensional vector. This process captures the semantic information of the document in the form of a vector. Here is an example:

from langchain.embeddings import xxxx

embeddings = xxxx()

text = 'xxxx'
doc_result =enbeddings.encode([text])

After this process, we have a for each document chunk a vector representation.

Storing in Vector Database

It is helpful a vector database enables efficient similarity search among the vectors, helping us retireve the most relevant documents for a given query.

Query Processing

When the query comes in, the LLM to transform the question into a vector.

Document Retrival

Based on the query from the previous step, we search the vector database for the most semantically similar documents vectors to the question vector. The documents associated with these vectors are the most relevant to our query and will be as context for our LLM.

The vector database only contains the embeddings and an identifier without the actual text. Using LangChain to identify and matches the vector results to the actual documents.

Answer Generation

The retrieved documents are fed back as context into the LLM. The LLM generates an answer based on the information provided. The important part is the prompt structure.

Tuning vs. Indexing

Either fine-tune the model or LLM use an external index that can be queried at runtime. Indexing has some advanatges over tuning:

  • New documents can be added to the index without retraining the model(In a real time)
  • We circumvent the context size limitations
  • Restricted documents at anytime
  • Cheaper
  • Explainable
  • Combined with prompt engineering

Conclusion

This article is a showcase for recolutionize document retrieval and question-answering systems by harnessing the power of LLMs when combined with Vectr Databses.

Credit

https://medium.com/google-cloud/generative-ai-document-retrieval-and-question-answering-with-llms-2b0fb80ae76d