Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add experimental question rephrase and chunks rerank #1518

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

fkesheh
Copy link
Contributor

@fkesheh fkesheh commented Mar 4, 2024

Question rephraser, changes the user question before the embeddings, improving what is retrieved. It also allows multi-turn conversation as it rephrases the question using the context and the prompt. Very useful for assitants.

Untitled.video.-.Made.with.Clipchamp.3.mp4

@fkesheh fkesheh marked this pull request as ready for review March 4, 2024 13:15
@mckaywrigley
Copy link
Owner

Love the idea. Going to dig into this a bit more on Thursday.

@fkesheh
Copy link
Contributor Author

fkesheh commented Mar 5, 2024

Love the idea. Going to dig into this a bit more on Thursday.

Right, then I will try also push in the reranker by today.

@fkesheh fkesheh changed the title Add experimental question rephraser and update retrieval route and chat helpers Add experimental question rephrase and chunks rerank Mar 11, 2024
@fkesheh
Copy link
Contributor Author

fkesheh commented Mar 11, 2024

Added the re-ranker in this PR as well. The re-ranker will take all the chunks and evaluate them under the question light and will return the Top chunks to the LLM. It's based on this post: https://medium.com/@foadmk/enhancing-data-retrieval-with-vector-databases-and-gpt-3-5-reranking-c58ec6061bde

@fkesheh
Copy link
Contributor Author

fkesheh commented Mar 14, 2024

Hello, indeed, this functionality is activated only when there are files present in the chat.

The purpose of the rephraser is to modify your query by considering the surrounding conversation. The parameters you mentioned adjusts how much of the conversation history it's allowed to view. This is particularly handy for follow-up questions where the context might be minimal, such as when asking for further details on a specific point. The rephraser also proves useful in cases where the initial question lacks sufficient context. You can find more details about its application here (https://twitter.com/FKesheh84/status/1767184356009710029?t=68zSSXNMdQV-0ty5c5Z66w&s=19). Essentially, the rephraser aims to generate text that improves the retrieval of relevant information from databases.

On the other hand, the reranker operates towards the end of the process. It assesses the information chunks pulled from the database and selects the most appropriate ones in response to the query. This function is somewhat similar to the reranking process by Cohere. Due to its need to process extensive information, the reranker utilizes nearly the entire context capacity (16k tokens), which incurs costs. While this may be costly in GPT-4, it is manageable in GPT-3.5. For a detailed explanation of the reranker, refer to this article: https://medium.com/@foadmk/enhancing-data-retrieval-with-vector-databases-and-gpt-3-5-reranking-c58ec6061bde.

After the reranker, then the selected chunks are passed back to the user selected model (i.e. GPT-4) to generate the final answer.

These methodologies significantly enhance the Retrieval-Augmented Generation (RAG), making it effective and reliable for answering straightforward questions. However, this approach is not flawless. A more robust alternative for production environments might be the ReAct agent.

@fkesheh
Copy link
Contributor Author

fkesheh commented Mar 14, 2024

This is the sequence diagram:
image

I will double check if there is any issue and left a comment here

@spammenotinoz
Copy link
Contributor

Thank-you, this is working really well!!

@ivanfioravanti
Copy link
Contributor

Amazing job @fkesheh

@ndroo
Copy link

ndroo commented Jun 2, 2024

This is certainly a scope creep suggestion but is there any chance there's the ability for something like this to be re-factored to customize the model responding? Ie if the user wants to ask about an image, but the model isnt 4o, can we make it send the request to 4o instead so the user gets a response that isn't just "I cant do that..." ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants