Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

针对知识库中的文档对话 #4049

Open
idiotscholarman opened this issue May 20, 2024 · 2 comments
Open

针对知识库中的文档对话 #4049

idiotscholarman opened this issue May 20, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@idiotscholarman
Copy link

知识库中上传的文档内容有很多近似主题的,返回的K个文本块可能来自不同的文档,造成回答的不太准确,现在希望能够通过API实现发送query以及文档名,实现针对知识库中的文档对话,这样即不需要上传切块嵌入这些操作,提升了速度,也提升了准确性。这个需求需要在哪部分修改

@idiotscholarman idiotscholarman added the bug Something isn't working label May 20, 2024
@liudichen
Copy link
Contributor

你这个似乎直接跟大模型对话就行了。但是文档的上下文长度可能超出大模型的限度了呦。

@idiotscholarman
Copy link
Author

你这个似乎直接跟大模型对话就行了。但是文档的上下文长度可能超出大模型的限度了呦。

和模型对话不行的,提问时即便带了文档名,也会当成query的一部分直接进行向量匹配,文档名也不是该文档每个分块都有的,一样不准,和别人说的一样在切块嵌入时针对同文档的文本块都添加metadata字段,针对用户query进行向量匹配前先针对文档名进行一次关键词匹配筛选出文档拆出来的文本块,再在这些文本块内进行向量搜索才能解决这个问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants