-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(actions): enable streaming in custom actions #735
base: develop
Are you sure you want to change the base?
feat(actions): enable streaming in custom actions #735
Conversation
0156df1
to
2f3f862
Compare
Hello @drazvan @mikeolubode, I found a neat solution without altering the library. So, I'm just requesting my example be pulled. What do you think of the idea of using a local streaming handler that filters out and handles stream-stopping chunks ( |
2f3f862
to
bfce7d8
Compare
Update: the duplicate chunks as a result of streaming within an action and returning its result must still be handled |
Thanks for digging into this @niels-garve! (using Colang 2.0 syntax, as it might not be possible easily with Colang 1)
|
Thanks for your prompt reply, @drazvan ! I pushed another approach: what if we leverage the possibility that I had to alter the library code, though; removing the fallback “I'm not sure what to say.” But I also see a chance of reworking this, as an English default reply blocks multi-language support. What do you think? I like your Colang 2.0 approach, too. Could the "fact-checking" approach work for Colang 1.0?
(I'll gladly squash the commits in the end; just wanted to keep history while discussing) |
🚨 Updates in the discussions below
Fixes #646
Problem description
First of all, thanks for NeMo-Guardrails!
Given two consecutive actions. The first one is a custom RAG, and the second one analyzes the answer to render a disclaimer in case the answer is not grounded in the knowledge base. It is like fact-checking, but with streaming enabled. The bot should answer and finish like: "I learn something new every day, so my answers may not always be perfect."
Using streaming currently leads to two errors:
streaming_finished_event
is set, which in turn is caused by an empty chunk (""
) that is passed toon_llm_new_token
. The existing if statement checks for empty chunks, but only when they occur at the beginning. In our case, it happens at the end. I extended the check so that""
is never being processed._process
function by adding an early return in casechunk == self.completion
How to test
I've added an example under
examples/configs/rag/custom_rag_streaming
which you can test like so:Please also follow the
README.md
I've included.I'm happy to hear your feedback!
@drazvan @mikeolubode