You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
this is cool, thanks for starting down this path with PyTorch. I came here from the HN discussion. I have been focusing on llama 2 since it seems for me to be a bit of a tipping point and its time to dig into local LLMs.
So lets say I want to limit the output of llama to valid Python code? The python grammar seems no longer to be EBNF but rather PEG. I was thinking that for my purposes I would actually be very happy if the LLM output was constrained to valid python AST. Either of these requires a different parser and thus code to handle each case.
Have you thought about how to handle this potential proliferation of parser formats? I am probably going to try to copy your approach and extend this to handle python AST.
The text was updated successfully, but these errors were encountered:
Hey, sorry, I'm chronically bad at GitHub notifications.
I think that's a really cool idea. My knowledge on the theoretical CS part of parsing theory etc. is fuzzy at best so I can't think through what an implementation would look like here for PEG as opposed to EBNF/CFG, but I would be thrilled to see that work.
The actual EBNF parsing code here was ported directly from that equivalent feature in llama.cpp with essentially no changes. I think it would be really excellent if we could support a richer format or a handful of them (this dialect of EBNF is quite rudimentary). I don't currently have any plans to work on this myself.
this is cool, thanks for starting down this path with PyTorch. I came here from the HN discussion. I have been focusing on llama 2 since it seems for me to be a bit of a tipping point and its time to dig into local LLMs.
So lets say I want to limit the output of llama to valid Python code? The python grammar seems no longer to be EBNF but rather PEG. I was thinking that for my purposes I would actually be very happy if the LLM output was constrained to valid python AST. Either of these requires a different parser and thus code to handle each case.
Have you thought about how to handle this potential proliferation of parser formats? I am probably going to try to copy your approach and extend this to handle python AST.
The text was updated successfully, but these errors were encountered: