Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

different parsers #2

Open
gminorcoles opened this issue Jul 22, 2023 · 1 comment
Open

different parsers #2

gminorcoles opened this issue Jul 22, 2023 · 1 comment

Comments

@gminorcoles
Copy link

this is cool, thanks for starting down this path with PyTorch. I came here from the HN discussion. I have been focusing on llama 2 since it seems for me to be a bit of a tipping point and its time to dig into local LLMs.

So lets say I want to limit the output of llama to valid Python code? The python grammar seems no longer to be EBNF but rather PEG. I was thinking that for my purposes I would actually be very happy if the LLM output was constrained to valid python AST. Either of these requires a different parser and thus code to handle each case.

Have you thought about how to handle this potential proliferation of parser formats? I am probably going to try to copy your approach and extend this to handle python AST.

@burke
Copy link
Member

burke commented Aug 15, 2023

Hey, sorry, I'm chronically bad at GitHub notifications.

I think that's a really cool idea. My knowledge on the theoretical CS part of parsing theory etc. is fuzzy at best so I can't think through what an implementation would look like here for PEG as opposed to EBNF/CFG, but I would be thrilled to see that work.

The actual EBNF parsing code here was ported directly from that equivalent feature in llama.cpp with essentially no changes. I think it would be really excellent if we could support a richer format or a handful of them (this dialect of EBNF is quite rudimentary). I don't currently have any plans to work on this myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants