-
-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Custom Rules #365
Comments
I'd like to see an example of some of the syntax you're referring to first. |
Unfortunately, I can't give concrete examples, as the project isn't open source (yet). Without giving too much away, consider a heredoc-like syntax where the A) the inner language is not expressible as a regular grammar, and B) the heredoc terminator is only accepted if it is located in certain points within the sub-language, otherwise, it consumed as part of the sub-language, and there must be another terminator located elsewhere. As a result, once the heredoc starts, there has to be custom logic to figure out where it ends, and then to validate that what's in between is even allowable, and if not, lexing (not parsing) terminates. If it weren't for point (B), a |
I'm not necessarily opposed to the stateful lexer being extensible, but I won't accept a backward breaking change. From briefly looking at the your code, I would suggest looking at extending That said, without any concrete examples/tests showing use-cases, I won't accept it either. |
Of course. Like I mentioned, this was a quick "What if?", and any actual PR I would submit would be backward compatible, with documentation, test coverage, and no regressions. Given that none of the methods of Action are exported, I'm not sure I follow you suggestion, and even looking at the unexported method, I don't see a simple way to have it generate additional tokens, but perhaps I misunderstand. Are you suggesting to export Action's method, and modify it to optionally return tokens as well as modify the state? |
I'm proposing you extend the private |
Ah, and rules must also be serialisable to JSON. |
Before you do anything, you should extract a representative example (obfuscated if necessary) and include it in this issue. And perhaps an example of how you would use this proposed new functionality to lex it. |
On the serialization note, is that just for diagnostic purposes, or does it need to be able to round-trip? My goal is to be able to inject an arbitrary function to execute, like in the linked fork, which obviously wouldn't be able to round-trip without having some sort of global lookup table to register functions to on initialization. |
It needs to be able to round-trip, but I think for this case it could just return an error. |
Background: I am working on an experimental language where, for the most part, simple regular expression rules are capable of lexing the source, but after certain tokens, the next token requires complex logic to identify before returning to the regular expression rules.
Problem: As far as I can tell, this is impossible without defining my own lexer.Definition and lexer.Lexer instances from scratch, and it is not possible to re-use the existing functionality from lexer.StatefulDefinition, without simply copy-pasting it.
Proposed Solution: Extend StatefulDefinition's Rule (or make StatefulDefinition a subset of a more comprehensive API) to anything that is sufficiently "regex-like", that is, can accept the parent state's name and captured groups, as well as the input data, then either terminate the lex, report no match, or report the end point of the next token and the action to take.
I have a very rudimentary proof-of-concept here, which is not backwards-compatible, breaks all of the tests, and isn't particularly well-written, but nonetheless works.
Would you be interested in working together to implement this in a way consistent with the current API, or would you prefer I maintain my own fork?
The text was updated successfully, but these errors were encountered: