Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indentation-sensitive grammar forgets consumed dedents #465

Open
simonwo opened this issue Dec 31, 2023 · 1 comment
Open

Indentation-sensitive grammar forgets consumed dedents #465

simonwo opened this issue Dec 31, 2023 · 1 comment

Comments

@simonwo
Copy link

simonwo commented Dec 31, 2023

Hi, I'm using Ohm v17.1.0 to build a whitespace-sensitive grammar using the experimental indentation support. I am struggling to get the following MVE grammar to correctly parse the example:

Blocks <: IndentationSensitive {
    Object = Block<ObjectHead, ObjectBody>
    ObjectHead = "key:"
    ObjectBody = ObjectHead "value"
        
    Block<Head, Body> = Head newline indent BlockBody<Head, Body> dedent
    BlockBody<Head, Body> = Block<Head, Body> BlockBody<Head, Body> -- block
                          | Body newline BlockBody<Head, Body>  -- stmt                          
                          | space* -- zero

    newline = "\n" "\r"?    
    space := " " | "\t"
}

And the example:

key:
    key:
        key: value
        key: value
    key: value

The parse fails at line 5 col 5 with Expected a dedent, a space, or "key:", which is weird because it should be able to consume a dedent, or key: if it has just consumed one. After banging my head against the desk for a while, I've stepped through the parsing code and whilst it does consume the dedent, when it returns to the same position to continue the parse it seems to still be expecting one. AFAICS the result of attempting to parse at this position is memoized when it should not be, because the memoized version does not take into account the consumed dedent.

Specifically, if I comment out this line which resets memoization for a match state that has it enabled

if (state.doNotMemoize) {
state.doNotMemoize = false;
} else if (isHeadOfLeftRecursion) {

...the parse works as expected.

However, I'm not really sure why this line is here so not sure if I've found a bug or have just two-wrongs-make-a-righted myself.

I'd be happy to work on and submit a patch for this, but I could do with a positive confirmation that this is a problem, and pointers to any other areas that I might need to change as result. No doubt it's all more complicated than I understand at the moment! Thanks!

@simonwo
Copy link
Author

simonwo commented Mar 5, 2024

FYI this patch solves the issue for me:
01-keep-memoization-disabled.patch

And this one provides some more debugging information:
02-output-indents-in-trace.patch

ianb added a commit to ianb/ohm that referenced this issue Jun 3, 2024
See ohmjs#465 (comment)

This changes memoization some so it can consume dedents
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant