-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(fmt): An attempt at aesthetic items into PL #4639
base: main
Are you sure you want to change the base?
Conversation
This is an attempt to get around the complications of managing lexer + parser output, which PRQL#4397 has hit in a few incarnations by just adding comments ('aesthetics') to PL. This very very nearly works -- with chumsky we can create a function that wraps anything that might have a comment, implement a trait on the AST items that contain it, and away we go (though it did require a lot of debugging in the end). This would then be really easy to write back out. I think there's literally a single case where it doesn't work -- where a comment doesn't come directly before or directly after an AST item -- in the final trailing comma of a tuple or array. So tests fail at the moment. Next we need to consider: - Can we workaround that one case? We don't actually care about whether there's a trailing comma, so we could likely hack around it... - Are there actually other cases of this model failing? I know this approach -- of putting aesthetic items into AST -- is not generally favored, and it's really rare that there's even a single case of something not working.
One thing we could do land from this is doc-comments — which must be attached to an item, and we may want to push through the AST. It's much less important to the project than getting |
Oh, this is an interesting idea: instead of discarding all comments and new lines, we keep them in the first AST, so they can be re-incorporated into codegen. It's a shame that it doesn't work for all the cases. It would be a beautiful solution for formatting comments. Could we parse trailing comma as an aesthetic too? re doc comments: yes, they are very similar. I think they should be allowed only on statements, so that's even simpler to parse. |
Yeah, I added some more thoughts at #4397 (comment). It's possible to do it this way, but not as elegant as I first thought — I think it would require lots of backtracking and custom parsers (i.e. not just |
By adding comments (named "aesthetics" here, and includes linewraps) to PL, this is an attempt to get around the complications of combining lexer + parser output in
prqlc fmt
, which #4397 has hit in a few incarnations.This very very nearly works — with chumsky we can create a function that wraps anything that might have a trailing or following comment, implement a trait on the AST items that contain it — and away we go. (though it did require lots of debugging in the end...). The AST would then be really easy to write back out.
This requires comments to lead or follow tokens that are part of an AST item. I think there's literally a single case where it doesn't work, which is when a comment follows the final trailing comma of a tuple or array. So apart from that case, a comment always leads or follows a token that's part of an AST item.
...so tests fail at the moment, on that case.
Next we need to consider: