-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalization of line-endings in template literals #90
Comments
I notice the tree doesn't have But like I said, that's instinct. I'm curious to hear whether this is the intention. |
should parse to ESTree output like
|
So does that mean there's no way to literally reconstruct the exact source from the AST (if it included I know things like whitespace (and comments) are typically not kept in the AST, but I also have the parallel hopes that the AST will be extended to support concrete syntax, with the express goal of being able to keep absolutely everything. If the actual code points aren't going to be kept in the AST, this issue seems like another wrinkle that would need to be considered for concrete syntax preservation. |
I have that same hope, but doing it right means representing program source in a way that is agnostic of AST node type—no special treatment for nodes that happen to represent Literal input elements. So for this issue in particular, the AST (i.e., In other words, |
There is - we still have ranges for that. In any case, I believe that according to spec both "raw" and "cooked" representations should be processed at least in sense of |
The spec calls for special processing (normalization) on line-endings in template literals:
http://people.mozilla.org/~jorendorff/es6-draft.html#sec-static-semantics-tv-s-and-trv-s
See specifically the note at the end of that section, which says:
So, I have several questions relating to how (if at all?) the AST spec deals with this:
Does the parser do all this normalization before creating the AST, or does the AST need to preserve the actual information in the code so it's handled post-AST (like in interpretation/code-gen/etc)?
If the parser handles the normalization (changing occurrences of
U+000D
andU+000DU+000A
toU+000A
) before producing the tree, then should it do that for both the node value and theraw
?My instinct would say that
raw
should preserve the originalU+000D
orU+000DU+000A
sequences (pre-normalization). However, the spec says that the template literal'sraw
value is post-normalization, so perhaps the parser/AST should also normalize itsraw
? Will it be confusing if the ASTraw
property and the template literalraw
property don't match?But that would mean that you couldn't completely faithfully recreate a JS file that had such line-endings mixed into its template literals. That seems like a bad thing.
The spec says that an actual
\r
or\r\n
escape sequence in the string is not normalized, only theU+000D
/U+000DU+000A
values themselves. However, the human-readable representation of the AST (which is often JSON stringification) would represent aU+000D
value from the code as\r
. So how would you tell the difference? Would a\r
actually show up as\\r
instead?+@allenwb @RReverser
The text was updated successfully, but these errors were encountered: