Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convenience API for dealing with variable scopes #13

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

adams85
Copy link
Owner

@adams85 adams85 commented May 28, 2024

This PR is kind of an experiment to find out if exposing the variable scopes tracked by the parser for checking variable redeclarations makes sense in the first place, and if so, how it can be technically achieved and what the public API of such a feature should look like.

My goal with this is to provide a starting point to help us understand the problem better, explore our options so we can discuss them and shape the implementation and API accordingly.

The main requirement regarding the implementation is that it must not degrade the performance of the primary consumer (Jint execution engine) significantly. This applies to both execution time and memory allocation + the memory footprint of the resulting AST. Big changes to the parser's logic are not allowed either, as it should stay a close translation of acornjs.

Beyond that, the main goal, of course, is to come up with the most convenient API possible for the consumers. However, I don't intend to implement and maintain an "all batteries included" solution but a general one that provides access to the scope information (and maybe some convenience features in the Extras package) for consumers who want to deal with variable scopes. That is, a limited feature set is preferred, which leaves the possibility open for consumers to work with variable scopes in a way that is optimized to their specific use cases (I mean primarily the freedom to choose the data structure to store the scope tree.)

All ideas and feedback are welcome!

@adams85 adams85 marked this pull request as draft May 28, 2024 23:02
@adams85 adams85 force-pushed the experiment/expose-scopes branch 2 times, most recently from 7cf389a to c50dfbe Compare May 29, 2024 11:22
@adams85 adams85 force-pushed the experiment/expose-scopes branch from c50dfbe to 84e3a67 Compare June 1, 2024 18:11
@adams85 adams85 changed the base branch from master to fix-pr June 1, 2024 18:29
@adams85 adams85 changed the base branch from fix-pr to master June 1, 2024 18:29
@adams85 adams85 force-pushed the experiment/expose-scopes branch 2 times, most recently from 4e7921c to 5e1eacb Compare June 1, 2024 18:44
@adams85
Copy link
Owner Author

adams85 commented Jun 1, 2024

As a next step, I plan to do some dogfooding with my ES module bundler. That, however, uses the core package only and will implement a custom scope tree data structure. In other words, this won't do any dogfooding to ScopeInfo in the Extras package.

So it would be great if you can provide feedback on how satisfied you are with the API of ScopeInfo, what additional features you need or would include, etc.

@Divulcan
Copy link

Divulcan commented Jun 1, 2024

So it would be great if you can provide feedback on how satisfied you are with the API of ScopeInfo, what additional features you need or would include, etc.

The main limitation is probably not being able to link any given node to a scope, however, with the current API I can make extension methods that cover everything that I need.
Example of what the end-goal is:
chrome_Zln5hJtBdW

Basically, handling cases where nodes are part of an outer scope and not the closest one (test expressions in control flow ops, EHs, etc).

If you think this might be a good fit for the extras package, I can make a proposal with better defined concepts once I have a clear vision of the extension features I plan on implementing.

@adams85 adams85 force-pushed the experiment/expose-scopes branch from 5e1eacb to 5b842c3 Compare June 2, 2024 09:26
@adams85
Copy link
Owner Author

adams85 commented Jun 2, 2024

The main limitation is probably not being able to link any given node to a scope, however, with the current API I can make extension methods that cover everything that I need.

If you need to link any given node to a scope, you can do parserOptions.RecordScopeInfoInUserData().RecordParentNodeInUserData(), which will store the parent nodes in Node.UserData (or ScopeInfo.UserData when Node.UserData is used for storing a scope info object). Then, you can walk up the node tree until you find a node whose UserData is a ScopeInfo.

I know it's not the most streamlined dev. experience, however it seems we can't really do any better if we want to stick to the performance-oriented design guidelines outlined in the PR description.

If you think this might be a good fit for the extras package, I can make a proposal with better defined concepts once I have a clear vision of the extension features I plan on implementing.

Sure, now that the basics are more or less figured out, we can move on to polishing DX. My only request is to avoid adding features too specific to your use case. I'd like to keep this lib general, that is, only include features that most of the use cases are expected to benefit from.

@adams85 adams85 force-pushed the experiment/expose-scopes branch from e0d83c9 to b8ca906 Compare June 15, 2024 21:21
@adams85
Copy link
Owner Author

adams85 commented Jun 16, 2024

Public API's taking shape nicely, plus I did some tweaks and could even improve the max. recursion depth. Seems like we're also ok performance-wise:

Acornima.Benchmark.FileParsingBenchmark

Diff Method FileName Mean Error Allocated
Old AcornimaParse bundle 328.320 ms 5.1226 ms 75.52 MB
New 327.107 ms (0%) 4.6519 ms 75.52 MB (0%)
Old AcornimaParse jquery-1.9.1 8.097 ms 0.0784 ms 3.19 MB
New 7.999 ms (-1%) 0.0719 ms 3.19 MB (0%)
Old AcornimaParse yui-3.12.0 6.323 ms 0.0123 ms 2.55 MB
New 6.267 ms (-1%) 0.0251 ms 2.55 MB (0%)

@adams85
Copy link
Owner Author

adams85 commented Jun 17, 2024

Uh-oh... we have a problem.

Let's consider we have a simple function declaration with default parameters:

function f(a = y) { var y = 2 }

For this script, acorn reports the following scope info:

Program (f)
 └─FunctionDeclaration (a, y)
    ├─Identifier
    ├─AssignmentPattern
    │  ├─Identifier
    │  └─Identifier
    └─BlockStatement
       └─VariableDeclaration
          └─VariableDeclarator
             ├─Identifier
             └─Literal

The problem here is that variable y should not be visible from the parameter list of the function. If you call function f, you get "ReferenceError: y is not defined".

This happens because acorn doesn't create separate scopes for the function parameter list and the function body but merges the declarations in the two into one scope. This makes the feature in its current form pretty useless for consumers (e.g. for my ES module bundler). It would only be useful if it produced this:

Program (f)
 └─FunctionDeclaration (a)
    ├─Identifier
    ├─AssignmentPattern
    │  ├─Identifier
    │  └─Identifier
    └─BlockStatement (y) - or (a, y) would probably be acceptable here too
       └─VariableDeclaration
          └─VariableDeclarator
             ├─Identifier
             └─Literal

I'll take a look to find out whether we could tweak the parser logic to output the correct scope info without major changes and a significant performance penalty. But there is a chance that this whole effort goes down the drain in the end...

@adams85 adams85 force-pushed the experiment/expose-scopes branch from 671bc9c to 0be0672 Compare June 19, 2024 23:05
@adams85
Copy link
Owner Author

adams85 commented Jun 20, 2024

Good news, looks like the PR is saved as I could figure out an unobtrusive way to tackle the issues described in my previous comment.

OnNode still reports a single scope for function and catch clause nodes, but the parser now stores two numbers (VarParamCount, LexicalParamCount) into Scope. These numbers tell how many variables are parameters in the corresponding spans actually.

Thanks to this extra info, it became possible to build a scope tree with separate scopes for function/catch clause parameter lists and function/catch clause bodies. For function f(a = y) { var y = 2 }, the following tree is produced now:

Program* [f]
 └─FunctionDeclaration [a, f]
    ├─Identifier
    ├─AssignmentPattern
    │  ├─Identifier
    │  └─Identifier
    └─BlockStatement* [y]
       └─VariableDeclaration
          └─VariableDeclarator
             ├─Identifier
             └─Literal

I also made some further improvements:

The name of a function declaration or function expressions is available in the function's scope. E.g. an expression like (function f(g = f) { return g })() returns the function itself. So I altered the ScopeInfo tree builder logic to include the function name in the function's scope:

Program* []
 └─ExpressionStatement
    └─CallExpression
       └─FunctionExpression [f, g]
          ├─Identifier
          ├─AssignmentPattern
          │  ├─Identifier
          │  └─Identifier
          └─BlockStatement* []
             └─ReturnStatement
                └─Identifier

Similarly, the name of a class declaration or named class expressions is available in the class's scope. (Which means that my statement "Classes don't create variable scopes as you can't directly declare variables in the body of a class, just properties and methods." wasn't entirely true. The class name is a special variable in the class's scope.) E.g. an expression like (new class C extends function() { this.x = C.X } { static X = 1 }).x returns 1.

Program* []
 └─ExpressionStatement
    └─MemberExpression
       ├─NewExpression
       │  └─ClassExpression [C]
       │     ├─Identifier
       │     ├─FunctionExpression []
       │     │  └─BlockStatement* []
       │     │     └─ExpressionStatement
       │     │        └─AssignmentExpression
       │     │           ├─MemberExpression
       │     │           │  ├─ThisExpression
       │     │           │  └─Identifier
       │     │           └─MemberExpression
       │     │              ├─Identifier
       │     │              └─Identifier
       │     └─ClassBody
       │        └─PropertyDefinition
       │           ├─Identifier
       │           └─Literal
       └─Identifier

This was problematic though because the parser didn't create scopes for classes to do variable redeclaration detection as they're not needed for that purpose. So, I decided to bite the bullet and changed the parser to create scopes for classes too. (Hopefully won't affect perf. much but I'll need to do some benchmarks w.r.t. this.)

After making these changes, I also updated my bundler project to use the scope info from the parser to build its scope tree. I could make it work, what's more, this simplified the related logic pretty much. 🎉

So the feature seems to turn out useful in the end. So I plan to include the core functionality (the changes to the parser and OnNode) in the master branch, then rebase this PR so it will be about the extras part (ScopeInfo) later on.

@adams85 adams85 force-pushed the experiment/expose-scopes branch from e1004c6 to 6910e38 Compare June 22, 2024 18:11
@adams85
Copy link
Owner Author

adams85 commented Jun 22, 2024

The core functionality (exposing the raw scope info via ParserOptions.OnNode) has been merged into master and shipped with v1.1.0.

So from now on, this PR is about the convenience features living in the Extras package (RecordScopeInfoInUserData, ScopeInfo, etc.), which should provide some handy APIs for consumers to deal with variable scopes. Stuff like what was proposed here.

The starting point is already there, including the logic that does the heavy-lifting. I'm aware of one thing that's still missing from it: in non-strict mode function declarations are also hoisted. Probably I'll implement this (will be tricky though), but apart from this, I don't have plans with this PR for now. I'd leave it up to the community to shape the implementation to their needs. So, @Divulcan, if you're still interested, please go for it.

@adams85 adams85 changed the title Expose variable scopes tracked by the parser to consumers Convenience API for dealing with variable scopes Jun 22, 2024
@adams85 adams85 linked an issue Jun 22, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Is it possible to resolve members while visiting nodes?
2 participants