Convenience API for dealing with variable scopes #13

adams85 · 2024-05-28T23:02:31Z

This PR is kind of an experiment to find out if exposing the variable scopes tracked by the parser for checking variable redeclarations makes sense in the first place, and if so, how it can be technically achieved and what the public API of such a feature should look like.

My goal with this is to provide a starting point to help us understand the problem better, explore our options so we can discuss them and shape the implementation and API accordingly.

The main requirement regarding the implementation is that it must not degrade the performance of the primary consumer (Jint execution engine) significantly. This applies to both execution time and memory allocation + the memory footprint of the resulting AST. Big changes to the parser's logic are not allowed either, as it should stay a close translation of acornjs.

Beyond that, the main goal, of course, is to come up with the most convenient API possible for the consumers. However, I don't intend to implement and maintain an "all batteries included" solution but a general one that provides access to the scope information (and maybe some convenience features in the Extras package) for consumers who want to deal with variable scopes. That is, a limited feature set is preferred, which leaves the possibility open for consumers to work with variable scopes in a way that is optimized to their specific use cases (I mean primarily the freedom to choose the data structure to store the scope tree.)

All ideas and feedback are welcome!

src/Acornima.Extras/ScopeInfo.cs

adams85 · 2024-06-01T19:53:07Z

As a next step, I plan to do some dogfooding with my ES module bundler. That, however, uses the core package only and will implement a custom scope tree data structure. In other words, this won't do any dogfooding to ScopeInfo in the Extras package.

So it would be great if you can provide feedback on how satisfied you are with the API of ScopeInfo, what additional features you need or would include, etc.

Divulcan · 2024-06-01T22:34:45Z

So it would be great if you can provide feedback on how satisfied you are with the API of ScopeInfo, what additional features you need or would include, etc.

The main limitation is probably not being able to link any given node to a scope, however, with the current API I can make extension methods that cover everything that I need.
Example of what the end-goal is:

Basically, handling cases where nodes are part of an outer scope and not the closest one (test expressions in control flow ops, EHs, etc).

If you think this might be a good fit for the extras package, I can make a proposal with better defined concepts once I have a clear vision of the extension features I plan on implementing.

adams85 · 2024-06-02T09:54:33Z

The main limitation is probably not being able to link any given node to a scope, however, with the current API I can make extension methods that cover everything that I need.

If you need to link any given node to a scope, you can do parserOptions.RecordScopeInfoInUserData().RecordParentNodeInUserData(), which will store the parent nodes in Node.UserData (or ScopeInfo.UserData when Node.UserData is used for storing a scope info object). Then, you can walk up the node tree until you find a node whose UserData is a ScopeInfo.

I know it's not the most streamlined dev. experience, however it seems we can't really do any better if we want to stick to the performance-oriented design guidelines outlined in the PR description.

If you think this might be a good fit for the extras package, I can make a proposal with better defined concepts once I have a clear vision of the extension features I plan on implementing.

Sure, now that the basics are more or less figured out, we can move on to polishing DX. My only request is to avoid adding features too specific to your use case. I'd like to keep this lib general, that is, only include features that most of the use cases are expected to benefit from.

adams85 · 2024-06-16T20:48:41Z

Public API's taking shape nicely, plus I did some tweaks and could even improve the max. recursion depth. Seems like we're also ok performance-wise:

Acornima.Benchmark.FileParsingBenchmark

Diff	Method	FileName	Mean	Error	Allocated
Old	AcornimaParse	bundle	328.320 ms	5.1226 ms	75.52 MB
New			327.107 ms (0%)	4.6519 ms	75.52 MB (0%)
Old	AcornimaParse	jquery-1.9.1	8.097 ms	0.0784 ms	3.19 MB
New			7.999 ms (-1%)	0.0719 ms	3.19 MB (0%)
Old	AcornimaParse	yui-3.12.0	6.323 ms	0.0123 ms	2.55 MB
New			6.267 ms (-1%)	0.0251 ms	2.55 MB (0%)

adams85 · 2024-06-17T20:19:17Z

Uh-oh... we have a problem.

Let's consider we have a simple function declaration with default parameters:

function f(a = y) { var y = 2 }

For this script, acorn reports the following scope info:

Program (f)
 └─FunctionDeclaration (a, y)
    ├─Identifier
    ├─AssignmentPattern
    │  ├─Identifier
    │  └─Identifier
    └─BlockStatement
       └─VariableDeclaration
          └─VariableDeclarator
             ├─Identifier
             └─Literal

The problem here is that variable y should not be visible from the parameter list of the function. If you call function f, you get "ReferenceError: y is not defined".

This happens because acorn doesn't create separate scopes for the function parameter list and the function body but merges the declarations in the two into one scope. This makes the feature in its current form pretty useless for consumers (e.g. for my ES module bundler). It would only be useful if it produced this:

Program (f)
 └─FunctionDeclaration (a)
    ├─Identifier
    ├─AssignmentPattern
    │  ├─Identifier
    │  └─Identifier
    └─BlockStatement (y) - or (a, y) would probably be acceptable here too
       └─VariableDeclaration
          └─VariableDeclarator
             ├─Identifier
             └─Literal

I'll take a look to find out whether we could tweak the parser logic to output the correct scope info without major changes and a significant performance penalty. But there is a chance that this whole effort goes down the drain in the end...

adams85 · 2024-06-20T00:08:46Z

Good news, looks like the PR is saved as I could figure out an unobtrusive way to tackle the issues described in my previous comment.

OnNode still reports a single scope for function and catch clause nodes, but the parser now stores two numbers (VarParamCount, LexicalParamCount) into Scope. These numbers tell how many variables are parameters in the corresponding spans actually.

Thanks to this extra info, it became possible to build a scope tree with separate scopes for function/catch clause parameter lists and function/catch clause bodies. For function f(a = y) { var y = 2 }, the following tree is produced now:

Program* [f]
 └─FunctionDeclaration [a, f]
    ├─Identifier
    ├─AssignmentPattern
    │  ├─Identifier
    │  └─Identifier
    └─BlockStatement* [y]
       └─VariableDeclaration
          └─VariableDeclarator
             ├─Identifier
             └─Literal

I also made some further improvements:

The name of a function declaration or function expressions is available in the function's scope. E.g. an expression like (function f(g = f) { return g })() returns the function itself. So I altered the ScopeInfo tree builder logic to include the function name in the function's scope:

Program* []
 └─ExpressionStatement
    └─CallExpression
       └─FunctionExpression [f, g]
          ├─Identifier
          ├─AssignmentPattern
          │  ├─Identifier
          │  └─Identifier
          └─BlockStatement* []
             └─ReturnStatement
                └─Identifier

Similarly, the name of a class declaration or named class expressions is available in the class's scope. (Which means that my statement "Classes don't create variable scopes as you can't directly declare variables in the body of a class, just properties and methods." wasn't entirely true. The class name is a special variable in the class's scope.) E.g. an expression like (new class C extends function() { this.x = C.X } { static X = 1 }).x returns 1.

Program* []
 └─ExpressionStatement
    └─MemberExpression
       ├─NewExpression
       │  └─ClassExpression [C]
       │     ├─Identifier
       │     ├─FunctionExpression []
       │     │  └─BlockStatement* []
       │     │     └─ExpressionStatement
       │     │        └─AssignmentExpression
       │     │           ├─MemberExpression
       │     │           │  ├─ThisExpression
       │     │           │  └─Identifier
       │     │           └─MemberExpression
       │     │              ├─Identifier
       │     │              └─Identifier
       │     └─ClassBody
       │        └─PropertyDefinition
       │           ├─Identifier
       │           └─Literal
       └─Identifier

This was problematic though because the parser didn't create scopes for classes to do variable redeclaration detection as they're not needed for that purpose. So, I decided to bite the bullet and changed the parser to create scopes for classes too. (Hopefully won't affect perf. much but I'll need to do some benchmarks w.r.t. this.)

After making these changes, I also updated my bundler project to use the scope info from the parser to build its scope tree. I could make it work, what's more, this simplified the related logic pretty much. 🎉

So the feature seems to turn out useful in the end. So I plan to include the core functionality (the changes to the parser and OnNode) in the master branch, then rebase this PR so it will be about the extras part (ScopeInfo) later on.

…dealing with variable scopes

adams85 · 2024-06-22T21:48:25Z

The core functionality (exposing the raw scope info via ParserOptions.OnNode) has been merged into master and shipped with v1.1.0.

So from now on, this PR is about the convenience features living in the Extras package (RecordScopeInfoInUserData, ScopeInfo, etc.), which should provide some handy APIs for consumers to deal with variable scopes. Stuff like what was proposed here.

The starting point is already there, including the logic that does the heavy-lifting. I'm aware of one thing that's still missing from it: in non-strict mode function declarations are also hoisted. Probably I'll implement this (will be tricky though), but apart from this, I don't have plans with this PR for now. I'd leave it up to the community to shape the implementation to their needs. So, @Divulcan, if you're still interested, please go for it.

adams85 marked this pull request as draft May 28, 2024 23:02

adams85 mentioned this pull request May 28, 2024

Is it possible to resolve members while visiting nodes? #12

Open

adams85 force-pushed the experiment/expose-scopes branch 2 times, most recently from 7cf389a to c50dfbe Compare May 29, 2024 11:22

Divulcan reviewed May 30, 2024

View reviewed changes

src/Acornima.Extras/ScopeInfo.cs Outdated Show resolved Hide resolved

Divulcan reviewed May 31, 2024

View reviewed changes

src/Acornima.Extras/ScopeInfo.cs Outdated Show resolved Hide resolved

Divulcan reviewed May 31, 2024

View reviewed changes

src/Acornima.Extras/ScopeInfo.cs Outdated Show resolved Hide resolved

adams85 force-pushed the experiment/expose-scopes branch from c50dfbe to 84e3a67 Compare June 1, 2024 18:11

adams85 changed the base branch from master to fix-pr June 1, 2024 18:29

adams85 changed the base branch from fix-pr to master June 1, 2024 18:29

adams85 force-pushed the experiment/expose-scopes branch 2 times, most recently from 4e7921c to 5e1eacb Compare June 1, 2024 18:44

adams85 force-pushed the experiment/expose-scopes branch from 5e1eacb to 5b842c3 Compare June 2, 2024 09:26

adams85 force-pushed the experiment/expose-scopes branch from e0d83c9 to b8ca906 Compare June 15, 2024 21:21

adams85 force-pushed the experiment/expose-scopes branch from 671bc9c to 0be0672 Compare June 19, 2024 23:05

adams85 force-pushed the experiment/expose-scopes branch from 0be0672 to e1004c6 Compare June 22, 2024 15:55

adams85 force-pushed the master branch from 9be8ed4 to 3eee49e Compare June 22, 2024 18:07

Introduce convenience API (RecordScopeInfoInUserData, ScopeInfo) for …

6910e38

…dealing with variable scopes

adams85 force-pushed the experiment/expose-scopes branch from e1004c6 to 6910e38 Compare June 22, 2024 18:11

adams85 changed the title ~~Expose variable scopes tracked by the parser to consumers~~ Convenience API for dealing with variable scopes Jun 22, 2024

adams85 linked an issue Jun 22, 2024 that may be closed by this pull request

Is it possible to resolve members while visiting nodes? #12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convenience API for dealing with variable scopes #13

Convenience API for dealing with variable scopes #13

adams85 commented May 28, 2024 •

edited

Loading

adams85 commented Jun 1, 2024

Divulcan commented Jun 1, 2024

adams85 commented Jun 2, 2024

adams85 commented Jun 16, 2024

adams85 commented Jun 17, 2024 •

edited

Loading

adams85 commented Jun 20, 2024

adams85 commented Jun 22, 2024 •

edited

Loading

Convenience API for dealing with variable scopes #13

Are you sure you want to change the base?

Convenience API for dealing with variable scopes #13

Conversation

adams85 commented May 28, 2024 • edited Loading

adams85 commented Jun 1, 2024

Divulcan commented Jun 1, 2024

adams85 commented Jun 2, 2024

adams85 commented Jun 16, 2024

Acornima.Benchmark.FileParsingBenchmark

adams85 commented Jun 17, 2024 • edited Loading

adams85 commented Jun 20, 2024

adams85 commented Jun 22, 2024 • edited Loading

adams85 commented May 28, 2024 •

edited

Loading

adams85 commented Jun 17, 2024 •

edited

Loading

adams85 commented Jun 22, 2024 •

edited

Loading