Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with list comprehension in python 3.12 #93

Open
mutricyl opened this issue Nov 23, 2023 · 5 comments
Open

issue with list comprehension in python 3.12 #93

mutricyl opened this issue Nov 23, 2023 · 5 comments
Labels

Comments

@mutricyl
Copy link

Hello,

I am running pyan3 (1.2.0) through sphinx 7.2.6.

Nice graph are generated when usint python 3.10 but the scripts fails when using python 3.12.

Exception occurred:
  File "C:\Users\saf928524\.conda\envs\py312\Lib\site-packages\pyan\anutils.py", line 255, in __enter__
    raise ValueError("Unknown scope '%s'" % (inner_ns))
ValueError: Unknown scope 'Model_buildup.Chain_Unchain.listcomp'

list comprehension is indeed located in Model_buildup.Chain_Unchain module:

        list_sim = [(os.path.join(root, name), root,name) for root, dirs, files in os.walk(main_dir)
                    for name in files
                    if name.endswith(".dat") or name.endswith(".yml") or name.endswith(".sim")]

Any clue on why the same pyan version with the same sphinx version runs differently on list comprehension between python 3.10 and 3.12 ??

@Technologicat
Copy link
Owner

What I can say off the top of my head is that pyan has not been updated since Python 3.6, so if it has worked at all on 3.7+, that's pure luck.

One thing is that the AST representation tends to change in almost every Python minor version, which is often enough to make it frustrating to keep your side projects updated. For a concrete example, pyan doesn't know about the := operator that was added in 3.8, so it will just ignore those.

The other thing is, scope analysis in Python has always been a headache (see e.g. the "Full Monty" paper by Politz et al., 2013; or if you prefer code, unpythonic's scope analyzer, which is much newer than what we have in pyan). Specifically, the behavior you observed could happen if some internal implementation detail of the list comprehension mechanism has changed in Python 3.12.

Other things may also have broken. Yet others were either never properly implemented in pyan, or rely on brittle hacks, e.g. the analysis of tuple values. Python is a rather big language.

It pains me to say, but I haven't needed pyan myself in many years. Therefore, in the foreseeable future, I unfortunately don't have the resources to keep pyan alive, nor to fix its many design issues.

Although I write my own research codes in Python, in practice that means that 90% of the time, I'm writing new code to solve some new problem. As we all know, this is wildly different from most kinds of software engineering.

The other 10% of the time, I'm looking at either my own old code, or at something modular enough where a local manual analysis is enough. Numerics tends to have very simple linear control flow, with highly nontrivial algorithms, which again is the polar opposite of most other kinds of software engineering.

Of my metaprogramming projects, I'm semi-actively using some parts of unpythonic, and I have a minor interest in keeping the mcpyrate macro expander alive (as it goes much further than earlier designs, and no one else seems to have run off with the mantle). But that's basically all I can afford. Even for those, the most recent opportunity I had to work on them was almost two years ago. So even those projects are stuck on Python 3.10 at the moment.

Since there is a continuing community interest in pyan, I'd like to get other developers on board. I'd gladly hand the project over. I tried doing so once, but it seems the developers I was able to attract to the project at that time are as busy with other projects as I am. Maybe the way forward would be to make an official announcement about this and put it in README.md. pyan would be much better off if it was developed by the community.

P.S. To any interested developer reading this:

My current opinion is that the design of pyan is confused at best. At the time when I took Edmund Horner's original Python 2 experiment and wrote this Python 3 version, I wasn't experienced enough in metaprogramming to have known what kinds of things a static analysis tool should attempt and what it should not. As a result, there is a crude attempt at some dynamic analysis mixed in.

If I was writing the tool now, I'd only do what can be done purely statically (simple is better than complex). This would remove some features, but it would also make it much easier to communicate which kinds of things the tool understands. Basically, if you have to run the code in your head, that's dynamic analysis, and a static analyzer shouldn't even attempt to do that. A static analyzer should only look at the definitions, and at their placement in the source code. Contrast also the concepts of scope and dynamic extent.

Also specifically, the last seen value mechanism is a Hail Mary that goes against the very philosophy of Python, and should be removed at the first possible opportunity. Analyze the structure explicitly, or skip it. Refuse the temptation to guess.

@mutricyl
Copy link
Author

Thanks for your long response @Technologicat . Unfortunately I do not qualify as metaprogramer and I do not have enough free time to take over.

I am stilling some time from my boss to make this issue progress:

symtable.symtable(code, filename, compile_type="exec").get_children()

python 3.12 :

[<Function SymbolTabl...nchain.py>, <Function SymbolTabl...nchain.py>]
special variables
function variables
0:
<Function SymbolTable for Chain in C:\Users\safXXXXX\Documents\scripts_pythons\Model_buildup\Chain_Unchain.py>
1:
<Function SymbolTable for UnChain in C:\Users\safXXXXX\Documents\scripts_pythons\Model_buildup\Chain_Unchain.py>
len():
2

python 3.10:

[<Function SymbolTabl...nchain.py>, <Function SymbolTabl...nchain.py>, <Function SymbolTabl...nchain.py>]
special variables
function variables
0:
<Function SymbolTable for Chain in C:\Users\safXXXXX\Documents\scripts_pythons\Model_buildup\Chain_Unchain.py>
1:
<Function SymbolTable for UnChain in C:\Users\safXXXXX\Documents\scripts_pythons\Model_buildup\Chain_Unchain.py>
2:
<Function SymbolTable for listcomp in C:\Users\safXXXXX\Documents\scripts_pythons\Model_buildup\Chain_Unchain.py>
len():
3

The issue is comming from a change in behavior in symtable called here :

process(self.module_name, symtable.symtable(code, filename, compile_type="exec"))

@Technologicat
Copy link
Owner

Thanks for the analysis!

That does sound like some internals of list comprehensions have changed in 3.12.

I suppose it's time to check the release notes. Indeed:

---8<---8<---8<---

PEP 709: Comprehension inlining

Dictionary, list, and set comprehensions are now inlined, rather than creating a new single-use function object for each execution of the comprehension. This speeds up execution of a comprehension by up to two times. See PEP 709 for further details.

Comprehension iteration variables remain isolated and don’t overwrite a variable of the same name in the outer scope, nor are they visible after the comprehension. Inlining does result in a few visible behavior changes:

There is no longer a separate frame for the comprehension in tracebacks, and tracing/profiling no longer shows the comprehension as a function call.

The symtable module will no longer produce child symbol tables for each comprehension; instead, the comprehension’s locals will be included in the parent function’s symbol table.

Calling [locals()](https://docs.python.org/3/library/functions.html#locals) inside a comprehension now includes variables from outside the comprehension, and no longer includes the synthetic .0 variable for the comprehension “argument”.

A comprehension iterating directly over `locals()` (e.g. `[k for k in locals()]`) may see `“RuntimeError: dictionary changed size during iteration”` when run under tracing (e.g. code coverage measurement). This is the same behavior already seen in e.g. `for k in locals():`. To avoid the error, first create a list of keys to iterate over: `keys = list(locals()); [k for k in keys]`.

(Contributed by Carl Meyer and Vladimir Matveev in PEP 709.)

---8<---8<---8<---

My personal opinion is, that's a hack that breaks the language's scoping contract. The performance increase comes at the cost of not having an explicit scope reported for variables that anyway behave as if they had such a scope. This just made the Python scoping rules an even bigger headache than they used to be.

If you want to have a try at fixing the analyzer, some ideas:

  • The other important part is in anutils.ExecuteInInnerScope.
    • We need correct scoping to be able to resolve references correctly. Now symtable no longer produces scopes that describe the scoping of comprehension variables.
    • The comprehension scopes made the call graph look cluttered, so pyan has always removed them. However, they are important for analysis, e.g. not to confuse a variable x at the top level of a function with a comprehension variable x inside the same function.
  • We could detect from sys.version_info whether we're running on 3.12 or later.
  • In anutils.ExecuteInInnerScope, why did I require inner_ns to be present in analyzer.scopes? The code is either just explicitly checking that the pre-3.12 scoping contract is fulfilled (to enforce fail-fast), or, there's some technical reason some other part of the analyzer actually needs the scopes to be there. If it's just a contract check - well, that contract changed in 3.12, so we could simply remove the check.
  • Another avenue worth investigating, would it be possible to manually add comprehension scopes to analyzer.scopes (somewhere near the line you suggested), to emulate pre-3.12 behavior of symtable? The tricky part is knowing which variables to put in that scope. Maybe there is something useful in the symtable API, maybe not.

@Technologicat
Copy link
Owner

Ah, one more thing: if you want to try the latter option, I can help with capturing the comprehension variable names from the relevant AST node.

That's something that requires knowledge of the Python AST format, but needs only a few lines of code. It shouldn't take much longer to implement that particular thing than it takes to write these comments here. :)

(Worst case is, we'll need an AST walker that runs over the whole source file, keeping track of scopes. But we already have that in unpythonic - so I could just quickly port the existing code, essentially removing the need to use symtable for this.)

@fruch
Copy link

fruch commented Jan 10, 2024

Hi @Technologicat,

I've just run into this issue, and in my case it was solved by just no doing the comprehension logic at all,
since it's not in the scope

in my use case i'm subclassing the Analyzer class so, this did the trick for me:

    def analyze_comprehension(self, *args, **kwargs):
        pass

i'm using it in
https://github.com/fruch/pytest-diff-selector

fruch added a commit to fruch/pytest-diff-selector that referenced this issue Jan 10, 2024
PEP-0709 has remove comprehension scopes
and we should ignore them as well

Ref: https://peps.python.org/pep-0709/
Ref: Technologicat/pyan#93
fruch added a commit to fruch/pytest-diff-selector that referenced this issue Jan 10, 2024
PEP-0709 has remove comprehension scopes
and we should ignore them as well

Ref: https://peps.python.org/pep-0709/
Ref: Technologicat/pyan#93
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants