Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check that generator expressions are safe #1951

Open
sobolevn opened this issue Mar 17, 2021 · 6 comments
Open

Check that generator expressions are safe #1951

sobolevn opened this issue Mar 17, 2021 · 6 comments
Assignees
Labels
rule request Adding a new rule

Comments

@sobolevn
Copy link
Member

Rule request

Thesis

We can create generator expression that produce a different result from what we expect.

>>> a = 1
>>> b = (a * i for i in range(5))
>>> a = 2
>>> print(list(b))
[0, 2, 4, 6, 8]

Link: https://twitter.com/asmeurer/status/1371952741728194561

Reasoning

Comprehensions like this should be safe. We should not be able to reassign values that comprehensions rely on.

@sobolevn sobolevn added the rule request Adding a new rule label Mar 17, 2021
@GibranHL0
Copy link
Contributor

Hey 👋 , I'm new to the project and I'd like to contribute.
Could you assign me this issue?

@sobolevn
Copy link
Member Author

Thanks, @GibranHL0!

@GibranHL0
Copy link
Contributor

Great. I'll make the proposal as soon as possible.

@GibranHL0
Copy link
Contributor

Hey @sobolevn, I got my approach to the issue. 👋
According to the PEP 289, the code uses a generator expression, which after a single usage, the memory is free. Then, we should allow the assignment of the involved variables after the generator usage, as they no longer affect it.

>>> a = 1
>>> b = (a * i for i in range(5))
>>> print(list(b))
[0, 1, 2, 3, 4]
>>> a = 2

The code works fine!

So, my solution would be:

  1. Identify a generator expression by the elements of an open and closing parenthesis "()", and the word for and in.
  2. Identify if the generator expression is instantiated or if it's assigned to another variable. If the expression is used on the same line, there is no need to check anything else. Otherwise ...
  3. Identify the variables that affect the behavior of the generator expression, which are before the for word.
>>> ( a * i for i in range(5))
>>> var_before_for = ["a", "i"]
  1. Identify the variable that is created in the generator, which is between the for and in.
>>> ( a * i for i in range(5))
>>> var_for_in = ["i"]
  1. Get the list of the variables that affect the behavior according to the list of variables before the for minus the variable between the for and in
>>> var_affect = ["a"]
  1. Check the code after the generator expression, if there is an assignment of one of the variables stored in the previously created list before the generator expression is used, we raise a violation.

I'd use a tokenize visitor and violation to implement it.

What do you think? 🤔

@sobolevn
Copy link
Member Author

Looks great! But, I think that ASTVisitor is easier to work with in this case.

@GibranHL0
Copy link
Contributor

Sure, I was checking, and I think is a better idea to use the ASTVisitor instead of the TokenVisitor.

I'll upload the PR as soon as possible. ✌️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rule request Adding a new rule
Projects
None yet
Development

No branches or pull requests

2 participants