Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 1951 Check that generator expressions are safe #2072

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

GibranHL0
Copy link
Contributor

I have made things!

Checklist

  • I have double checked that there are no unrelated changes in this pull request (old patches, accidental config files, etc)
  • I have created at least one test case for the changes I have made
  • I have updated the documentation for the changes I have made
  • I have added my changes to the CHANGELOG.md

Related issues

🙏 Please, if you or your company is finding wemake-python-styleguide valuable, help us sustain the project by sponsoring it transparently on https://opencollective.com/wemake-python-styleguide. As a thank you, your profile/company logo will be added to our main README which receives hundreds of unique visitors per day.

Copy link
Member

@sobolevn sobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thanks! 💪

It needs a bit of polishing and I would be happy to merge it.

CHANGELOG.md Outdated
@@ -109,6 +109,7 @@ Semantic versioning in our case means:
- Forbids inconsistent structuring of multiline comprehensions
- Forbids to use unpythonic getters and setters such as `get_attribute` or `set_attribute`
- Now `credits`, `license`, and `copyright` builtins are free to shadow
- Forbids the reassignment of variables that affect the generator expression
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be released under 0.16.0, not 0.15.0

# Correct
a = 1
b = (a * i for i in range(5))
c = (sum(b))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
c = (sum(b))
c = sum(b)

generator expression or implement it before reassigning.

Example:
# Correct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please use more meaningful names here? We try to write our correct examples as close to the real-world code as possible.

unsafe_function = """
first_value = 1
expression = (first_value * index for index in range(5))
first_value = 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about changing index? Let's test that as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the index cannot be changed as the variable is load within the generator expression. Even though, I'll include a test

a = 2
c = (sum(c))

.. versionadded:: 0.15.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.16.0

assignments = [
sub_node
for sub_node in ast.walk(node)
if isinstance(sub_node, ast.Assign)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You only check for Assign, what about AugAssign like += 1 or assigning expression like (x := 1)? We need to support them as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it needs to have AugAssign support, but it'll add too much complexity with the current Visitor.

In the case of the assigning expression, there is a Violation that prevents the usage of (x := 1) , WPS332 Found Walrus operator

])

if self.affecting_variables:
self._check_gen_usage(node, assignments)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are missing self.generic_visit(node) call at the bottom. It is required.


def _get_inside_variables(self, node: ast.GeneratorExp) -> None:
sub_node = node.elt
self.inside_variables = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


return gen_positions

def _get_id_lineno(self, node: ast.Name):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing return type

def _get_id_lineno(self, node: ast.Name):
return (node.id, node.lineno)

def _check_gen_usage(self, node: ast.AST, assignments: List[ast.Assign]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing return type

@GibranHL0
Copy link
Contributor Author

Sure, I'll be working on them today, and I hope later today or tomorrow early commit the requested changes. 🤓

a = 1
b = (a * i for i in range(5))
c = (sum(b))
suffix = 'ish'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great one! 👍

@GibranHL0
Copy link
Contributor Author

I uploaded a new version of the UnsafeGeneratorExpressionVisitor that fixes the issues with complexity and style. But I'd like to add a new visitor that adds support for the AugAssign. I'll upload it in less than a day! 👍

@sobolevn
Copy link
Member

But I'd like to add a new visitor that adds support for the AugAssign

This sounds like not the best idea.
The better way: move logic to logic/ and use a single visitor with all the types you need to call it.

@GibranHL0
Copy link
Contributor Author

I figured out a new way to support AugAssign within the visitor and not creating a logic method outside. I also included new tests.(/7-7)/

Copy link
Member

@sobolevn sobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress! 👍

values = [1, 2, 3, 4, 5]
exponent = 5
power_expr = (value * exponent for value in values)
exponent += 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about changing values here? Like values.append(1)
Should we allow it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not because it affects the expected behavior of the expression. Can I create a new rule request to prevent it? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that it is "expected" because some people might be getting an error here. Let me give you an example:

>>> values = [1, 2, 3]
>>> listcomp = [a * 2 for a in values]
>>> values.append(0)
>>> values, listcomp
([1, 2, 3, 0], [2, 4, 6])

Here's how it works with lists: no unexpected problems.
Now, the same with gen:

>>> values = [1, 2, 3]
>>> gen = (a * 2 for a in values)
>>> values.append(0)
>>> values, list(gen)
([1, 2, 3, 0], [2, 4, 6, 0])

Now, we can see that changing values changes how gen works. It is not consistent with list comprehensions. If a persons just refactored list comprehension to be a generator expression with a code like this - there would be an error.

And we can catch it!

I recommend to try to visit ast.Name with usage / reassining context. It might be easier that our current appoach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that we should prevent it!

But, with the current visitor, it would be difficult to implement without making it complex because the ast.Name of values have ctx ast.Load in comparisons, functions or creating other list comprehensions and gens that not affect the gen.

>>> values = [1, 2, 3, 4, 5]
>>> gen = (a * 2 for a in values)
>>> if 2 in values:
>>>    addition = sum(values)
>>> values, list(gen)
([1, 2, 3], [2, 4, 6]

Then we should check that the ast.Attribute append of gen is not being used within assignment and usage. That would need at least 2 new methods visit_Attribute and _check_append but the current visitor is at maximum allowed method capacity.

Also, I believe that a new violation would make the error easier to understand as the current violation only explains the variables that affect a generator expression. The UnsafeAppendUsageViolation would point out the forbidden usage of append within assignment and call for lists that affect a generator expression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the current implementation work for:

self.x = 1
gen = (self.x * 2 for a in values)
self.x = 2
...

?

That would need at least 2 new methods visit_Attribute and _check_append but the current visitor is at maximum allowed method capacity.

Don't worry about it for now, let's concentrate on making this work for all general and edge cases first, then we can refactor the working solution into a simplier version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure then, I'll be working on append support. 📝

On the other hand, the current implementation doesn't work with self variables. So, I'll update it as well with new tests 🤠

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants