Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Support detecting dependencies of projects being scanned #49

Open
kkirsche opened this issue Oct 12, 2022 · 3 comments
Open

Feat: Support detecting dependencies of projects being scanned #49

kkirsche opened this issue Oct 12, 2022 · 3 comments

Comments

@kkirsche
Copy link

Good morning,

This issue is to add support for detecting dependencies of the project(s) being scanned by MyPy.

Use Case

The use case of this feature is to understand the impact of a scan better when evaluating the results in typeshed pull requests.

Behavior

The recommended behavior of mypy_primer is to add support for an optional argument, either positional or flag-based, which accepts one or more package names. These package names represent the package being evaluated, such as types-requests. As typeshed packages are published under the pattern types-{package}, this would be used to determine which package was modified in this change.

With this change implemented and a package provided, while mypy_primer is scanning individual packages, it will evaluate whether or not the package being scanned uses that dependency, providing the end user with a percentage of projects scanned that use this dependency. If mypy_primer supports a verbose run mode, this will instead provide a list of scanned packages with each package's individual status.

Enhancements

This behavior can be enhanced, at the cost of additional complexity, by evaluating the package using a coverage-focused approach, determining if the changed APIs in a pull request are used within the package rather than simply looking for the dependencies.

Approaches

There seem to be a few different approaches we could take for this, depending on the longer-term intent of a feature like this. I've listed the three that immediately come to mind.

  1. modulefinder (not recommended)
    • https://docs.python.org/3/library/modulefinder.html
    • modulefinder can execute individual scripts locating dependencies used by that. This can be used to scan individual package files, evaluating which dependencies are used by it. modulefinder achieves this behavior using an import_hook.
  2. metadata via pip
    • Re-implement a minimal version of pip's search_packages_info to retrieve the requires field of the project's metadata.
  3. metadata via filesystem
    • Depending on how mypy_primer is working with the projects, it may instead make sense to read metadata from the project's configuration files (such as pyproject.toml, setup.py, setup.cfg, etc.)
  4. AST Evaluation
    • If a coverage-based solution is desired, the approach begins to become more complicated and may make sense to be a separate tool or a plugin for a separate tool, such as flake8. A low-level approach would be to scan the source code of a project and evaluate its AST to determine which packages are being imported and how the package is being used.

There certainly may be more approaches, I'd be interested in any feedback you may have about what approach you feel makes the most sense.

Who Will Do This?

I'm happy to attempt to provide this, though there will be some delays as I am currently assisting my family with something offline. This is why I haven't been able to be as involved in typeshed as I would like following my discussion with @AlexWaygood.

Thank you for your time.

@JelleZijlstra
Copy link
Collaborator

Just to clarify, the desired end result is that when I make a typeshed PR that touches the requests stubs, the mypy-primer output will say something like "15 packages checked by mypy-primer use requests". Is that right? If so, that seems like a useful enhancement.

The standard API for retrieving the dependencies of a package is importlib.metadata: https://docs.python.org/3.10/library/importlib.metadata.html#distribution-requirements

@kkirsche
Copy link
Author

Just to clarify, the desired end result is that when I make a typeshed PR that touches the requests stubs, the mypy-primer output will say something like "15 packages checked by mypy-primer use requests". Is that right? If so, that seems like a useful enhancement.

Correct. The intent is to ensure that reviewers have the additional context about whether the output of mypy_primer is applicable, a small signal, or a strong test case for a change.

The standard API for retrieving the dependencies of a package is importlib.metadata: https://docs.python.org/3.10/library/importlib.metadata.html#distribution-requirements

Thanks for the correction / additional detail(s). I've had some PRs rejected for other tools for using importlib so I must have just skipped over it due to compatibility or other historic reasons that don't apply here.

@hauntsaninja
Copy link
Owner

Might be a little fiddly... mypy_primer avoids installing most of the projects it checks. This keeps things relatively faster and mostly avoids all the various build related issues that would otherwise arise. This means options 1 and 2 wouldn't work. I'd probably go with 4. You may also want to use project.source_paths to get exactly the set of paths that a mypy invocation would look at.

I'd also recommend not doing this as part of the core mypy_primer logic. Instead we could make this its own command, like mypy_primer --coverage, mypy_primer --measure-project-runtimes, etc. This could simply spit out: X projects use types-xyz or whatever. The advantage of a separate command is it would interact better with the sharding we do in CI and the use of mypy_primer for mypy CI (as opposed to typeshed).

Out of curiosity, are there PRs where you feel like having this would have resulted in some different outcome? Also one more thing in this space... It could be useful to run something mypy_primer-like on the tests for the untyped project. This is probably best done by using mypy_primer like a library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants