Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated __import__ is much slower than on CPython #4951

Closed
devdanzin opened this issue May 16, 2024 · 8 comments
Closed

Repeated __import__ is much slower than on CPython #4951

devdanzin opened this issue May 16, 2024 · 8 comments

Comments

@devdanzin
Copy link
Contributor

On PyPy, repeatedly calling __import__ (that is, for a module that is already in sys.modules) is much slower than on CPython:

pypy win32 3.10.14 (75b3de9d9035, Apr 21 2024, 13:13:38)
[PyPy 7.3.16 with MSC v.1929 64 bit (AMD64)]
Time taken: 38.180 seconds
cpython win32 3.12.3 (tags/v3.12.3:f6650f9, Apr  9 2024, 14:05:25) [MSC v.1938 64 bit (AMD64)]
Time taken: 0.072 seconds

With the following test script:

import sys
from time import time

def test():
    n_loops = 200000
    for _ in range(n_loops):
        __import__('email')

print(f"{sys.implementation.name} {sys.platform} {sys.version}")
start = time()
test()
print(f"Time taken: {time() - start:.3f} seconds")

This seems to be the root cause of #1423.

@cfbolz
Copy link
Member

cfbolz commented May 17, 2024

there is a fast path for importing written in rpython, in pypy/module/imp/importing.py, function import_name_fast_path. We use that in the bytecode IMPORT_NAME but not in __import__.

@devdanzin
Copy link
Contributor Author

Testing on Linux (WSL), PyPy is only 5x-10x slower than CPython. On Windows, it's 500x slower, even with antivirus disabled. Trying to figure out why.

@cfbolz
Copy link
Member

cfbolz commented May 17, 2024

@devdanzin but even on linux __import__("email") is massively slower than import email, which is what I want to fix.

@cfbolz
Copy link
Member

cfbolz commented May 17, 2024

I have now written a quick prototype on the py3.9-__import__-fast-path branch, probably won't get to that in the next couple of days though.

@cfbolz
Copy link
Member

cfbolz commented May 21, 2024

I've merged the fast path, let's check tomorrow's nightly whether the situation improved.

@devdanzin
Copy link
Contributor Author

The Linux nightly at pypy-c-jit-183898-9064d3c9091e-linux64.tar.bz2 is fixed:

pypy linux 3.9.19 (9064d3c9091e, May 22 2024, 01:30:20)
[PyPy 7.3.17-alpha0 with GCC 10.2.1 20210130 (Red Hat 10.2.1-11)]
Time taken: 0.005 seconds

Compared to:

pypy linux 3.9.19 (a2113ea87262, Apr 21 2024, 05:40:24)
[PyPy 7.3.16 with GCC 10.2.1 20210130 (Red Hat 10.2.1-11)]
Time taken: 0.543 seconds

The time taken for repeated __import__ when the module isn't in sys.modules is slightly longer than when it is. Repeated import is a bit faster still.

@cfbolz
Copy link
Member

cfbolz commented May 22, 2024

wonderful, thanks for checking @devdanzin! if the module is not in sys.modules we use the full importlib machinery, so it's to be expected that that case is slower.

@devdanzin
Copy link
Contributor Author

As predicted, performance improvement on Windows is impressive:

pypy win32 3.10.14 (75b3de9d9035, Apr 21 2024, 13:13:38)
[PyPy 7.3.16 with MSC v.1929 64 bit (AMD64)]
Time taken: 38.357 seconds
pypy win32 3.9.19 (b93b48d4429e, May 24 2024, 01:35:22)
[PyPy 7.3.17-alpha0 with MSC v.1929 64 bit (AMD64)]
Time taken: 0.033 seconds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants