-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore no-GIL support (free threading) #1038
Comments
Let me know if there's anything I can help with. FYI, it's not yet feasible to experiment with the 3.13 |
hi, At the moment I can build even a free-threading binary wheel with pip-24.1b1 msvc_runtime-14.38.33135-cp313-cp313t-win_amd64.whl But then mypyc fails on me. if try to accelerate a pure python file. With the basic error
|
PEP 703 (Making the Global Interpreter Lock Optional in CPython) was accepted in Oct 2023. Mypyc will require various changes to support CPython builds that don't use the GIL. We should experiment with the no-GIL build as soon as feasible once it's functional in the CPython main branch, since this is one of the biggest changes to the CPython runtime ever. Many existing and potential mypyc users would also likely want to use it.
Implementation
I haven't tried to think through all the implications carefully (experimentation will also be required), but here are some things that already seem likely:
There are probably other changes.
Reference counting will become more expensive, and built-in containers, including
list
,dict
andset
would also now use fine-grained locks to protect most operations. These changes will happen behind the scenes if we use the C API. Direct access to struct fields of (at least non-immutable) built-in objects is perhaps unsafe, unless we a careful to always take the necessary locks.Performance impact
Obviously, not having GIL should enable better performance in many multi-threaded workloads. This is the main benefit.
Sequential performance is expected to be slower due to the extra synchronization and other changes. The PEP suggests around 7%-8% overhead when running mostly interpreted workloads. For compiled workloads the impact may be bigger, since compilation may not reduce the number of slower operations as much as it reduces other overhead.
Here's a contrived example which highlights the above issue. Let's assume that all the overhead would be from reference counting, and compilation would speed up overall performance by 5x with the GIL. Also, let's assume that compiled code needs the exact same reference count manipulations as interpreted code. Now a 7% overhead for interpreted code could result in a 50% overhead in compiled code, since reference counting accounts for a much larger fraction of time spent in compiled code.
Multi-threaded code that uses packed arrays or numeric arrays could see very big benefits, as these could probably be accessed from multiple threads without fine-grained synchronization. Also, code that spends a lot of time in single-threaded C extensions (that don't use Python containers) could also benefit a lot.
Since mypy is single-threaded and uses lots of heap-allocated objects and built-in collections, it could experience a fairly high overhead.
Open issues / brainstorming
i64
orfloat
wouldn't require synchronization.a = x.y.z
maybe we can still borrowx.y
if it's a final attribute.self.x = self.y + 's'
, maybe we'd take a single lock aroundself
instead of locking separately forself.y
andself.x
.gc.get_objects()
in other threads, which seems acceptable.)Useful links
Tasks
The text was updated successfully, but these errors were encountered: