-
Notifications
You must be signed in to change notification settings - Fork 2.3k
[Idea][free-threaded][gil_scoped_acquire]: Better Safe Than Sorry (add compat mutex) #5943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In free-threaded Python (Py_GIL_DISABLED), the GIL no longer provides mutual exclusion. Existing code that assumes gil_scoped_acquire provides mutual exclusion would have data races. This adds a global compatibility mutex that restores the safety guarantee: - Acquired when gil_scoped_acquire is constructed (if not already held) - Released when gil_scoped_release is constructed (if held) - Ownership is tracked per-thread to handle the main thread case KNOWN LIMITATION: The global mutex conflicts with per-interpreter GILs (Py_MOD_PER_INTERPRETER_GIL_SUPPORTED). The "Per-Subinterpreter GIL" test deadlocks with this change. Future work could use per-interpreter mutexes. This is a proof-of-concept for discussion.
|
What do you guys think about the general idea? |
|
I would suggest Even on free-threaded Python, attaching a thread state can block in order to support stop-the-world garbage collection. That makes acquiring the compat mutex after attaching the thread state deadlock-prone: imagine that thread A has an attached thread state and is blocking to try to acquire the compat mutex, while thread B holds the compat mutex and is waiting for everyone else to detach their thread states so it can perform a GC run. Changing the order so we acquire the compat mutex before attaching the thread state would improve the situation, but I'm not convinced it will totally eliminate the deadlock possibility. We could also consider using It is also worth considering whether we want to enforce mutual exclusion against all Python code (in which case we kind of have to reach for the stop-the-world API and its associated significant performance penalty) or only against other uses of py::gil_scoped_acquire (in which case the compat-mutex approach is feasible). |
|
I like this. While pybind11 modules can enable/require the GIL to exist (by not specifying But I think it needs an opt-out mechanism ... some way to say "this code needs the GIL but does not need mutual exclusion". Currently (without this change) that is what Options for that approach:
|
|
Curious if @colesbury has some ideas/comments here on how to handle this. |
|
I think we had a similar discussion in the context of Cython. I don't think we should add global locks to I think this only addresses a very narrow class of bugs. Exclusivity bugs in C++ code involving Second, if we truly want the exclusive semantics of the GIL, then we need the GIL. We already have a mechanism for that by omitting Finally, changing the default behavior of |
Interesting, I was guessing it's very common.
OK, I'll close this PR (I will not have time to work on something like Thanks for the explanation! |
Description
This is a proof-of-concept for discussion, not a final implementation.
The Problem
In traditional Python,
gil_scoped_acquiremeans:In free-threaded Python (
Py_GIL_DISABLED),gil_scoped_acquiremeans:Code written assuming the first meaning is silently broken under free-threading. This includes pybind11 internals and user code.
With pybind11 as-is, free-threading turns every unaudited
gil_scoped_acquireinto a potential data race. Given pybind11's scale of adoption, it's not a question of if these races will bite users, but how many and how badly.I'd much rather hand users a speed limit they can see and choose to remove, than landmines they discover in production.
The Proposal: Better Safe Than Sorry
scoped_ensure_thread_state— just thread state, no lockingThis POC adds a global compatibility mutex that restores the mutual exclusion guarantee for free-threaded builds:
This makes existing code safe by default under free-threading, at the cost of serializing
gil_scoped_acquiresections.Trade-offs
The performance cost is the price of safety, but users who need maximum free-threading performance can migrate to the new helper.
Known Limitation of this POC
The global compat mutex conflicts with per-interpreter GILs (
Py_MOD_PER_INTERPRETER_GIL_SUPPORTED), therefore thePer-Subinterpreter GILtest is skipped under free-threading.Future work could use per-interpreter mutexes instead of a global one.
Questions for Discussion
scoped_ensure_thread_state?