Skip to content

Surround the GIL with a ReentrantLock on the Julia side. #637

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,7 @@ dist/
.CondaPkg/
/jltest.*
uv.lock

# pixi environments
.pixi
*.egg-info
109 changes: 94 additions & 15 deletions src/GIL/GIL.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,44 +9,99 @@ module GIL

using ..C: C

# Ensure that only one Julia task tries to acquire the Python GIL.
# Avoid the potential issue that a task could miscompute whether
# it actually has the GIL simply because a different task that ran
# on the same thread that once had the GIL.
# https://github.com/JuliaPy/PythonCall.jl/issues/627
const _jl_gil_lock = ReentrantLock()

"""
hasgil()

Returns `true` if the current thread has the GIL or `false` otherwise.
"""
hasgil() = C.PyGILState_Check() == Cint(1)

"""
lock(f)
lock(f; exclusive=true)

Lock the GIL, compute `f()`, unlock the GIL, then return the result of `f()`.

Use this to run Python code from threads that do not currently hold the GIL, such as new
threads. Since the main Julia thread holds the GIL by default, you will need to
[`unlock`](@ref) the GIL before using this function.

Setting the `exclusive` keyword to `false` allows more than one Julia `Task`
to attempt to acquire the GIL. This may be useful if one Julia `Task` calls
code which releases the GIL, in which case another Julia task could acquire
it. However, this is probably not a sound approach.

See [`@lock`](@ref) for the macro form.
"""
function lock(f)
state = C.PyGILState_Ensure()
function lock(f; exclusive=true)
exclusive && Base.lock(_jl_gil_lock)
try
f()
state = C.PyGILState_Ensure()
try
f()
finally
C.PyGILState_Release(state)
end
finally
C.PyGILState_Release(state)
exclusive && Base.unlock(_jl_gil_lock)
end
end

"""
@lock expr
@lock [exclusive=true] expr

Lock the GIL, compute `expr`, unlock the GIL, then return the result of `expr`.

Use this to run Python code from threads that do not currently hold the GIL, such as new
threads. Since the main Julia thread holds the GIL by default, you will need to
[`@unlock`](@ref) the GIL before using this function.

Setting the `exclusive` parameter to `false` allows more than one Julia `Task`
to attempt to acquire the GIL. This may be useful if one Julia `Task` calls
code which releases the GIL, in which case another Julia task could acquire
it. However, this is probably not a sound approach.

The macro equivalent of [`lock`](@ref).
"""
macro lock(expr)
quote
state = C.PyGILState_Ensure()
Base.lock(_jl_gil_lock)
try
$(esc(expr))
state = C.PyGILState_Ensure()
try
$(esc(expr))
finally
C.PyGILState_Release(state)
end
finally
C.PyGILState_Release(state)
Base.unlock(_jl_gil_lock)
end
end
end

macro lock(parameter, expr)
parameter.head == :(=) &&
parameter.args[1] == :exclusive ||
throw(ArgumentError("The only accepted parameter to @lock is `exclusive`."))

do_lock = esc(parameter.args[2])
quote
$do_lock && Base.lock(_jl_gil_lock)
try
state = C.PyGILState_Ensure()
try
$(esc(expr))
finally
C.PyGILState_Release(state)
end
finally
$do_lock && Base.unlock(_jl_gil_lock)
end
end
end
Expand All @@ -63,11 +118,17 @@ Python code. That other thread can be a Julia thread, which must lock the GIL us
See [`@unlock`](@ref) for the macro form.
"""
function unlock(f)
state = C.PyEval_SaveThread()
_locked = Base.islocked(_jl_gil_lock)
_locked && Base.unlock(_jl_gil_lock)
try
f()
state = C.PyEval_SaveThread()
try
f()
finally
C.PyEval_RestoreThread(state)
end
finally
C.PyEval_RestoreThread(state)
_locked && Base.lock(_jl_gil_lock)
end
end

Expand All @@ -84,13 +145,31 @@ The macro equivalent of [`unlock`](@ref).
"""
macro unlock(expr)
quote
state = C.PyEval_SaveThread()
_locked = Base.islocked(_jl_gil_lock)
_locked && Base.unlock(_jl_gil_lock)
try
$(esc(expr))
state = C.PyEval_SaveThread()
try
$(esc(expr))
finally
C.PyEval_RestoreThread(state)
end
finally
C.PyEval_RestoreThread(state)
_locked && Base.lock(_jl_gil_lock)
end
end
end

#=
# Disable this for now since holding this lock will disable finalizers
# If the main thread already has the GIL, we should lock _jl_gil_lock.
function __init__()
if hasgil()
Base.lock(_jl_gil_lock)
end
end
Comment on lines +166 to +170
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locking on __init__ is probably a bad idea even though we are technically holding the GIL since this will disable finalizers.

Rather we should consider invoking lock while executing other calls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabling this apparently allowed the python tests to succeed.

Copy link

@vtjnash vtjnash Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After sleeping on this, I propose you make the following changes to make this fix work:

  • Replace this ReentrantLock with a OncePerThread{@NamedTuple{stack::Vector{Task}, set::IdSet{Task}, condvar::Condition}} lockowners. The point being mostly to convert the state on ReentrantLock into something similar but protected by the GIL to avoid concurrency, finalizer, and multi-threading issues.
  • Perhaps, after locking the GIL on this thread, check that the current Task is in(lockowners().set, current_task) for safety (forcing the user to make explicit GIL usage on each Task with a call to lock on each Task)?
  • During the calls to lock (after getting GIL), push the current Task to the lockowners stack and add it to the set (if using that)
  • On unlock, block (on condvar) until the current_task is at the top of the stack (notify condvar after every unlock) to enforce that GIL unlock calls are correctly sequenced
  • Where necessary (anywhere internally that is holding the GIL to update a Julia data structure related to python), add calls to enable/disable finalizers to prevent memory corruption from concurrent mutation of the data structures (separate PR)
  • Whenever a Task interacts with python, force current_task().sticky = true to prevent it migrating to a thread that is not holding the GIL.
  • Adding __init__ is dangerous since that doesn't run on the main Task in some user applications even though it always does so during your CI tests. You might just want to add some special exemptions for when current_task() == root_task(0)?

Does that all sound logical and reasonable? My proposal already went through a few drafts as I tried to make it seem straightforward enough to explain and use, so it might not be the only way to deal with some of these constraints, and I'm probably forgetting at least some important details.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OncePerThread is to be introduced in Julia 1.12 which has yet to be released at the time of this writing. Is there a lesser version of this that could be implemented prior to Julia 1.12, preferably to Julia 1.10 LTS?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vtjnash Should Condition here be Base.Condition (Base.GenericCondition{Base.AlwaysLockedST}) or Threads.Condition (Base.GenericCondition{ReentrantLock}) ?

I originally thought you meant Threads.Condition, but now I'm thinking Base.Condition may be correct since in theory all of the Tasks waiting to unlock should already be on the same thread.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Base.Condition, since this is locked to thread-local state

=#

include("GlobalInterpreterLock.jl")

end
120 changes: 120 additions & 0 deletions src/GIL/GlobalInterpreterLock.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
struct TaskState
task::Task
sticky::Bool # original stickiness of the task
state::C.PyGILState_STATE
end

struct TaskStack
stack::Vector{TaskState}
count::IdDict{Task,Int}
condvar::Threads.Condition
function TaskStack()
return new(TaskState[], IdDict{Task,Int}(), Threads.Condition())
end
end
function Base.last(task_stack::TaskStack)::Task
return last(task_stack.stack).task
end
function Base.push!(task_stack::TaskStack, task::Task)
original_sticky = task.sticky
# The task should not migrate threads while acquiring or holding the GIL
task.sticky = true
gil_state = C.PyGILState_Ensure()

# Save the stickiness and state for when we release
state = TaskState(task, original_sticky, gil_state)
push!(task_stack.stack, state)

# Increment the count for this task
count = get(task_stack.count, task, 0)
task_stack.count[task] = count + 1

return task_stack
end
function Base.pop!(task_stack::TaskStack)::Task
state = pop!(task_stack.stack)
task = state.task
sticky = state.sticky
gil_state = state.state

# Decrement the count for this task
count = task_stack.count[task] - 1
if count == 0
# If 0, remove it from the key set
pop!(task_stack.count, task)
else
task_stack[task] = count
end

C.PyGILState_Release(gil_state)

# Restore sticky state after releasing the GIL
task.sticky = sticky

Base.lock(task_stack.condvar) do
notify(task_stack.condvar)
end

return task
end
Base.isempty(task_stack::TaskStack) = isempty(task_stack.stack)

if !isdefined(Base, :OncePerThread)

# OncePerThread is implemented in full in Julia 1.12
# This implementation is meant for compatibility with Julia 1.10 and 1.11
# and only supports a static number of threads. Use Julia 1.12 for dynamic
# thread usage.
mutable struct OncePerThread{T,F} <: Function
@atomic xs::Vector{T} # values
@atomic ss::Vector{UInt8} # states: 0=initial, 1=hasrun, 2=error, 3==concurrent
const initializer::F
function OncePerThread{T,F}(initializer::F) where {T,F}
nt = Threads.maxthreadid()
return new{T,F}(Vector{T}(undef, nt), zeros(UInt8, nt), initializer)
end
end
OncePerThread{T}(initializer::Type{U}) where {T, U} = OncePerThread{T,Type{U}}(initializer)
(once::OncePerThread{T,F})() where {T,F} = once[Threads.threadid()]
function Base.getindex(once::OncePerThread, tid::Integer)
tid = Threads.threadid()
ss = @atomic :acquire once.ss
xs = @atomic :monotonic once.xs
if checkbounds(Bool, xs, tid)
if ss[tid] == 0
xs[tid] = once.initializer()
ss[tid] = 1
end
return xs[tid]
else
throw(ErrorException("Thread id $tid is out of bounds as initially allocated. Use Julia 1.12 for dynamic thread usage."))
end
end
Copy link
Contributor

@MilesCranmer MilesCranmer Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make this per-task rather than per-thread?

I use something like this in SymbolicRegression.jl:

struct PerTaskCache{T,F<:Function}
    constructor::F
    PerTaskCache{T}(constructor::F) where {T,F} = new{T,F}(constructor)
    PerTaskCache{T}() where {T} = PerTaskCache{T}(() -> T())
end
function Base.getindex(cache::PerTaskCache{T}) where {T}
    tls = Base.task_local_storage()
    haskey(tls, cache) && return tls[cache]::T
    value = cache.constructor()::T
    tls[cache] = value
    return value
end
function Base.setindex!(cache::PerTaskCache{T}, value::T) where {T}
    tls = Base.task_local_storage()
    tls[cache] = value
    return value
end

Then I can use it on each task, without worrying about locking or dynamic threading:

const CachedPrep = PerTaskCache{Dict{UInt,Any}}()

function foo()
    cache = CachedPrep[]
    #= do stuff with cache::Dict{UInt,Any} =#
end

Copy link
Member Author

@mkitti mkitti Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python GIL knows nothing about Julia's Tasks. It only knows about threads.

The goal of the new lock that Jameson proposed is to lock the GIL on a thread and make Tasks sticky to that thread so they do not migrate.

Several Tasks running on the same thread could technically hold the GIL together. Thus we keep a stack of Tasks to ensure one Task owns the GIL at a time and they are properly unlocked in the right order.

It is essentially a thread and GIL aware ReentrantLock.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that holding the GIL together is sort of difficult to arrange to do entirely correctly, since python also allows callees to unlock it, even though that is unsound in the presence of other any other locks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks; makes sense!


end

struct GlobalInterpreterLock <: Base.AbstractLock
lock_owners::OncePerThread{TaskStack}
function GlobalInterpreterLock()
return new(OncePerThread{TaskStack}(TaskStack))
end
end
function Base.lock(gil::GlobalInterpreterLock)
push!(gil.lock_owners(), current_task())
return nothing
end
function Base.unlock(gil::GlobalInterpreterLock)
lock_owner::TaskStack = gil.lock_owners()
while last(lock_owner) != current_task()
wait(lock_owner.condvar)
end
task = pop!(lock_owner)
@assert task == current_task()
return nothing
end
function Base.islocked(gil::GlobalInterpreterLock)
# TODO: handle Julia 1.10 and 1.11 case when have not allocated up to maxthreadid
return any(!isempty(gil.lock_owners[thread_index]) for thread_index in 1:Threads.maxthreadid())
end

const _GIL = GlobalInterpreterLock()
10 changes: 10 additions & 0 deletions test/GC.jl
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
using TestItemRunner

@testitem "GC.gc()" begin
using PythonCall
let
pyobjs = map(pylist, 1:100)
PythonCall.GIL.@unlock Threads.@threads for obj in pyobjs
Expand All @@ -13,6 +16,7 @@
end

@testitem "GC.GCHook" begin
using PythonCall
let
pyobjs = map(pylist, 1:100)
PythonCall.GIL.@unlock Threads.@threads for obj in pyobjs
Expand All @@ -23,5 +27,11 @@ end
VERSION >= v"1.10.0-" &&
@test !isempty(PythonCall.GC.QUEUE.items)
GC.gc()

# Unlock and relocking the ReentrantLock allows this test to pass
# if _jl_gil_lock is locked on init
# Base.unlock(PythonCall.GIL._jl_gil_lock)
# Base.lock(PythonCall.GIL._jl_gil_lock)

@test isempty(PythonCall.GC.QUEUE.items)
end
21 changes: 19 additions & 2 deletions test/GIL.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
function threaded_sleep()
PythonCall.GIL.unlock() do
Threads.@threads for i = 1:2
PythonCall.GIL.lock() do
PythonCall.GIL.lock(exclusive=false) do
pyimport("time").sleep(1)
end
end
Expand All @@ -18,14 +18,22 @@
if Threads.nthreads() ≥ 2
@test 0.9 < t.time < 1.2
end

@test PythonCall.GIL.hasgil()
PythonCall.GIL.unlock() do
@test !Base.islocked(PythonCall.GIL._jl_gil_lock)
PythonCall.GIL.lock() do
@test Base.islocked(PythonCall.GIL._jl_gil_lock)
end
end
end

@testitem "@unlock and @lock" begin
# This calls Python's time.sleep(1) twice concurrently. Since sleep() unlocks the
# GIL, these can happen in parallel if Julia has at least 2 threads.
function threaded_sleep()
PythonCall.GIL.@unlock Threads.@threads for i = 1:2
PythonCall.GIL.@lock pyimport("time").sleep(1)
PythonCall.GIL.@lock exclusive=false pyimport("time").sleep(1)
end
end
# one run to ensure it's compiled
Expand All @@ -36,4 +44,13 @@ end
if Threads.nthreads() ≥ 2
@test 0.9 < t.time < 1.2
end

@test PythonCall.GIL.hasgil()
PythonCall.GIL.@unlock begin
@test !Base.islocked(PythonCall.GIL._jl_gil_lock)
PythonCall.GIL.@lock begin
@test Base.islocked(PythonCall.GIL._jl_gil_lock)
end
end

end