Skip to content

Commit daa159f

Browse files
authored
gh-135871: Reload lock internal state while spinning in PyMutex_LockTimed (gh-146064)
Add atomic loads in the slow path of PyMutex to increase the number of lock acquisitions per second that threads can make on a shared mutex.
1 parent 821581a commit daa159f

File tree

2 files changed

+16
-0
lines changed

2 files changed

+16
-0
lines changed
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Improve multithreaded scaling of PyMutex in low-contention scenarios by reloading the lock's internal state, without slowing down high-contention scenarios.

Python/lock.c

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,10 @@ static const PyTime_t TIME_TO_BE_FAIR_NS = 1000*1000;
2727
// enabled.
2828
#if Py_GIL_DISABLED
2929
static const int MAX_SPIN_COUNT = 40;
30+
static const int RELOAD_SPIN_MASK = 3;
3031
#else
3132
static const int MAX_SPIN_COUNT = 0;
33+
static const int RELOAD_SPIN_MASK = 1;
3234
#endif
3335

3436
struct mutex_entry {
@@ -79,6 +81,16 @@ _PyMutex_LockTimed(PyMutex *m, PyTime_t timeout, _PyLockFlags flags)
7981
};
8082

8183
Py_ssize_t spin_count = 0;
84+
#ifdef Py_GIL_DISABLED
85+
// Using thread-id as a way of reducing contention further in the reload below.
86+
// It adds a pseudo-random starting offset to the recurrence, so that threads
87+
// are less likely to try and run compare-exchange at the same time.
88+
// The lower bits of platform thread ids are likely to not be random,
89+
// hence the right shift.
90+
const Py_ssize_t tid = (Py_ssize_t)(_Py_ThreadId() >> 12);
91+
#else
92+
const Py_ssize_t tid = 0;
93+
#endif
8294
for (;;) {
8395
if ((v & _Py_LOCKED) == 0) {
8496
// The lock is unlocked. Try to grab it.
@@ -92,6 +104,9 @@ _PyMutex_LockTimed(PyMutex *m, PyTime_t timeout, _PyLockFlags flags)
92104
// Spin for a bit.
93105
_Py_yield();
94106
spin_count++;
107+
if (((spin_count + tid) & RELOAD_SPIN_MASK) == 0) {
108+
v = _Py_atomic_load_uint8_relaxed(&m->_bits);
109+
}
95110
continue;
96111
}
97112

0 commit comments

Comments
 (0)