Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
[3.14] gh-139103: fix free-threading dataclass.__init__ perf issue (g…
…h-141596)

The dataclasses `__init__` function is generated dynamically by a call to `exec()` and so doesn't have deferred reference counting enabled. Enable deferred reference counting on functions when assigned as an attribute to type objects to avoid reference count contention when creating dataclass instances.
(cherry picked from commit ce79154)

Co-authored-by: Edward Xu <[email protected]>
  • Loading branch information
LindaSummer authored and colesbury committed Nov 19, 2025
commit 3eb5459d46fc4fcab99687091dbb75661477038b
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Improve multithreaded scaling of dataclasses on the free-threaded build.
12 changes: 12 additions & 0 deletions Objects/typeobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -6181,6 +6181,18 @@ type_setattro(PyObject *self, PyObject *name, PyObject *value)
assert(!_PyType_HasFeature(metatype, Py_TPFLAGS_INLINE_VALUES));
assert(!_PyType_HasFeature(metatype, Py_TPFLAGS_MANAGED_DICT));

#ifdef Py_GIL_DISABLED
// gh-139103: Enable deferred refcounting for functions assigned
// to type objects. This is important for `dataclass.__init__`,
// which is generated dynamically.
if (value != NULL &&
PyFunction_Check(value) &&
!_PyObject_HasDeferredRefcount(value))
{
PyUnstable_Object_EnableDeferredRefcount(value);
}
#endif

PyObject *old_value = NULL;
PyObject *descr = _PyType_LookupRef(metatype, name);
if (descr != NULL) {
Expand Down
12 changes: 12 additions & 0 deletions Tools/ftscalingbench/ftscalingbench.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import sys
import threading
import time
from dataclasses import dataclass

# The iterations in individual benchmarks are scaled by this factor.
WORK_SCALE = 100
Expand Down Expand Up @@ -189,6 +190,17 @@ def thread_local_read():
_ = tmp.x


@dataclass
class MyDataClass:
x: int
y: int
z: int

@register_benchmark
def instantiate_dataclass():
for _ in range(1000 * WORK_SCALE):
obj = MyDataClass(x=1, y=2, z=3)

def bench_one_thread(func):
t0 = time.perf_counter_ns()
func()
Expand Down
Loading