-
-
Notifications
You must be signed in to change notification settings - Fork 34.8k
gh-151646: Fix data race between gc.get_stats() and GC in free-threading builds #151766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,99 @@ | ||
| # Reproduction for the gc.get_stats() data race under free-threading. | ||
| # | ||
| # CPython issue gh-151646: in free-threading builds (``--disable-gil``) the | ||
| # concurrent collector and ``gc.get_stats()`` touch the same per-generation | ||
| # statistics struct (``gcstate->generation_stats``) without any synchronisation: | ||
| # | ||
| # * READER -- ``gc_get_stats_impl()`` in ``Modules/gcmodule.c`` copies the | ||
| # ``struct gc_generation_stats`` for each generation (``collections``, | ||
| # ``collected``, ``uncollectable``, ``candidates``, ``duration``) with a | ||
| # plain struct assignment and no lock. | ||
| # | ||
| # * WRITER -- at the end of every collection cycle ``gc_collect_main()`` in | ||
| # ``Python/gc_free_threading.c`` mutates that very struct in place | ||
| # (``stats->collections++``, ``stats->collected += m``, ...), again with no | ||
| # lock. | ||
| # | ||
| # When one thread is collecting while several others read the stats, the | ||
| # unsynchronised read/write to the same memory is a data race. The values are | ||
| # only statistics, so the race is benign at the Python level (no crash), which | ||
| # is exactly what lets this script double as a regression test: once the fix | ||
| # adds proper synchronisation, ThreadSanitizer should stay quiet while the | ||
| # script keeps exiting cleanly. | ||
| # | ||
| # Running this script under a free-threading CPython build compiled with | ||
| # ThreadSanitizer (``./configure --disable-gil --with-thread-sanitizer``) is | ||
| # expected to print a ``WARNING: ThreadSanitizer: data race`` report whose stack | ||
| # traces point at ``gc_get_stats_impl`` (gcmodule.c) and ``gc_collect_main`` | ||
| # (gc_free_threading.c). | ||
| # | ||
| # Standalone usage: | ||
| # | ||
| # ./python Lib/test/test_free_threading/test_gc_get_stats_race.py | ||
| # | ||
| # or as part of the test suite (only meaningful under a TSAN build): | ||
| # | ||
| # ./python -m test test_free_threading.test_gc_get_stats_race | ||
|
|
||
| import gc | ||
| import threading | ||
| import unittest | ||
|
|
||
| from test.support import threading_helper | ||
|
|
||
|
|
||
| # One thread hammers gc.collect() (the writer side); enough reader threads run | ||
| # gc.get_stats() concurrently to make the race easy to observe. Readers run | ||
| # until the writer is done so the test duration is controlled by the collection | ||
| # count rather than by reader loop counts. | ||
| NUM_READERS = 4 | ||
| ITERATIONS = 50 | ||
|
|
||
|
|
||
| def _stress_get_stats_race(num_readers=NUM_READERS, iterations=ITERATIONS): | ||
| """Race gc.collect() against gc.get_stats().""" | ||
|
|
||
| # Synchronise the start so the writer and readers overlap for as long as | ||
| # possible, maximising the chance of the read and write landing on the | ||
| # statistics struct at the same time. | ||
| done = threading.Event() | ||
|
|
||
| def collector(): | ||
| try: | ||
| for _ in range(iterations): | ||
| # Writer: each full collection updates gcstate->generation_stats. | ||
| gc.collect() | ||
| finally: | ||
| done.set() | ||
|
|
||
| def reader(): | ||
| while not done.is_set(): | ||
| # Reader: copies the per-generation stats structs with no lock. | ||
| gc.get_stats() | ||
|
|
||
| threading_helper.run_concurrently([collector] + [reader] * num_readers) | ||
|
|
||
|
|
||
| @threading_helper.requires_working_threading() | ||
| class TestGCGetStatsRace(unittest.TestCase): | ||
| def test_get_stats_collect_race(self): | ||
| _stress_get_stats_race() | ||
|
|
||
| # The race is benign at the Python level: gc.get_stats() must still | ||
| # return well-formed data and the interpreter must not crash. | ||
| stats = gc.get_stats() | ||
| self.assertEqual(len(stats), 3) | ||
| for generation in stats: | ||
| self.assertIn("collections", generation) | ||
| self.assertIn("collected", generation) | ||
| self.assertIn("uncollectable", generation) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| # Standalone reproduction: run the race and exit cleanly so the script can | ||
| # be reused as a regression check once the fix lands. | ||
| print(f"Racing 1 gc.collect() thread against " | ||
| f"{NUM_READERS} gc.get_stats() thread(s), {ITERATIONS} collections...") | ||
| _stress_get_stats_race() | ||
| print("Done (no Python-level crash). " | ||
| "Run under a free-threading + TSAN build to observe the data race.") |
3 changes: 3 additions & 0 deletions
3
Misc/NEWS.d/next/Core_and_Builtins/2026-06-19-12-00-00.gh-issue-151646.Gc5tAt.rst
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| Fix a data race in free-threading builds between :func:`gc.get_stats` and a | ||
| concurrent garbage collection cycle. Access to the per-generation statistics | ||
| is now serialized with a mutex so the reader observes a consistent snapshot. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the guard macro
Py_GIL_DISABLEDis missing here and below the unlock.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole gc_free_threading.c file is wrapped in
#ifdef Py_GIL_DISABLEDso I don't think that's needed.