feat(context): all-native OTEL context write API — Phase 1 (PROF-15271)#631
Draft
rkennke wants to merge 8 commits into
Draft
feat(context): all-native OTEL context write API — Phase 1 (PROF-15271)#631rkennke wants to merge 8 commits into
rkennke wants to merge 8 commits into
Conversation
Captures the UAF problem, options A-E, the JMH benchmark campaign and its governing principle, the combined-API design, the consumer audit, and the expand -> migrate -> contract migration plan. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add native setTraceContext0/clearTraceContext0 (combined per-scope activate/deactivate) and setContextValue0/clearContextValue0 (single attribute), plus attrs_data compaction and LRS helpers, in javaApi.cpp, with matching private native declarations on JavaProfiler. Each resolves the current carrier's OtelThreadContextRecord via ProfiledThread::current(). A JNI native frame pins a mounted virtual thread to its carrier for the duration of the call, so the resolved record is guaranteed live and cannot migrate mid-write -- there is no cached per-thread DirectByteBuffer to dangle. Writes follow the same detach -> mutate -> attach valid-flag protocol as ThreadContext, keeping the async-signal-handler sampler (ContextApi::get) coherent. Native layer only; the public Java API and value cache built on top follow in a later commit. The DirectByteBuffer path is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-15271) Add the public all-native context API on JavaProfiler (provisional names, pending dd-trace-java coordination): setTraceContext/clearTraceContext (combined per-scope activate/deactivate) and setContextValue/clearContextValue (single attribute), each backed by the native primitives. Relocate the value->(encoding,utf8) memoization out of the per-thread ThreadContext into a new process-wide ContextValueCache. Because the encoding is a process-global immutable ID (registerConstant0), a shared lock-free cache is correct: immutable entries published via a single AtomicReferenceArray slot store (no torn reads), a miss re-registers idempotently, and collisions self-heal. This avoids duplicating the table across every carrier/virtual thread and needs no CarrierThreadLocal. registerConstant0 is made package-private for reuse (relocates in phase 3). Coexists with the DirectByteBuffer path (both write the same native record); the DBB path is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…5271) Mark the DBB write/read surface `@Deprecated` pointing at the all-native API: JavaProfiler.setContext/clearContext/setContextAttribute/clearContextAttribute/ setContextAttributesByIdAndBytes/copyTags/getThreadContext, and the DBB-path helper classes ThreadContext, ScopeStack, and ContextSetter. Non-breaking (annotations only) -- signals the migration for dd-trace-java (phase 2) ahead of removal (phase 3). Java has no -Werror, so the deprecation notes are informational; the DBB path still functions unchanged during coexistence. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…PROF-15271) setTraceContext0/clearTraceContext0 now write the fixed LRS entry header (key_index=0, length=16) in addition to its hex value, instead of relying on the ThreadContext constructor to have initialized it. This makes the all-native path correct on a record that only saw the ProfiledThread zero-init -- i.e. when no DirectByteBuffer/ThreadContext was ever created (the phase-2 pure-native case), where attrs_data[1] would otherwise be 0 and mis-frame the LRS entry. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AllNativeContextTest exercises setTraceContext/clearTraceContext/setContextValue/ clearContextValue: round-trip readback, equivalence with the DirectByteBuffer path, clear semantics, span-transition attribute reset, single-attribute set/clear, attrs_data overflow, rejected (null/oversized) values, DBB<->native coexistence on one thread, and coherence of native writes from many mounted virtual threads (the migration-safe path). Read-back uses the ThreadContext DBB read methods as an oracle -- a view over the same native record the all-native path writes. All 9 pass on JDK 21 (debug); the existing context suite (Otel/Carrier/Tag/ProcessContext) is unregressed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ROF-15271) Point the design note appendix at branch context-storage-benchmark-experiment (pushed to origin), which preserves the full JMH campaign + native prototypes for reproducibility, rather than describing scaffolding that lives only on that branch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…le (PROF-15271) ContextCombinedBenchmark compares the production per-scope activate+deactivate cycle: the combined native calls (setTraceContext + clearTraceContext) vs the deprecated DirectByteBuffer sequence as dd-trace-java actually drives it (profiler.setContext + 2x setContextAttribute + clear...), on platform + mounted virtual threads, both storage modes. Measured (JDK 21, debug, carrier): native ~136 ns vs DBB ~189 ns -- native ~28% faster, mode-independent, and a win on vthreads too (~147 vs ~193 ns). The DBB baseline here is the faithful production cost: each of its 6 calls pays a currentContext()/ CarrierThreadLocal lookup, which the 2-call combined native cycle avoids. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
CI Test ResultsRun: #28657970875 | Commit:
Status Overview
Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled Failed Testsglibc-aarch64/debug / 8-j9Job: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 11-j9Job: View logs No detailed failure information available. Check the job logs. Summary: Total: 32 | Passed: 30 | Failed: 2 Updated: 2026-07-03 11:53:20 UTC |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 1 of eliminating the virtual-thread use-after-free in OTEL trace-context storage
(PROF-15271), by moving context writes to an all-native path that resolves the current
carrier's OTEP record inside the JNI call — where the mounted virtual thread is pinned to its
carrier by the native frame, so the write is race-free by construction with no cached
DirectByteBufferto dangle.This PR is purely additive: it adds the new API alongside the existing
DirectByteBuffer(DBB) path (now
@Deprecated) and changes no behavior. The crash fix itself lands in Phase 2,when dd-trace-java switches to the new API; until then carrier-scoping (#625 /
94686da) remainsthe mitigation.
Full rationale, options considered (A–E), the JMH campaign, the consumer audit, and the
expand → migrate → contract plan are in the design note:
doc/plans/2026-07-02-all-native-context-storage-design.md.What's in this PR
javaApi.cpp):setTraceContext0/clearTraceContext0(combinedper-scope activate/deactivate) and
setContextValue0/clearContextValue0(single attribute),using the same
detach → mutate → attachvalid-flag protocol asThreadContextso theasync-signal-handler sampler (
ContextApi::get) stays coherent. The native path self-initializesthe LRS
attrs_dataentry (no dependency on theThreadContextctor).JavaProfiler,ContextValueCache): provisional namessetTraceContext/clearTraceContext/setContextValue/clearContextValue. Thevalue → (encoding, utf8)memoization moves out of the per-threadThreadContextinto aprocess-wide lock-free cache (safe because the encoding is a process-global immutable ID).
ThreadContext/ScopeStack/ContextSetterhelper classes are marked
@Deprecated(non-breaking) pointing at the new API.AllNativeContextTest, 9): round-trip, DBB-equivalence, clear semantics,span-transition attribute reset, single-attr set/clear, overflow, rejected values,
DBB↔native coexistence, and coherence of native writes from 512 mounted virtual threads.
ContextCombinedBenchmark).Results
Tag / ProcessContext) is unregressed.
~136 ns vs DBB ~189 ns — ~28% faster (~24% thread mode), a win on virtual threads too, and
mode-independent. The DBB baseline is the faithful production cost (each of its 6 calls pays a
currentContext()/CarrierThreadLocallookup the 2-call native cycle avoids).Migration (expand → migrate → contract)
DatadogProfilingIntegrationto the combined API. Crashfix + perf win land here.
CarrierThreadLocal/OtelContextStoragesubsystem. Breaking, gated on all-consumers-migrated.
Reviewer notes
team so Phase 2 is a clean swap.
detach/attach+ release-fence protocol vsthe sampler's acquire load of
valid(ContextApi::get). Verify on aarch64 (trivial on x86).context-storage-benchmark-experiment.else is test/benchmark. Open item: confirm
ddprof-libstandalone Maven publication (externalconsumers) before the Phase-3 removal.
🤖 Generated with Claude Code