Skip to content

feat(aws_durable_execution): persist trace context across suspend/resume#17773

Open
joeyzhao2018 wants to merge 46 commits into
mainfrom
joey/cross-invocation-tracecontext-propagation
Open

feat(aws_durable_execution): persist trace context across suspend/resume#17773
joeyzhao2018 wants to merge 46 commits into
mainfrom
joey/cross-invocation-tracecontext-propagation

Conversation

@joeyzhao2018
Copy link
Copy Markdown
Contributor

@joeyzhao2018 joeyzhao2018 commented Apr 30, 2026

Description

This adds Datadog trace-context checkpointing for AWS Durable Execution workflows so traces can continue across Lambda invocations when a workflow suspends and later resumes.

When a durable handler raises SuspendExecution, the integration now appends an async synthetic _datadog_{N} STEP containing the current propagation headers. On replay, datadog-lambda-python can read the latest checkpoint and reactivate the trace context before the workflow continues. And that part is in the datadog-lambda-python PR#818

The checkpoint writer also:

  • anchors saved parent ID to the first durable execute span, then reuses that parent ID on all following invocations' durable execute spans (All lambda invocations are siblings)
  • skips redundant checkpoint writes when only volatile per-span fields change
  • allocates checkpoint names with a thread-safe per-state counter so concurrent/map/parallel saves do not collide
  • writes checkpoints only on suspend, not when workflows complete or fail terminally

Testing

  • Added targeted unit coverage in tests/contrib/aws_durable_execution_sdk_python/test_trace_checkpoint.py for header stabilization, parent-id anchoring/reuse, traceparent rewriting, replay diff suppression, checkpoint numbering, concurrent allocation, and no-op failure paths.
  • Tested using a real lambda durable function.
Screenshot 2026-05-08 at 2 53 02 PM Screenshot 2026-05-08 at 2 55 26 PM

Risks

Low to medium. This only writes Datadog checkpoint metadata on the SuspendExecution path, and failures in the checkpoint writer are swallowed so workflow execution should not be affected.

The main behavioral risks are:

  • a user-defined step name colliding with the reserved _datadog_* prefix
  • unexpected SDK changes around state.operations, create_checkpoint, or operation parent IDs

Note

To avoid creating too many checkpoints, we are excluding the dd=p: part when comparing the tracecontext.

Diffing the new headers against the stored payload of the highest-N existing _datadog_* operation suppresses redundant writes; only x-datadog-parent-id and the dd=p: entry of tracestate are stripped before comparison so sampling priority, decision-maker, origin, and propagation tags still trigger a fresh save when they change.

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Apr 30, 2026

Performance SLOs

Comparing candidate joey/cross-invocation-tracecontext-propagation (db67ae1) with baseline joey/apm-ai-toolkit/aws-durable-execution-sdk-python (92f0e53)

📈 Performance Regressions (2 suites)
📈 iastaspects - 118/118

✅ add_aspect

Time: ✅ 104.846µs (SLO: <130.000µs 📉 -19.3%) vs baseline: +4.3%

Memory: ✅ 43.988MB (SLO: <46.000MB -4.4%) vs baseline: +4.8%


✅ add_inplace_aspect

Time: ✅ 101.578µs (SLO: <130.000µs 📉 -21.9%) vs baseline: -0.5%

Memory: ✅ 43.921MB (SLO: <46.000MB -4.5%) vs baseline: +5.0%


✅ add_inplace_noaspect

Time: ✅ 28.128µs (SLO: <40.000µs 📉 -29.7%) vs baseline: -2.0%

Memory: ✅ 43.872MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


✅ add_noaspect

Time: ✅ 49.152µs (SLO: <70.000µs 📉 -29.8%) vs baseline: +0.3%

Memory: ✅ 43.857MB (SLO: <46.000MB -4.7%) vs baseline: +4.6%


✅ bytearray_aspect

Time: ✅ 265.249µs (SLO: <400.000µs 📉 -33.7%) vs baseline: -0.2%

Memory: ✅ 43.845MB (SLO: <46.000MB -4.7%) vs baseline: +4.8%


✅ bytearray_extend_aspect

Time: ✅ 654.784µs (SLO: <800.000µs 📉 -18.2%) vs baseline: +0.2%

Memory: ✅ 43.870MB (SLO: <46.000MB -4.6%) vs baseline: +4.5%


✅ bytearray_extend_noaspect

Time: ✅ 273.422µs (SLO: <400.000µs 📉 -31.6%) vs baseline: +0.7%

Memory: ✅ 43.967MB (SLO: <46.000MB -4.4%) vs baseline: +5.1%


✅ bytearray_noaspect

Time: ✅ 148.822µs (SLO: <300.000µs 📉 -50.4%) vs baseline: +0.9%

Memory: ✅ 43.975MB (SLO: <46.000MB -4.4%) vs baseline: +5.5%


✅ bytes_aspect

Time: ✅ 231.096µs (SLO: <300.000µs 📉 -23.0%) vs baseline: +0.2%

Memory: ✅ 43.962MB (SLO: <46.000MB -4.4%) vs baseline: +5.2%


✅ bytes_noaspect

Time: ✅ 139.112µs (SLO: <200.000µs 📉 -30.4%) vs baseline: +0.4%

Memory: ✅ 43.791MB (SLO: <46.000MB -4.8%) vs baseline: +4.6%


✅ bytesio_aspect

Time: ✅ 3.824ms (SLO: <5.000ms 📉 -23.5%) vs baseline: +0.4%

Memory: ✅ 43.870MB (SLO: <46.000MB -4.6%) vs baseline: +4.6%


✅ bytesio_noaspect

Time: ✅ 322.401µs (SLO: <420.000µs 📉 -23.2%) vs baseline: +0.4%

Memory: ✅ 43.852MB (SLO: <46.000MB -4.7%) vs baseline: +4.6%


✅ capitalize_aspect

Time: ✅ 89.959µs (SLO: <300.000µs 📉 -70.0%) vs baseline: +0.8%

Memory: ✅ 43.999MB (SLO: <46.000MB -4.3%) vs baseline: +5.0%


✅ capitalize_noaspect

Time: ✅ 274.466µs (SLO: <300.000µs -8.5%) vs baseline: +8.2%

Memory: ✅ 43.934MB (SLO: <46.000MB -4.5%) vs baseline: +5.0%


✅ casefold_aspect

Time: ✅ 89.525µs (SLO: <500.000µs 📉 -82.1%) vs baseline: ~same

Memory: ✅ 43.930MB (SLO: <46.000MB -4.5%) vs baseline: +4.9%


✅ casefold_noaspect

Time: ✅ 312.184µs (SLO: <500.000µs 📉 -37.6%) vs baseline: +1.1%

Memory: ✅ 43.962MB (SLO: <46.000MB -4.4%) vs baseline: +4.8%


✅ decode_aspect

Time: ✅ 87.271µs (SLO: <100.000µs 📉 -12.7%) vs baseline: ~same

Memory: ✅ 43.909MB (SLO: <46.000MB -4.5%) vs baseline: +5.0%


✅ decode_noaspect

Time: ✅ 155.960µs (SLO: <210.000µs 📉 -25.7%) vs baseline: -1.0%

Memory: ✅ 43.868MB (SLO: <46.000MB -4.6%) vs baseline: +4.7%


✅ encode_aspect

Time: ✅ 84.877µs (SLO: <200.000µs 📉 -57.6%) vs baseline: +0.6%

Memory: ✅ 43.902MB (SLO: <46.000MB -4.6%) vs baseline: +4.8%


✅ encode_noaspect

Time: ✅ 145.266µs (SLO: <200.000µs 📉 -27.4%) vs baseline: +0.1%

Memory: ✅ 43.876MB (SLO: <46.000MB -4.6%) vs baseline: +4.7%


✅ format_aspect

Time: ✅ 14.649ms (SLO: <19.200ms 📉 -23.7%) vs baseline: +0.6%

Memory: ✅ 43.933MB (SLO: <46.000MB -4.5%) vs baseline: +4.9%


✅ format_map_aspect

Time: ✅ 16.368ms (SLO: <21.500ms 📉 -23.9%) vs baseline: ~same

Memory: ✅ 43.951MB (SLO: <46.000MB -4.5%) vs baseline: +5.1%


✅ format_map_noaspect

Time: ✅ 361.466µs (SLO: <500.000µs 📉 -27.7%) vs baseline: ~same

Memory: ✅ 43.947MB (SLO: <46.000MB -4.5%) vs baseline: +4.9%


✅ format_noaspect

Time: ✅ 310.122µs (SLO: <500.000µs 📉 -38.0%) vs baseline: ~same

Memory: ✅ 43.907MB (SLO: <46.000MB -4.6%) vs baseline: +5.0%


✅ index_aspect

Time: ✅ 127.993µs (SLO: <300.000µs 📉 -57.3%) vs baseline: +5.0%

Memory: ✅ 43.998MB (SLO: <46.000MB -4.4%) vs baseline: +5.3%


✅ index_noaspect

Time: ✅ 41.123µs (SLO: <300.000µs 📉 -86.3%) vs baseline: +0.6%

Memory: ✅ 43.919MB (SLO: <46.000MB -4.5%) vs baseline: +5.0%


✅ join_aspect

Time: ✅ 214.200µs (SLO: <300.000µs 📉 -28.6%) vs baseline: -0.3%

Memory: ✅ 43.880MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


✅ join_noaspect

Time: ✅ 141.669µs (SLO: <300.000µs 📉 -52.8%) vs baseline: -0.9%

Memory: ✅ 43.927MB (SLO: <46.000MB -4.5%) vs baseline: +5.1%


✅ ljust_aspect

Time: ✅ 587.609µs (SLO: <700.000µs 📉 -16.1%) vs baseline: 📈 +16.7%

Memory: ✅ 43.919MB (SLO: <46.000MB -4.5%) vs baseline: +5.0%


✅ ljust_noaspect

Time: ✅ 260.659µs (SLO: <300.000µs 📉 -13.1%) vs baseline: +0.2%

Memory: ✅ 43.877MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


✅ lower_aspect

Time: ✅ 309.436µs (SLO: <500.000µs 📉 -38.1%) vs baseline: -1.3%

Memory: ✅ 43.974MB (SLO: <46.000MB -4.4%) vs baseline: +5.2%


✅ lower_noaspect

Time: ✅ 240.155µs (SLO: <300.000µs 📉 -19.9%) vs baseline: +0.5%

Memory: ✅ 43.887MB (SLO: <46.000MB -4.6%) vs baseline: +5.0%


✅ lstrip_aspect

Time: ✅ 0.279ms (SLO: <3.000ms 📉 -90.7%) vs baseline: +1.0%

Memory: ✅ 43.857MB (SLO: <46.000MB -4.7%) vs baseline: +4.7%


✅ lstrip_noaspect

Time: ✅ 0.179ms (SLO: <3.000ms 📉 -94.0%) vs baseline: +1.1%

Memory: ✅ 43.947MB (SLO: <46.000MB -4.5%) vs baseline: +5.0%


✅ modulo_aspect

Time: ✅ 14.239ms (SLO: <18.750ms 📉 -24.1%) vs baseline: -0.1%

Memory: ✅ 43.953MB (SLO: <46.000MB -4.4%) vs baseline: +4.9%


✅ modulo_aspect_for_bytearray_bytearray

Time: ✅ 14.755ms (SLO: <19.350ms 📉 -23.7%) vs baseline: +0.2%

Memory: ✅ 43.943MB (SLO: <46.000MB -4.5%) vs baseline: +4.6%


✅ modulo_aspect_for_bytes

Time: ✅ 14.393ms (SLO: <18.900ms 📉 -23.8%) vs baseline: -0.3%

Memory: ✅ 43.906MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


✅ modulo_aspect_for_bytes_bytearray

Time: ✅ 14.544ms (SLO: <19.150ms 📉 -24.1%) vs baseline: -0.2%

Memory: ✅ 44.052MB (SLO: <46.000MB -4.2%) vs baseline: +5.1%


✅ modulo_noaspect

Time: ✅ 0.366ms (SLO: <3.000ms 📉 -87.8%) vs baseline: +1.2%

Memory: ✅ 43.877MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


✅ replace_aspect

Time: ✅ 18.330ms (SLO: <24.000ms 📉 -23.6%) vs baseline: +0.2%

Memory: ✅ 43.940MB (SLO: <46.000MB -4.5%) vs baseline: +5.0%


✅ replace_noaspect

Time: ✅ 288.694µs (SLO: <400.000µs 📉 -27.8%) vs baseline: +0.4%

Memory: ✅ 43.943MB (SLO: <46.000MB -4.5%) vs baseline: +4.9%


✅ repr_aspect

Time: ✅ 329.026µs (SLO: <420.000µs 📉 -21.7%) vs baseline: +1.0%

Memory: ✅ 43.837MB (SLO: <46.000MB -4.7%) vs baseline: +4.9%


✅ repr_noaspect

Time: ✅ 46.298µs (SLO: <90.000µs 📉 -48.6%) vs baseline: -1.2%

Memory: ✅ 44.006MB (SLO: <46.000MB -4.3%) vs baseline: +5.4%


✅ rstrip_aspect

Time: ✅ 390.488µs (SLO: <500.000µs 📉 -21.9%) vs baseline: +1.3%

Memory: ✅ 43.966MB (SLO: <46.000MB -4.4%) vs baseline: +5.2%


✅ rstrip_noaspect

Time: ✅ 184.620µs (SLO: <300.000µs 📉 -38.5%) vs baseline: -0.9%

Memory: ✅ 43.830MB (SLO: <46.000MB -4.7%) vs baseline: +4.3%


✅ slice_aspect

Time: ✅ 180.609µs (SLO: <300.000µs 📉 -39.8%) vs baseline: -1.2%

Memory: ✅ 43.929MB (SLO: <46.000MB -4.5%) vs baseline: +5.0%


✅ slice_noaspect

Time: ✅ 54.646µs (SLO: <90.000µs 📉 -39.3%) vs baseline: +1.6%

Memory: ✅ 43.916MB (SLO: <46.000MB -4.5%) vs baseline: +4.7%


✅ stringio_aspect

Time: ✅ 4.598ms (SLO: <5.000ms -8.0%) vs baseline: 📈 +18.0%

Memory: ✅ 43.994MB (SLO: <46.000MB -4.4%) vs baseline: +5.0%


✅ stringio_noaspect

Time: ✅ 359.247µs (SLO: <500.000µs 📉 -28.2%) vs baseline: +0.4%

Memory: ✅ 43.924MB (SLO: <46.000MB -4.5%) vs baseline: +4.9%


✅ strip_aspect

Time: ✅ 277.018µs (SLO: <350.000µs 📉 -20.9%) vs baseline: +1.4%

Memory: ✅ 43.956MB (SLO: <46.000MB -4.4%) vs baseline: +5.1%


✅ strip_noaspect

Time: ✅ 177.441µs (SLO: <240.000µs 📉 -26.1%) vs baseline: -0.6%

Memory: ✅ 43.920MB (SLO: <46.000MB -4.5%) vs baseline: +4.7%


✅ swapcase_aspect

Time: ✅ 348.183µs (SLO: <500.000µs 📉 -30.4%) vs baseline: +0.8%

Memory: ✅ 43.929MB (SLO: <46.000MB -4.5%) vs baseline: +4.8%


✅ swapcase_noaspect

Time: ✅ 272.516µs (SLO: <400.000µs 📉 -31.9%) vs baseline: -0.7%

Memory: ✅ 44.000MB (SLO: <46.000MB -4.3%) vs baseline: +5.2%


✅ title_aspect

Time: ✅ 333.461µs (SLO: <500.000µs 📉 -33.3%) vs baseline: -1.1%

Memory: ✅ 43.927MB (SLO: <46.000MB -4.5%) vs baseline: +5.1%


✅ title_noaspect

Time: ✅ 269.269µs (SLO: <400.000µs 📉 -32.7%) vs baseline: +3.4%

Memory: ✅ 43.952MB (SLO: <46.000MB -4.5%) vs baseline: +4.9%


✅ translate_aspect

Time: ✅ 513.050µs (SLO: <700.000µs 📉 -26.7%) vs baseline: +0.1%

Memory: ✅ 43.900MB (SLO: <46.000MB -4.6%) vs baseline: +4.8%


✅ translate_noaspect

Time: ✅ 431.340µs (SLO: <500.000µs 📉 -13.7%) vs baseline: -0.6%

Memory: ✅ 43.829MB (SLO: <46.000MB -4.7%) vs baseline: +4.9%


✅ upper_aspect

Time: ✅ 311.331µs (SLO: <500.000µs 📉 -37.7%) vs baseline: +0.3%

Memory: ✅ 43.931MB (SLO: <46.000MB -4.5%) vs baseline: +4.7%


✅ upper_noaspect

Time: ✅ 239.055µs (SLO: <400.000µs 📉 -40.2%) vs baseline: +0.7%

Memory: ✅ 43.882MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


📈 iastaspectsospath - 24/24

✅ ospathbasename_aspect

Time: ✅ 522.536µs (SLO: <700.000µs 📉 -25.4%) vs baseline: 📈 +23.2%

Memory: ✅ 43.855MB (SLO: <46.000MB -4.7%) vs baseline: +4.8%


✅ ospathbasename_noaspect

Time: ✅ 429.429µs (SLO: <700.000µs 📉 -38.7%) vs baseline: +0.9%

Memory: ✅ 43.866MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


✅ ospathjoin_aspect

Time: ✅ 630.061µs (SLO: <700.000µs -10.0%) vs baseline: +0.2%

Memory: ✅ 43.949MB (SLO: <46.000MB -4.5%) vs baseline: +4.9%


✅ ospathjoin_noaspect

Time: ✅ 635.077µs (SLO: <700.000µs -9.3%) vs baseline: -0.3%

Memory: ✅ 43.847MB (SLO: <46.000MB -4.7%) vs baseline: +4.7%


✅ ospathnormcase_aspect

Time: ✅ 350.203µs (SLO: <700.000µs 📉 -50.0%) vs baseline: -0.5%

Memory: ✅ 43.880MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


✅ ospathnormcase_noaspect

Time: ✅ 356.197µs (SLO: <700.000µs 📉 -49.1%) vs baseline: -0.6%

Memory: ✅ 43.967MB (SLO: <46.000MB -4.4%) vs baseline: +5.0%


✅ ospathsplit_aspect

Time: ✅ 480.753µs (SLO: <700.000µs 📉 -31.3%) vs baseline: -0.3%

Memory: ✅ 43.859MB (SLO: <46.000MB -4.7%) vs baseline: +4.6%


✅ ospathsplit_noaspect

Time: ✅ 492.526µs (SLO: <700.000µs 📉 -29.6%) vs baseline: +0.1%

Memory: ✅ 43.782MB (SLO: <46.000MB -4.8%) vs baseline: +4.5%


✅ ospathsplitdrive_aspect

Time: ✅ 369.938µs (SLO: <700.000µs 📉 -47.2%) vs baseline: -0.3%

Memory: ✅ 43.899MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


✅ ospathsplitdrive_noaspect

Time: ✅ 73.230µs (SLO: <700.000µs 📉 -89.5%) vs baseline: +0.3%

Memory: ✅ 43.955MB (SLO: <46.000MB -4.4%) vs baseline: +5.0%


✅ ospathsplitext_aspect

Time: ✅ 458.394µs (SLO: <700.000µs 📉 -34.5%) vs baseline: -0.3%

Memory: ✅ 43.915MB (SLO: <46.000MB -4.5%) vs baseline: +4.9%


✅ ospathsplitext_noaspect

Time: ✅ 464.313µs (SLO: <700.000µs 📉 -33.7%) vs baseline: +0.3%

Memory: ✅ 43.907MB (SLO: <46.000MB -4.5%) vs baseline: +5.1%

🟡 Near SLO Breach (7 suites)
🟡 djangosimple - 28/28

✅ appsec

Time: ✅ 19.631ms (SLO: <22.300ms 📉 -12.0%) vs baseline: ~same

Memory: ✅ 71.612MB (SLO: <73.500MB -2.6%) vs baseline: +5.0%


✅ exception-replay-enabled

Time: ✅ 1.371ms (SLO: <1.450ms -5.5%) vs baseline: ~same

Memory: ✅ 69.837MB (SLO: <71.500MB -2.3%) vs baseline: +4.9%


✅ iast

Time: ✅ 19.644ms (SLO: <22.250ms 📉 -11.7%) vs baseline: -0.8%

Memory: ✅ 71.631MB (SLO: <75.000MB -4.5%) vs baseline: +5.2%


✅ profiler

Time: ✅ 15.201ms (SLO: <16.550ms -8.2%) vs baseline: ~same

Memory: ✅ 60.391MB (SLO: <61.000MB 🟡 -1.0%) vs baseline: +4.9%


✅ resource-renaming

Time: ✅ 19.558ms (SLO: <21.750ms 📉 -10.1%) vs baseline: +0.3%

Memory: ✅ 71.602MB (SLO: <73.500MB -2.6%) vs baseline: +4.9%


✅ span-code-origin

Time: ✅ 19.986ms (SLO: <28.200ms 📉 -29.1%) vs baseline: +0.2%

Memory: ✅ 71.576MB (SLO: <75.000MB -4.6%) vs baseline: +4.8%


✅ tracer

Time: ✅ 19.616ms (SLO: <21.750ms -9.8%) vs baseline: -0.6%

Memory: ✅ 71.329MB (SLO: <75.000MB -4.9%) vs baseline: +4.4%


✅ tracer-and-profiler

Time: ✅ 21.035ms (SLO: <23.500ms 📉 -10.5%) vs baseline: +0.6%

Memory: ✅ 73.482MB (SLO: <75.000MB -2.0%) vs baseline: +4.7%


✅ tracer-dont-create-db-spans

Time: ✅ 19.694ms (SLO: <21.500ms -8.4%) vs baseline: ~same

Memory: ✅ 71.513MB (SLO: <75.000MB -4.6%) vs baseline: +4.8%


✅ tracer-minimal

Time: ✅ 17.793ms (SLO: <18.500ms -3.8%) vs baseline: -1.2%

Memory: ✅ 71.523MB (SLO: <75.000MB -4.6%) vs baseline: +4.9%


✅ tracer-no-caches

Time: ✅ 18.806ms (SLO: <19.650ms -4.3%) vs baseline: ~same

Memory: ✅ 71.457MB (SLO: <75.000MB -4.7%) vs baseline: +4.9%


✅ tracer-no-databases

Time: ✅ 20.550ms (SLO: <21.100ms -2.6%) vs baseline: -0.8%

Memory: ✅ 71.643MB (SLO: <75.000MB -4.5%) vs baseline: +5.2%


✅ tracer-no-middleware

Time: ✅ 19.350ms (SLO: <21.500ms -10.0%) vs baseline: -0.3%

Memory: ✅ 71.596MB (SLO: <75.000MB -4.5%) vs baseline: +5.2%


✅ tracer-no-templates

Time: ✅ 19.548ms (SLO: <22.000ms 📉 -11.1%) vs baseline: +1.0%

Memory: ✅ 71.722MB (SLO: <73.500MB -2.4%) vs baseline: +5.2%


🟡 iastpropagation - 8/8

✅ no-propagation

Time: ✅ 48.726µs (SLO: <60.000µs 📉 -18.8%) vs baseline: -0.5%

Memory: ✅ 40.855MB (SLO: <42.000MB -2.7%) vs baseline: +3.9%


✅ propagation_enabled

Time: ✅ 138.970µs (SLO: <190.000µs 📉 -26.9%) vs baseline: +0.3%

Memory: ✅ 41.229MB (SLO: <42.000MB 🟡 -1.8%) vs baseline: +5.5%


✅ propagation_enabled_100

Time: ✅ 1.572ms (SLO: <2.300ms 📉 -31.7%) vs baseline: +0.4%

Memory: ✅ 41.012MB (SLO: <42.000MB -2.4%) vs baseline: +4.8%


✅ propagation_enabled_1000

Time: ✅ 29.349ms (SLO: <34.550ms 📉 -15.1%) vs baseline: +0.3%

Memory: ✅ 40.835MB (SLO: <42.000MB -2.8%) vs baseline: +4.5%


🟡 otelspan - 22/22

✅ add-event

Time: ✅ 41.627ms (SLO: <47.150ms 📉 -11.7%) vs baseline: ~same

Memory: ✅ 41.472MB (SLO: <47.000MB 📉 -11.8%) vs baseline: +4.3%


✅ add-metrics

Time: ✅ 234.535ms (SLO: <344.800ms 📉 -32.0%) vs baseline: ~same

Memory: ✅ 45.622MB (SLO: <47.500MB -4.0%) vs baseline: +5.1%


✅ add-tags

Time: ✅ 264.274ms (SLO: <330.000ms 📉 -19.9%) vs baseline: -0.5%

Memory: ✅ 45.536MB (SLO: <47.500MB -4.1%) vs baseline: +4.7%


✅ get-context

Time: ✅ 80.953ms (SLO: <92.350ms 📉 -12.3%) vs baseline: ~same

Memory: ✅ 41.288MB (SLO: <46.500MB 📉 -11.2%) vs baseline: +5.3%


✅ is-recording

Time: ✅ 37.977ms (SLO: <44.500ms 📉 -14.7%) vs baseline: +0.4%

Memory: ✅ 41.117MB (SLO: <47.500MB 📉 -13.4%) vs baseline: +5.2%


✅ record-exception

Time: ✅ 62.680ms (SLO: <67.650ms -7.3%) vs baseline: -0.1%

Memory: ✅ 41.855MB (SLO: <47.000MB 📉 -10.9%) vs baseline: +5.1%


✅ set-status

Time: ✅ 43.722ms (SLO: <50.400ms 📉 -13.2%) vs baseline: +0.6%

Memory: ✅ 40.958MB (SLO: <47.000MB 📉 -12.9%) vs baseline: +4.9%


✅ start

Time: ✅ 38.863ms (SLO: <44.500ms 📉 -12.7%) vs baseline: +4.3%

Memory: ✅ 40.987MB (SLO: <47.000MB 📉 -12.8%) vs baseline: +4.9%


✅ start-finish

Time: ✅ 89.959ms (SLO: <92.000ms -2.2%) vs baseline: +0.6%

Memory: ✅ 38.830MB (SLO: <46.500MB 📉 -16.5%) vs baseline: +4.9%


✅ start-finish-telemetry

Time: ✅ 91.263ms (SLO: <93.000ms 🟡 -1.9%) vs baseline: -0.4%

Memory: ✅ 38.692MB (SLO: <46.500MB 📉 -16.8%) vs baseline: +4.7%


✅ update-name

Time: ✅ 39.189ms (SLO: <45.150ms 📉 -13.2%) vs baseline: +0.5%

Memory: ✅ 41.038MB (SLO: <47.000MB 📉 -12.7%) vs baseline: +5.1%


🟡 packagesupdateimporteddependencies - 24/24 (1 unstable)

✅ import_many

Time: ✅ 169.027µs (SLO: <170.000µs 🟡 -0.6%) vs baseline: +0.4%

Memory: ✅ 41.459MB (SLO: <46.000MB -9.9%) vs baseline: +5.2%


✅ import_many_cached

Time: ✅ 131.168µs (SLO: <170.000µs 📉 -22.8%) vs baseline: -1.1%

Memory: ✅ 41.347MB (SLO: <46.000MB 📉 -10.1%) vs baseline: +5.0%


✅ import_many_stdlib

Time: ✅ 1.257ms (SLO: <1.750ms 📉 -28.2%) vs baseline: +0.6%

Memory: ✅ 41.222MB (SLO: <46.000MB 📉 -10.4%) vs baseline: +4.9%


⚠️ import_many_stdlib_cached

Time: ⚠️ 0.631ms (SLO: <1.100ms 📉 -42.6%) vs baseline: +1.0%

Memory: ✅ 41.165MB (SLO: <46.000MB 📉 -10.5%) vs baseline: +4.9%


✅ import_many_unknown

Time: ✅ 898.752µs (SLO: <960.000µs -6.4%) vs baseline: +1.0%

Memory: ✅ 41.177MB (SLO: <46.000MB 📉 -10.5%) vs baseline: +4.5%


✅ import_many_unknown_cached

Time: ✅ 853.210µs (SLO: <870.000µs 🟡 -1.9%) vs baseline: +0.4%

Memory: ✅ 41.463MB (SLO: <46.000MB -9.9%) vs baseline: +6.0%


✅ import_one

Time: ✅ 22.913µs (SLO: <30.000µs 📉 -23.6%) vs baseline: ~same

Memory: ✅ 41.380MB (SLO: <46.000MB 📉 -10.0%) vs baseline: +4.9%


✅ import_one_cache

Time: ✅ 8.976µs (SLO: <10.000µs 📉 -10.2%) vs baseline: +0.2%

Memory: ✅ 41.097MB (SLO: <46.000MB 📉 -10.7%) vs baseline: +4.4%


✅ import_one_stdlib

Time: ✅ 21.831µs (SLO: <30.000µs 📉 -27.2%) vs baseline: -0.6%

Memory: ✅ 41.186MB (SLO: <46.000MB 📉 -10.5%) vs baseline: +4.7%


✅ import_one_stdlib_cache

Time: ✅ 8.918µs (SLO: <10.000µs 📉 -10.8%) vs baseline: -0.2%

Memory: ✅ 41.251MB (SLO: <46.000MB 📉 -10.3%) vs baseline: +4.9%


✅ import_one_unknown

Time: ✅ 49.780µs (SLO: <51.000µs -2.4%) vs baseline: -0.3%

Memory: ✅ 40.951MB (SLO: <46.000MB 📉 -11.0%) vs baseline: +5.3%


✅ import_one_unknown_cache

Time: ✅ 8.995µs (SLO: <10.000µs 📉 -10.0%) vs baseline: +0.2%

Memory: ✅ 41.200MB (SLO: <43.000MB -4.2%) vs baseline: +4.4%


🟡 recursivecomputation - 8/8

✅ deep

Time: ✅ 311.529ms (SLO: <320.950ms -2.9%) vs baseline: -0.5%

Memory: ✅ 37.316MB (SLO: <38.750MB -3.7%) vs baseline: +4.8%


✅ deep-profiled

Time: ✅ 335.468ms (SLO: <359.150ms -6.6%) vs baseline: +0.4%

Memory: ✅ 43.765MB (SLO: <46.000MB -4.9%) vs baseline: +5.0%


✅ medium

Time: ✅ 7.345ms (SLO: <7.450ms 🟡 -1.4%) vs baseline: ~same

Memory: ✅ 36.608MB (SLO: <38.000MB -3.7%) vs baseline: +6.2%


✅ shallow

Time: ✅ 1.036ms (SLO: <1.050ms 🟡 -1.3%) vs baseline: +2.3%

Memory: ✅ 36.294MB (SLO: <38.000MB -4.5%) vs baseline: +5.3%


🟡 span - 26/26

✅ add-event

Time: ✅ 20.466ms (SLO: <22.500ms -9.0%) vs baseline: +0.2%

Memory: ✅ 38.555MB (SLO: <53.000MB 📉 -27.3%) vs baseline: +4.6%


✅ add-metrics

Time: ✅ 89.976ms (SLO: <93.500ms -3.8%) vs baseline: -0.4%

Memory: ✅ 42.743MB (SLO: <53.000MB 📉 -19.4%) vs baseline: +4.4%


✅ add-tags

Time: ✅ 136.225ms (SLO: <155.000ms 📉 -12.1%) vs baseline: +0.4%

Memory: ✅ 42.578MB (SLO: <53.000MB 📉 -19.7%) vs baseline: +4.5%


✅ get-context

Time: ✅ 17.974ms (SLO: <20.500ms 📉 -12.3%) vs baseline: ~same

Memory: ✅ 38.181MB (SLO: <53.000MB 📉 -28.0%) vs baseline: +5.0%


✅ is-recording

Time: ✅ 18.047ms (SLO: <20.500ms 📉 -12.0%) vs baseline: -0.3%

Memory: ✅ 38.083MB (SLO: <53.000MB 📉 -28.1%) vs baseline: +4.4%


✅ record-exception

Time: ✅ 41.941ms (SLO: <42.000ms 🟡 -0.1%) vs baseline: ~same

Memory: ✅ 38.909MB (SLO: <53.000MB 📉 -26.6%) vs baseline: +4.8%


✅ set-status

Time: ✅ 19.585ms (SLO: <22.000ms 📉 -11.0%) vs baseline: -0.1%

Memory: ✅ 38.083MB (SLO: <53.000MB 📉 -28.1%) vs baseline: +4.7%


✅ start

Time: ✅ 18.993ms (SLO: <20.500ms -7.3%) vs baseline: +6.1%

Memory: ✅ 38.122MB (SLO: <53.000MB 📉 -28.1%) vs baseline: +4.9%


✅ start-finish

Time: ✅ 57.907ms (SLO: <58.500ms 🟡 -1.0%) vs baseline: ~same

Memory: ✅ 36.156MB (SLO: <38.000MB -4.9%) vs baseline: +4.7%


✅ start-finish-telemetry

Time: ✅ 59.156ms (SLO: <60.000ms 🟡 -1.4%) vs baseline: -0.2%

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.2%


✅ start-finish-traceid128

Time: ✅ 60.152ms (SLO: <62.000ms -3.0%) vs baseline: -0.4%

Memory: ✅ 36.176MB (SLO: <38.000MB -4.8%) vs baseline: +4.7%


✅ start-traceid128

Time: ✅ 17.900ms (SLO: <22.500ms 📉 -20.4%) vs baseline: +0.6%

Memory: ✅ 38.162MB (SLO: <53.000MB 📉 -28.0%) vs baseline: +4.8%


✅ update-name

Time: ✅ 18.505ms (SLO: <22.000ms 📉 -15.9%) vs baseline: -0.5%

Memory: ✅ 38.179MB (SLO: <53.000MB 📉 -28.0%) vs baseline: +4.9%


🟡 tracer - 6/6

✅ large

Time: ✅ 32.883ms (SLO: <33.950ms -3.1%) vs baseline: ~same

Memory: ✅ 37.297MB (SLO: <39.250MB -5.0%) vs baseline: +4.6%


✅ medium

Time: ✅ 3.183ms (SLO: <3.500ms -9.0%) vs baseline: -0.2%

Memory: ✅ 36.255MB (SLO: <38.750MB -6.4%) vs baseline: +5.2%


✅ small

Time: ✅ 386.510µs (SLO: <390.000µs 🟡 -0.9%) vs baseline: +3.7%

Memory: ✅ 36.196MB (SLO: <38.750MB -6.6%) vs baseline: +5.3%

⚠️ Unstable Tests (1 suite)
⚠️ coreapiscenario - 10/10 (1 unstable)

⚠️ context_with_data_listeners

Time: ⚠️ 13.644µs (SLO: <20.000µs 📉 -31.8%) vs baseline: +0.2%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +4.7%


✅ context_with_data_no_listeners

Time: ✅ 3.628µs (SLO: <10.000µs 📉 -63.7%) vs baseline: +1.3%

Memory: ✅ 36.333MB (SLO: <38.000MB -4.4%) vs baseline: +5.5%


✅ get_item_exists

Time: ✅ 0.586µs (SLO: <10.000µs 📉 -94.1%) vs baseline: -0.1%

Memory: ✅ 36.156MB (SLO: <38.000MB -4.9%) vs baseline: +4.6%


✅ get_item_missing

Time: ✅ 0.643µs (SLO: <10.000µs 📉 -93.6%) vs baseline: +0.8%

Memory: ✅ 36.196MB (SLO: <38.000MB -4.7%) vs baseline: +5.1%


✅ set_item

Time: ✅ 24.422µs (SLO: <30.000µs 📉 -18.6%) vs baseline: ~same

Memory: ✅ 36.294MB (SLO: <38.000MB -4.5%) vs baseline: +4.9%

✅ All Tests Passing (17 suites)
codeprovenancefork - 2/2

✅ fork-10

Time: ✅ 2.195s (SLO: <2.400s -8.5%) vs baseline: +2.4%

Memory: ✅ 17.360MB (SLO: <20.000MB 📉 -13.2%) vs baseline: +4.9%


errortrackingdjangosimple - 6/6

✅ errortracking-enabled-all

Time: ✅ 16.247ms (SLO: <19.850ms 📉 -18.1%) vs baseline: ~same

Memory: ✅ 71.349MB (SLO: <75.000MB -4.9%) vs baseline: +4.8%


✅ errortracking-enabled-user

Time: ✅ 16.205ms (SLO: <19.400ms 📉 -16.5%) vs baseline: -0.8%

Memory: ✅ 71.310MB (SLO: <75.000MB -4.9%) vs baseline: +4.7%


✅ tracer-enabled

Time: ✅ 17.382ms (SLO: <19.450ms 📉 -10.6%) vs baseline: -0.2%

Memory: ✅ 71.388MB (SLO: <75.000MB -4.8%) vs baseline: +4.9%


errortrackingflasksqli - 6/6

✅ errortracking-enabled-all

Time: ✅ 2.117ms (SLO: <2.300ms -7.9%) vs baseline: -0.1%

Memory: ✅ 58.412MB (SLO: <60.000MB -2.6%) vs baseline: +5.1%


✅ errortracking-enabled-user

Time: ✅ 2.127ms (SLO: <2.250ms -5.5%) vs baseline: +0.5%

Memory: ✅ 58.432MB (SLO: <60.000MB -2.6%) vs baseline: +4.9%


✅ tracer-enabled

Time: ✅ 2.114ms (SLO: <2.300ms -8.1%) vs baseline: ~same

Memory: ✅ 58.373MB (SLO: <60.000MB -2.7%) vs baseline: +5.0%


flasksimple - 16/16

✅ appsec-get

Time: ✅ 3.393ms (SLO: <4.750ms 📉 -28.6%) vs baseline: ~same

Memory: ✅ 58.608MB (SLO: <66.500MB 📉 -11.9%) vs baseline: +4.9%


✅ appsec-post

Time: ✅ 2.897ms (SLO: <6.750ms 📉 -57.1%) vs baseline: ~same

Memory: ✅ 58.667MB (SLO: <66.500MB 📉 -11.8%) vs baseline: +4.8%


✅ appsec-telemetry

Time: ✅ 3.419ms (SLO: <4.750ms 📉 -28.0%) vs baseline: +1.0%

Memory: ✅ 58.642MB (SLO: <66.500MB 📉 -11.8%) vs baseline: +5.0%


✅ debugger

Time: ✅ 1.885ms (SLO: <2.000ms -5.7%) vs baseline: +0.3%

Memory: ✅ 49.422MB (SLO: <51.500MB -4.0%) vs baseline: +5.1%


✅ iast-get

Time: ✅ 1.875ms (SLO: <2.000ms -6.3%) vs baseline: +0.4%

Memory: ✅ 46.060MB (SLO: <49.000MB -6.0%) vs baseline: +4.9%


✅ profiler

Time: ✅ 1.915ms (SLO: <2.100ms -8.8%) vs baseline: ~same

Memory: ✅ 52.055MB (SLO: <53.500MB -2.7%) vs baseline: +4.3%


✅ resource-renaming

Time: ✅ 3.383ms (SLO: <3.650ms -7.3%) vs baseline: +0.4%

Memory: ✅ 58.622MB (SLO: <60.000MB -2.3%) vs baseline: +4.9%


✅ tracer

Time: ✅ 3.377ms (SLO: <3.650ms -7.5%) vs baseline: -0.2%

Memory: ✅ 58.624MB (SLO: <60.000MB -2.3%) vs baseline: +4.8%


flasksqli - 6/6

✅ appsec-enabled

Time: ✅ 2.106ms (SLO: <4.200ms 📉 -49.9%) vs baseline: -0.4%

Memory: ✅ 58.707MB (SLO: <66.000MB 📉 -11.0%) vs baseline: +4.9%


✅ iast-enabled

Time: ✅ 2.117ms (SLO: <2.800ms 📉 -24.4%) vs baseline: +0.1%

Memory: ✅ 58.511MB (SLO: <62.500MB -6.4%) vs baseline: +4.6%


✅ tracer-enabled

Time: ✅ 2.105ms (SLO: <2.250ms -6.5%) vs baseline: -0.3%

Memory: ✅ 58.570MB (SLO: <60.000MB -2.4%) vs baseline: +4.9%


forktime - 4/4

✅ baseline

Time: ✅ 1.940ms (SLO: <3.000ms 📉 -35.3%) vs baseline: +3.8%

Memory: ✅ 29.334MB (SLO: <33.000MB 📉 -11.1%) vs baseline: +5.3%


✅ configured

Time: ✅ 9.390ms (SLO: <17.000ms 📉 -44.8%) vs baseline: +0.3%

Memory: ✅ 58.529MB (SLO: <60.000MB -2.5%) vs baseline: +5.0%


httppropagationextract - 60/60

✅ all_styles_all_headers

Time: ✅ 78.983µs (SLO: <100.000µs 📉 -21.0%) vs baseline: +5.7%

Memory: ✅ 36.569MB (SLO: <38.000MB -3.8%) vs baseline: +5.3%


✅ b3_headers

Time: ✅ 12.590µs (SLO: <20.000µs 📉 -37.1%) vs baseline: +0.4%

Memory: ✅ 36.707MB (SLO: <38.000MB -3.4%) vs baseline: +5.0%


✅ b3_single_headers

Time: ✅ 11.546µs (SLO: <20.000µs 📉 -42.3%) vs baseline: +0.1%

Memory: ✅ 36.766MB (SLO: <38.000MB -3.2%) vs baseline: +4.6%


✅ datadog_tracecontext_tracestate_not_propagated_on_trace_id_no_match

Time: ✅ 58.290µs (SLO: <80.000µs 📉 -27.1%) vs baseline: ~same

Memory: ✅ 36.353MB (SLO: <38.000MB -4.3%) vs baseline: +4.3%


✅ datadog_tracecontext_tracestate_propagated_on_trace_id_match

Time: ✅ 62.117µs (SLO: <80.000µs 📉 -22.4%) vs baseline: +0.5%

Memory: ✅ 36.726MB (SLO: <38.000MB -3.4%) vs baseline: +4.7%


✅ empty_headers

Time: ✅ 1.296µs (SLO: <10.000µs 📉 -87.0%) vs baseline: -0.4%

Memory: ✅ 36.392MB (SLO: <38.000MB -4.2%) vs baseline: +4.3%


✅ full_t_id_datadog_headers

Time: ✅ 20.892µs (SLO: <30.000µs 📉 -30.4%) vs baseline: +0.4%

Memory: ✅ 36.648MB (SLO: <38.000MB -3.6%) vs baseline: +4.6%


✅ invalid_priority_header

Time: ✅ 5.845µs (SLO: <10.000µs 📉 -41.5%) vs baseline: ~same

Memory: ✅ 36.431MB (SLO: <38.000MB -4.1%) vs baseline: +4.5%


✅ invalid_span_id_header

Time: ✅ 5.855µs (SLO: <10.000µs 📉 -41.5%) vs baseline: ~same

Memory: ✅ 36.608MB (SLO: <38.000MB -3.7%) vs baseline: +4.1%


✅ invalid_tags_header

Time: ✅ 5.885µs (SLO: <10.000µs 📉 -41.1%) vs baseline: +0.6%

Memory: ✅ 36.628MB (SLO: <38.000MB -3.6%) vs baseline: +4.5%


✅ invalid_trace_id_header

Time: ✅ 5.909µs (SLO: <10.000µs 📉 -40.9%) vs baseline: +0.5%

Memory: ✅ 36.333MB (SLO: <38.000MB -4.4%) vs baseline: +4.3%


✅ large_header_no_matches

Time: ✅ 26.869µs (SLO: <30.000µs 📉 -10.4%) vs baseline: +0.1%

Memory: ✅ 36.333MB (SLO: <38.000MB -4.4%) vs baseline: +4.3%


✅ large_valid_headers_all

Time: ✅ 28.040µs (SLO: <40.000µs 📉 -29.9%) vs baseline: +0.3%

Memory: ✅ 36.313MB (SLO: <38.000MB -4.4%) vs baseline: +4.0%


✅ medium_header_no_matches

Time: ✅ 9.140µs (SLO: <20.000µs 📉 -54.3%) vs baseline: -0.5%

Memory: ✅ 36.569MB (SLO: <38.000MB -3.8%) vs baseline: +5.1%


✅ medium_valid_headers_all

Time: ✅ 10.605µs (SLO: <20.000µs 📉 -47.0%) vs baseline: +0.4%

Memory: ✅ 36.589MB (SLO: <38.000MB -3.7%) vs baseline: +4.8%


✅ none_propagation_style

Time: ✅ 1.380µs (SLO: <10.000µs 📉 -86.2%) vs baseline: -0.7%

Memory: ✅ 36.530MB (SLO: <38.000MB -3.9%) vs baseline: +4.7%


✅ tracecontext_headers

Time: ✅ 31.515µs (SLO: <40.000µs 📉 -21.2%) vs baseline: +0.6%

Memory: ✅ 36.667MB (SLO: <38.000MB -3.5%) vs baseline: +4.7%


✅ valid_headers_all

Time: ✅ 5.873µs (SLO: <10.000µs 📉 -41.3%) vs baseline: +0.3%

Memory: ✅ 36.392MB (SLO: <38.000MB -4.2%) vs baseline: +4.3%


✅ valid_headers_basic

Time: ✅ 5.462µs (SLO: <10.000µs 📉 -45.4%) vs baseline: +0.2%

Memory: ✅ 36.412MB (SLO: <38.000MB -4.2%) vs baseline: +4.4%


✅ wsgi_empty_headers

Time: ✅ 1.295µs (SLO: <10.000µs 📉 -87.1%) vs baseline: +0.5%

Memory: ✅ 36.569MB (SLO: <38.000MB -3.8%) vs baseline: +4.9%


✅ wsgi_invalid_priority_header

Time: ✅ 5.880µs (SLO: <10.000µs 📉 -41.2%) vs baseline: -0.5%

Memory: ✅ 36.707MB (SLO: <38.000MB -3.4%) vs baseline: +4.6%


✅ wsgi_invalid_span_id_header

Time: ✅ 1.309µs (SLO: <10.000µs 📉 -86.9%) vs baseline: +1.0%

Memory: ✅ 36.333MB (SLO: <38.000MB -4.4%) vs baseline: +3.8%


✅ wsgi_invalid_tags_header

Time: ✅ 5.900µs (SLO: <10.000µs 📉 -41.0%) vs baseline: ~same

Memory: ✅ 36.667MB (SLO: <38.000MB -3.5%) vs baseline: +4.8%


✅ wsgi_invalid_trace_id_header

Time: ✅ 5.934µs (SLO: <10.000µs 📉 -40.7%) vs baseline: +0.2%

Memory: ✅ 36.490MB (SLO: <38.000MB -4.0%) vs baseline: +4.6%


✅ wsgi_large_header_no_matches

Time: ✅ 28.068µs (SLO: <40.000µs 📉 -29.8%) vs baseline: -0.2%

Memory: ✅ 36.766MB (SLO: <38.000MB -3.2%) vs baseline: +4.6%


✅ wsgi_large_valid_headers_all

Time: ✅ 29.169µs (SLO: <40.000µs 📉 -27.1%) vs baseline: +0.4%

Memory: ✅ 36.667MB (SLO: <38.000MB -3.5%) vs baseline: +4.8%


✅ wsgi_medium_header_no_matches

Time: ✅ 9.470µs (SLO: <20.000µs 📉 -52.6%) vs baseline: +0.4%

Memory: ✅ 36.628MB (SLO: <38.000MB -3.6%) vs baseline: +4.3%


✅ wsgi_medium_valid_headers_all

Time: ✅ 10.973µs (SLO: <20.000µs 📉 -45.1%) vs baseline: +1.0%

Memory: ✅ 36.726MB (SLO: <38.000MB -3.4%) vs baseline: +4.7%


✅ wsgi_valid_headers_all

Time: ✅ 5.896µs (SLO: <10.000µs 📉 -41.0%) vs baseline: -0.2%

Memory: ✅ 36.549MB (SLO: <38.000MB -3.8%) vs baseline: +4.2%


✅ wsgi_valid_headers_basic

Time: ✅ 5.438µs (SLO: <10.000µs 📉 -45.6%) vs baseline: -0.8%

Memory: ✅ 36.431MB (SLO: <38.000MB -4.1%) vs baseline: +4.9%


httppropagationinject - 16/16

✅ ids_only

Time: ✅ 20.556µs (SLO: <30.000µs 📉 -31.5%) vs baseline: +4.3%

Memory: ✅ 36.313MB (SLO: <38.000MB -4.4%) vs baseline: +4.7%


✅ with_all

Time: ✅ 26.101µs (SLO: <40.000µs 📉 -34.7%) vs baseline: ~same

Memory: ✅ 36.490MB (SLO: <38.000MB -4.0%) vs baseline: +5.3%


✅ with_dd_origin

Time: ✅ 23.094µs (SLO: <30.000µs 📉 -23.0%) vs baseline: +0.3%

Memory: ✅ 36.412MB (SLO: <38.000MB -4.2%) vs baseline: +4.5%


✅ with_priority_and_origin

Time: ✅ 22.519µs (SLO: <40.000µs 📉 -43.7%) vs baseline: ~same

Memory: ✅ 36.451MB (SLO: <38.000MB -4.1%) vs baseline: +5.3%


✅ with_sampling_priority

Time: ✅ 19.574µs (SLO: <30.000µs 📉 -34.8%) vs baseline: -0.4%

Memory: ✅ 36.431MB (SLO: <38.000MB -4.1%) vs baseline: +4.7%


✅ with_tags

Time: ✅ 24.084µs (SLO: <40.000µs 📉 -39.8%) vs baseline: +0.2%

Memory: ✅ 36.451MB (SLO: <38.000MB -4.1%) vs baseline: +4.4%


✅ with_tags_invalid

Time: ✅ 25.473µs (SLO: <40.000µs 📉 -36.3%) vs baseline: +0.3%

Memory: ✅ 36.392MB (SLO: <38.000MB -4.2%) vs baseline: +5.1%


✅ with_tags_max_size

Time: ✅ 24.740µs (SLO: <40.000µs 📉 -38.1%) vs baseline: +0.7%

Memory: ✅ 36.451MB (SLO: <38.000MB -4.1%) vs baseline: +5.3%


iastaspectssplit - 12/12

✅ rsplit_aspect

Time: ✅ 153.536µs (SLO: <250.000µs 📉 -38.6%) vs baseline: +0.9%

Memory: ✅ 43.865MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


✅ rsplit_noaspect

Time: ✅ 157.622µs (SLO: <250.000µs 📉 -37.0%) vs baseline: +0.5%

Memory: ✅ 43.891MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


✅ split_aspect

Time: ✅ 146.334µs (SLO: <250.000µs 📉 -41.5%) vs baseline: ~same

Memory: ✅ 43.898MB (SLO: <46.000MB -4.6%) vs baseline: +5.2%


✅ split_noaspect

Time: ✅ 154.026µs (SLO: <250.000µs 📉 -38.4%) vs baseline: +2.3%

Memory: ✅ 43.884MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


✅ splitlines_aspect

Time: ✅ 146.038µs (SLO: <250.000µs 📉 -41.6%) vs baseline: +1.0%

Memory: ✅ 43.707MB (SLO: <46.000MB -5.0%) vs baseline: +4.4%


✅ splitlines_noaspect

Time: ✅ 149.952µs (SLO: <250.000µs 📉 -40.0%) vs baseline: +0.4%

Memory: ✅ 43.918MB (SLO: <46.000MB -4.5%) vs baseline: +5.0%


otelsdkspan - 24/24

✅ add-event

Time: ✅ 40.527ms (SLO: <42.000ms -3.5%) vs baseline: +0.2%

Memory: ✅ 39.086MB (SLO: <40.750MB -4.1%) vs baseline: +5.2%


✅ add-link

Time: ✅ 36.320ms (SLO: <38.550ms -5.8%) vs baseline: -0.2%

Memory: ✅ 39.046MB (SLO: <40.750MB -4.2%) vs baseline: +4.6%


✅ add-metrics

Time: ✅ 219.815ms (SLO: <232.000ms -5.3%) vs baseline: +0.1%

Memory: ✅ 39.046MB (SLO: <40.750MB -4.2%) vs baseline: +4.9%


✅ add-tags

Time: ✅ 209.989ms (SLO: <221.600ms -5.2%) vs baseline: -0.7%

Memory: ✅ 39.086MB (SLO: <40.750MB -4.1%) vs baseline: +4.8%


✅ get-context

Time: ✅ 29.171ms (SLO: <31.300ms -6.8%) vs baseline: ~same

Memory: ✅ 39.125MB (SLO: <40.750MB -4.0%) vs baseline: +5.0%


✅ is-recording

Time: ✅ 29.190ms (SLO: <31.000ms -5.8%) vs baseline: +0.2%

Memory: ✅ 39.007MB (SLO: <40.750MB -4.3%) vs baseline: +4.7%


✅ record-exception

Time: ✅ 63.131ms (SLO: <65.850ms -4.1%) vs baseline: -0.1%

Memory: ✅ 38.948MB (SLO: <40.750MB -4.4%) vs baseline: +4.5%


✅ set-status

Time: ✅ 32.027ms (SLO: <34.150ms -6.2%) vs baseline: +0.6%

Memory: ✅ 39.125MB (SLO: <40.750MB -4.0%) vs baseline: +5.3%


✅ start

Time: ✅ 29.414ms (SLO: <30.150ms -2.4%) vs baseline: +1.7%

Memory: ✅ 39.027MB (SLO: <40.750MB -4.2%) vs baseline: +4.7%


✅ start-finish

Time: ✅ 33.977ms (SLO: <35.350ms -3.9%) vs baseline: +0.3%

Memory: ✅ 39.184MB (SLO: <40.750MB -3.8%) vs baseline: +5.3%


✅ start-finish-telemetry

Time: ✅ 34.085ms (SLO: <35.450ms -3.8%) vs baseline: +0.5%

Memory: ✅ 39.105MB (SLO: <40.750MB -4.0%) vs baseline: +4.8%


✅ update-name

Time: ✅ 30.938ms (SLO: <33.400ms -7.4%) vs baseline: -0.3%

Memory: ✅ 39.007MB (SLO: <40.750MB -4.3%) vs baseline: +4.9%


packagespackageforrootmodulemapping - 4/4

✅ cache_off

Time: ✅ 344.920ms (SLO: <354.300ms -2.6%) vs baseline: +0.2%

Memory: ✅ 42.275MB (SLO: <46.000MB -8.1%) vs baseline: +5.3%


✅ cache_on

Time: ✅ 0.393µs (SLO: <10.000µs 📉 -96.1%) vs baseline: +1.4%

Memory: ✅ 39.783MB (SLO: <46.000MB 📉 -13.5%) vs baseline: +4.5%


rand - 2/2

✅ rand128bits

Time: ✅ 0.184µs (SLO: <21.000µs 📉 -99.1%) vs baseline: -0.3%


✅ rand64bits

Time: ✅ 0.125µs (SLO: <15.000µs 📉 -99.2%) vs baseline: +0.4%


ratelimiter - 12/12

✅ defaults

Time: ✅ 2.360µs (SLO: <10.000µs 📉 -76.4%) vs baseline: +0.4%

Memory: ✅ 36.589MB (SLO: <38.000MB -3.7%) vs baseline: +4.6%


✅ high_rate_limit

Time: ✅ 2.420µs (SLO: <10.000µs 📉 -75.8%) vs baseline: +0.4%

Memory: ✅ 36.549MB (SLO: <38.000MB -3.8%) vs baseline: +4.9%


✅ long_window

Time: ✅ 2.344µs (SLO: <10.000µs 📉 -76.6%) vs baseline: -0.4%

Memory: ✅ 36.648MB (SLO: <38.000MB -3.6%) vs baseline: +4.6%


✅ low_rate_limit

Time: ✅ 2.349µs (SLO: <10.000µs 📉 -76.5%) vs baseline: -0.5%

Memory: ✅ 36.667MB (SLO: <38.000MB -3.5%) vs baseline: +5.2%


✅ no_rate_limit

Time: ✅ 0.823µs (SLO: <10.000µs 📉 -91.8%) vs baseline: -0.8%

Memory: ✅ 36.648MB (SLO: <38.000MB -3.6%) vs baseline: +5.1%


✅ short_window

Time: ✅ 2.487µs (SLO: <10.000µs 📉 -75.1%) vs baseline: ~same

Memory: ✅ 36.648MB (SLO: <38.000MB -3.6%) vs baseline: +4.4%


samplingrules - 8/8

✅ average_match

Time: ✅ 145.327µs (SLO: <200.000µs 📉 -27.3%) vs baseline: -0.9%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +4.7%


✅ high_match

Time: ✅ 165.632µs (SLO: <200.000µs 📉 -17.2%) vs baseline: -1.1%

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +4.7%


✅ low_match

Time: ✅ 121.736µs (SLO: <130.000µs -6.4%) vs baseline: -0.9%

Memory: ✅ 619.340MB (SLO: <780.000MB 📉 -20.6%) vs baseline: +4.9%


✅ very_low_match

Time: ✅ 2.655ms (SLO: <4.000ms 📉 -33.6%) vs baseline: ~same

Memory: ✅ 73.685MB (SLO: <85.000MB 📉 -13.3%) vs baseline: +5.0%


sethttpmeta - 32/32

✅ all-disabled

Time: ✅ 10.691µs (SLO: <20.000µs 📉 -46.5%) vs baseline: +0.4%

Memory: ✅ 37.395MB (SLO: <38.750MB -3.5%) vs baseline: +4.6%


✅ all-enabled

Time: ✅ 40.081µs (SLO: <50.000µs 📉 -19.8%) vs baseline: +1.5%

Memory: ✅ 37.493MB (SLO: <38.750MB -3.2%) vs baseline: +4.9%


✅ collectipvariant_exists

Time: ✅ 40.205µs (SLO: <50.000µs 📉 -19.6%) vs baseline: -0.2%

Memory: ✅ 37.572MB (SLO: <38.750MB -3.0%) vs baseline: +5.4%


✅ no-collectipvariant

Time: ✅ 39.478µs (SLO: <50.000µs 📉 -21.0%) vs baseline: ~same

Memory: ✅ 37.415MB (SLO: <38.750MB -3.4%) vs baseline: +5.6%


✅ no-useragentvariant

Time: ✅ 38.181µs (SLO: <50.000µs 📉 -23.6%) vs baseline: -0.4%

Memory: ✅ 37.513MB (SLO: <38.750MB -3.2%) vs baseline: +4.7%


✅ obfuscation-no-query

Time: ✅ 39.935µs (SLO: <50.000µs 📉 -20.1%) vs baseline: -0.1%

Memory: ✅ 37.336MB (SLO: <38.750MB -3.6%) vs baseline: +5.2%


✅ obfuscation-regular-case-explicit-query

Time: ✅ 75.477µs (SLO: <90.000µs 📉 -16.1%) vs baseline: ~same

Memory: ✅ 37.473MB (SLO: <38.750MB -3.3%) vs baseline: +4.9%


✅ obfuscation-regular-case-implicit-query

Time: ✅ 76.066µs (SLO: <90.000µs 📉 -15.5%) vs baseline: -0.1%

Memory: ✅ 37.532MB (SLO: <38.750MB -3.1%) vs baseline: +4.9%


✅ obfuscation-send-querystring-disabled

Time: ✅ 155.088µs (SLO: <170.000µs -8.8%) vs baseline: +0.5%

Memory: ✅ 37.415MB (SLO: <38.750MB -3.4%) vs baseline: +4.7%


✅ obfuscation-worst-case-explicit-query

Time: ✅ 148.850µs (SLO: <160.000µs -7.0%) vs baseline: +0.1%

Memory: ✅ 37.473MB (SLO: <38.750MB -3.3%) vs baseline: +4.8%


✅ obfuscation-worst-case-implicit-query

Time: ✅ 154.833µs (SLO: <170.000µs -8.9%) vs baseline: -0.1%

Memory: ✅ 37.473MB (SLO: <38.750MB -3.3%) vs baseline: +4.7%


✅ useragentvariant_exists_1

Time: ✅ 38.976µs (SLO: <50.000µs 📉 -22.0%) vs baseline: ~same

Memory: ✅ 37.179MB (SLO: <38.750MB -4.1%) vs baseline: +4.1%


✅ useragentvariant_exists_2

Time: ✅ 39.935µs (SLO: <50.000µs 📉 -20.1%) vs baseline: -0.5%

Memory: ✅ 37.218MB (SLO: <38.750MB -4.0%) vs baseline: +3.8%


✅ useragentvariant_exists_3

Time: ✅ 39.271µs (SLO: <50.000µs 📉 -21.5%) vs baseline: -0.7%

Memory: ✅ 37.336MB (SLO: <38.750MB -3.6%) vs baseline: +4.2%


✅ useragentvariant_not_exists_1

Time: ✅ 39.017µs (SLO: <50.000µs 📉 -22.0%) vs baseline: +0.5%

Memory: ✅ 37.552MB (SLO: <38.750MB -3.1%) vs baseline: +5.1%


✅ useragentvariant_not_exists_2

Time: ✅ 38.845µs (SLO: <50.000µs 📉 -22.3%) vs baseline: -0.3%

Memory: ✅ 37.513MB (SLO: <38.750MB -3.2%) vs baseline: +4.9%


telemetryaddmetric - 30/30

✅ 1-count-metric-1-times

Time: ✅ 2.338µs (SLO: <20.000µs 📉 -88.3%) vs baseline: +8.2%

Memory: ✅ 36.648MB (SLO: <38.000MB -3.6%) vs baseline: +5.3%


✅ 1-count-metrics-100-times

Time: ✅ 156.874µs (SLO: <220.000µs 📉 -28.7%) vs baseline: +1.6%

Memory: ✅ 36.648MB (SLO: <38.000MB -3.6%) vs baseline: +5.3%


✅ 1-distribution-metric-1-times

Time: ✅ 2.423µs (SLO: <20.000µs 📉 -87.9%) vs baseline: -1.0%

Memory: ✅ 36.707MB (SLO: <38.000MB -3.4%) vs baseline: +5.2%


✅ 1-distribution-metrics-100-times

Time: ✅ 169.146µs (SLO: <230.000µs 📉 -26.5%) vs baseline: +0.8%

Memory: ✅ 36.569MB (SLO: <38.000MB -3.8%) vs baseline: +4.9%


✅ 1-gauge-metric-1-times

Time: ✅ 2.322µs (SLO: <20.000µs 📉 -88.4%) vs baseline: -1.7%

Memory: ✅ 36.648MB (SLO: <38.000MB -3.6%) vs baseline: +5.2%


✅ 1-gauge-metrics-100-times

Time: ✅ 170.674µs (SLO: <190.000µs 📉 -10.2%) vs baseline: +0.4%

Memory: ✅ 36.608MB (SLO: <38.000MB -3.7%) vs baseline: +4.9%


✅ 1-rate-metric-1-times

Time: ✅ 2.279µs (SLO: <20.000µs 📉 -88.6%) vs baseline: +0.7%

Memory: ✅ 36.471MB (SLO: <38.000MB -4.0%) vs baseline: +4.7%


✅ 1-rate-metrics-100-times

Time: ✅ 170.737µs (SLO: <250.000µs 📉 -31.7%) vs baseline: -0.2%

Memory: ✅ 36.569MB (SLO: <38.000MB -3.8%) vs baseline: +5.1%


✅ 100-count-metrics-100-times

Time: ✅ 15.894ms (SLO: <22.000ms 📉 -27.8%) vs baseline: +1.2%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +5.1%


✅ 100-distribution-metrics-100-times

Time: ✅ 1.775ms (SLO: <2.550ms 📉 -30.4%) vs baseline: +0.2%

Memory: ✅ 36.490MB (SLO: <38.000MB -4.0%) vs baseline: +4.5%


✅ 100-gauge-metrics-100-times

Time: ✅ 1.736ms (SLO: <1.850ms -6.2%) vs baseline: -1.0%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +5.1%


✅ 100-rate-metrics-100-times

Time: ✅ 1.755ms (SLO: <2.550ms 📉 -31.2%) vs baseline: -0.2%

Memory: ✅ 36.196MB (SLO: <38.000MB -4.7%) vs baseline: +5.0%


✅ flush-1-metric

Time: ✅ 3.579µs (SLO: <20.000µs 📉 -82.1%) vs baseline: ~same

Memory: ✅ 36.530MB (SLO: <38.000MB -3.9%) vs baseline: +4.9%


✅ flush-100-metrics

Time: ✅ 172.538µs (SLO: <250.000µs 📉 -31.0%) vs baseline: ~same

Memory: ✅ 36.589MB (SLO: <38.000MB -3.7%) vs baseline: +5.0%


✅ flush-1000-metrics

Time: ✅ 2.196ms (SLO: <2.500ms 📉 -12.2%) vs baseline: +0.4%

Memory: ✅ 37.336MB (SLO: <38.750MB -3.6%) vs baseline: +4.9%


telemetrydependencies - 4/4

✅ first-50-deps-sca-off

Time: ✅ 2.211ms (SLO: <3.500ms 📉 -36.8%) vs baseline: +0.4%

Memory: ✅ 40.678MB (SLO: <46.000MB 📉 -11.6%) vs baseline: +5.1%


✅ first-50-deps-sca-on

Time: ✅ 2.251ms (SLO: <3.750ms 📉 -40.0%) vs baseline: ~same

Memory: ✅ 40.619MB (SLO: <46.000MB 📉 -11.7%) vs baseline: +5.0%

ℹ️ Scenarios Missing SLO Configuration (46 scenarios)

The following scenarios exist in candidate data but have no SLO thresholds configured:

  • coreapiscenario-core_dispatch_listeners
  • coreapiscenario-core_dispatch_no_listeners
  • coreapiscenario-core_dispatch_with_results_listeners
  • coreapiscenario-core_dispatch_with_results_no_listeners
  • djangosimple-baseline
  • errortrackingdjangosimple-baseline
  • errortrackingflasksqli-baseline
  • flasksimple-baseline
  • flasksqli-baseline
  • iast_aspects-re_expand_aspect
  • iast_aspects-re_expand_noaspect
  • iast_aspects-re_findall_aspect
  • iast_aspects-re_findall_noaspect
  • iast_aspects-re_finditer_aspect
  • iast_aspects-re_finditer_noaspect
  • iast_aspects-re_fullmatch_aspect
  • iast_aspects-re_fullmatch_noaspect
  • iast_aspects-re_group_aspect
  • iast_aspects-re_group_noaspect
  • iast_aspects-re_groups_aspect
  • iast_aspects-re_groups_noaspect
  • iast_aspects-re_match_aspect
  • iast_aspects-re_match_noaspect
  • iast_aspects-re_search_aspect
  • iast_aspects-re_search_noaspect
  • iast_aspects-re_sub_aspect
  • iast_aspects-re_sub_noaspect
  • iast_aspects-re_subn_aspect
  • iast_aspects-re_subn_noaspect
  • sethttpmeta-obfuscation-disabled
  • startup-baseline
  • startup-baseline_django
  • startup-baseline_flask
  • startup-ddtrace_run
  • startup-ddtrace_run_appsec
  • startup-ddtrace_run_profiling
  • startup-ddtrace_run_runtime_metrics
  • startup-ddtrace_run_send_span
  • startup-ddtrace_run_telemetry_disabled
  • startup-ddtrace_run_telemetry_enabled
  • startup-import_ddtrace
  • startup-import_ddtrace_auto
  • startup-import_ddtrace_auto_django
  • startup-import_ddtrace_auto_flask
  • startup-import_ddtrace_django
  • startup-import_ddtrace_flask

@pablomartinezbernardo pablomartinezbernardo force-pushed the joey/apm-ai-toolkit/aws-durable-execution-sdk-python branch from de57fad to f873a4f Compare May 5, 2026 09:14
@cit-pr-commenter-54b7da
Copy link
Copy Markdown

cit-pr-commenter-54b7da Bot commented May 6, 2026

Codeowners resolved as

ddtrace/contrib/internal/aws_durable_execution_sdk_python/__init__.py   @DataDog/apm-core-python @DataDog/apm-idm-python
releasenotes/notes/feat-aws-durable-execution-trace-context-propagation-028c99dac901b8ba.yaml  @DataDog/apm-python

@joeyzhao2018 joeyzhao2018 force-pushed the joey/cross-invocation-tracecontext-propagation branch from 3eddaeb to f33d7fa Compare May 6, 2026 19:04
@pablomartinezbernardo pablomartinezbernardo force-pushed the joey/apm-ai-toolkit/aws-durable-execution-sdk-python branch from 6c14d36 to ca81f78 Compare May 7, 2026 09:11
@joeyzhao2018 joeyzhao2018 force-pushed the joey/cross-invocation-tracecontext-propagation branch from f33d7fa to 9a25ef3 Compare May 7, 2026 20:04
@datadog-datadog-prod-us1
Copy link
Copy Markdown
Contributor

datadog-datadog-prod-us1 Bot commented May 11, 2026

Pipelines  Tests

Fix all issues with BitsAI

⚠️ Warnings

🚦 9 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-py | prechecks   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). 10 type errors found in trace_checkpoint.py: Missing type parameters for generic type 'dict' and missing type annotations for function arguments.

DataDog/apm-reliability/dd-trace-py | build linux: [amd64, cp315-cp315, v113741238-d2b8243-manylinux2014_x86_64]   View in Datadog   GitLab

🔄 Retry job. This looks flaky and may succeed on retry. Dependency download failed for 'wrapt==2.2.1'.

DataDog/apm-reliability/dd-trace-py | build linux: [arm64, cp315-cp315, v113741357-d2b8243-manylinux2014_aarch64]   View in Datadog   GitLab

🔄 Retry job. This looks flaky and may succeed on retry. Dependency download failed for package 'wrapt==2.2.1'.

View all 9 failed jobs.

ℹ️ Info

No other issues found (see more)

🧪 All tests passed
❄️ No new flaky tests detected

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 3e59a47 | Docs | Datadog PR Page | Give us feedback!

@pablomartinezbernardo pablomartinezbernardo force-pushed the joey/apm-ai-toolkit/aws-durable-execution-sdk-python branch from e0f7a28 to fed5e48 Compare May 12, 2026 08:41
Base automatically changed from joey/apm-ai-toolkit/aws-durable-execution-sdk-python to main May 12, 2026 10:04
pablomartinezbernardo and others added 16 commits May 12, 2026 19:14
Introduce ``ddtrace/contrib/internal/aws_durable_execution_sdk_python/trace_checkpoint.py``
which appends a synthetic ``_datadog_{N}`` STEP operation to the durable
execution log on every ``SuspendExecution``. The payload is a JSON dict of
the propagation headers for the active trace, with the per-span volatile
fields (``x-datadog-parent-id``, ``traceparent``'s parent segment) rewritten
to point at the durable-execution root span — either the grandparent of the
current ``aws.durable.execute`` span (first invocation) or the parent id
already stored in the latest prior checkpoint (replays).

Diffing the new headers against the stored payload of the highest-N existing
``_datadog_*`` operation suppresses redundant writes; only ``x-datadog-parent-id``
and the ``dd=p:`` entry of ``tracestate`` are stripped before comparison so
sampling priority, decision-maker, origin, and propagation tags still trigger
a fresh save when they change.

Wired into ``_traced_durable_execution`` so a single checkpoint is written per
invocation, only on the suspend path (workflows that return or fail
terminally would never read the checkpoint).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@joeyzhao2018 joeyzhao2018 force-pushed the joey/cross-invocation-tracecontext-propagation branch from 24afe33 to d283ef5 Compare May 12, 2026 23:17
@joeyzhao2018 joeyzhao2018 marked this pull request as ready for review May 12, 2026 23:57
@joeyzhao2018 joeyzhao2018 requested review from a team as code owners May 12, 2026 23:57
try:
setattr(state, _STATE_NEXT_N_ATTR, n + 1)
except Exception:
log.debug("Could not advance checkpoint counter", exc_info=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing a bunch of debug logs on fatal exceptions, should they be debug considering that level will not show by default? I understand the fear of spamming warn lines, but I thing I would find it even more confusing for a feature to completely fail and no error/warning log showing at all.

This comment is in this specific line, but it applies to a bunch of the debug logs.

Copy link
Copy Markdown
Contributor Author

@joeyzhao2018 joeyzhao2018 May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allow me to push back a bit on this one. There are a few main reasons I simply cannot ignore.

  1. After all, we're not customer code. I don't want the WARN/ERROR from us lands in their application logs and pages their oncall for things they can't act on. The unwritten contract for a library running in someone else's runtime is to stay quiet unless something the customer can act on is broken.
  2. Graceful degradation, not feature loss. If these paths fail the worst outcome is a Datadog observability feature degrades — a checkpoint doesn't get pre-marked, a tag doesn't get set. Customer workflow correctness is untouched. WARN/ERROR for "an observability library lost some observability" is louder than the impact warrants.
  3. The "no signal at all" concern has a better mitigation than log level: telemetry. If we want Datadog to see when an integration degrades, that's what integration-error telemetry is for — visible to us, not piped into customer logs.
    • But, we haven't turned on instrumentation telemetry for serverless yet
      • But but, we are working on it and @emmettbutler actually have a few PRs for that. And we also need that to see how customers use the integration. So I added them here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to this, can we make the debug logs a bit more actionable. Just from the message it's difficult to debug and dive deeper into root causes. We should include some info about the state and in this case why we could not advance (this could be captured in exc_info but that might not be the case everywhere.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with some details to help debugging.
In general, we are trying to use

  • the metric tags of integration telemertry carry the dimensions we want to aggregate on
  • the logs carry the runtime values needed to reproduce locally

Comment thread ddtrace/contrib/internal/aws_durable_execution_sdk_python/patch.py Outdated
Comment thread ddtrace/contrib/internal/aws_durable_execution_sdk_python/patch.py
Comment thread ddtrace/contrib/internal/aws_durable_execution_sdk_python/trace_checkpoint.py Outdated
Copy link
Copy Markdown
Contributor

@mabdinur mabdinur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some nits, only blocking comment is about using native threads. From a tracing stand point the direction looks good. Thanks for driving this

Comment thread ddtrace/contrib/internal/aws_durable_execution_sdk_python/patch.py Outdated
Comment thread ddtrace/contrib/internal/aws_durable_execution_sdk_python/patch.py Outdated
*which* code path failed without us having to fish through customer
logs. Keep the set of values bounded — it becomes a metric tag.
"""
try:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is try/except needed? Do we expect this to fail?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_record_integration_error is called from inside except blocks whose entire job is to keep telemetry failures from breaking the workflow — including as the last statement of the outermost catch-all in maybe_save_trace_context_checkpoint. If telemetry_writer.add_count_metric ever raises (shutdown, post-fork, etc.), that exception propagates out and breaks the user's durable execution. The try/except is what makes the function's "must never raise" contract real.

pass


_CHECKPOINT_NAME_PREFIX = "_datadog_"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to deal with size limits for checkpoint names or headers? Woud using _dd be more efficent?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can potentially benefit from this bit explicitness here. One reason is that these operation names will be directly viewed by the customers. Not just visible. They will for sure look at those checkpoints. I can even picture customers go look for trace-ids from these checkpoints and directly jump to our traces. Another reason is that _dd actually has a higher chance of collision. I actually once looked into a support case where the customer services also had a ton of stuff that starts with dd* something.

As for the size limits, I put more details in my design doc. But in short, I think our stuff's size is negligible.


import hashlib
import json
import threading
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should should avoid using the theading library directly, this can cause stability issues in forked enviornments. We should use this module instead: https://github.com/DataDog/dd-trace-py/blob/main/ddtrace/internal/threads.py

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — swapped to ddtrace.internal.threads.Lock.

Quick follow-up: is this something most Python devs would catch on instinct, or is it pretty specific to ddtrace's constraints (gevent monkey-patching, forked workers)? If it's the latter, would it be worth a lint rule or CI grep that flags import threading / threading.Lock in new code under ddtrace/? Let me know and I can open a PR for that.

Comment thread ddtrace/contrib/internal/aws_durable_execution_sdk_python/trace_checkpoint.py Outdated
try:
setattr(state, _STATE_NEXT_N_ATTR, n + 1)
except Exception:
log.debug("Could not advance checkpoint counter", exc_info=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to this, can we make the debug logs a bit more actionable. Just from the message it's difficult to debug and dive deeper into root causes. We should include some info about the state and in this case why we could not advance (this could be captured in exc_info but that might not be the case everywhere.

Comment thread ddtrace/contrib/internal/aws_durable_execution_sdk_python/trace_checkpoint.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants