getsentry · inventarSarah · Jun 24, 2026 · Jun 24, 2026
diff --git a/docs/platforms/python/tracing/instrumentation/automatic-instrumentation.mdx b/docs/platforms/python/tracing/instrumentation/automatic-instrumentation.mdx
@@ -17,7 +17,7 @@ supported:
 description: "Learn what instrumentation automatically captures transactions."
 ---
 
-Many integrations for popular frameworks automatically capture transactions. If you already have any of the following frameworks set up for Sentry error reporting, you will start to see traces immediately:
+Many integrations for popular frameworks automatically capture transactions (or service spans in <PlatformLink to="/tracing/new-spans">stream mode</PlatformLink>). If you already have any of the following frameworks set up for Sentry error reporting, you will start to see traces immediately:
 
 - All WSGI-based web frameworks (Django, Flask, Pyramid, Falcon, Bottle)
 - Celery
@@ -26,11 +26,13 @@ Many integrations for popular frameworks automatically capture transactions. If
 
 See the full [list of available integrations](/platforms/python/integrations/).
 
-Spans are instrumented for the following operations within a transaction:
+Spans are instrumented for the following operations within a transaction/service span:
 
 - Database queries that use SQLAlchemy or the Django ORM
 - HTTP requests made with HTTPX, requests, the stdlib, AIOHTTP, or pyreqwest
 - Spawned subprocesses
 - Redis operations
 
-Spans are only created within an existing transaction. If you're not using any of the supported frameworks, you'll need to <PlatformLink to="/tracing/instrumentation/custom-instrumentation/">create transactions manually</PlatformLink>.
+In transaction mode, spans are only created within an existing transaction. If you're not using any of the supported frameworks, you'll need to <PlatformLink to="/tracing/instrumentation/custom-instrumentation/">create transactions manually</PlatformLink>.
+
+Stream mode removes this limitation. Since there are no transactions, any span started without a parent is automatically promoted to a service span (the equivalent of a transaction). You can also force any span to become a service span when starting it by removing its parent. See <PlatformLink to="/tracing/instrumentation/custom-instrumentation/">Custom Instrumentation</PlatformLink> to learn more.
diff --git a/...orms/python/tracing/instrumentation/custom-instrumentation/ai-agents-module.mdx b/...orms/python/tracing/instrumentation/custom-instrumentation/ai-agents-module.mdx
@@ -10,6 +10,13 @@ With <Link to="/ai/monitoring/agents/dashboards/">Sentry AI Agent Monitoring</Li
 
 As a prerequisite to setting up AI Agent Monitoring with Python, you'll need to first <PlatformLink to="/tracing/">set up tracing</PlatformLink>. Once this is done, the Python SDK will automatically instrument AI agents created with supported libraries. If that doesn't fit your use case, you can use custom instrumentation described below.
 
+<Alert>
+
+This page covers both transaction mode (default) and stream mode. See <PlatformLink to="/tracing/new-spans/">New Spans</PlatformLink> to learn more.
+
+</Alert>
+
+
 ## Automatic Instrumentation
 
 The Python SDK supports automatic instrumentation for some AI libraries. We recommend adding their integrations to your Sentry configuration to automatically capture spans for AI agents.
@@ -27,7 +34,8 @@ The Python SDK supports automatic instrumentation for some AI libraries. We reco
 
 For your AI agents data to show up in the [AI Agents Dashboards](https://sentry.io/orgredirect/organizations/:orgslug/dashboards/?filter=onlyPrebuilt&query=agents&sort=mostPopular), at least one of the AI spans needs to be created and have well-defined names and data attributes. See below.
 
-The [@sentry_sdk.trace()](/platforms/python/tracing/instrumentation/custom-instrumentation/#span-templates) decorator can also be used to create these spans.
+In transaction mode, you can also use the [@sentry_sdk.trace()](/platforms/python/tracing/instrumentation/custom-instrumentation/#span-templates) decorator to create these spans, using its `template` parameter.
+In stream mode, these spans need to be created directly with `sentry_sdk.traces.start_span()`, as shown below.
 
 ## Span Hierarchy
 
@@ -55,7 +63,7 @@ This span represents a request to an LLM model or service that generates a respo
 
 #### Example AI Request span
 
-```python
+```python {tabTitle:Transaction Mode (Default)}
 import json
 import sentry_sdk
 
@@ -78,11 +86,41 @@ with sentry_sdk.start_span(op="gen_ai.chat", name="chat o3-mini") as span:
     span.set_data("gen_ai.usage.output_tokens", result.usage.completion_tokens)
 ```
 
+```python {tabTitle:Stream Mode}
+import json
+import sentry_sdk
+
+messages = [{"role": "user", "parts": [{"type": "text", "content": "Tell me a joke"}]}]
+
+with sentry_sdk.traces.start_span(
+    name="chat o3-mini",
+    attributes={"sentry.op": "gen_ai.chat"},
+) as span:
+    span.set_attributes({
+        "gen_ai.operation.name": "chat",
+        "gen_ai.request.model": "o3-mini",
+        "gen_ai.provider.name": "openai",
+        "gen_ai.input.messages": json.dumps(messages),
+    })
+
+    result = client.chat.completions.create(model="o3-mini", messages=messages)
+
+    span.set_attributes({
+        "gen_ai.response.model": result.model,
+        "gen_ai.output.messages": json.dumps([
+            {"role": "assistant", "parts": [{"type": "text", "content": result.choices[0].message.content}]}
+        ]),
+        "gen_ai.response.finish_reasons": json.dumps([result.choices[0].finish_reason]),
+        "gen_ai.usage.input_tokens": result.usage.prompt_tokens,
+        "gen_ai.usage.output_tokens": result.usage.completion_tokens,
+    })
+```
+
 #### Thinking / reasoning messages
 
 Models with extended thinking (such as Anthropic's `thinking` blocks, Gemini's `thought`, or DeepSeek's `reasoning_content`) produce internal reasoning that isn't part of the user-visible reply. Represent this content as a `reasoning` part inside the assistant message, alongside the user-facing `text` part. Sentry surfaces reasoning parts separately and filters them out of the user-facing <Link to="/ai/monitoring/conversations/">Conversations</Link> view, so don't fold thinking into a `text` part.
 
-```python
+```python {tabTitle:Transaction Mode (Default)}
 import json
 import sentry_sdk
 
@@ -111,6 +149,45 @@ with sentry_sdk.start_span(op="gen_ai.chat", name="chat o3-mini") as span:
     span.set_data("gen_ai.usage.output_tokens.reasoning", result.usage.completion_tokens_details.reasoning_tokens)
 ```
 
+```python {tabTitle:Stream Mode}
+import json
+import sentry_sdk
+
+messages = [{"role": "user", "parts": [{"type": "text", "content": "What is 6 * 7?"}]}]
+
+with sentry_sdk.traces.start_span(
+    name="chat o3-mini",
+    attributes={"sentry.op": "gen_ai.chat"},
+) as span:
+    span.set_attributes({
+        "gen_ai.operation.name": "chat",
+        "gen_ai.request.model": "o3-mini",
+        "gen_ai.provider.name": "openai",
+        "gen_ai.input.messages": json.dumps(messages),
+    })
+
+    result = client.chat.completions.create(model="o3-mini", messages=messages)
+
+    span.set_attributes({
+        "gen_ai.response.model": result.model,
+        "gen_ai.output.messages": json.dumps([
+            {
+                "role": "assistant",
+                "parts": [
+                    {"type": "reasoning", "content": "6 times 7 is 42."},
+                    {"type": "text", "content": "The answer is 42."},
+                ],
+            }
+        ]),
+        "gen_ai.usage.output_tokens": result.usage.completion_tokens,
+    })
+    # If the provider reports reasoning tokens, record them as a subset of output tokens
+    span.set_attribute(
+        "gen_ai.usage.output_tokens.reasoning",
+        result.usage.completion_tokens_details.reasoning_tokens,
+    )
+```
+
 When previous thinking is fed back into a multi-turn request, include the same `reasoning` parts in the assistant messages within `gen_ai.input.messages`.
 
 ### Invoke Agent Span
@@ -129,7 +206,7 @@ For a complete guide on naming agents across all supported frameworks, see [Nami
 
 #### Example of an Invoke Agent Span:
 
-```python
+```python {tabTitle:Transaction Mode (Default)}
 import json
 import sentry_sdk
 
@@ -147,6 +224,31 @@ with sentry_sdk.start_span(op="gen_ai.invoke_agent", name="invoke_agent Weather
     span.set_data("gen_ai.usage.output_tokens", final_output.usage.output_tokens)
 ```
 
+```python {tabTitle:Stream Mode}
+import json
+import sentry_sdk
+
+with sentry_sdk.traces.start_span(
+    name="invoke_agent Weather Agent",
+    attributes={"sentry.op": "gen_ai.invoke_agent"},
+) as span:
+    span.set_attributes({
+        "gen_ai.operation.name": "invoke_agent",
+        "gen_ai.request.model": "o3-mini",
+        "gen_ai.agent.name": "Weather Agent",
+    })
+
+    final_output = my_agent.run()
+
+    span.set_attributes({
+        "gen_ai.output.messages": json.dumps([
+            {"role": "assistant", "parts": [{"type": "text", "content": str(final_output)}]}
+        ]),
+        "gen_ai.usage.input_tokens": final_output.usage.input_tokens,
+        "gen_ai.usage.output_tokens": final_output.usage.output_tokens,
+    })
+```
+
 ### Execute Tool Span
 
 This span represents the execution of a tool or function that was requested by an AI model, including the input arguments and resulting output.
@@ -157,7 +259,7 @@ This span represents the execution of a tool or function that was requested by a
 
 #### Example Execute Tool span
 
-```python
+```python {tabTitle:Transaction Mode (Default)}
 import json
 import sentry_sdk
 
@@ -171,6 +273,25 @@ with sentry_sdk.start_span(op="gen_ai.execute_tool", name="execute_tool get_weat
     span.set_data("gen_ai.tool.call.result", json.dumps(result))
 ```
 
+```python {tabTitle:Stream Mode}
+import json
+import sentry_sdk
+
+with sentry_sdk.traces.start_span(
+    name="execute_tool get_weather",
+    attributes={"sentry.op": "gen_ai.execute_tool"},
+) as span:
+    span.set_attributes({
+        "gen_ai.operation.name": "execute_tool",
+        "gen_ai.tool.name": "get_weather",
+        "gen_ai.tool.call.arguments": json.dumps({"location": "Paris"}),
+    })
+
+    result = get_weather(location="Paris")
+
+    span.set_attribute("gen_ai.tool.call.result", json.dumps(result))
+```
+
 ### Handoff Span
 
 This span marks the transition of control from one agent to another, typically when the current agent determines another agent is better suited to handle the task.
@@ -181,7 +302,7 @@ This span marks the transition of control from one agent to another, typically w
 
 #### Example of a Handoff Span
 
-```python
+```python {tabTitle:Transaction Mode (Default)}
 import sentry_sdk
 
 with sentry_sdk.start_span(op="gen_ai.handoff", name="handoff from Weather Agent to Travel Agent"):
@@ -192,10 +313,28 @@ with sentry_sdk.start_span(op="gen_ai.invoke_agent", name="invoke_agent Travel A
     pass
 ```
 
+```python {tabTitle:Stream Mode}
+import sentry_sdk
+
+with sentry_sdk.traces.start_span(
+    name="handoff from Weather Agent to Travel Agent",
+    attributes={"sentry.op": "gen_ai.handoff"},
+):
+    pass  # Handoff span just marks the transition
+
+with sentry_sdk.traces.start_span(
+    name="invoke_agent Travel Agent",
+    attributes={"sentry.op": "gen_ai.invoke_agent"},
+):
+    # Run the target agent here
+    pass
+```
+
 ## Tracking Conversations
 
 <Alert>
-  Tracking Conversations has **beta** stability. Configuration options and behavior may change.
+  Tracking Conversations has **beta** stability. Configuration options and
+  behavior may change.
 </Alert>
 
 For AI applications that involve multi-turn conversations, you can use `sentry_sdk.ai.set_conversation_id()` to associate all AI spans from the same conversation. This enables you to track and analyze complete <Link to="/ai/monitoring/conversations/">conversation</Link> flows within Sentry.

diff --git a/...atforms/python/tracing/instrumentation/custom-instrumentation/caches-module.mdx b/...atforms/python/tracing/instrumentation/custom-instrumentation/caches-module.mdx
@@ -3,15 +3,22 @@ title: Instrument Caches
 sidebar_order: 1000
 description: "Learn how to manually instrument your code to use Sentry's Caches module. "
 ---
+
 A cache can be used to speed up data retrieval, thereby improving application performance. Because instead of getting data from a potentially slow data layer, your application will be getting data from memory (in a best case scenario). Caching can speed up read-heavy workloads for applications like Q&A portals, gaming, media sharing, and social networking.
 
+<Alert>
+
+This page covers both transaction mode (default) and stream mode. See <PlatformLink to="/tracing/new-spans/">New Spans</PlatformLink> to learn more.
+
+</Alert>
+
 Sentry offers a [cache-monitoring dashboard](https://sentry.io/orgredirect/organizations/:orgslug/dashboards/) that can be auto-instrumented for popular Python caching setups (like <PlatformLink to="/integrations/django/">Django</PlatformLink> and <PlatformLink to="/integrations/redis/">Redis</PlatformLink>).
 
 If you're using a custom caching solution that doesn't have auto instrumentation, you can manually instrument it and use Sentry to get a look into how your caching solution is performing by following the setup instructions below.
 
 To make it possible for Sentry to give you an overview of your cache performance, you'll need to create two spans - one indicating that something is being put into the cache, and a second one indicating that something is being fetched from the cache.
 
-Make sure that there's a transaction running when you create the spans. If you're using a web framework those transactions will be created for you automatically. See <PlatformLink to="/tracing/">Tracing</PlatformLink> for more information.
+In transaction mode, make sure that there's a transaction running when you create the spans. If you're using a web framework those transactions will be created for you automatically. See <PlatformLink to="/tracing/">Tracing</PlatformLink> for more information.
 
 For detailed information about which data can be set, see the [Cache Module Developer Specification](https://develop.sentry.dev/sdk/performance/modules/caches/).
 
@@ -21,16 +28,16 @@ If you're using anything other than <PlatformLink to="/integrations/django/">Dja
 
 ### Add Span When Putting Data Into the Cache
 
-If the cache you’re using isn’t supported by auto instrumentation mentioned above, you can use the custom instrumentation instructions below to emit cache spans:
+If the cache you're using isn't supported by auto instrumentation mentioned above, you can use the custom instrumentation instructions below to emit cache spans:
 
 1. Set the cache value with whatever cache library you happen to be using.
-2. Wrap the part of your application that uses the cached value with  `with sentry_sdk.start_span(...)`
+2. Wrap the part of your application that uses the cached value with `with sentry_sdk.start_span(...)` (transaction mode) or `with sentry_sdk.traces.start_span(...)` (stream mode).
 3. Set `op` to `cache.put`.
 4. Set `cache.item_size` to an integer representing the size of the cached item.
 
 (The steps described above are documented in the snippet.)
 
-```python
+```python {tabTitle:Transaction Mode (Default)}
 import my_caching
 import sentry_sdk
 
@@ -52,20 +59,43 @@ with sentry_sdk.start_span(op="cache.put") as span:
     span.set_data("cache.item_size", len(value))  # Warning: if value is very big this could use lots of memory
 ```
 
+```python {tabTitle:Stream Mode}
+import my_caching
+import sentry_sdk
+
+key = "myCacheKey123"
+value = "The value I want to cache."
+
+with sentry_sdk.traces.start_span(attributes={"sentry.op": "cache.put"}) as span:
+    # Set a key in your caching using your custom caching solution
+    my_caching.set(key, value)
+
+    # Describe the cache server you are accessing
+    span.set_attributes({
+        "network.peer.address": "cache.example.com/supercache",
+        "network.peer.port": 9000,
+
+        # Add the key(s) you want to set
+        "cache.key": [key],
+
+        # Add the size of the value you stored in the cache
+        "cache.item_size": len(value),  # Warning: if value is very big this could use lots of memory
+    })
+```
 
 ### Add Span When Retrieving Data From the Cache
 
-If the cache you’re using isn’t supported by auto instrumentation mentioned above, you can use the custom instrumentation instructions below to emit cache spans:
+If the cache you're using isn't supported by auto instrumentation mentioned above, you can use the custom instrumentation instructions below to emit cache spans:
 
 1. Fetch the cached value from whatever cache library you happen to be using.
-2. Wrap the part of your application that uses the cached value with  `with sentry_sdk.start_span(...)`
+2. Wrap the part of your application that uses the cached value with `with sentry_sdk.start_span(...)` (transaction mode) or `with sentry_sdk.traces.start_span(...)` (stream mode).
 3. Set `op` to `cache.get`.
 4. Set `cache.hit` to a boolean value representing whether the value was successfully fetched from the cache or not.
 5. Set `cache.item_size` to an integer representing the size of the cached item.
 
 (The steps described above are documented in the snippet.)
 
-```python
+```python {tabTitle:Transaction Mode (Default)}
 import my_caching
 import sentry_sdk
 
@@ -94,4 +124,35 @@ with sentry_sdk.start_span(op="cache.get") as span:
         span.set_data("cache.hit", False)
 ```
 
+```python {tabTitle:Stream Mode}
+import my_caching
+import sentry_sdk
+
+key = "myCacheKey123"
+value = None
+
+with sentry_sdk.traces.start_span(attributes={"sentry.op": "cache.get"}) as span:
+    # Get a key from your caching solution
+    value = my_caching.get(key)
+
+    # Describe the cache server you are accessing
+    span.set_attributes({
+        "network.peer.address": "cache.example.com/supercache",
+        "network.peer.port": 9000,
+
+        # Add the key(s) you just retrieved from the cache
+        "cache.key": [key],
+    })
+
+    if value is not None:
+        # If you retrieved a value, the cache was hit
+        span.set_attribute("cache.hit", True)
+
+        # Optionally also add the size of the value you retrieved
+        span.set_attribute("cache.item_size", len(value))
+    else:
+        # If you could not retrieve a value, it was a miss
+        span.set_attribute("cache.hit", False)
+```
+
 You should now have the right spans in place. Head over to the [Cache dashboard](https://sentry.io/orgredirect/organizations/:orgslug/dashboards/) to see how your cache is performing.