10_TRACING

Shared from "Study" on Inkdown

Tracing - Comprehensive Deep Dive

Overview

Tracing in the OpenAI Agents SDK provides comprehensive observability into agent runs. Think of tracing as a "flight recorder" or "audit log" that captures everything that happens during an agent execution - from the initial input to the final output, including all model calls, tool executions, handoffs, and more. This is essential for debugging, monitoring, and understanding agent behavior.

Core Concepts

What is Tracing?

Tracing is the systematic recording of events that occur during an agent run. It captures:

Timing - When each operation occurred
Duration - How long each operation took
Inputs/Outputs - What data was passed (unless sensitive)
Relationships - How operations relate to each other
Metadata - Additional context about the run

Why Tracing Matters

Plain text

Trace (entire run)
├── Span (operation)
│   ├── Span (sub-operation)
│   │   └── Span (sub-sub-operation)
│   └── Span (another sub-operation)
└── Span (another operation)

Python

@dataclass
class Trace:
    id: str
    """Unique identifier for the trace."""
    
    workflow_name: str
    """Name of the workflow being traced."""
    
    group_id: str | None
    """Group ID for linking related traces."""
    
    metadata: dict[str, Any]
    """Additional metadata about the trace."""
    
    spans: list[Span]
    """All spans in this trace."""
    
    state: TraceState
    """State of the trace (in_progress, completed, etc.)."""

Python

@dataclass
class Span:
    id: str
    """Unique identifier for the span."""
    
    parent_id: str | None
    """ID of the parent span (if any)."""
    
    name: str
    """Name of the operation."""
    
    start_time: datetime
    """When the span started."""
    
    end_time: datetime | None
    """When the span ended (None if in progress)."""
    
    data: SpanData
    """Data associated with the span."""
    
    error: SpanError | None
    """Error if the span failed."""

Python

from agents import RunConfig

config = RunConfig(
    tracing_disabled=False,  # Enable tracing
    trace_include_sensitive_data=True,  # Include sensitive data
    workflow_name="My workflow",
    trace_id="custom-trace-id",
    group_id="conversation-123",
    trace_metadata={"user_id": "user-456"},
)

Python

from agents import set_tracing_disabled, set_tracing_export_api_key

# Disable tracing globally
set_tracing_disabled(True)

# Set API key for trace export
set_tracing_export_api_key("your-api-key")

Python

from agents import TracingProcessor, Trace

class MyProcessor(TracingProcessor):
    async def process_trace(self, trace: Trace) -> None:
        """Process a completed trace."""
        # Send to your observability platform
        await send_to_platform(trace)

Python

from agents import trace, Trace

with trace(workflow_name="custom_workflow") as trace_obj:
    # Manual operations
    with custom_span(name="step1"):
        do_step1()
    
    with custom_span(name="step2"):
        do_step2()
    
# Access the trace
print(trace_obj.id)

Python

@dataclass
class AgentSpanData:
    agent_name: str
    """Name of the agent."""
    
    instructions: str | None
    """Agent instructions."""
    
    input_items: list[TResponseInputItem]
    """Input items."""
    
    output_items: list[RunItem]
    """Output items."""

Python

@dataclass
class GenerationSpanData:
    model_name: str
    """Name of the model."""
    
    system_instructions: str | None
    """System instructions."""
    
    input_items: list[TResponseInputItem]
    """Input items."""
    
    output_items: list[RunItem]
    """Output items."""
    
    usage: Usage
    """Token usage."""

Python

@dataclass
class FunctionSpanData:
    function_name: str
    """Name of the function."""
    
    arguments: dict[str, Any]
    """Function arguments."""
    
    result: Any
    """Function result."""

Python

@dataclass
class GuardrailSpanData:
    guardrail_name: str
    """Name of the guardrail."""
    
    guardrail_type: str
    """Type of guardrail."""
    
    input_data: Any
    """Input data."""
    
    output_data: Any
    """Output data."""
    
    tripwire_triggered: bool
    """Whether tripwire was triggered."""

Python

from agents import SpanError

with custom_span(name="my_operation") as span:
    try:
        do_operation()
    except Exception as e:
        span.error = SpanError(
            message=str(e),
            data={"exception_type": type(e).__name__},
        )
        raise

Python

class CustomExporter:
    async def export(self, trace: Trace) -> None:
        """Export trace to custom destination."""
        await send_to_custom_system(trace)

add_trace_processor(CustomExporter())

Python

def calculate_trace_duration(trace: Trace) -> float:
    """Calculate total trace duration."""
    if not trace.spans:
        return 0.0
    
    start = min(span.start_time for span in trace.spans)
    end = max(span.end_time for span in trace.spans if span.end_time)
    
    return (end - start).total_seconds()

Python

def aggregate_token_usage(trace: Trace) -> Usage:
    """Aggregate token usage from all generation spans."""
    total = Usage(request_tokens=0, response_tokens=0, total_tokens=0)
    
    for span in trace.spans:
        if isinstance(span.data, GenerationSpanData):
            total += span.data.usage
    
    return total

Python

def count_tool_calls(trace: Trace) -> dict[str, int]:
    """Count tool calls by name."""
    counts = {}
    
    for span in trace.spans:
        if span.name.startswith("tool_"):
            tool_name = span.name.replace("tool_", "")
            counts[tool_name] = counts.get(tool_name, 0) + 1
    
    return counts

Python

def analyze_handoffs(trace: Trace) -> list[dict]:
    """Analyze handoff patterns."""
    handoffs = []
    
    for span in trace.spans:
        if isinstance(span.data, HandoffSpanData):
            handoffs.append({
                "from": span.data.from_agent,
                "to": span.data.to_agent,
                "timestamp": span.start_time,
            })
    
    return handoffs

Python

# Good - appropriate granularity
with agent_span(name="agent_execution"):
    with generation_span(name="llm_call"):
        ...
    with tool_span(name="tool_execution"):
        ...

# Avoid - too granular
with custom_span(name="variable_assignment"):
    x = 1  # Too detailed

Python

class PerformanceProcessor(TracingProcessor):
    async def process_trace(self, trace: Trace):
        duration = calculate_trace_duration(trace)
        if duration > 10.0:
            await alert_slow_trace(trace)

Python

class ErrorTrackingProcessor(TracingProcessor):
    async def process_trace(self, trace: Trace):
        errors = [span.error for span in trace.spans if span.error]
        if errors:
            await report_errors(trace, errors)

Python

def estimate_cost(trace: Trace) -> float:
    """Estimate cost from token usage."""
    usage = aggregate_token_usage(trace)
    
    # GPT-4 pricing (example)
    input_cost = usage.request_tokens * 0.00003
    output_cost = usage.response_tokens * 0.00006
    
    return input_cost + output_cost

Python

# Option 1: Exclude sensitive data
config = RunConfig(
    trace_include_sensitive_data=False,
)

# Option 2: Sanitize in processor
class SanitizingProcessor(TracingProcessor):
    async def process_trace(self, trace: Trace):
        sanitized = sanitize_trace(trace)
        await send_to_platform(sanitized)

Python

class AccessControlProcessor(TracingProcessor):
    async def process_trace(self, trace: Trace):
        user_id = trace.metadata.get("user_id")
        if user_has_access(user_id):
            await store_trace(trace)
        else:
            await log_denied_access(user_id)

Python

# Option 1: Disable tracing in production
config = RunConfig(tracing_disabled=True)

# Option 2: Sample traces
class SamplingProcessor(TracingProcessor):
    async def process_trace(self, trace: Trace):
        if random.random() < 0.1:  # 10% sample
            await send_to_platform(trace)

# Option 3: Batch exports
class BatchProcessor(TracingProcessor):
    def __init__(self):
        self.batch = []
    
    async def process_trace(self, trace: Trace):
        self.batch.append(trace)
        if len(self.batch) >= 100:
            await export_batch(self.batch)
            self.batch = []

Python

result = await Runner.run(agent, input)

# Inspect the trace
print(f"Trace ID: {result.trace.id}")
print(f"Workflow: {result.trace.workflow_name}")
print(f"Spans: {len(result.trace.spans)}")

# Find slow spans
for span in result.trace.spans:
    duration = (span.end_time - span.start_time).total_seconds()
    if duration > 1.0:
        print(f"Slow span: {span.name} ({duration}s)")

Python

def visualize_trace(trace: Trace) -> str:
    """Create a text visualization of the trace."""
    lines = [f"Trace: {trace.id}"]
    
    for span in trace.spans:
        indent = "  " * span.depth
        duration = (span.end_time - span.start_time).total_seconds()
        lines.append(f"{indent}{span.name} ({duration}s)")
    
    return "\n".join(lines)

10_TRACING

Tracing - Comprehensive Deep Dive

Overview

Core Concepts

What is Tracing?

Why Tracing Matters

10_TRACING

Tracing - Comprehensive Deep Dive

Overview

Core Concepts

What is Tracing?

Why Tracing Matters

Trace Structure

Trace Hierarchy

Trace Class

Span Class

Span Types

Agent Span

Generation Span

Function Span

Guardrail Span

Handoff Span

Custom Span

Tool Span

MCP Tools Span

Tracing Configuration

RunConfig Tracing Settings

Global Tracing Settings

Environment Variables

Trace Processors

TracingProcessor Interface

Adding Trace Processors

Multiple Processors

Built-in Processors

Console Processor

File Processor

OpenAI Processor

Manual Tracing

Creating Traces Manually

Getting Current Trace

Getting Current Span

Span Data Types

AgentSpanData

GenerationSpanData

FunctionSpanData

GuardrailSpanData

Trace Errors

SpanError

Error Handling in Spans

Manual Error Recording

Trace Export

OpenAI Trace Export

Custom Export

Trace Analysis

Analyzing Trace Duration

Analyzing Token Usage

Analyzing Tool Usage

Analyzing Handoff Patterns

Tracing Best Practices

1. Use Descriptive Span Names

2. Include Relevant Metadata

3. Handle Sensitive Data

4. Use Appropriate Granularity

5. Clean Up Old Traces

Common Tracing Patterns

1. Distributed Tracing

2. Performance Monitoring

3. Error Tracking

4. Compliance Logging

5. Cost Tracking

Tracing and Streaming

Streaming Traces

Real-time Span Updates

Tracing and Sessions

Session-Aware Tracing

Cross-Session Tracing

Tracing Security

Sensitive Data Protection

Access Control

Tracing Performance