InkdownInkdown
Start writing

Study

59 filesยท8 subfolders

Shared Workspace

Study
core

10_TRACING

Shared from "Study" on Inkdown

Tracing - Comprehensive Deep Dive

Overview

Tracing in the OpenAI Agents SDK provides comprehensive observability into agent runs. Think of tracing as a "flight recorder" or "audit log" that captures everything that happens during an agent execution - from the initial input to the final output, including all model calls, tool executions, handoffs, and more. This is essential for debugging, monitoring, and understanding agent behavior.

Core Concepts

What is Tracing?

Tracing is the systematic recording of events that occur during an agent run. It captures:

  • Timing - When each operation occurred
  • Duration - How long each operation took
  • Inputs/Outputs - What data was passed (unless sensitive)
  • Relationships - How operations relate to each other
  • Metadata - Additional context about the run
Why Tracing Matters
programming-language-concepts.md
zero-language-explanation.md
DB
01-introduction.md
02-relational-databases.md
03-database-design.md
04-indexing.md
05-transactions-acid.md
06-nosql-databases.md
07-query-optimization.md
08-replication-ha.md
09-sharding-partitioning.md
10-caching-strategies.md
11-cap-theorem.md
12-connection-pooling.md
13-backup-recovery.md
14-monitoring.md
15-database-selection.md
README.md
JS
Event loop
Merlin Backend
01-Orchestration.md
02-DeepResearch.md
03-Search.md
04-Scraping.md
05-Streaming.md
06-MultiProviderLLM.md
07-MemoryAndContext.md
08-ErrorHandling.md
09-RateLimiting.md
10-TaskQueue.md
11-SecurityAndAuth.md
Orchestration-2nd-draft
OpenAI Agents Python
00_OVERVIEW.md
01_AGENT_SYSTEM.md
02_RUNNER_SYSTEM.md
03_TOOL_SYSTEM.md
04_ITEMS_SYSTEM.md
05_GUARDRAILS.md
06_HANDOFFS.md
07_MEMORY_SESSIONS.md
08_MODEL_PROVIDERS.md
09_SANDBOX_SYSTEM.md
10_TRACING.md
11_RUN_STATE.md
12_CONTEXT.md
13_LIFECYCLE_HOOKS.md
14_CONFIGURATION.md
15_ERROR_HANDLING.md
16_STREAMING.md
17_EXTENSIONS.md
18_MCP_INTEGRATION.md
19_BEST_PRACTICES.md
20_ARCHITECTURE_PATTERNS.md
opencode-study
context-handling
core
Python
Alembic
Basics
sqlalchemy - fastapi
SQLAlchemy overview
tweets
system_design_for_agentic_apps.md
  • Debugging - Understand what went wrong in a failed run
  • Performance - Identify bottlenecks and slow operations
  • Observability - Monitor agent behavior in production
  • Analysis - Analyze patterns and optimize workflows
  • Compliance - Maintain audit trails for regulatory requirements
  • Testing - Verify agent behavior matches expectations
  • Trace Structure

    Trace Hierarchy

    Traces are organized hierarchically:

    Plain text
    Trace (entire run)
    โ”œโ”€โ”€ Span (operation)
    โ”‚   โ”œโ”€โ”€ Span (sub-operation)
    โ”‚   โ”‚   โ””โ”€โ”€ Span (sub-sub-operation)
    โ”‚   โ””โ”€โ”€ Span (another sub-operation)
    โ””โ”€โ”€ Span (another operation)

    Analogy: Think of it like a tree where the Trace is the trunk and Spans are branches.

    Trace Class
    Python
    @dataclass
    class Trace:
        id: str
        """Unique identifier for the trace."""
        
        workflow_name: str
        """Name of the workflow being traced."""
        
        group_id: str | None
        """Group ID for linking related traces."""
        
        metadata: dict[str, Any]
        """Additional metadata about the trace."""
        
        spans: list[Span]
        """All spans in this trace."""
        
        state: TraceState
        """State of the trace (in_progress, completed, etc.)."""
    Span Class
    Python
    @dataclass
    class Span:
        id: str
        """Unique identifier for the span."""
        
        parent_id: str | None
        """ID of the parent span (if any)."""
        
        name: str
        """Name of the operation."""
        
        start_time: datetime
        """When the span started."""
        
        end_time: datetime | None
        """When the span ended (None if in progress)."""
        
        data: SpanData
        """Data associated with the span."""
        
        error: SpanError | None
        """Error if the span failed."""

    Span Types

    Agent Span

    Represents an agent's execution:

    Python
    from agents import agent_span, AgentSpanData
    
    with agent_span(name="my_agent", agent=agent) as span:
        # Agent execution
        pass

    AgentSpanData includes:

    • Agent name
    • Agent instructions
    • Input items
    • Output items
    • Tool calls
    • Handoffs
    Generation Span

    Represents a model generation:

    Python
    from agents import generation_span, GenerationSpanData
    
    with generation_span(name="llm_call") as span:
        response = await model.get_response(...)

    GenerationSpanData includes:

    • Model name
    • System instructions
    • Input items
    • Output items
    • Token usage
    • Tool calls
    Function Span

    Represents a function execution:

    Python
    from agents import function_span, FunctionSpanData
    
    with function_span(name="my_function") as span:
        result = my_function()

    FunctionSpanData includes:

    • Function name
    • Arguments
    • Return value
    • Duration
    Guardrail Span

    Represents a guardrail execution:

    Python
    from agents import guardrail_span, GuardrailSpanData
    
    with guardrail_span(name="input_guardrail") as span:
        result = await guardrail.run(...)

    GuardrailSpanData includes:

    • Guardrail name
    • Guardrail type (input/output/tool)
    • Input data
    • Output data
    • Tripwire triggered
    Handoff Span

    Represents a handoff between agents:

    Python
    from agents import handoff_span, HandoffSpanData
    
    with handoff_span(name="to_specialist") as span:
        await handoff.on_invoke_handoff(...)

    HandoffSpanData includes:

    • From agent name
    • To agent name
    • Handoff arguments
    • Handoff result
    Custom Span

    Represents a custom operation:

    Python
    from agents import custom_span, CustomSpanData
    
    with custom_span(name="my_operation", data={"key": "value"}) as span:
        # Custom operation
        pass
    Tool Span

    Represents a tool execution:

    Python
    from agents import tool_span
    
    with tool_span(name="my_tool", tool=tool) as span:
        result = await tool.execute(...)
    MCP Tools Span

    Represents MCP tool operations:

    Python
    from agents import mcp_tools_span, MCPListToolsSpanData
    
    with mcp_tools_span(name="mcp_list_tools") as span:
        tools = await mcp_server.list_tools()

    Tracing Configuration

    RunConfig Tracing Settings

    Configure tracing for a run:

    Python
    from agents import RunConfig
    
    config = RunConfig(
        tracing_disabled=False,  # Enable tracing
        trace_include_sensitive_data=True,  # Include sensitive data
        workflow_name="My workflow",
        trace_id="custom-trace-id",
        group_id="conversation-123",
        trace_metadata={"user_id": "user-456"},
    )
    Global Tracing Settings

    Set global tracing defaults:

    Python
    from agents import set_tracing_disabled, set_tracing_export_api_key
    
    # Disable tracing globally
    set_tracing_disabled(True)
    
    # Set API key for trace export
    set_tracing_export_api_key("your-api-key")
    Environment Variables

    Configure tracing via environment:

    Bash
    # Disable tracing
    export OPENAI_AGENTS_TRACING_DISABLED=true
    
    # Include sensitive data in traces
    export OPENAI_AGENTS_TRACE_INCLUDE_SENSITIVE_DATA=true

    Trace Processors

    TracingProcessor Interface

    Custom trace processors allow you to handle traces:

    Python
    from agents import TracingProcessor, Trace
    
    class MyProcessor(TracingProcessor):
        async def process_trace(self, trace: Trace) -> None:
            """Process a completed trace."""
            # Send to your observability platform
            await send_to_platform(trace)
    Adding Trace Processors

    Add processors to handle traces:

    Python
    from agents import add_trace_processor
    
    processor = MyProcessor()
    add_trace_processor(processor)
    Multiple Processors

    You can have multiple processors:

    Python
    add_trace_processor(ConsoleProcessor())
    add_trace_processor(DatabaseProcessor())
    add_trace_processor(AlertingProcessor())

    Built-in Processors

    Console Processor

    Print traces to console:

    Python
    from agents.tracing.processors import ConsoleProcessor
    
    add_trace_processor(ConsoleProcessor())
    File Processor

    Save traces to files:

    Python
    from agents.tracing.processors import FileProcessor
    
    add_trace_processor(FileProcessor(directory="/traces"))
    OpenAI Processor

    Send traces to OpenAI:

    Python
    from agents.tracing.processors import OpenAIProcessor
    
    add_trace_processor(OpenAIProcessor(api_key="your-key"))

    Manual Tracing

    Creating Traces Manually

    Create traces for custom operations:

    Python
    from agents import trace, Trace
    
    with trace(workflow_name="custom_workflow") as trace_obj:
        # Manual operations
        with custom_span(name="step1"):
            do_step1()
        
        with custom_span(name="step2"):
            do_step2()
        
    # Access the trace
    print(trace_obj.id)
    Getting Current Trace

    Access the current trace context:

    Python
    from agents import get_current_trace
    
    trace = get_current_trace()
    print(f"Current trace ID: {trace.id}")
    Getting Current Span

    Access the current span context:

    Python
    from agents import get_current_span
    
    span = get_current_span()
    print(f"Current span: {span.name}")

    Span Data Types

    AgentSpanData
    Python
    @dataclass
    class AgentSpanData:
        agent_name: str
        """Name of the agent."""
        
        instructions: str | None
        """Agent instructions."""
        
        input_items: list[TResponseInputItem]
        """Input items."""
        
        output_items: list[RunItem]
        """Output items."""
    GenerationSpanData
    Python
    @dataclass
    class GenerationSpanData:
        model_name: str
        """Name of the model."""
        
        system_instructions: str | None
        """System instructions."""
        
        input_items: list[TResponseInputItem]
        """Input items."""
        
        output_items: list[RunItem]
        """Output items."""
        
        usage: Usage
        """Token usage."""
    FunctionSpanData
    Python
    @dataclass
    class FunctionSpanData:
        function_name: str
        """Name of the function."""
        
        arguments: dict[str, Any]
        """Function arguments."""
        
        result: Any
        """Function result."""
    GuardrailSpanData
    Python
    @dataclass
    class GuardrailSpanData:
        guardrail_name: str
        """Name of the guardrail."""
        
        guardrail_type: str
        """Type of guardrail."""
        
        input_data: Any
        """Input data."""
        
        output_data: Any
        """Output data."""
        
        tripwire_triggered: bool
        """Whether tripwire was triggered."""

    Trace Errors

    SpanError

    Errors in spans are captured:

    Python
    @dataclass
    class SpanError:
        message: str
        """Error message."""
        
        data: dict[str, Any]
        """Additional error data."""
    Error Handling in Spans

    Errors are automatically captured:

    Python
    with custom_span(name="risky_operation") as span:
        try:
            risky_operation()
        except Exception as e:
            # Error is automatically captured in span
            raise
    Manual Error Recording

    Manually record errors:

    Python
    from agents import SpanError
    
    with custom_span(name="my_operation") as span:
        try:
            do_operation()
        except Exception as e:
            span.error = SpanError(
                message=str(e),
                data={"exception_type": type(e).__name__},
            )
            raise

    Trace Export

    OpenAI Trace Export

    Export traces to OpenAI:

    Python
    from agents import set_tracing_export_api_key
    
    set_tracing_export_api_key("your-openai-api-key")
    
    # Traces are automatically sent to OpenAI
    result = await Runner.run(agent, input)
    Custom Export

    Implement custom export:

    Python
    class CustomExporter:
        async def export(self, trace: Trace) -> None:
            """Export trace to custom destination."""
            await send_to_custom_system(trace)
    
    add_trace_processor(CustomExporter())

    Trace Analysis

    Analyzing Trace Duration

    Calculate total duration:

    Python
    def calculate_trace_duration(trace: Trace) -> float:
        """Calculate total trace duration."""
        if not trace.spans:
            return 0.0
        
        start = min(span.start_time for span in trace.spans)
        end = max(span.end_time for span in trace.spans if span.end_time)
        
        return (end - start).total_seconds()
    Analyzing Token Usage

    Aggregate token usage across spans:

    Python
    def aggregate_token_usage(trace: Trace) -> Usage:
        """Aggregate token usage from all generation spans."""
        total = Usage(request_tokens=0, response_tokens=0, total_tokens=0)
        
        for span in trace.spans:
            if isinstance(span.data, GenerationSpanData):
                total += span.data.usage
        
        return total
    Analyzing Tool Usage

    Count tool executions:

    Python
    def count_tool_calls(trace: Trace) -> dict[str, int]:
        """Count tool calls by name."""
        counts = {}
        
        for span in trace.spans:
            if span.name.startswith("tool_"):
                tool_name = span.name.replace("tool_", "")
                counts[tool_name] = counts.get(tool_name, 0) + 1
        
        return counts
    Analyzing Handoff Patterns

    Track handoff patterns:

    Python
    def analyze_handoffs(trace: Trace) -> list[dict]:
        """Analyze handoff patterns."""
        handoffs = []
        
        for span in trace.spans:
            if isinstance(span.data, HandoffSpanData):
                handoffs.append({
                    "from": span.data.from_agent,
                    "to": span.data.to_agent,
                    "timestamp": span.start_time,
                })
        
        return handoffs

    Tracing Best Practices

    1. Use Descriptive Span Names

    Use clear, descriptive span names:

    Python
    # Good
    with generation_span(name="gpt-4o_call_for_summarization"):
        ...
    
    # Avoid
    with generation_span(name="llm"):
        ...  # Too vague
    2. Include Relevant Metadata

    Add helpful metadata:

    Python
    # Good
    span.data = CustomSpanData(
        operation="file_processing",
        file_path="/workspace/data.txt",
        file_size=1024,
    )
    
    # Avoid
    span.data = CustomSpanData()  # No context
    3. Handle Sensitive Data

    Be careful with sensitive data:

    Python
    # Good - exclude sensitive data
    config = RunConfig(
        trace_include_sensitive_data=False,
    )
    
    # Avoid - leak sensitive data
    config = RunConfig(
        trace_include_sensitive_data=True,
    )
    4. Use Appropriate Granularity

    Choose the right level of detail:

    Python
    # Good - appropriate granularity
    with agent_span(name="agent_execution"):
        with generation_span(name="llm_call"):
            ...
        with tool_span(name="tool_execution"):
            ...
    
    # Avoid - too granular
    with custom_span(name="variable_assignment"):
        x = 1  # Too detailed
    5. Clean Up Old Traces

    Implement trace cleanup:

    Python
    async def cleanup_old_traces(days: int = 30):
        """Clean up traces older than N days."""
        cutoff = datetime.now() - timedelta(days=days)
        await trace_store.delete_before(cutoff)

    Common Tracing Patterns

    1. Distributed Tracing

    Link traces across services:

    Python
    config = RunConfig(
        group_id="request-123",  # Link related traces
        trace_metadata={"service": "agent-service"},
    )
    2. Performance Monitoring

    Track performance metrics:

    Python
    class PerformanceProcessor(TracingProcessor):
        async def process_trace(self, trace: Trace):
            duration = calculate_trace_duration(trace)
            if duration > 10.0:
                await alert_slow_trace(trace)
    3. Error Tracking

    Track errors in traces:

    Python
    class ErrorTrackingProcessor(TracingProcessor):
        async def process_trace(self, trace: Trace):
            errors = [span.error for span in trace.spans if span.error]
            if errors:
                await report_errors(trace, errors)
    4. Compliance Logging

    Maintain compliance logs:

    Python
    class ComplianceProcessor(TracingProcessor):
        async def process_trace(self, trace: Trace):
            # Log for compliance
            await compliance_log.store(trace)
    5. Cost Tracking

    Estimate costs from traces:

    Python
    def estimate_cost(trace: Trace) -> float:
        """Estimate cost from token usage."""
        usage = aggregate_token_usage(trace)
        
        # GPT-4 pricing (example)
        input_cost = usage.request_tokens * 0.00003
        output_cost = usage.response_tokens * 0.00006
        
        return input_cost + output_cost

    Tracing and Streaming

    Streaming Traces

    Traces are built incrementally during streaming:

    Python
    async for event in Runner.run_streamed(agent, input):
        # Trace is being built in real-time
        pass
    
    # Access final trace
    trace = result.trace
    Real-time Span Updates

    Spans are updated as operations complete:

    Python
    with generation_span(name="llm_call") as span:
        async for chunk in model.stream_response(...):
            # Span is still in progress
            pass
        # Span is now complete

    Tracing and Sessions

    Session-Aware Tracing

    Traces can include session information:

    Python
    config = RunConfig(
        trace_metadata={
            "conversation_id": "conv-123",
            "session_type": "SQLiteSession",
        },
    )
    Cross-Session Tracing

    Link traces across sessions:

    Python
    config = RunConfig(
        group_id="user-123",  # Link all user's traces
    )

    Tracing Security

    Sensitive Data Protection

    Protect sensitive data in traces:

    Python
    # Option 1: Exclude sensitive data
    config = RunConfig(
        trace_include_sensitive_data=False,
    )
    
    # Option 2: Sanitize in processor
    class SanitizingProcessor(TracingProcessor):
        async def process_trace(self, trace: Trace):
            sanitized = sanitize_trace(trace)
            await send_to_platform(sanitized)
    Access Control

    Control who can access traces:

    Python
    class AccessControlProcessor(TracingProcessor):
        async def process_trace(self, trace: Trace):
            user_id = trace.metadata.get("user_id")
            if user_has_access(user_id):
                await store_trace(trace)
            else:
                await log_denied_access(user_id)

    Tracing Performance

    Performance Impact

    Tracing has minimal performance impact:

    Python
    # Tracing is async and non-blocking
    # Spans use lightweight data structures
    # Processors run in background
    Optimizing Tracing

    Optimize for performance:

    Python
    # Option 1: Disable tracing in production
    config = RunConfig(tracing_disabled=True)
    
    # Option 2: Sample traces
    class SamplingProcessor(TracingProcessor):
        async def process_trace(self, trace: Trace):
            if random.random() < 0.1:  # 10% sample
                await send_to_platform(trace)
    
    # Option 3: Batch exports
    class BatchProcessor(TracingProcessor):
        def __init__(self):
            self.batch = []
        
        async def process_trace(self, trace: Trace):
            self.batch.append(trace)
            if len(self.batch) >= 100:
                await export_batch(self.batch)
                self.batch = []

    Tracing Debugging

    Debugging with Traces

    Use traces to debug issues:

    Python
    result = await Runner.run(agent, input)
    
    # Inspect the trace
    print(f"Trace ID: {result.trace.id}")
    print(f"Workflow: {result.trace.workflow_name}")
    print(f"Spans: {len(result.trace.spans)}")
    
    # Find slow spans
    for span in result.trace.spans:
        duration = (span.end_time - span.start_time).total_seconds()
        if duration > 1.0:
            print(f"Slow span: {span.name} ({duration}s)")
    Visualizing Traces

    Create visual representations:

    Python
    def visualize_trace(trace: Trace) -> str:
        """Create a text visualization of the trace."""
        lines = [f"Trace: {trace.id}"]
        
        for span in trace.spans:
            indent = "  " * span.depth
            duration = (span.end_time - span.start_time).total_seconds()
            lines.append(f"{indent}{span.name} ({duration}s)")
        
        return "\n".join(lines)

    Summary

    Tracing provides comprehensive observability. Key takeaways:

    1. Traces capture entire agent runs
    2. Spans represent individual operations
    3. Span hierarchy shows operation relationships
    4. Agent spans track agent execution
    5. Generation spans track model calls
    6. Function spans track function execution
    7. Guardrail spans track guardrail execution
    8. Handoff spans track agent delegation
    9. Custom spans track custom operations
    10. Trace processors handle completed traces
    11. Configuration controls tracing behavior
    12. Sensitive data can be excluded
    13. Manual tracing for custom operations
    14. Current trace/span context access
    15. Span data provides operation details
    16. Span errors capture failures
    17. Trace export to external systems
    18. Trace analysis enables insights
    19. Performance monitoring via traces
    20. Compliance via audit trails

    Tracing is essential for building observable, debuggable, and monitorable agent systems.