InkdownInkdown
Start writing

OpenAI Agents Python

21 filesยท0 subfolders

Shared Workspace

OpenAI Agents Python
00_OVERVIEW.md

04_ITEMS_SYSTEM

Shared from "OpenAI Agents Python" on Inkdown

Items System - Comprehensive Deep Dive

Overview

The Items system is the fundamental data structure that represents everything that happens during an agent run. Every message, tool call, handoff, and model response is represented as an "Item". Think of Items as the "ledger" or "transaction log" of an agent run - they record every event in a structured, serializable format.

Core Concepts

What is an Item?

An Item is a structured representation of an event that occurs during an agent run. Items are:

  • Typed - Each item has a specific type (message, tool call, handoff, etc.)
  • Serializable - Can be converted to/from JSON
  • Traceable - Each item can be traced back to the agent that created it
  • Convertible - Can be converted to input items for the model
Why Items Matter
  1. Audit Trail - Complete record of what happened during a run
01_AGENT_SYSTEM.md
02_RUNNER_SYSTEM.md
03_TOOL_SYSTEM.md
04_ITEMS_SYSTEM.md
05_GUARDRAILS.md
06_HANDOFFS.md
07_MEMORY_SESSIONS.md
08_MODEL_PROVIDERS.md
09_SANDBOX_SYSTEM.md
10_TRACING.md
11_RUN_STATE.md
12_CONTEXT.md
13_LIFECYCLE_HOOKS.md
14_CONFIGURATION.md
15_ERROR_HANDLING.md
16_STREAMING.md
17_EXTENSIONS.md
18_MCP_INTEGRATION.md
19_BEST_PRACTICES.md
20_ARCHITECTURE_PATTERNS.md
  • Session Persistence - Items are saved to sessions for conversation history
  • Debugging - Inspect items to understand agent behavior
  • Replay - Items can be replayed to reproduce runs
  • Analysis - Analyze patterns in agent behavior
  • Tracing - Items are the basis for trace spans
  • Item Types

    MessageOutputItem

    Represents a message from the LLM:

    Python
    from agents import MessageOutputItem
    
    item = MessageOutputItem(
        agent=my_agent,
        raw_item=ResponseOutputMessage(
            id="msg_123",
            content=[{"type": "text", "text": "Hello!"}],
            role="assistant",
        ),
    )

    When it's created:

    • When the LLM generates a message response
    • After tool execution when the LLM responds
    • When an agent produces final output

    Structure:

    • agent - The agent that generated this message
    • raw_item - The raw OpenAI ResponseOutputMessage object
    • type - Always "message_output_item"
    ToolCallItem

    Represents a request to call a tool:

    Python
    from agents import ToolCallItem
    
    item = ToolCallItem(
        agent=my_agent,
        raw_item=ResponseFunctionToolCall(
            id="call_123",
            name="my_tool",
            arguments='{"param": "value"}',
        ),
    )

    When it's created:

    • When the LLM decides to call a tool
    • For each tool call in a multi-tool invocation

    Structure:

    • agent - The agent that made the tool call
    • raw_item - The raw tool call object
    • type - The specific tool type (function, computer, shell, etc.)
    ToolCallOutputItem

    Represents the result of a tool execution:

    Python
    from agents import ToolCallOutputItem
    
    item = ToolCallOutputItem(
        agent=my_agent,
        raw_item=ResponseFunctionCallOutputItem(
            call_id="call_123",
            output="Tool result",
        ),
    )

    When it's created:

    • After a tool is executed
    • When a tool error occurs
    • When a tool is rejected (approval denied)

    Structure:

    • agent - The agent that called the tool
    • raw_item - The raw output item
    • output - The tool's output (as string)
    HandoffCallItem

    Represents a handoff to another agent:

    Python
    from agents import HandoffCallItem
    
    item = HandoffCallItem(
        agent=my_agent,
        raw_item=ResponseFunctionToolCall(
            id="call_123",
            name="transfer_to_specialist",
            arguments='{}',
        ),
    )

    When it's created:

    • When the LLM calls a handoff tool
    • When an agent delegates to another agent

    Structure:

    • agent - The agent making the handoff
    • raw_item - The handoff tool call
    • target_agent - The agent being handed off to
    HandoffOutputItem

    Represents the result of a handoff:

    Python
    from agents import HandoffOutputItem
    
    item = HandoffOutputItem(
        agent=my_agent,
        raw_item=ResponseFunctionCallOutputItem(
            call_id="call_123",
            output="Handoff completed",
        ),
    )

    When it's created:

    • After a handoff completes
    • When the target agent produces output
    ReasoningItem

    Represents model reasoning (for reasoning models like GPT-5):

    Python
    from agents import ReasoningItem
    
    item = ReasoningItem(
        agent=my_agent,
        raw_item=ResponseReasoningItem(
            id="reasoning_123",
            summary="The model's reasoning summary",
        ),
    )

    When it's created:

    • When a reasoning model produces reasoning content
    • Before the final response

    Structure:

    • agent - The agent that produced reasoning
    • raw_item - The raw reasoning item
    • summary - Summary of the reasoning
    ToolApprovalItem

    Represents a tool approval request (human-in-the-loop):

    Python
    from agents import ToolApprovalItem
    
    item = ToolApprovalItem(
        agent=my_agent,
        tool_name="sensitive_tool",
        tool_arguments='{"param": "value"}',
        call_id="call_123",
    )

    When it's created:

    • When a tool requires approval
    • When execution pauses for human review

    Structure:

    • agent - The agent that called the tool
    • tool_name - Name of the tool
    • tool_arguments - Arguments passed to the tool
    • call_id - Unique call ID
    MCPApprovalRequestItem

    Represents an MCP tool approval request:

    Python
    from agents import MCPApprovalRequestItem
    
    item = MCPApprovalRequestItem(
        agent=my_agent,
        raw_item=McpApprovalRequest(
            tool_name="mcp_tool",
            arguments='{}',
        ),
    )

    When it's created:

    • When an MCP tool requires approval
    • Before MCP tool execution
    MCPApprovalResponseItem

    Represents the response to an MCP approval request:

    Python
    from agents import MCPApprovalResponseItem
    
    item = MCPApprovalResponseItem(
        agent=my_agent,
        raw_item=McpApprovalResponse(
            tool_name="mcp_tool",
            approved=True,
        ),
    )

    When it's created:

    • After human approves/rejects MCP tool
    • Before actual MCP tool execution
    ToolSearchCallItem

    Represents a tool search request (Responses API):

    Python
    from agents import ToolSearchCallItem
    
    item = ToolSearchCallItem(
        agent=my_agent,
        raw_item=ResponseToolSearchCall(
            id="search_123",
            query="search query",
        ),
    )

    When it's created:

    • When the model uses tool search
    • To find relevant tools
    ToolSearchOutputItem

    Represents tool search results:

    Python
    from agents import ToolSearchOutputItem
    
    item = ToolSearchOutputItem(
        agent=my_agent,
        raw_item=ResponseToolSearchOutputItem(
            id="search_123",
            results=[...],
        ),
    )

    When it's created:

    • After tool search completes
    • With search results
    CompactionItem

    Represents conversation compaction (for long conversations):

    Python
    from agents import CompactionItem
    
    item = CompactionItem(
        agent=my_agent,
        raw_item=ResponseOutputMessage(
            id="compact_123",
            content=[{"type": "text", "text": "Summary of conversation"}],
        ),
    )

    When it's created:

    • When conversation history is compacted
    • To reduce token usage while preserving context

    Item Base Class

    RunItemBase

    All items inherit from RunItemBase:

    Python
    @dataclass
    class RunItemBase(Generic[T], abc.ABC):
        agent: Agent[Any]
        """The agent whose run caused this item to be generated."""
        
        raw_item: T
        """The raw Responses item from the run."""
        
        _agent_ref: weakref.ReferenceType[Agent[Any]] | None
        """Weak reference to the agent for memory management."""

    Key Features:

    1. Agent Reference - Every item knows which agent created it
    2. Weak References - Uses weak references to avoid memory leaks
    3. Type Generic - Generic over the raw item type
    4. Convertible - Can convert to input items
    Memory Management

    Items use weak references to agents to prevent memory leaks:

    Python
    item = MessageOutputItem(agent=agent, raw_item=...)
    
    # Release the strong reference
    item.release_agent()
    
    # Agent can now be garbage collected
    # Item still has weak reference for debugging

    Why weak references?

    • Long-running sessions with many items
    • Prevents agents from being kept alive by old items
    • Still allows debugging with agent information

    Item Conversion

    to_input_item()

    Every item can be converted to an input item for the model:

    Python
    item = MessageOutputItem(agent=agent, raw_item=message)
    input_item = item.to_input_item()
    # Returns: {"type": "assistant", "content": [...]}

    Why conversion is needed:

    • Items are output format (what came out)
    • Input items are input format (what goes in)
    • Conversion enables replay and session persistence
    • Different formats for output vs input

    Conversion Rules:

    1. Dict items - Returned as-is (already input format)
    2. Pydantic items - Converted using model_dump(exclude_unset=True)
    3. Custom items - Must implement to_input_item()
    run_items_to_input_items()

    Convert multiple items to input items:

    Python
    from agents import run_items_to_input_items
    
    items = [item1, item2, item3]
    input_items = run_items_to_input_items(items)

    Used for:

    • Session persistence
    • Replay functionality
    • Handoff history preparation

    Item Helpers

    ItemHelpers Class

    Utility class for working with items:

    Python
    from agents import ItemHelpers
    
    # Get all tool call items
    tool_calls = ItemHelpers.get_tool_calls(items)
    
    # Get all message items
    messages = ItemHelpers.get_messages(items)
    
    # Get text content from messages
    text = ItemHelpers.get_text_content(message_item)
    
    # Convert to input list
    input_list = ItemHelpers.to_input_list(items)

    Key Methods:

    • get_tool_calls(items) - Extract tool call items
    • get_messages(items) - Extract message items
    • get_text_content(item) - Get text from a message
    • to_input_list(items) - Convert items to input list
    • get_function_call_outputs(items) - Get tool outputs

    Model Response

    ModelResponse Class

    Represents a complete model response:

    Python
    @dataclass
    class ModelResponse:
        response: Response
        """The raw OpenAI Response object."""
        
        request_id: str | None
        """The request ID for this response."""
        
        usage: Usage
        """Token usage for this response."""
        
        agent: Agent[Any]
        """The agent that made the request."""

    When it's created:

    • After each model call
    • Contains all output items from the call
    • Includes usage information

    Structure:

    • response - Full OpenAI Response object
    • request_id - Request identifier
    • usage - Token usage data
    • agent - Agent that made the request

    Item Lifecycle

    Creation Flow
    Python
    # 1. Model generates response
    response = await model.get_response(...)
    
    # 2. Response is processed
    for output_item in response.output:
        # 3. Create appropriate item type
        if isinstance(output_item, ResponseOutputMessage):
            item = MessageOutputItem(agent=agent, raw_item=output_item)
        elif isinstance(output_item, ResponseFunctionToolCall):
            item = ToolCallItem(agent=agent, raw_item=output_item)
        # ... other types
        
        # 4. Add to run state
        state._generated_items.append(item)
    Persistence Flow
    Python
    # 1. Run completes
    result = await Runner.run(agent, input)
    
    # 2. Items are saved to session
    if session:
        await session.save_items(
            conversation_id=conversation_id,
            items=result.new_items,
        )
    
    # 3. Items can be loaded later
    loaded_items = await session.load_items(conversation_id)
    Replay Flow
    Python
    # 1. Load items from session
    items = await session.load_items(conversation_id)
    
    # 2. Convert to input items
    input_items = run_items_to_input_items(items)
    
    # 3. Run with loaded items
    result = await Runner.run(agent, input_items)

    Item Serialization

    JSON Serialization

    Items can be serialized to JSON:

    Python
    import json
    
    item = MessageOutputItem(agent=agent, raw_item=message)
    json_str = json.dumps(item.to_input_item())

    Serialization considerations:

    • Raw items are Pydantic models with built-in serialization
    • Agent references are not serialized (use weak references)
    • Context is not serialized (user data)
    • Only the raw item data is serialized
    RunState Serialization

    Items are part of RunState serialization:

    Python
    state = result.to_state()
    json_str = state.to_json()
    
    # Later
    restored_state = RunState.from_json(json_str)

    What gets serialized:

    • All generated items
    • Agent identities (name, handoff description)
    • Tool call metadata
    • Usage information
    • Guardrail results

    Item Filtering

    Filtering by Type
    Python
    from agents import ItemHelpers
    
    # Get only tool calls
    tool_calls = [i for i in items if isinstance(i, ToolCallItem)]
    
    # Get only messages
    messages = [i for i in items if isinstance(i, MessageOutputItem)]
    Filtering by Agent
    Python
    # Get items from specific agent
    agent_items = [i for i in items if i.agent.name == "specialist"]
    Filtering by Tool Name
    Python
    # Get calls to specific tool
    tool_calls = [
        i for i in items 
        if isinstance(i, ToolCallItem) 
        and i.tool_name == "search"
    ]

    Item Analysis

    Analyzing Tool Usage
    Python
    from collections import Counter
    
    # Count tool usage
    tool_names = [i.tool_name for i in items if isinstance(i, ToolCallItem)]
    counts = Counter(tool_names)
    print(counts)
    # Output: Counter({'search': 5, 'calculate': 3, 'write_file': 2})
    Analyzing Turn Structure
    Python
    # Group items by turn
    turns = []
    current_turn = []
    for item in items:
        if isinstance(item, MessageOutputItem):
            if current_turn:
                turns.append(current_turn)
            current_turn = [item]
        else:
            current_turn.append(item)
    if current_turn:
        turns.append(current_turn)
    
    # Each turn is a list of items
    for i, turn in enumerate(turns):
        print(f"Turn {i}: {len(turn)} items")
    Analyzing Agent Handoffs
    Python
    # Track handoffs
    handoffs = []
    for item in items:
        if isinstance(item, HandoffCallItem):
            handoffs.append({
                'from': item.agent.name,
                'to': item.target_agent.name,
            })
    
    print(handoffs)
    # Output: [{'from': 'generalist', 'to': 'specialist'}, ...]

    Item and Session Integration

    Session Items

    Sessions store items as conversation history:

    Python
    from agents import SQLiteSession
    
    session = SQLiteSession(db_path="conversations.db")
    
    # Save items
    await session.save_items(
        conversation_id="conv_123",
        items=run_result.new_items,
    )
    
    # Load items
    items = await session.load_items(
        conversation_id="conv_123",
        max_items=50,  # Limit items loaded
    )
    Compaction Items

    For long conversations, items are compacted:

    Python
    # Old items are replaced with a summary
    compaction_item = CompactionItem(
        agent=agent,
        raw_item=ResponseOutputMessage(
            content=[{"type": "text", "text": "Summary of previous conversation"}],
        ),
    )
    
    # Replace old items with compaction item
    session.replace_old_items(compaction_item)

    Item and Tracing

    Trace Spans from Items

    Each item can create a trace span:

    Python
    from agents import generation_span, tool_span
    
    # Create spans from items
    for item in items:
        if isinstance(item, MessageOutputItem):
            with generation_span(name="message_generation"):
                # Trace the message generation
                pass
        elif isinstance(item, ToolCallItem):
            with tool_span(name=item.tool_name):
                # Trace the tool execution
                pass
    Item Metadata in Traces

    Items include metadata in traces:

    Python
    span_data = {
        "agent_name": item.agent.name,
        "item_type": item.type,
        "timestamp": item.timestamp,
        # ... other metadata
    }

    Best Practices

    1. Use Item Helpers

    Use the provided helper functions:

    Python
    # Good
    text = ItemHelpers.get_text_content(message_item)
    
    # Avoid
    text = message_item.raw_item.content[0].text  # Fragile
    2. Check Item Types

    Always check item types before accessing:

    Python
    # Good
    if isinstance(item, ToolCallItem):
        tool_name = item.tool_name
    
    # Avoid
    tool_name = item.tool_name  # Might not exist
    3. Release Agent References

    Release agent references when done:

    Python
    # Good
    for item in items:
        item.release_agent()
    
    # Avoid
    # Keeping strong references prevents garbage collection
    4. Convert to Input Items

    Use proper conversion for replay:

    Python
    # Good
    input_items = run_items_to_input_items(items)
    
    # Avoid
    input_items = [item.raw_item for item in items]  # Might not work
    5. Filter Appropriately

    Filter items based on your needs:

    Python
    # Good - specific filter
    tool_calls = ItemHelpers.get_tool_calls(items)
    
    # Avoid - too broad
    all_items = items

    Common Patterns

    1. Extract Tool Results
    Python
    def get_tool_results(items: list[RunItem]) -> dict[str, str]:
        """Extract tool call results."""
        results = {}
        for item in items:
            if isinstance(item, ToolCallOutputItem):
                results[item.call_id] = item.output
        return results
    2. Count Agent Turns
    Python
    def count_agent_turns(items: list[RunItem], agent_name: str) -> int:
        """Count turns by a specific agent."""
        return sum(
            1 for item in items 
            if isinstance(item, MessageOutputItem) 
            and item.agent.name == agent_name
        )
    3. Find Handoff Points
    Python
    def find_handoffs(items: list[RunItem]) -> list[tuple[str, str]]:
        """Find all handoff points."""
        handoffs = []
        for item in items:
            if isinstance(item, HandoffCallItem):
                handoffs.append((item.agent.name, item.target_agent.name))
        return handoffs
    4. Calculate Token Usage
    Python
    def calculate_total_usage(items: list[RunItem]) -> Usage:
        """Calculate total token usage."""
        total = Usage(request_tokens=0, response_tokens=0, total_tokens=0)
        for item in items:
            if hasattr(item, 'usage'):
                total += item.usage
        return total

    Summary

    The Items system is the foundation of run representation. Key takeaways:

    1. Items represent all events in an agent run
    2. Item types include messages, tool calls, handoffs, reasoning, etc.
    3. RunItemBase is the base class for all items
    4. Weak references prevent memory leaks
    5. Conversion enables replay and persistence
    6. ItemHelpers provide utility functions
    7. ModelResponse wraps a complete model response
    8. Serialization enables state persistence
    9. Filtering allows item analysis
    10. Sessions store items as conversation history
    11. Compaction reduces token usage for long conversations
    12. Tracing uses items for observability
    13. Analysis enables pattern detection
    14. Replay enables reproducing runs
    15. Debugging is easier with item inspection
    16. Agent references track which agent created each item
    17. Tool metadata tracks tool usage
    18. Handoff tracking follows agent delegation
    19. Usage tracking monitors token consumption
    20. Type safety ensures correct item handling

    Understanding Items is essential for debugging, persistence, and analysis of agent runs.