InkdownInkdown
Start writing

Study

59 filesยท8 subfolders

Shared Workspace

Study
core

11_RUN_STATE

Shared from "Study" on Inkdown

Run State - Comprehensive Deep Dive

Overview

RunState is the serialization mechanism that enables pausing, resuming, and inspecting agent runs. Think of RunState as a "save point" or "checkpoint" in a video game - it captures the complete state of an agent run at any moment, allowing you to pause execution, save that state, and resume it later. This is essential for human-in-the-loop workflows, long-running tasks, and debugging.

Core Concepts

What is RunState?

RunState is a complete snapshot of an agent run that includes:

  • Current agent - Which agent is currently active
  • Conversation history - All messages and tool calls so far
  • Model responses - History of model responses
  • Tool use tracker - Which tools have been used
  • Guardrail results - Results of guardrail checks
  • Session state - Session-related information
  • Sandbox state - Sandbox execution state (if applicable)
programming-language-concepts.md
zero-language-explanation.md
DB
01-introduction.md
02-relational-databases.md
03-database-design.md
04-indexing.md
05-transactions-acid.md
06-nosql-databases.md
07-query-optimization.md
08-replication-ha.md
09-sharding-partitioning.md
10-caching-strategies.md
11-cap-theorem.md
12-connection-pooling.md
13-backup-recovery.md
14-monitoring.md
15-database-selection.md
README.md
JS
Event loop
Merlin Backend
01-Orchestration.md
02-DeepResearch.md
03-Search.md
04-Scraping.md
05-Streaming.md
06-MultiProviderLLM.md
07-MemoryAndContext.md
08-ErrorHandling.md
09-RateLimiting.md
10-TaskQueue.md
11-SecurityAndAuth.md
Orchestration-2nd-draft
OpenAI Agents Python
00_OVERVIEW.md
01_AGENT_SYSTEM.md
02_RUNNER_SYSTEM.md
03_TOOL_SYSTEM.md
04_ITEMS_SYSTEM.md
05_GUARDRAILS.md
06_HANDOFFS.md
07_MEMORY_SESSIONS.md
08_MODEL_PROVIDERS.md
09_SANDBOX_SYSTEM.md
10_TRACING.md
11_RUN_STATE.md
12_CONTEXT.md
13_LIFECYCLE_HOOKS.md
14_CONFIGURATION.md
15_ERROR_HANDLING.md
16_STREAMING.md
17_EXTENSIONS.md
18_MCP_INTEGRATION.md
19_BEST_PRACTICES.md
20_ARCHITECTURE_PATTERNS.md
opencode-study
context-handling
core
Python
Alembic
Basics
sqlalchemy - fastapi
SQLAlchemy overview
tweets
system_design_for_agentic_apps.md
  • Interruptions - Pending human approvals
  • Error details - Any errors that occurred
  • Context - User-provided context object
  • Trace state - Tracing information
  • Why RunState Matters
    1. Human-in-the-Loop - Pause for human approval and resume
    2. Long-Running Tasks - Handle tasks that span hours or days
    3. Debugging - Inspect state at any point
    4. Reproducibility - Replay exact execution paths
    5. Error Recovery - Resume from failures
    6. State Inspection - Understand what the agent is doing

    RunState Structure

    RunState Class
    Python
    @dataclass
    class RunState(Generic[TContext]):
        agent: Agent[TContext]
        """The current agent."""
        
        run_id: str
        """Unique identifier for the run."""
        
        group_id: str | None
        """Group ID for related runs."""
        
        conversation_id: str | None
        """Conversation ID for server-managed conversations."""
        
        current_input_items: list[TResponseInputItem]
        """Input items for the current turn."""
        
        all_items: list[RunItem]
        """All items generated so far."""
        
        model_responses: list[ModelResponse]
        """History of model responses."""
        
        tool_use_tracker: ToolUseTracker
        """Tracks tool usage."""
        
        trace: Trace | None
        """Trace information."""
        
        sandbox_runtime: dict | None
        """Sandbox runtime state."""
        
        approvals: Approvals
        """Tool approval state."""
        
        input_guardrail_results: list[InputGuardrailResult]
        """Input guardrail results."""
        
        output_guardrail_results: list[OutputGuardrailResult]
        """Output guardrail results."""
        
        tool_input_guardrail_results: list[ToolInputGuardrailResult]
        """Tool input guardrail results."""
        
        tool_output_guardrail_results: list[ToolOutputGuardrailResult]
        """Tool output guardrail results."""
        
        error_details: RunErrorDetails | None
        """Error details if an error occurred."""
        
        context: TContext | None
        """User-provided context."""
        
        context_serializer: Callable | None
        """Serializer for context."""
        
        context_deserializer: Callable | None
        """Deserializer for context."""
        
        run_config: RunConfig
        """Run configuration."""
        
        reasoning_item_id_policy: Literal["preserve", "omit"] | None
        """Policy for reasoning item IDs."""
        
        _current_agent: Agent[TContext] | None
        """The agent that generated this state."""
        
        _generated_items: list[RunItem]
        """Items generated in this run."""
        
        _session_items: list[RunItem]
        """Items from session."""
        
        _model_responses: list[ModelResponse]
        """Model responses."""
        
        _input_guardrail_results: list[InputGuardrailResult]
        """Input guardrail results."""
        
        _output_guardrail_results: list[OutputGuardrailResult]
        """Output guardrail results."""
        
        _tool_input_guardrail_results: list[ToolInputGuardrailResult]
        """Tool input guardrail results."""
        
        _tool_output_guardrail_results: list[ToolOutputGuardrailResult]
        """Tool output guardrail results."""
        
        _last_processed_response: ProcessedResponse | None
        """Last processed model response."""
        
        _current_turn: int
        """Current turn number."""
        
        _current_turn_persisted_item_count: int
        """Number of items persisted in current turn."""
        
        _conversation_id: str | None
        """Conversation ID."""
        
        _previous_response_id: str | None
        """Previous response ID."""
        
        _auto_previous_response_id: bool
        """Auto previous response ID flag."""
        
        _generated_prompt_cache_key: str | None
        """Generated prompt cache key."""
        
        _reasoning_item_id_policy: Literal["preserve", "omit"] | None
        """Reasoning item ID policy."""
        
        _current_step: NextStepInterruption | None
        """Current step (for interruptions)."""
        
        _trace_state: TraceState | None
        """Trace state."""
        
        _sandbox: dict | None
        """Sandbox state."""
        
        _schema_version: str
        """Schema version of this state."""

    Schema Versioning

    CURRENT_SCHEMA_VERSION

    The SDK uses schema versioning to handle changes in RunState format:

    Python
    CURRENT_SCHEMA_VERSION = "1.9"
    SCHEMA_VERSION_SUMMARIES

    Each schema version has a summary:

    Python
    SCHEMA_VERSION_SUMMARIES = {
        "1.0": "Initial schema version",
        "1.1": "Added sandbox runtime state",
        "1.2": "Added tool input/output guardrail results",
        "1.3": "Added reasoning item ID policy",
        "1.4": "Added trace state",
        "1.5": "Added sandbox session state",
        "1.6": "Added prompt cache key resolver",
        "1.7": "Added generated prompt cache key",
        "1.8": "Added current step for interruptions",
        "1.9": "Added sandbox resume state",
    }
    Schema Compatibility

    The SDK checks schema compatibility:

    Python
    state = RunState.from_json(json_str)
    
    if state._schema_version > CURRENT_SCHEMA_VERSION:
        raise RunStateVersionError(
            f"State version {state._schema_version} is newer than SDK version {CURRENT_SCHEMA_VERSION}"
        )

    RunState Serialization

    to_json()

    Serialize RunState to JSON:

    Python
    result = await Runner.run(agent, input)
    state = result.to_state()
    
    json_str = state.to_json()

    What gets serialized:

    • Agent metadata (name, instructions, etc.)
    • All run items
    • Model responses
    • Tool use tracker
    • Guardrail results
    • Trace state
    • Sandbox state (if applicable)
    • Context (if serializer provided)
    • Run configuration
    from_json()

    Deserialize RunState from JSON:

    Python
    state = RunState.from_json(json_str)

    What gets deserialized:

    • All serialized fields
    • Agent is reconstructed from metadata
    • Context is deserialized (if deserializer provided)
    Context Serialization

    Custom context requires serializers:

    Python
    from dataclasses import dataclass
    import json
    
    @dataclass
    class MyContext:
        user_id: str
        data: dict
    
    def serialize_context(context: MyContext) -> str:
        """Serialize context to JSON."""
        return json.dumps({
            "user_id": context.user_id,
            "data": context.data,
        })
    
    def deserialize_context(data: str) -> MyContext:
        """Deserialize context from JSON."""
        parsed = json.loads(data)
        return MyContext(
            user_id=parsed["user_id"],
            data=parsed["data"],
        )
    
    context = MyContext(user_id="123", data={"key": "value"})
    
    result = await Runner.run(
        agent,
        input,
        context=context,
        context_serializer=serialize_context,
        context_deserializer=deserialize_context,
    )
    
    state = result.to_state()
    json_str = state.to_json()  # Context is serialized
    
    # Later
    restored_state = RunState.from_json(json_str)
    # Context is deserialized automatically

    RunState and Human-in-the-Loop

    Pausing for Approval

    RunState enables approval workflows:

    Python
    @function_tool(needs_approval=True)
    def sensitive_operation() -> str:
        """Operation requiring approval."""
        return "Done"
    
    result = await Runner.run(agent, "Perform sensitive operation")
    
    # Check for interruptions
    if result.interruptions:
        # Save state
        state = result.to_state()
        
        # Human reviews
        for interruption in result.interruptions:
            print(f"Tool: {interruption.tool_name}")
            print(f"Args: {interruption.tool_arguments}")
            
            if should_approve(interruption):
                state.context.approve_tool(interruption)
            else:
                state.context.reject_tool(interruption)
        
        # Resume
        result = await Runner.run(agent, state)
    Interruption Handling

    RunState captures interruptions:

    Python
    @dataclass
    class NextStepInterruption:
        interruptions: list[ToolApprovalItem]
        """Pending tool approvals."""

    RunState and Resumption

    Resuming from State

    Resume a run from a saved state:

    Python
    # First run - pauses
    result1 = await Runner.run(agent, input)
    state = result1.to_state()
    
    # ... time passes, human intervention ...
    
    # Resume
    result2 = await Runner.run(agent, state)
    Resumption Behavior

    When resuming:

    • The current agent is restored
    • Conversation history is preserved
    • Tool use tracker is restored
    • Pending approvals are handled
    • Session state is maintained
    • Sandbox state is restored (if applicable)
    Resumption with New Input

    You can provide new input when resuming:

    Python
    # Resume with new input
    result = await Runner.run(agent, state, input="New question")

    RunState and Sessions

    Session-Aware RunState

    RunState includes session information:

    Python
    session = SQLiteSession(db_path="conversations.db")
    
    result = await Runner.run(
        agent,
        input,
        session=session,
        conversation_id="conv-123",
    )
    
    state = result.to_state()
    # state._conversation_id = "conv-123"
    # state._session_items contains session items
    Session Resumption

    Resume with session context:

    Python
    # First run with session
    result1 = await Runner.run(
        agent,
        input,
        session=session,
        conversation_id="conv-123",
    )
    
    state = result1.to_state()
    
    # Resume - session is automatically used
    result2 = await Runner.run(agent, state)

    RunState and Sandbox

    Sandbox State in RunState

    Sandbox state is included:

    Python
    config = SandboxRunConfig(client=UnixLocalSandboxClient())
    
    result = await Runner.run(
        agent,
        input,
        run_config=config,
    )
    
    state = result.to_state()
    # state._sandbox contains sandbox state
    Sandbox Resumption

    Resume with sandbox state:

    Python
    # First run with sandbox
    result1 = await Runner.run(agent, input, run_config=config)
    state = result1.to_state()
    
    # Resume - sandbox is restored
    result2 = await Runner.run(agent, state)

    RunState and Errors

    Error Details

    RunState captures error details:

    Python
    try:
        result = await Runner.run(agent, input)
    except MaxTurnsExceeded as e:
        state = e.run_state
        # state.error_details contains error information
    Error Recovery

    Recover from errors using RunState:

    Python
    try:
        result = await Runner.run(agent, input)
    except MaxTurnsExceeded as e:
        state = e.run_state
        
        # Increase max_turns
        state.run_config.max_turns = 20
        
        # Resume
        result = await Runner.run(agent, state)

    RunState Inspection

    Inspecting Conversation History
    Python
    state = result.to_state()
    
    # View all items
    for item in state.all_items:
        print(f"{type(item).__name__}: {item}")
    
    # View only messages
    messages = [i for i in state.all_items if isinstance(i, MessageOutputItem)]
    
    # View tool calls
    tool_calls = [i for i in state.all_items if isinstance(i, ToolCallItem)]
    Inspecting Tool Usage
    Python
    state = result.to_state()
    
    # Tool use tracker
    tracker = state.tool_use_tracker
    print(f"Tools used: {tracker.tool_counts}")
    print(f"Total calls: {tracker.total_calls}")
    Inspecting Guardrail Results
    Python
    state = result.to_state()
    
    # Input guardrails
    for result in state.input_guardrail_results:
        print(f"Guardrail: {result.guardrail.get_name()}")
        print(f"Triggered: {result.output.tripwire_triggered}")
    
    # Output guardrails
    for result in state.output_guardrail_results:
        print(f"Guardrail: {result.guardrail.get_name()}")
        print(f"Triggered: {result.output.tripwire_triggered}")
    Inspecting Model Responses
    Python
    state = result.to_state()
    
    for response in state.model_responses:
        print(f"Model: {response.agent.name}")
        print(f"Usage: {response.usage}")
        print(f"Items: {len(response.response.output)}")

    RunState and Tracing

    Trace State

    RunState includes trace state:

    Python
    state = result.to_state()
    
    # Trace state
    trace_state = state._trace_state
    print(f"Trace ID: {trace_state.trace_id}")
    print(f"Spans: {len(trace_state.spans)}")
    Resuming with Trace

    Trace is continued when resuming:

    Python
    # First run
    result1 = await Runner.run(agent, input)
    state = result1.to_state()
    
    # Resume - trace continues
    result2 = await Runner.run(agent, state)
    # result2.trace is a continuation of result1.trace

    RunState Best Practices

    1. Serialize Context Properly

    Always provide serializers for custom context:

    Python
    # Good
    result = await Runner.run(
        agent,
        input,
        context=my_context,
        context_serializer=serialize_context,
        context_deserializer=deserialize_context,
    )
    
    # Avoid - context won't serialize
    result = await Runner.run(
        agent,
        input,
        context=my_context,
    )
    2. Handle Schema Versions

    Be aware of schema version changes:

    Python
    # Good - check version
    state = RunState.from_json(json_str)
    if state._schema_version > CURRENT_SCHEMA_VERSION:
        # Handle newer version
        pass
    
    # Avoid - assume compatibility
    state = RunState.from_json(json_str)
    # Might fail if version is incompatible
    3. Clean Up Old States

    Implement cleanup for old states:

    Python
    async def cleanup_old_states(days: int = 30):
        """Clean up states older than N days."""
        cutoff = datetime.now() - timedelta(days=days)
        await state_store.delete_before(cutoff)
    4. Validate State Before Resumption

    Validate state before resuming:

    Python
    def validate_state(state: RunState) -> bool:
        """Validate state before resumption."""
        # Check schema version
        if state._schema_version > CURRENT_SCHEMA_VERSION:
            return False
        
        # Check required fields
        if not state.agent:
            return False
        
        # Check context deserializer
        if state.context and not state.context_deserializer:
            return False
        
        return True
    
    # Use before resumption
    if validate_state(state):
        result = await Runner.run(agent, state)
    5. Encrypt Sensitive States

    Encrypt sensitive state data:

    Python
    def encrypt_state(state: RunState, key: str) -> str:
        """Encrypt state to JSON."""
        json_str = state.to_json()
        encrypted = encrypt(json_str, key)
        return encrypted
    
    def decrypt_state(encrypted: str, key: str) -> RunState:
        """Decrypt state from JSON."""
        json_str = decrypt(encrypted, key)
        return RunState.from_json(json_str)

    Common RunState Patterns

    1. Approval Workflow
    Python
    result = await Runner.run(agent, input)
    
    if result.interruptions:
        state = result.to_state()
        
        # UI for approval
        approved = get_user_approval(result.interruptions)
        
        if approved:
            state.context.approve_tool(result.interruptions[0])
        else:
            state.context.reject_tool(result.interruptions[0])
        
        result = await Runner.run(agent, state)
    2. Long-Running Task
    Python
    # Start task
    result = await Runner.run(agent, "Start long task")
    state = result.to_state()
    
    # Save state
    await save_state_to_db(task_id, state.to_json())
    
    # ... time passes ...
    
    # Resume task
    json_str = await load_state_from_db(task_id)
    state = RunState.from_json(json_str)
    result = await Runner.run(agent, state)
    3. Error Recovery
    Python
    try:
        result = await Runner.run(agent, input, max_turns=10)
    except MaxTurnsExceeded as e:
        state = e.run_state
        
        # Adjust configuration
        state.run_config.max_turns = 20
        
        # Resume
        result = await Runner.run(agent, state)
    4. Debugging
    Python
    result = await Runner.run(agent, input)
    state = result.to_state()
    
    # Inspect state
    print(f"Agent: {state.agent.name}")
    print(f"Turn: {state._current_turn}")
    print(f"Items: {len(state.all_items)}")
    
    # Find what went wrong
    for item in state.all_items:
        if isinstance(item, ToolCallOutputItem):
            if "error" in item.output.lower():
                print(f"Error in tool: {item}")
    5. State Sharing

    Share state across processes:

    Python
    # Process 1
    result = await Runner.run(agent, input)
    state = result.to_state()
    await redis.set(f"state:{run_id}", state.to_json())
    
    # Process 2
    json_str = await redis.get(f"state:{run_id}")
    state = RunState.from_json(json_str)
    result = await Runner.run(agent, state)

    RunState and Version Migration

    Migrating Between Versions

    When the schema changes, implement migration:

    Python
    def migrate_state(state: RunState, target_version: str) -> RunState:
        """Migrate state to target version."""
        current = state._schema_version
        
        while current != target_version:
            # Apply migration for current -> current+1
            state = apply_migration(state, current, str(int(current) + 1))
            current = state._schema_version
        
        return state
    Backward Compatibility

    Maintain backward compatibility:

    Python
    def load_state_with_migration(json_str: str) -> RunState:
        """Load state with automatic migration."""
        state = RunState.from_json(json_str)
        
        # Migrate to current version if needed
        if state._schema_version < CURRENT_SCHEMA_VERSION:
            state = migrate_state(state, CURRENT_SCHEMA_VERSION)
        
        return state

    RunState Performance

    State Size

    RunState can be large for long conversations:

    Python
    # Estimate size
    state = result.to_state()
    json_str = state.to_json()
    size_kb = len(json_str) / 1024
    print(f"State size: {size_kb:.2f} KB")
    Optimizing State Size

    Reduce state size:

    Python
    # Option 1: Limit history
    state.all_items = state.all_items[-100:]
    
    # Option 2: Compact items
    state.all_items = compact_items(state.all_items)
    
    # Option 3: Exclude trace
    state.trace = None
    State Compression

    Compress state for storage:

    Python
    import gzip
    
    def compress_state(state: RunState) -> bytes:
        """Compress state."""
        json_str = state.to_json()
        return gzip.compress(json_str.encode())
    
    def decompress_state(data: bytes) -> RunState:
        """Decompress state."""
        json_str = gzip.decompress(data).decode()
        return RunState.from_json(json_str)

    Summary

    RunState enables pausing and resuming agent runs. Key takeaways:

    1. RunState is a complete snapshot of an agent run
    2. Serialization enables state persistence
    3. Schema versioning handles format changes
    4. Context serialization requires custom serializers
    5. Human-in-the-loop via approval workflows
    6. Resumption continues from saved state
    7. Session integration maintains conversation context
    8. Sandbox state preserves execution environment
    9. Error details capture failure information
    10. Inspection enables debugging
    11. Tool use tracker monitors tool usage
    12. Guardrail results capture validation outcomes
    13. Model responses track LLM interactions
    14. Trace state preserves tracing information
    15. Validation before resumption
    16. Encryption for sensitive data
    17. Cleanup prevents storage bloat
    18. Migration handles schema changes
    19. Compression reduces storage size
    20. Performance considerations for large states

    RunState is essential for building robust, resumable, and debuggable agent workflows.