InkdownInkdown
Start writing

OpenAI Agents Python

21 files·0 subfolders

Shared Workspace

OpenAI Agents Python
00_OVERVIEW.md
01_AGENT_SYSTEM.md
02_RUNNER_SYSTEM.md
03_TOOL_SYSTEM.md
04_ITEMS_SYSTEM.md
05_GUARDRAILS.md
06_HANDOFFS.md
07_MEMORY_SESSIONS.md
08_MODEL_PROVIDERS.md
09_SANDBOX_SYSTEM.md
10_TRACING.md
11_RUN_STATE.md
12_CONTEXT.md
13_LIFECYCLE_HOOKS.md
14_CONFIGURATION.md
15_ERROR_HANDLING.md
16_STREAMING.md
17_EXTENSIONS.md
18_MCP_INTEGRATION.md
19_BEST_PRACTICES.md
20_ARCHITECTURE_PATTERNS.md

00_OVERVIEW

Shared from "OpenAI Agents Python" on Inkdown

OpenAI Agents Python SDK - Comprehensive Codebase Study

Introduction

This is a comprehensive study of the OpenAI Agents Python SDK codebase. The SDK is a lightweight yet powerful framework for building multi-agent workflows that is provider-agnostic, supporting the OpenAI Responses and Chat Completions APIs, as well as 100+ other LLMs.

Project Structure

Plain text
openai-agents-python/
├── src/agents/          # Core library implementation
├── tests/               # Test suite
├── examples/            # Sample projects showing SDK usage
├── docs/                # MkDocs documentation source
├── mkdocs.yml           # Documentation site configuration
├── pyproject.toml       # Python dependencies and tool configuration
└── Makefile             # Common developer commands

Core Architecture

The SDK is built around several key concepts:

  1. Agents - LLMs configured with instructions, tools, guardrails, and handoffs
  2. Runner - The execution engine that runs agents
  3. Tools - Functions that agents can call to perform actions
  4. Items - The unit of work in an agent run (messages, tool calls, etc.)
  5. Guardrails - Safety checks for input and output validation
  6. Handoffs - Mechanism for delegating to other agents
  7. Sessions - Automatic conversation history management
  8. Tracing - Built-in tracking of agent runs
  9. Sandbox - Isolated execution environments for agents

Key Design Principles

1. Provider Agnostic

The SDK supports multiple model providers through a unified interface. You can use OpenAI's Responses API, Chat Completions API, or integrate with 100+ other LLMs through the MultiProvider system.

2. Type Safety

The entire SDK is written with Python type hints and uses Pydantic for data validation. This ensures type safety and provides excellent IDE support.

3. Async-First

The SDK is designed with async/await as the primary execution model. This allows for efficient concurrent operations, especially when dealing with multiple tools or handoffs.

4. Streaming Support

All operations support streaming, allowing you to get real-time updates as the agent processes requests.

5. Human-in-the-Loop

Built-in support for pausing agent runs for human approval, inspection, or intervention through the RunState system.

6. Extensibility

The SDK is designed to be extended through:

  • Custom tools
  • Custom model providers
  • Custom guardrails
  • Lifecycle hooks
  • Extensions (MCP, memory backends, etc.)

Module Organization

Core Modules (src/agents/)
  • agent.py - Agent and AgentBase classes
  • run.py - Runner and execution orchestration
  • tool.py - Tool system and built-in tools
  • items.py - Run items (messages, tool calls, etc.)
  • guardrail.py - Input and output guardrails
  • handoffs/ - Agent delegation system
  • memory/ - Session and conversation persistence
  • models/ - Model provider implementations
  • sandbox/ - Isolated execution environments
  • tracing/ - Observability and debugging
  • run_internal/ - Internal runtime implementation
  • extensions/ - Additional functionality (MCP, memory, etc.)

Data Flow

Basic Agent Execution Flow
  1. Input - User provides input (string or list of items)
  2. Context Setup - RunContextWrapper is created with user context
  3. Guardrails - Input guardrails run (if configured)
  4. Model Call - Agent calls LLM with instructions and tools
  5. Tool Execution - If model calls tools, they are executed
  6. Output Guardrails - Output guardrails run (if configured)
  7. Result - Final output is returned
  8. Session Update - Conversation history is saved (if session is configured)
Streaming Flow

The streaming flow follows the same path but yields events as they happen:

  • Model response chunks
  • Tool call events
  • Tool output events
  • Final result

Key Abstractions

Agent

An agent is the main building block. It represents an AI assistant with:

  • Instructions (system prompt)
  • Tools it can use
  • Guardrails for safety
  • Handoffs to other agents
  • Output schema for structured responses
Runner

The Runner is responsible for executing agents. It handles:

  • Turn management (multi-turn conversations)
  • Tool execution
  • Handoff delegation
  • Session persistence
  • Error handling
  • Streaming
Tool

Tools are functions that agents can call. They can be:

  • Function tools (Python functions)
  • Hosted tools (OpenAI tools like file search, web search)
  • MCP tools (from Model Context Protocol servers)
  • Agent tools (other agents exposed as tools)
  • Shell tools (command execution)
  • Computer tools (computer interaction)
Item

Items represent units of work in an agent run:

  • MessageOutputItem - Messages from the LLM
  • ToolCallItem - Tool calls made by the LLM
  • ToolCallOutputItem - Results from tool execution
  • HandoffCallItem - Handoff to another agent
  • ReasoningItem - Model reasoning (for reasoning models)
RunState

RunState captures the complete state of an agent run, enabling:

  • Pause and resume
  • Human-in-the-loop workflows
  • State inspection
  • Debugging

Configuration Hierarchy

Configuration flows from most specific to most general:

  1. Agent-level - Settings on the Agent instance
  2. RunConfig-level - Settings for a specific run
  3. Global defaults - Default settings for the SDK

This allows fine-grained control while maintaining sensible defaults.

Error Handling

The SDK uses a structured error handling approach:

  1. UserError - Errors due to user input or configuration
  2. ModelBehaviorError - Errors from unexpected model behavior
  3. AgentsException - Base exception for SDK errors
  4. ToolTimeoutError - Tool execution timeout
  5. MaxTurnsExceeded - Agent exceeded maximum turns
  6. Guardrail Tripwires - Guardrail violations

Testing Strategy

The test suite uses:

  • pytest for test execution
  • inline-snapshot for snapshot tests
  • pytest-asyncio for async test support
  • coverage.py for coverage tracking

Tests are organized by functionality:

  • test_agent_*.py - Agent-specific tests
  • test_run_*.py - Runner and execution tests
  • test_tool_*.py - Tool system tests
  • test_guardrails.py - Guardrail tests
  • test_handoff_*.py - Handoff tests
  • Integration tests in examples/

Documentation

Documentation is built with MkDocs and Material theme:

  • Source in docs/
  • Built output in site/
  • API reference auto-generated from docstrings
  • Multi-language support (ja, ko, zh)

Development Workflow

  1. Setup - uv sync to install dependencies
  2. Development - Make changes with uv run python
  3. Testing - make tests to run test suite
  4. Linting - make lint and make format
  5. Type checking - make typecheck
  6. Documentation - make build-docs
  7. Coverage - make coverage

Performance Considerations

  • Async execution - Tools and guardrails run concurrently when possible
  • Connection pooling - Reuse HTTP connections for model calls
  • Prompt caching - Support for prompt cache keys
  • Session compaction - Intelligent conversation history management
  • Tool use tracking - Prevent infinite loops

Security Considerations

  • Input validation - Guardrails for input sanitization
  • Tool approval - Human approval for sensitive tools
  • Sandbox isolation - Isolated execution for file system access
  • API key management - Secure handling of API keys
  • Tracing controls - Options to exclude sensitive data from traces

Next Steps

This overview provides a high-level understanding of the SDK. The following documents dive deep into each component:

  1. Agent System - Deep dive into agents
  2. Runner System - Execution engine
  3. Tool System - Tools and tool execution
  4. Items System - Run items and events
  5. Guardrails - Input/output validation
  6. Handoffs - Agent delegation
  7. Memory & Sessions - Conversation persistence
  8. Model Providers - LLM integration
  9. Sandbox System - Isolated execution
  10. Tracing - Observability
  11. Run State - State management
  12. Context - Execution context
  13. Lifecycle Hooks - Event callbacks
  14. Configuration - Settings and config
  15. Error Handling - Error management
  16. Streaming - Real-time updates
  17. Extensions - Extending the SDK
  18. MCP Integration - Model Context Protocol
  19. Best Practices - Usage patterns
  20. Architecture Patterns - Design patterns