InkdownInkdown
Start writing

Study

59 filesยท8 subfolders

Shared Workspace

Study
core

07_MEMORY_SESSIONS

Shared from "Study" on Inkdown

Memory & Sessions - Comprehensive Deep Dive

Overview

Memory and Sessions in the OpenAI Agents SDK enable conversation persistence across agent runs. Think of sessions as "conversation memory" - they allow agents to remember previous interactions, maintain context over time, and provide continuity in multi-turn conversations. This is essential for building chatbots, assistants, and any application where conversation history matters.

Core Concepts

What is a Session?

A session is a persistent store of conversation history between a user and an agent (or multiple agents). It:

  • Stores all messages, tool calls, and outputs from a conversation
  • Retrieves relevant history when resuming a conversation
  • Manages conversation length and token usage
  • Compacts long conversations to preserve context while reducing tokens
Why Sessions Matter
programming-language-concepts.md
zero-language-explanation.md
DB
01-introduction.md
02-relational-databases.md
03-database-design.md
04-indexing.md
05-transactions-acid.md
06-nosql-databases.md
07-query-optimization.md
08-replication-ha.md
09-sharding-partitioning.md
10-caching-strategies.md
11-cap-theorem.md
12-connection-pooling.md
13-backup-recovery.md
14-monitoring.md
15-database-selection.md
README.md
JS
Event loop
Merlin Backend
01-Orchestration.md
02-DeepResearch.md
03-Search.md
04-Scraping.md
05-Streaming.md
06-MultiProviderLLM.md
07-MemoryAndContext.md
08-ErrorHandling.md
09-RateLimiting.md
10-TaskQueue.md
11-SecurityAndAuth.md
Orchestration-2nd-draft
OpenAI Agents Python
00_OVERVIEW.md
01_AGENT_SYSTEM.md
02_RUNNER_SYSTEM.md
03_TOOL_SYSTEM.md
04_ITEMS_SYSTEM.md
05_GUARDRAILS.md
06_HANDOFFS.md
07_MEMORY_SESSIONS.md
08_MODEL_PROVIDERS.md
09_SANDBOX_SYSTEM.md
10_TRACING.md
11_RUN_STATE.md
12_CONTEXT.md
13_LIFECYCLE_HOOKS.md
14_CONFIGURATION.md
15_ERROR_HANDLING.md
16_STREAMING.md
17_EXTENSIONS.md
18_MCP_INTEGRATION.md
19_BEST_PRACTICES.md
20_ARCHITECTURE_PATTERNS.md
opencode-study
context-handling
core
Python
Alembic
Basics
sqlalchemy - fastapi
SQLAlchemy overview
tweets
system_design_for_agentic_apps.md
  • Context Continuity - Agents remember previous interactions
  • Multi-Turn Conversations - Enable back-and-forth dialogue
  • Token Efficiency - Intelligent history management
  • Conversation Analytics - Track conversation patterns
  • Resume Capability - Pause and resume conversations
  • Multi-Agent History - Track which agent said what
  • Session Types

    The SDK supports several session implementations:

    1. SQLiteSession - Local SQLite database storage
    2. OpenAIConversationsSession - OpenAI's server-managed conversations
    3. OpenAIResponsesCompactionSession - OpenAI Responses API with compaction
    4. Custom Sessions - Implement your own session backend

    Session Interface

    Session ABC

    All sessions implement the SessionABC base class:

    Python
    class SessionABC(abc.ABC):
        @abc.abstractmethod
        async def save_items(
            self,
            conversation_id: str,
            items: list[TResponseInputItem],
        ) -> None:
            """Save items to the session."""
            pass
        
        @abc.abstractmethod
        async def load_items(
            self,
            conversation_id: str,
            *,
            max_items: int | None = None,
        ) -> list[TResponseInputItem]:
            """Load items from the session."""
            pass

    Key Methods:

    • save_items() - Store conversation items
    • load_items() - Retrieve conversation items
    Session Settings

    Configure session behavior:

    Python
    @dataclass
    class SessionSettings:
        max_items: int | None = None
        """Maximum number of items to retrieve from session."""
        
        compaction_enabled: bool = False
        """Whether to enable conversation compaction."""
        
        compaction_threshold: int = 100
        """Number of items before compaction triggers."""

    SQLite Session

    Basic Usage
    Python
    from agents import Agent, SQLiteSession, Runner
    
    # Create a session
    session = SQLiteSession(db_path="conversations.db")
    
    agent = Agent(
        name="assistant",
        instructions="You are a helpful assistant",
    )
    
    # First message
    result1 = await Runner.run(
        agent,
        "Hello, I'm John",
        session=session,
        conversation_id="user-123",
    )
    
    # Second message (with history)
    result2 = await Runner.run(
        agent,
        "What's my name?",
        session=session,
        conversation_id="user-123",
    )
    # Agent remembers: "Your name is John"
    Session Initialization
    Python
    from agents import SQLiteSession
    
    # Create with custom path
    session = SQLiteSession(db_path="/path/to/database.db")
    
    # Create with in-memory database (for testing)
    session = SQLiteSession(db_path=":memory:")
    Conversation IDs

    Conversation IDs identify unique conversations:

    Python
    # Each user/conversation gets a unique ID
    await Runner.run(agent, input, session=session, conversation_id="conv-123")
    await Runner.run(agent, input, session=session, conversation_id="conv-456")

    Best practices:

    • Use user IDs for per-user conversations
    • Use thread IDs for conversation threads
    • Use UUIDs for unique identifiers
    • Avoid hardcoded IDs in production
    Session Limits

    Control how much history is loaded:

    Python
    from agents import SessionSettings
    
    settings = SessionSettings(max_items=50)
    
    result = await Runner.run(
        agent,
        input,
        session=session,
        session_settings=settings,
    )

    Why limit items:

    • Reduce token usage
    • Improve response time
    • Focus on recent context
    • Manage memory

    OpenAI Conversations Session

    Server-Managed Conversations

    OpenAI provides server-managed conversation storage:

    Python
    from agents import Agent, OpenAIConversationsSession, Runner
    
    session = OpenAIConversationsSession()
    
    agent = Agent(
        name="assistant",
        instructions="You are a helpful assistant",
    )
    
    result = await Runner.run(
        agent,
        "Hello",
        session=session,
        conversation_id="conv-123",
    )

    Benefits:

    • No local storage needed
    • Automatic prompt caching
    • Better performance
    • Managed by OpenAI

    How it works:

    • OpenAI stores conversation history on their servers
    • Only deltas (new items) are sent
    • The server maintains full context
    • Session persistence is handled automatically
    When to Use

    Use OpenAIConversationsSession when:

    • You want to reduce local storage
    • You're using OpenAI's Responses API
    • You want automatic prompt caching
    • You don't need custom session logic

    Use SQLiteSession when:

    • You need local control over data
    • You want custom session logic
    • You're not using OpenAI exclusively
    • You need to analyze conversation data

    OpenAI Responses Compaction Session

    Intelligent Compaction

    The Responses API supports intelligent conversation compaction:

    Python
    from agents import Agent, OpenAIResponsesCompactionSession, Runner
    
    session = OpenAIResponsesCompactionSession()
    
    agent = Agent(
        name="assistant",
        instructions="You are a helpful assistant",
    )
    
    result = await Runner.run(
        agent,
        "Hello",
        session=session,
        conversation_id="conv-123",
    )

    How compaction works:

    • Older messages are summarized
    • Key information is preserved
    • Token usage is reduced
    • Context is maintained
    Compaction Configuration
    Python
    from agents import OpenAIResponsesCompactionArgs
    
    args = OpenAIResponsesCompactionArgs(
        target_items=50,  # Target number of items after compaction
        min_summary_tokens=100,  # Minimum tokens for summaries
    )
    
    session = OpenAIResponsesCompactionSession(compaction_args=args)

    Session Input Callback

    Custom Input Handling

    Control how new input is combined with session history:

    Python
    from agents import SessionInputCallback
    
    def custom_input_callback(
        history: list[TResponseInputItem],
        new_input: str | list[TResponseInputItem],
    ) -> list[TResponseInputItem]:
        """Custom logic for combining history and new input."""
        # Only keep last 10 items from history
        recent_history = history[-10:]
        
        # Add new input
        if isinstance(new_input, str):
            recent_history.append({"type": "user", "content": new_input})
        else:
            recent_history.extend(new_input)
        
        return recent_history
    
    result = await Runner.run(
        agent,
        input,
        session=session,
        session_input_callback=custom_input_callback,
    )

    Use cases:

    • Custom history filtering
    • Special input processing
    • Context window management
    • Custom formatting

    Session and Agent Lifecycle

    Session Persistence Across Runs
    Python
    # Run 1
    result1 = await Runner.run(
        agent,
        "Hello",
        session=session,
        conversation_id="conv-123",
    )
    
    # Run 2 - history is automatically loaded
    result2 = await Runner.run(
        agent,
        "What did I say before?",
        session=session,
        conversation_id="conv-123",
    )
    # Agent can reference previous messages
    Session with Handoffs

    Sessions track which agent generated each item:

    Python
    triage_agent = Agent(name="triage", ...)
    specialist = Agent(name="specialist", handoffs=[handoff(specialist)])
    
    result = await Runner.run(
        triage_agent,
        input,
        session=session,
        conversation_id="conv-123",
    )
    
    # Session includes items from both agents
    items = await session.load_items("conv-123")
    for item in items:
        print(f"Agent: {item.agent.name}")
    Session with Multiple Conversations
    Python
    # User A's conversation
    await Runner.run(agent, input, session=session, conversation_id="user-a")
    
    # User B's conversation (separate history)
    await Runner.run(agent, input, session=session, conversation_id="user-b")
    
    # Each conversation has independent history

    Session Implementation Details

    Item Storage

    Sessions store items as input items:

    Python
    # Items are converted to input format before storage
    input_items = run_items_to_input_items(run_result.new_items)
    
    await session.save_items(
        conversation_id="conv-123",
        items=input_items,
    )
    Item Retrieval

    Sessions retrieve items with optional limits:

    Python
    # Retrieve all items
    all_items = await session.load_items("conv-123")
    
    # Retrieve last 50 items
    recent_items = await session.load_items("conv-123", max_items=50)
    Session and RunState

    RunState includes session information:

    Python
    state = result.to_state()
    # state._session_items contains items from the session
    # state._conversation_id contains the conversation ID

    Custom Session Implementation

    Creating a Custom Session

    Implement the SessionABC interface:

    Python
    from agents import SessionABC, TResponseInputItem
    from typing import Optional
    
    class CustomSession(SessionABC):
        def __init__(self, backend: Any):
            self.backend = backend
        
        async def save_items(
            self,
            conversation_id: str,
            items: list[TResponseInputItem],
        ) -> None:
            """Save items to custom backend."""
            await self.backend.save(conversation_id, items)
        
        async def load_items(
            self,
            conversation_id: str,
            *,
            max_items: Optional[int] = None,
        ) -> list[TResponseInputItem]:
            """Load items from custom backend."""
            items = await self.backend.load(conversation_id)
            if max_items:
                return items[-max_items:]
            return items
    Using Custom Session
    Python
    backend = MyCustomBackend()
    session = CustomSession(backend)
    
    result = await Runner.run(
        agent,
        input,
        session=session,
        conversation_id="conv-123",
    )

    Use cases:

    • Redis/MongoDB storage
    • Cloud storage (S3, etc.)
    • Custom encryption
    • Distributed systems

    Session and Tracing

    Session Metadata in Traces

    Sessions include metadata in traces:

    Python
    result = await Runner.run(
        agent,
        input,
        session=session,
        conversation_id="conv-123",
    )
    
    # Trace includes session information
    print(result.trace.metadata)
    # {"conversation_id": "conv-123", "session_type": "SQLiteSession"}
    Session Analytics

    Analyze session data:

    Python
    items = await session.load_items("conv-123")
    
    # Count messages
    message_count = len([i for i in items if i["type"] == "user"])
    
    # Estimate token usage
    token_estimate = estimate_tokens(items)
    
    # Analyze patterns
    patterns = analyze_conversation_patterns(items)

    Session Best Practices

    1. Use Unique Conversation IDs

    Always use unique identifiers:

    Python
    # Good - unique per user/conversation
    conversation_id = f"user-{user_id}-conv-{uuid.uuid4()}"
    
    # Avoid - hardcoded or predictable
    conversation_id = "conversation-1"  # Will collide
    2. Set Appropriate Limits

    Configure session limits based on your needs:

    Python
    # Good - reasonable limits
    settings = SessionSettings(max_items=100)
    
    # Avoid - too high (token waste)
    settings = SessionSettings(max_items=10000)
    
    # Avoid - too low (loss of context)
    settings = SessionSettings(max_items=5)
    3. Handle Session Errors

    Handle session failures gracefully:

    Python
    # Good
    try:
        items = await session.load_items(conversation_id)
    except SessionError as e:
        logger.error(f"Session error: {e}")
        items = []  # Start fresh
    
    # Avoid - crashes on session error
    items = await session.load_items(conversation_id)  # Might crash
    4. Clean Up Old Sessions

    Implement cleanup for old sessions:

    Python
    # Clean up sessions older than 30 days
    async def cleanup_old_sessions():
        cutoff_date = datetime.now() - timedelta(days=30)
        old_sessions = await find_sessions_before(cutoff_date)
        for session_id in old_sessions:
            await session.delete(session_id)
    5. Monitor Session Performance

    Track session performance:

    Python
    import time
    
    start = time.time()
    items = await session.load_items(conversation_id)
    duration = time.time() - start
    
    if duration > 1.0:  # More than 1 second
        logger.warning(f"Slow session load: {duration}s for {conversation_id}")

    Session Patterns

    1. Per-User Sessions

    Each user gets their own conversation history:

    Python
    async def handle_user_message(user_id: str, message: str):
        conversation_id = f"user-{user_id}"
        result = await Runner.run(
            agent,
            message,
            session=session,
            conversation_id=conversation_id,
        )
        return result.final_output
    2. Thread-Based Sessions

    Each conversation thread gets its own history:

    Python
    async def handle_thread_message(thread_id: str, message: str):
        conversation_id = f"thread-{thread_id}"
        result = await Runner.run(
            agent,
            message,
            session=session,
            conversation_id=conversation_id,
        )
        return result.final_output
    3. Session with Context Window Management

    Manage context window with sessions:

    Python
    def calculate_max_items(context_window: int) -> int:
        """Calculate max items based on context window."""
        avg_tokens_per_item = 100
        return context_window // avg_tokens_per_item
    
    settings = SessionSettings(
        max_items=calculate_max_items(128000),  # 128k context window
    )
    4. Session with Encryption

    Encrypt sensitive session data:

    Python
    class EncryptedSession(SessionABC):
        def __init__(self, backend: Any, encryption_key: str):
            self.backend = backend
            self.cipher = Fernet(encryption_key)
        
        async def save_items(self, conversation_id: str, items: list):
            encrypted = self.cipher.encrypt(json.dumps(items).encode())
            await self.backend.save(conversation_id, encrypted)
        
        async def load_items(self, conversation_id: str, *, max_items=None):
            encrypted = await self.backend.load(conversation_id)
            decrypted = json.loads(self.cipher.decrypt(encrypted))
            if max_items:
                return decrypted[-max_items:]
            return decrypted
    5. Session with Analytics

    Track conversation analytics:

    Python
    class AnalyticsSession(SessionABC):
        def __init__(self, backend: Any, analytics: Analytics):
            self.backend = backend
            self.analytics = analytics
        
        async def save_items(self, conversation_id: str, items: list):
            await self.backend.save(conversation_id, items)
            
            # Track analytics
            await self.analytics.track(
                conversation_id=conversation_id,
                item_count=len(items),
                timestamp=datetime.now(),
            )

    Session and Server-Managed Conversations

    Compatibility

    Server-managed conversations have limitations:

    Python
    # Server-managed conversations don't support:
    # - Custom input filters
    # - Handoff input filters
    # - Session compaction (handled by server)
    
    # They do support:
    # - Automatic history management
    # - Prompt caching
    # - Better performance
    When to Avoid Server-Managed

    Avoid when:

    • You need custom session logic
    • You need to analyze conversation data locally
    • You're not using OpenAI exclusively
    • You need fine-grained control

    Session and Memory Rollouts

    Sandbox Memory Rollouts

    For sandbox agents, sessions can include memory rollouts:

    Python
    from agents import SandboxAgent, SandboxRunConfig
    
    agent = SandboxAgent(
        name="sandbox_agent",
        instructions="Work in the sandbox",
    )
    
    config = SandboxRunConfig(
        client=UnixLocalSandboxClient(),
        memory_rollout_id="rollout-123",  # Memory rollout ID
    )
    
    result = await Runner.run(
        agent,
        input,
        run_config=config,
        session=session,
    )

    Session Error Handling

    Session Errors

    Handle session-related errors:

    Python
    from agents import SessionError
    
    try:
        items = await session.load_items(conversation_id)
    except SessionError as e:
        logger.error(f"Session error: {e}")
        # Fallback to no history
        items = []
    Recovery Strategies

    Implement recovery strategies:

    Python
    async def load_with_fallback(conversation_id: str):
        """Load session with fallback."""
        try:
            return await session.load_items(conversation_id)
        except SessionError:
            # Try to recover from backup
            try:
                return await backup_session.load_items(conversation_id)
            except SessionError:
                # Start fresh
                return []

    Session Performance

    Optimization Techniques

    Optimize session performance:

    Python
    # 1. Use connection pooling
    session = SQLiteSession(
        db_path="conversations.db",
        pool_size=10,
    )
    
    # 2. Batch operations
    await session.save_batch([
        (conv_id_1, items_1),
        (conv_id_2, items_2),
    ])
    
    # 3. Use indexes
    session.create_index("conversation_id")
    
    # 4. Cache frequently accessed sessions
    cache = LRUCache(maxsize=100)
    Monitoring Session Performance

    Track session metrics:

    Python
    class MonitoredSession(SessionABC):
        def __init__(self, wrapped_session: SessionABC):
            self.wrapped = wrapped_session
            self.metrics = {}
        
        async def load_items(self, conversation_id: str, *, max_items=None):
            start = time.time()
            items = await self.wrapped.load_items(conversation_id, max_items=max_items)
            duration = time.time() - start
            
            self.metrics[conversation_id] = {
                "load_time": duration,
                "item_count": len(items),
            }
            
            return items

    Summary

    Sessions enable conversation persistence. Key takeaways:

    1. Sessions store conversation history
    2. SQLiteSession provides local SQLite storage
    3. OpenAIConversationsSession uses server-managed storage
    4. Conversation IDs identify unique conversations
    5. Session limits control history size
    6. Compaction reduces token usage for long conversations
    7. Input callbacks customize history combination
    8. Custom sessions can implement custom backends
    9. SessionABC defines the session interface
    10. Item storage uses input item format
    11. Item retrieval supports optional limits
    12. RunState includes session information
    13. Handoffs are tracked across agents
    14. Multiple conversations can coexist
    15. Tracing includes session metadata
    16. Analytics can track conversation patterns
    17. Encryption can protect sensitive data
    18. Performance can be optimized
    19. Error handling should be graceful
    20. Cleanup prevents data bloat

    Sessions are essential for building conversational agents that remember context across interactions.