Memory and Sessions in the OpenAI Agents SDK enable conversation persistence across agent runs. Think of sessions as "conversation memory" - they allow agents to remember previous interactions, maintain context over time, and provide continuity in multi-turn conversations. This is essential for building chatbots, assistants, and any application where conversation history matters.
Core Concepts
What is a Session?
A session is a persistent store of conversation history between a user and an agent (or multiple agents). It:
Stores all messages, tool calls, and outputs from a conversation
Retrieves relevant history when resuming a conversation
Manages conversation length and token usage
Compacts long conversations to preserve context while reducing tokens
OpenAIResponsesCompactionSession - OpenAI Responses API with compaction
Custom Sessions - Implement your own session backend
Session Interface
Session ABC
All sessions implement the SessionABC base class:
Python
classSessionABC(abc.ABC):
@abc.abstractmethodasyncdefsave_items(
self,
conversation_id: str,
items: list[TResponseInputItem],
) -> None:
"""Save items to the session."""pass @abc.abstractmethodasyncdefload_items(
self,
conversation_id: str,
*,
max_items: int | None = None,
) -> list[TResponseInputItem]:
"""Load items from the session."""pass
Key Methods:
save_items() - Store conversation items
load_items() - Retrieve conversation items
Session Settings
Configure session behavior:
Python
@dataclassclassSessionSettings:
max_items: int | None = None"""Maximum number of items to retrieve from session."""
compaction_enabled: bool = False"""Whether to enable conversation compaction."""
compaction_threshold: int = 100"""Number of items before compaction triggers."""
SQLite Session
Basic Usage
Python
from agents import Agent, SQLiteSession, Runner
# Create a session
session = SQLiteSession(db_path="conversations.db")
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
)
# First message
result1 = await Runner.run(
agent,
"Hello, I'm John",
session=session,
conversation_id="user-123",
)
# Second message (with history)
result2 = await Runner.run(
agent,
"What's my name?",
session=session,
conversation_id="user-123",
)
# Agent remembers: "Your name is John"
Session Initialization
Python
from agents import SQLiteSession
# Create with custom path
session = SQLiteSession(db_path="/path/to/database.db")
# Create with in-memory database (for testing)
session = SQLiteSession(db_path=":memory:")
Conversation IDs
Conversation IDs identify unique conversations:
Python
# Each user/conversation gets a unique IDawait Runner.run(agent, input, session=session, conversation_id="conv-123")
await Runner.run(agent, input, session=session, conversation_id="conv-456")
Best practices:
Use user IDs for per-user conversations
Use thread IDs for conversation threads
Use UUIDs for unique identifiers
Avoid hardcoded IDs in production
Session Limits
Control how much history is loaded:
Python
from agents import SessionSettings
settings = SessionSettings(max_items=50)
result = await Runner.run(
agent,
input,
session=session,
session_settings=settings,
)
from agents import Agent, OpenAIConversationsSession, Runner
session = OpenAIConversationsSession()
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
)
result = await Runner.run(
agent,
"Hello",
session=session,
conversation_id="conv-123",
)
Benefits:
No local storage needed
Automatic prompt caching
Better performance
Managed by OpenAI
How it works:
OpenAI stores conversation history on their servers
Only deltas (new items) are sent
The server maintains full context
Session persistence is handled automatically
When to Use
Use OpenAIConversationsSession when:
You want to reduce local storage
You're using OpenAI's Responses API
You want automatic prompt caching
You don't need custom session logic
Use SQLiteSession when:
You need local control over data
You want custom session logic
You're not using OpenAI exclusively
You need to analyze conversation data
OpenAI Responses Compaction Session
Intelligent Compaction
The Responses API supports intelligent conversation compaction:
Python
from agents import Agent, OpenAIResponsesCompactionSession, Runner
session = OpenAIResponsesCompactionSession()
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
)
result = await Runner.run(
agent,
"Hello",
session=session,
conversation_id="conv-123",
)
How compaction works:
Older messages are summarized
Key information is preserved
Token usage is reduced
Context is maintained
Compaction Configuration
Python
from agents import OpenAIResponsesCompactionArgs
args = OpenAIResponsesCompactionArgs(
target_items=50, # Target number of items after compaction
min_summary_tokens=100, # Minimum tokens for summaries
)
session = OpenAIResponsesCompactionSession(compaction_args=args)
Session Input Callback
Custom Input Handling
Control how new input is combined with session history:
Python
from agents import SessionInputCallback
defcustom_input_callback(
history: list[TResponseInputItem],
new_input: str | list[TResponseInputItem],
) -> list[TResponseInputItem]:
"""Custom logic for combining history and new input."""# Only keep last 10 items from history
recent_history = history[-10:]
# Add new inputifisinstance(new_input, str):
recent_history.append({"type": "user", "content": new_input})
else:
recent_history.extend(new_input)
return recent_history
result = await Runner.run(
agent,
input,
session=session,
session_input_callback=custom_input_callback,
)
Use cases:
Custom history filtering
Special input processing
Context window management
Custom formatting
Session and Agent Lifecycle
Session Persistence Across Runs
Python
# Run 1
result1 = await Runner.run(
agent,
"Hello",
session=session,
conversation_id="conv-123",
)
# Run 2 - history is automatically loaded
result2 = await Runner.run(
agent,
"What did I say before?",
session=session,
conversation_id="conv-123",
)
# Agent can reference previous messages
Session with Handoffs
Sessions track which agent generated each item:
Python
triage_agent = Agent(name="triage", ...)
specialist = Agent(name="specialist", handoffs=[handoff(specialist)])
result = await Runner.run(
triage_agent,
input,
session=session,
conversation_id="conv-123",
)
# Session includes items from both agents
items = await session.load_items("conv-123")
for item in items:
print(f"Agent: {item.agent.name}")
Session with Multiple Conversations
Python
# User A's conversationawait Runner.run(agent, input, session=session, conversation_id="user-a")
# User B's conversation (separate history)await Runner.run(agent, input, session=session, conversation_id="user-b")
# Each conversation has independent history
Session Implementation Details
Item Storage
Sessions store items as input items:
Python
# Items are converted to input format before storage
input_items = run_items_to_input_items(run_result.new_items)
await session.save_items(
conversation_id="conv-123",
items=input_items,
)
Item Retrieval
Sessions retrieve items with optional limits:
Python
# Retrieve all items
all_items = await session.load_items("conv-123")
# Retrieve last 50 items
recent_items = await session.load_items("conv-123", max_items=50)
Session and RunState
RunState includes session information:
Python
state = result.to_state()
# state._session_items contains items from the session# state._conversation_id contains the conversation ID
result = await Runner.run(
agent,
input,
session=session,
conversation_id="conv-123",
)
# Trace includes session informationprint(result.trace.metadata)
# {"conversation_id": "conv-123", "session_type": "SQLiteSession"}
Session Analytics
Analyze session data:
Python
items = await session.load_items("conv-123")
# Count messages
message_count = len([i for i in items if i["type"] == "user"])
# Estimate token usage
token_estimate = estimate_tokens(items)
# Analyze patterns
patterns = analyze_conversation_patterns(items)
Session Best Practices
1. Use Unique Conversation IDs
Always use unique identifiers:
Python
# Good - unique per user/conversation
conversation_id = f"user-{user_id}-conv-{uuid.uuid4()}"# Avoid - hardcoded or predictable
conversation_id = "conversation-1"# Will collide
2. Set Appropriate Limits
Configure session limits based on your needs:
Python
# Good - reasonable limits
settings = SessionSettings(max_items=100)
# Avoid - too high (token waste)
settings = SessionSettings(max_items=10000)
# Avoid - too low (loss of context)
settings = SessionSettings(max_items=5)
# Clean up sessions older than 30 daysasyncdefcleanup_old_sessions():
cutoff_date = datetime.now() - timedelta(days=30)
old_sessions = await find_sessions_before(cutoff_date)
for session_id in old_sessions:
await session.delete(session_id)
5. Monitor Session Performance
Track session performance:
Python
import time
start = time.time()
items = await session.load_items(conversation_id)
duration = time.time() - start
if duration > 1.0: # More than 1 second
logger.warning(f"Slow session load: {duration}s for {conversation_id}")
# Server-managed conversations don't support:# - Custom input filters# - Handoff input filters# - Session compaction (handled by server)# They do support:# - Automatic history management# - Prompt caching# - Better performance
When to Avoid Server-Managed
Avoid when:
You need custom session logic
You need to analyze conversation data locally
You're not using OpenAI exclusively
You need fine-grained control
Session and Memory Rollouts
Sandbox Memory Rollouts
For sandbox agents, sessions can include memory rollouts:
Python
from agents import SandboxAgent, SandboxRunConfig
agent = SandboxAgent(
name="sandbox_agent",
instructions="Work in the sandbox",
)
config = SandboxRunConfig(
client=UnixLocalSandboxClient(),
memory_rollout_id="rollout-123", # Memory rollout ID
)
result = await Runner.run(
agent,
input,
run_config=config,
session=session,
)
Session Error Handling
Session Errors
Handle session-related errors:
Python
from agents import SessionError
try:
items = await session.load_items(conversation_id)
except SessionError as e:
logger.error(f"Session error: {e}")
# Fallback to no history
items = []
Recovery Strategies
Implement recovery strategies:
Python
asyncdefload_with_fallback(conversation_id: str):
"""Load session with fallback."""try:
returnawait session.load_items(conversation_id)
except SessionError:
# Try to recover from backuptry:
returnawait backup_session.load_items(conversation_id)
except SessionError:
# Start freshreturn []