Model Providers are the abstraction layer that connects the OpenAI Agents SDK to various Large Language Model (LLM) APIs. Think of Model Providers as "translators" or "adapters" - they translate the SDK's standardized requests into the specific format required by different LLM providers (OpenAI, Anthropic, Google, etc.).
Core Concepts
What is a Model Provider?
A Model Provider is responsible for:
Resolving model names to concrete Model instances
Managing model connections and resources
Providing a consistent interface across different LLM APIs
Handling provider-specific features and quirks
Why Model Providers Matter
Provider Agnostic - Switch between LLM providers without changing agent code
The ModelProvider interface defines how models are resolved:
Python
classModelProvider(abc.ABC):
"""The base interface for a model provider.""" @abc.abstractmethoddefget_model(self, model_name: str | None) -> Model:
"""Get a model by name."""passasyncdefaclose(self) -> None:
"""Release any resources held by the provider."""returnNone
OpenAI Provider
OpenAIProvider
The default provider for OpenAI models:
Python
from agents import Agent, OpenAIProvider, Runner
provider = OpenAIProvider()
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
model="gpt-4o",
)
result = await Runner.run(
agent,
"Hello",
model_provider=provider,
)
OpenAI Chat Completions Model
Uses the OpenAI Chat Completions API:
Python
from agents import Agent, OpenAIChatCompletionsModel, Runner
model = OpenAIChatCompletionsModel(model="gpt-4o")
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
)
result = await Runner.run(
agent,
"Hello",
model=model,
)
When to use:
You want to use the Chat Completions API
You need compatibility with older OpenAI integrations
You want more control over the API
OpenAI Responses Model
Uses the newer OpenAI Responses API:
Python
from agents import Agent, OpenAIResponsesModel, Runner
model = OpenAIResponsesModel(model="gpt-4o")
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
)
result = await Runner.run(
agent,
"Hello",
model=model,
)
When to use:
You want the latest OpenAI features
You need better tool support
You want server-managed conversations
You need prompt caching
OpenAI Responses WebSocket Model
Uses WebSocket transport for Responses API:
Python
from agents import Agent, OpenAIResponsesWSModel, Runner
model = OpenAIResponsesWSModel(model="gpt-4o")
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
)
result = await Runner.run(
agent,
"Hello",
model=model,
)
When to use:
You want real-time streaming
You need lower latency
You're building real-time applications
Default Model Selection
The SDK has a default model:
Python
from agents import get_default_model_settings
# Default is currently gpt-4.1
settings = get_default_model_settings()
print(settings.model) # "gpt-4.1"
GPT-5 Special Handling
GPT-5 models require special reasoning settings:
Python
from agents import Agent, ModelSettings
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
model="gpt-5-preview",
model_settings=ModelSettings(
# GPT-5 requires specific settings
reasoning_effort="high",
),
)
The SDK automatically adjusts settings when you specify a GPT-5 model.
MultiProvider
Using Multiple Providers
MultiProvider allows using multiple model providers:
Python
from agents import Agent, MultiProvider, OpenAIProvider, AnthropicProvider
provider = MultiProvider([
OpenAIProvider(),
AnthropicProvider(),
])
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
model="gpt-4o", # Will use OpenAI
)
Provider Priority
Providers are tried in order:
Python
provider = MultiProvider([
OpenAIProvider(), # Tried first
AnthropicProvider(), # Tried if OpenAI fails
GoogleProvider(), # Tried if both fail
])
Model Name Resolution
Different providers use different model names:
Python
provider = MultiProvider([
OpenAIProvider(),
AnthropicProvider(),
])
# This will try to resolve "gpt-4o" with OpenAI first# If that fails, it will try with Anthropic (which won't have "gpt-4o")
model = provider.get_model("gpt-4o")
Custom Model Providers
Creating a Custom Provider
Implement the ModelProvider interface:
Python
from agents import ModelProvider, Model
classCustomProvider(ModelProvider):
def__init__(self, api_key: str):
self.api_key = api_key
self.client = CustomClient(api_key)
defget_model(self, model_name: str | None) -> Model:
"""Get a model instance."""return CustomModel(
model_name or"default-model",
self.client,
)
asyncdefaclose(self) -> None:
"""Release resources."""awaitself.client.close()
provider = CustomProvider(api_key="your-api-key")
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
)
result = await Runner.run(
agent,
"Hello",
model_provider=provider,
model="custom-model",
)
Model Settings
ModelSettings Class
Configure model-specific parameters:
Python
from agents import ModelSettings
settings = ModelSettings(
temperature=0.7, # 0.0 - 2.0, higher = more creative
top_p=0.9, # 0.0 - 1.0, nucleus sampling
max_tokens=1000, # Maximum tokens in response
presence_penalty=0.0, # -2.0 - 2.0
frequency_penalty=0.0, # -2.0 - 2.0
)
Temperature
Controls randomness:
Python
# Low temperature - more deterministic
settings = ModelSettings(temperature=0.1)
# High temperature - more creative
settings = ModelSettings(temperature=1.5)
Use cases:
Low (0.0-0.3): Factual responses, code generation
Medium (0.4-0.7): General conversation
High (0.8-1.5): Creative writing, brainstorming
Max Tokens
Limit response length:
Python
# Short responses
settings = ModelSettings(max_tokens=100)
# Long responses
settings = ModelSettings(max_tokens=4000)
Penalties
Control repetition:
Python
# Presence penalty - encourage new topics
settings = ModelSettings(presence_penalty=0.5)
# Frequency penalty - discourage repetition
settings = ModelSettings(frequency_penalty=0.5)
Model Tracing
ModelTracing Enum
Controls tracing behavior:
Python
from agents import ModelTracing
# Tracing disabled
tracing = ModelTracing.DISABLED
# Tracing enabled with data
tracing = ModelTracing.ENABLED
# Tracing enabled without sensitive data
tracing = ModelTracing.ENABLED_WITHOUT_DATA
settings = ModelSettings(
reasoning_effort="high", # For GPT-5
)
Anthropic Features (via LiteLLM)
Using Anthropic through LiteLLM:
Python
from agents import Agent, LiteLLMModel
model = LiteLLMModel(model="anthropic/claude-3-opus")
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
model=model,
)
Google Features (via LiteLLM)
Using Google through LiteLLM:
Python
model = LiteLLMModel(model="gemini/gemini-pro")
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
model=model,
)
Model Selection Strategies
Per-Agent Model Selection
Different agents can use different models:
Python
# Fast model for simple tasks
quick_agent = Agent(
name="quick",
instructions="Quick responses",
model="gpt-4o-mini",
)
# Smart model for complex tasks
smart_agent = Agent(
name="smart",
instructions="Complex reasoning",
model="gpt-4o",
)
Run-Level Model Override
Override model for a specific run:
Python
config = RunConfig(
model="gpt-4o", # Override all agents
)
result = await Runner.run(
agent,
input,
run_config=config,
)
Dynamic Model Selection
Select model based on context:
Python
defselect_model(context: RunContextWrapper) -> str:
"""Select model based on context."""if context.context.complexity == "high":
return"gpt-4o"return"gpt-4o-mini"# Apply via model settings or custom provider
Model Configuration
Global Default Configuration
Set global defaults:
Python
from agents import set_default_openai_key, set_default_openai_api
set_default_openai_key("your-api-key")
set_default_openai_api("responses") # Use Responses API by default
Agent-Level Configuration
Configure model on agent:
Python
agent = Agent(
name="assistant",
instructions="You are a helpful assistant",
model="gpt-4o",
model_settings=ModelSettings(temperature=0.7),
)
import random
model = random.choice(["gpt-4o", "claude-3-opus"])
agent = Agent(
name="test",
instructions="Test agent",
model=model,
)
3. Cost Optimization
Use cheaper models when possible:
Python
# Use cheap model for classification
classifier = Agent(
name="classifier",
instructions="Classify the input",
model="gpt-4o-mini",
)
# Use expensive model only for generation
generator = Agent(
name="generator",
instructions="Generate content",
model="gpt-4o",
)
4. Model-Specific Prompts
Adjust prompts per model:
Python
defget_instructions_for_model(model: str) -> str:
if"gpt-4"in model:
return"You are GPT-4, be thorough."elif"claude"in model:
return"You are Claude, be helpful."return"You are a helpful assistant."
agent = Agent(
name="adaptive",
instructions=get_instructions_for_model("gpt-4o"),
)
5. Provider Redundancy
Ensure availability with multiple providers:
Python
provider = MultiProvider([
OpenAIProvider(),
AnthropicProvider(),
GoogleProvider(),
])
# If one provider is down, others are tried
Model and Tracing
Model-Level Tracing
Models can emit traces:
Python
from agents import ModelTracing
await model.get_response(
...,
tracing=ModelTracing.ENABLED,
)
Sensitive Data Handling
Control what's traced:
Python
# Include all data
tracing = ModelTracing.ENABLED
# Exclude sensitive data
tracing = ModelTracing.ENABLED_WITHOUT_DATA
# Disable tracing
tracing = ModelTracing.DISABLED
Summary
Model Providers enable flexible LLM integration. Key takeaways:
Model Providers abstract LLM API differences
Model interface defines the contract for all models
ModelProvider interface defines model resolution
OpenAIProvider is the default provider
OpenAIChatCompletionsModel uses the Chat Completions API
OpenAIResponsesModel uses the newer Responses API
OpenAIResponsesWSModel uses WebSocket transport
MultiProvider enables using multiple providers
Custom providers can integrate any LLM API
ModelSettings configures model parameters
Temperature controls randomness
Max tokens limits response length
Penalties control repetition
ModelTracing controls observability
Retry advice provides error recovery hints
Provider-specific features like server-managed conversations
LiteLLM enables using 100+ LLM providers
Per-agent model selection for different tasks
Run-level overrides for specific runs
Configuration priority determines which settings apply
Model Providers are essential for building flexible, cost-effective, and resilient agent systems.