InkdownInkdown
Start writing

Study

59 filesยท8 subfolders

Shared Workspace

Study
core

05_GUARDRAILS

Shared from "Study" on Inkdown

Guardrails - Comprehensive Deep Dive

Overview

Guardrails are safety checks that validate input and output at various points in an agent run. Think of guardrails as "security checkpoints" or "quality gates" that ensure the agent is operating within safe and acceptable boundaries. They can prevent harmful content, validate data formats, enforce business rules, and provide custom validation logic.

Core Concepts

What are Guardrails?

Guardrails are functions that run at specific points during agent execution to:

  • Validate - Check that data meets certain criteria
  • Filter - Remove or modify inappropriate content
  • Block - Prevent execution when safety thresholds are crossed
  • Transform - Modify content to meet requirements
Types of Guardrails
  1. Input Guardrails - Run before the agent processes input
programming-language-concepts.md
zero-language-explanation.md
DB
01-introduction.md
02-relational-databases.md
03-database-design.md
04-indexing.md
05-transactions-acid.md
06-nosql-databases.md
07-query-optimization.md
08-replication-ha.md
09-sharding-partitioning.md
10-caching-strategies.md
11-cap-theorem.md
12-connection-pooling.md
13-backup-recovery.md
14-monitoring.md
15-database-selection.md
README.md
JS
Event loop
Merlin Backend
01-Orchestration.md
02-DeepResearch.md
03-Search.md
04-Scraping.md
05-Streaming.md
06-MultiProviderLLM.md
07-MemoryAndContext.md
08-ErrorHandling.md
09-RateLimiting.md
10-TaskQueue.md
11-SecurityAndAuth.md
Orchestration-2nd-draft
OpenAI Agents Python
00_OVERVIEW.md
01_AGENT_SYSTEM.md
02_RUNNER_SYSTEM.md
03_TOOL_SYSTEM.md
04_ITEMS_SYSTEM.md
05_GUARDRAILS.md
06_HANDOFFS.md
07_MEMORY_SESSIONS.md
08_MODEL_PROVIDERS.md
09_SANDBOX_SYSTEM.md
10_TRACING.md
11_RUN_STATE.md
12_CONTEXT.md
13_LIFECYCLE_HOOKS.md
14_CONFIGURATION.md
15_ERROR_HANDLING.md
16_STREAMING.md
17_EXTENSIONS.md
18_MCP_INTEGRATION.md
19_BEST_PRACTICES.md
20_ARCHITECTURE_PATTERNS.md
opencode-study
context-handling
core
Python
Alembic
Basics
sqlalchemy - fastapi
SQLAlchemy overview
tweets
system_design_for_agentic_apps.md
  • Output Guardrails - Run after the agent produces output
  • Tool Input Guardrails - Run before tool execution
  • Tool Output Guardrails - Run after tool execution
  • Guardrail Anatomy

    Every guardrail has:

    Python
    @dataclass
    class GuardrailFunctionOutput:
        output_info: Any
        """Information about the guardrail's check."""
        
        tripwire_triggered: bool
        """Whether the guardrail was triggered (blocked execution)."""

    Tripwire Concept:

    • When tripwire_triggered = False - Execution continues normally
    • When tripwire_triggered = True - Execution is halted with an exception
    • The output_info can contain details about what was checked

    Input Guardrails

    Purpose

    Input guardrails validate or filter input before it reaches the LLM. They run:

    • Only on the first turn of a run
    • Only for the starting agent (not for handoffs)
    • Optionally in parallel with the agent (default) or before the agent starts
    Basic Input Guardrail
    Python
    from agents import Agent, input_guardrail, InputGuardrailFunctionOutput
    
    @input_guardrail
    def check_off_topic(
        context: RunContextWrapper,
        agent: Agent,
        input: str | list[TResponseInputItem],
    ) -> InputGuardrailFunctionOutput:
        """Check if input is off-topic."""
        if isinstance(input, str):
            if "politics" in input.lower():
                return InputGuardrailFunctionOutput(
                    output_info="Political content detected",
                    tripwire_triggered=True,
                )
        
        return InputGuardrailFunctionOutput(
            output_info="Input is appropriate",
            tripwire_triggered=False,
        )
    
    agent = Agent(
        name="safe_agent",
        instructions="Help with safe topics only",
        input_guardrails=[check_off_topic],
    )
    Input Guardrail with Structured Input
    Python
    from agents import Agent, input_guardrail
    
    @input_guardrail
    def validate_structured_input(
        context: RunContextWrapper,
        agent: Agent,
        input: list[TResponseInputItem],
    ) -> InputGuardrailFunctionOutput:
        """Validate structured input items."""
        for item in input:
            if item.get("type") == "image_url":
                # Check image URL is from allowed domain
                url = item.get("image_url", {}).get("url", "")
                if not url.startswith("https://allowed-domain.com"):
                    return InputGuardrailFunctionOutput(
                        output_info="Image from disallowed domain",
                        tripwire_triggered=True,
                    )
        
        return InputGuardrailFunctionOutput(
            output_info="All inputs valid",
            tripwire_triggered=False,
        )
    Parallel vs Sequential Execution

    Input guardrails can run in parallel (default) or sequentially:

    Python
    # Parallel (default) - all guardrails run concurrently
    @input_guardrail(run_in_parallel=True)
    def guardrail1(context, agent, input):
        ...
    
    @input_guardrail(run_in_parallel=True)
    def guardrail2(context, agent, input):
        ...
    
    # Sequential - guardrails run one after another
    @input_guardrail(run_in_parallel=False)
    def guardrail3(context, agent, input):
        ...

    When to use parallel:

    • Independent checks that don't depend on each other
    • Faster execution
    • No order dependency

    When to use sequential:

    • Guardrails depend on previous guardrail results
    • Order matters (e.g., check A must pass before check B)
    • Need to short-circuit on first failure
    Async Input Guardrails

    Guardrails can be async:

    Python
    @input_guardrail
    async def check_moderation_api(
        context: RunContextWrapper,
        agent: Agent,
        input: str | list[TResponseInputItem],
    ) -> InputGuardrailFunctionOutput:
        """Check input against external moderation API."""
        if isinstance(input, str):
            # Call external API
            result = await moderation_api.check(input)
            if result.is_flagged:
                return InputGuardrailFunctionOutput(
                    output_info=f"Flagged by moderation: {result.reason}",
                    tripwire_triggered=True,
                )
        
        return InputGuardrailFunctionOutput(
            output_info="Passed moderation",
            tripwire_triggered=False,
        )
    Input Guardrail Parameters
    Python
    @input_guardrail(
        name="content_safety",  # Custom name for tracing
        run_in_parallel=False,  # Sequential execution
    )
    def my_guardrail(context, agent, input):
        ...

    Parameters:

    • name - Custom name (defaults to function name)
    • run_in_parallel - Whether to run in parallel (default: True)

    Output Guardrails

    Purpose

    Output guardrails validate the agent's final output before it's returned. They run:

    • After the agent produces a final output
    • For every agent that produces output
    • Before the result is returned to the user
    Basic Output Guardrail
    Python
    from agents import Agent, output_guardrail, OutputGuardrailFunctionOutput
    
    @output_guardrail
    def check_output_length(
        context: RunContextWrapper,
        agent: Agent,
        agent_output: Any,
    ) -> OutputGuardrailFunctionOutput:
        """Ensure output is not too long."""
        if isinstance(agent_output, str):
            if len(agent_output) > 1000:
                return OutputGuardrailFunctionOutput(
                    output_info="Output too long (max 1000 chars)",
                    tripwire_triggered=True,
                )
        
        return OutputGuardrailFunctionOutput(
            output_info="Output length acceptable",
            tripwire_triggered=False,
        )
    
    agent = Agent(
        name="concise_agent",
        instructions="Be concise in your responses",
        output_guardrails=[check_output_length],
    )
    Structured Output Validation
    Python
    from agents import Agent, output_guardrail
    from pydantic import BaseModel
    
    class Summary(BaseModel):
        title: str
        points: list[str]
    
    @output_guardrail
    def validate_summary_structure(
        context: RunContextWrapper,
        agent: Agent,
        agent_output: Summary,
    ) -> OutputGuardrailFunctionOutput:
        """Validate summary structure."""
        if not agent_output.title:
            return OutputGuardrailFunctionOutput(
                output_info="Summary missing title",
                tripwire_triggered=True,
            )
        
        if len(agent_output.points) < 3:
            return OutputGuardrailFunctionOutput(
                output_info="Summary needs at least 3 points",
                tripwire_triggered=True,
            )
        
        return OutputGuardrailFunctionOutput(
            output_info="Summary structure valid",
            tripwire_triggered=False,
        )
    
    agent = Agent(
        name="summarizer",
        instructions="Summarize the input",
        output_type=Summary,
        output_guardrails=[validate_summary_structure],
    )
    Content Filtering
    Python
    @output_guardrail
    def filter_profanity(
        context: RunContextWrapper,
        agent: Agent,
        agent_output: str,
    ) -> OutputGuardrailFunctionOutput:
        """Filter profanity from output."""
        profanity_list = ["badword1", "badword2"]
        
        if isinstance(agent_output, str):
            for word in profanity_list:
                if word in agent_output.lower():
                    return OutputGuardrailFunctionOutput(
                        output_info=f"Profanity detected: {word}",
                        tripwire_triggered=True,
                    )
        
        return OutputGuardrailFunctionOutput(
            output_info="No profanity detected",
            tripwire_triggered=False,
        )
    Business Rule Validation
    Python
    @output_guardrail
    def validate_business_rules(
        context: RunContextWrapper,
        agent: Agent,
        agent_output: dict,
    ) -> OutputGuardrailFunctionOutput:
        """Validate business rules."""
        # Check required fields
        required_fields = ["customer_id", "amount", "currency"]
        for field in required_fields:
            if field not in agent_output:
                return OutputGuardrailFunctionOutput(
                    output_info=f"Missing required field: {field}",
                    tripwire_triggered=True,
                )
        
        # Validate amount is positive
        if agent_output["amount"] <= 0:
            return OutputGuardrailFunctionOutput(
                output_info="Amount must be positive",
                tripwire_triggered=True,
            )
        
        return OutputGuardrailFunctionOutput(
            output_info="Business rules validated",
            tripwire_triggered=False,
        )

    Tool Input Guardrails

    Purpose

    Tool input guardrails validate arguments before a tool is executed. They run:

    • After the model calls a tool
    • Before the tool is actually executed
    • For each tool call individually
    Basic Tool Input Guardrail
    Python
    from agents import function_tool, tool_input_guardrail, ToolInputGuardrailFunctionOutput
    
    @tool_input_guardrail
    def validate_search_args(
        context: RunContextWrapper,
        tool_name: str,
        args_json: str,
    ) -> ToolInputGuardrailFunctionOutput:
        """Validate search tool arguments."""
        args = json.loads(args_json)
        
        # Check query length
        query = args.get("query", "")
        if len(query) < 3:
            return ToolInputGuardrailFunctionOutput(
                output_info="Query too short (min 3 chars)",
                tripwire_triggered=True,
            )
        
        # Check limit is reasonable
        limit = args.get("limit", 10)
        if limit > 100:
            return ToolInputGuardrailFunctionOutput(
                output_info="Limit too high (max 100)",
                tripwire_triggered=True,
            )
        
        return ToolInputGuardrailFunctionOutput(
            output_info="Arguments valid",
            tripwire_triggered=False,
        )
    
    @function_tool(tool_input_guardrails=[validate_search_args])
    def search(query: str, limit: int = 10) -> str:
        """Search function."""
        return f"Results for {query}"
    Security Validation
    Python
    @tool_input_guardrail
    def validate_file_access(
        context: RunContextWrapper,
        tool_name: str,
        args_json: str,
    ) -> ToolInputGuardrailFunctionOutput:
        """Validate file access is allowed."""
        args = json.loads(args_json)
        file_path = args.get("path", "")
        
        # Prevent access to sensitive directories
        sensitive_dirs = ["/etc", "/var", "/sys", "/proc"]
        for sensitive_dir in sensitive_dirs:
            if file_path.startswith(sensitive_dir):
                return ToolInputGuardrailFunctionOutput(
                    output_info=f"Access to {sensitive_dir} not allowed",
                    tripwire_triggered=True,
                )
        
        return ToolInputGuardrailFunctionOutput(
            output_info="File access allowed",
            tripwire_triggered=False,
        )
    
    @function_tool(tool_input_guardrails=[validate_file_access])
    def read_file(path: str) -> str:
        """Read a file."""
        with open(path, 'r') as f:
            return f.read()
    Schema Validation
    Python
    from pydantic import BaseModel, ValidationError
    
    class SearchArgs(BaseModel):
        query: str
        limit: int = 10
        filters: list[str] = []
    
    @tool_input_guardrail
    def validate_schema(
        context: RunContextWrapper,
        tool_name: str,
        args_json: str,
    ) -> ToolInputGuardrailFunctionOutput:
        """Validate arguments against schema."""
        try:
            SearchArgs.model_validate_json(args_json)
            return ToolInputGuardrailFunctionOutput(
                output_info="Schema valid",
                tripwire_triggered=False,
            )
        except ValidationError as e:
            return ToolInputGuardrailFunctionOutput(
                output_info=f"Schema validation failed: {e}",
                tripwire_triggered=True,
            )

    Tool Output Guardrails

    Purpose

    Tool output guardrails validate the results after tool execution. They run:

    • After the tool completes
    • Before the result is sent back to the model
    • For each tool call individually
    Basic Tool Output Guardrail
    Python
    from agents import tool_output_guardrail, ToolOutputGuardrailFunctionOutput
    
    @tool_output_guardrail
    def filter_sensitive_data(
        context: RunContextWrapper,
        tool_name: str,
        output: str,
    ) -> ToolOutputGuardrailFunctionOutput:
        """Filter sensitive data from tool output."""
        sensitive_patterns = [
            r"\d{3}-\d{2}-\d{4}",  # SSN pattern
            r"\d{16}",  # Credit card pattern
        ]
        
        for pattern in sensitive_patterns:
            if re.search(pattern, output):
                return ToolOutputGuardrailFunctionOutput(
                    output_info="Sensitive data detected in output",
                    tripwire_triggered=True,
                )
        
        return ToolOutputGuardrailFunctionOutput(
            output_info="No sensitive data",
            tripwire_triggered=False,
        )
    
    @function_tool(tool_output_guardrails=[filter_sensitive_data])
    def get_user_info(user_id: str) -> str:
        """Get user information."""
        return f"User {user_id}: SSN: 123-45-6789"
    Output Sanitization
    Python
    @tool_output_guardrail
    def sanitize_output(
        context: RunContextWrapper,
        tool_name: str,
        output: str,
    ) -> ToolOutputGuardrailFunctionOutput:
        """Sanitize tool output."""
        # Remove HTML tags
        sanitized = re.sub(r'<[^>]+>', '', output)
        
        # Remove script tags
        sanitized = re.sub(r'<script.*?>.*?</script>', '', sanitized, flags=re.DOTALL)
        
        return ToolOutputGuardrailFunctionOutput(
            output_info=sanitized,  # Return sanitized output
            tripwire_triggered=False,
        )
    
    @function_tool(tool_output_guardrails=[sanitize_output])
    def fetch_webpage(url: str) -> str:
        """Fetch a webpage."""
        return requests.get(url).text
    Format Validation
    Python
    @tool_output_guardrail
    def validate_json_output(
        context: RunContextWrapper,
        tool_name: str,
        output: str,
    ) -> ToolOutputGuardrailFunctionOutput:
        """Validate output is valid JSON."""
        try:
            json.loads(output)
            return ToolOutputGuardrailFunctionOutput(
                output_info="Valid JSON",
                tripwire_triggered=False,
            )
        except json.JSONDecodeError:
            return ToolOutputGuardrailFunctionOutput(
                output_info="Output is not valid JSON",
                tripwire_triggered=True,
            )
    
    @function_tool(tool_output_guardrails=[validate_json_output])
    def query_database(query: str) -> str:
        """Query database and return JSON."""
        return json.dumps(db.execute(query))

    Guardrail Execution Flow

    Input Guardrail Flow
    Python
    # 1. User provides input
    input = "Hello, world!"
    
    # 2. Input guardrails run (first turn, starting agent only)
    if is_first_turn and is_starting_agent:
        guardrail_results = []
        
        # Run guardrails (parallel or sequential)
        for guardrail in input_guardrails:
            result = await guardrail.run(context, agent, input)
            guardrail_results.append(result)
            
            # Check if tripwire triggered
            if result.output.tripwire_triggered:
                raise InputGuardrailTripwireTriggered(
                    guardrail_result=result,
                    input=input,
                )
    
    # 3. If no tripwires, continue with agent execution
    response = await model.get_response(...)
    Output Guardrail Flow
    Python
    # 1. Agent produces output
    output = "Final answer"
    
    # 2. Output guardrails run
    guardrail_results = []
    for guardrail in output_guardrails:
        result = await guardrail.run(context, agent, output)
        guardrail_results.append(result)
        
        # Check if tripwire triggered
        if result.output.tripwire_triggered:
            raise OutputGuardrailTripwireTriggered(
                guardrail_result=result,
                output=output,
            )
    
    # 3. If no tripwires, return result
    return RunResult(final_output=output, ...)
    Tool Guardrail Flow
    Python
    # 1. Model calls tool
    tool_call = ResponseFunctionToolCall(
        name="my_tool",
        arguments='{"param": "value"}',
    )
    
    # 2. Run tool input guardrails
    for guardrail in tool.input_guardrails:
        result = await guardrail.run(context, tool_name, arguments)
        if result.output.tripwire_triggered:
            handle_guardrail_tripwire(result)
            return  # Skip tool execution
    
    # 3. Execute tool
    try:
        output = await tool.execute(arguments, context)
    except Exception as e:
        output = handle_error(e)
    
    # 4. Run tool output guardrails
    for guardrail in tool.output_guardrails:
        result = await guardrail.run(context, tool_name, output)
        if result.output.tripwire_triggered:
            output = result.output_info  # Use guardrail output
            # or raise exception
    
    # 5. Return output to model
    return output

    Guardrail Exceptions

    InputGuardrailTripwireTriggered

    Raised when an input guardrail tripwire is triggered:

    Python
    from agents import InputGuardrailTripwireTriggered
    
    try:
        result = await Runner.run(agent, input)
    except InputGuardrailTripwireTriggered as e:
        print(f"Input blocked: {e.guardrail_result.output.output_info}")
        # Handle the blocked input
    OutputGuardrailTripwireTriggered

    Raised when an output guardrail tripwire is triggered:

    Python
    from agents import OutputGuardrailTripwireTriggered
    
    try:
        result = await Runner.run(agent, input)
    except OutputGuardrailTripwireTriggered as e:
        print(f"Output blocked: {e.guardrail_result.output.output_info}")
        # Handle the blocked output
    ToolInputGuardrailTripwireTriggered

    Raised when a tool input guardrail tripwire is triggered:

    Python
    from agents import ToolInputGuardrailTripwireTriggered
    
    # This is handled internally by the SDK
    # The tool is skipped and an error message is sent to the model
    ToolOutputGuardrailTripwireTriggered

    Raised when a tool output guardrail tripwire is triggered:

    Python
    from agents import ToolOutputGuardrailTripwireTriggered
    
    # This is handled internally by the SDK
    # The output is replaced with the guardrail's output_info

    Guardrail Configuration

    Agent-Level Guardrails

    Guardrails can be set on individual agents:

    Python
    agent = Agent(
        name="guarded_agent",
        instructions="Agent with guardrails",
        input_guardrails=[guardrail1, guardrail2],
        output_guardrails=[guardrail3],
    )
    Run-Level Guardrails

    Guardrails can be set for an entire run:

    Python
    from agents import RunConfig
    
    config = RunConfig(
        input_guardrails=[global_input_guardrail],
        output_guardrails=[global_output_guardrail],
    )
    
    result = await Runner.run(agent, input, run_config=config)
    Guardrail Priority

    Guardrails are applied in this priority:

    1. Tool-level guardrails (for tool input/output)
    2. Agent-level guardrails
    3. Run-level guardrails

    All guardrails at the same level run (unless configured otherwise).

    Guardrail Best Practices

    1. Clear Tripwire Conditions

    Make it clear when a guardrail should trigger:

    Python
    # Good
    @input_guardrail
    def check_profanity(context, agent, input):
        if has_profanity(input):
            return GuardrailFunctionOutput(
                output_info="Profanity detected",
                tripwire_triggered=True,
            )
        return GuardrailFunctionOutput(
            output_info="No profanity",
            tripwire_triggered=False,
        )
    
    # Avoid - unclear logic
    @input_guardrail
    def vague_check(context, agent, input):
        # Complex, unclear logic
        if some_complex_condition(input):
            return GuardrailFunctionOutput(
                output_info="Something wrong",
                tripwire_triggered=True,
            )
    2. Descriptive Output Info

    Provide helpful information in output_info:

    Python
    # Good
    return GuardrailFunctionOutput(
        output_info="Query too short: minimum 3 characters, got 2",
        tripwire_triggered=True,
    )
    
    # Avoid
    return GuardrailFunctionOutput(
        output_info="Error",
        tripwire_triggered=True,
    )
    3. Use Parallel When Possible

    For independent checks, use parallel execution:

    Python
    # Good - independent checks
    @input_guardrail(run_in_parallel=True)
    def check_length(context, agent, input):
        ...
    
    @input_guardrail(run_in_parallel=True)
    def check_content(context, agent, input):
        ...
    
    # Avoid - sequential when not needed
    @input_guardrail(run_in_parallel=False)
    def check_length(context, agent, input):
        ...
    4. Handle Errors Gracefully

    Guardrails should handle their own errors:

    Python
    # Good
    @input_guardrail
    def safe_check(context, agent, input):
        try:
            result = external_api.check(input)
            return GuardrailFunctionOutput(
                output_info=str(result),
                tripwire_triggered=result.is_flagged,
            )
        except Exception as e:
            # Log error but don't block
            logger.error(f"Guardrail error: {e}")
            return GuardrailFunctionOutput(
                output_info="Guardrail error, allowing input",
                tripwire_triggered=False,
            )
    
    # Avoid
    @input_guardrail
    def unsafe_check(context, agent, input):
        # If this fails, the whole run fails
        return external_api.check(input)
    5. Keep Guardrails Focused

    Each guardrail should have a single responsibility:

    Python
    # Good - single responsibility
    @input_guardrail
    def check_length(context, agent, input):
        """Only checks length."""
        ...
    
    @input_guardrail
    def check_profanity(context, agent, input):
        """Only checks profanity."""
        ...
    
    # Avoid - multiple responsibilities
    @input_guardrail
    def check_everything(context, agent, input):
        """Checks length, profanity, format, etc."""
        ...

    Common Guardrail Patterns

    1. Content Moderation
    Python
    @input_guardrail
    @output_guardrail
    def moderate_content(context, agent, content):
        """Moderate content for safety."""
        categories = moderation_api.check(content)
        
        if categories["hate_speech"]:
            return GuardrailFunctionOutput(
                output_info="Hate speech detected",
                tripwire_triggered=True,
            )
        
        if categories["violence"]:
            return GuardrailFunctionOutput(
                output_info="Violence detected",
                tripwire_triggered=True,
            )
        
        return GuardrailFunctionOutput(
            output_info="Content safe",
            tripwire_triggered=False,
        )
    2. PII Detection
    Python
    @input_guardrail
    @output_guardrail
    def detect_pii(context, agent, content):
        """Detect personally identifiable information."""
        pii_types = pii_detector.detect(content)
        
        if pii_types:
            return GuardrailFunctionOutput(
                output_info=f"PII detected: {', '.join(pii_types)}",
                tripwire_triggered=True,
            )
        
        return GuardrailFunctionOutput(
            output_info="No PII detected",
            tripwire_triggered=False,
        )
    3. Format Validation
    Python
    @output_guardrail
    def validate_format(context, agent, output):
        """Validate output format."""
        if not is_valid_json(output):
            return GuardrailFunctionOutput(
                output_info="Output must be valid JSON",
                tripwire_triggered=True,
            )
        
        return GuardrailFunctionOutput(
            output_info="Format valid",
            tripwire_triggered=False,
        )
    4. Business Rule Enforcement
    Python
    @output_guardrail
    def enforce_business_rules(context, agent, output):
        """Enforce business rules."""
        if output["amount"] > context.context.max_amount:
            return GuardrailFunctionOutput(
                output_info=f"Amount exceeds maximum: {context.context.max_amount}",
                tripwire_triggered=True,
            )
        
        return GuardrailFunctionOutput(
            output_info="Business rules satisfied",
            tripwire_triggered=False,
        )
    5. Resource Limits
    Python
    @tool_input_guardrail
    def check_resource_limits(context, tool_name, args_json):
        """Check resource limits."""
        args = json.loads(args_json)
        
        if args.get("limit", 0) > 1000:
            return ToolInputGuardrailFunctionOutput(
                output_info="Limit exceeds maximum (1000)",
                tripwire_triggered=True,
            )
        
        return ToolInputGuardrailFunctionOutput(
            output_info="Within limits",
            tripwire_triggered=False,
        )

    Guardrail and Tracing

    Guardrail Spans

    Guardrails create trace spans for observability:

    Python
    from agents import guardrail_span
    
    @input_guardrail
    def my_guardrail(context, agent, input):
        with guardrail_span(name="my_guardrail"):
            # Guardrail logic
            return GuardrailFunctionOutput(...)
    Guardrail Metadata

    Guardrail results are included in traces:

    Python
    result = await Runner.run(agent, input)
    # Trace includes guardrail results
    for guardrail_result in result.input_guardrail_results:
        print(f"Guardrail: {guardrail_result.guardrail.get_name()}")
        print(f"Triggered: {guardrail_result.output.tripwire_triggered}")
        print(f"Info: {guardrail_result.output.output_info}")

    Summary

    Guardrails provide safety and validation for agent runs. Key takeaways:

    1. Input guardrails validate input before agent processing
    2. Output guardrails validate output after agent processing
    3. Tool input guardrails validate tool arguments
    4. Tool output guardrails validate tool results
    5. Tripwires halt execution when triggered
    6. Parallel execution improves performance for independent checks
    7. Sequential execution for dependent checks
    8. GuardrailFunctionOutput is the standard return type
    9. Exceptions indicate guardrail violations
    10. Agent-level guardrails apply to specific agents
    11. Run-level guardrails apply to entire runs
    12. Priority determines which guardrails apply
    13. Async guardrails support external API calls
    14. Structured validation for complex data
    15. Content filtering for safety
    16. Business rules for domain validation
    17. Resource limits prevent abuse
    18. Error handling should be graceful
    19. Tracing includes guardrail results
    20. Single responsibility keeps guardrails focused

    Guardrails are essential for building safe, reliable, and compliant agent systems.