Guardrails implementation #504

gbayomi · 2025-08-09T06:18:50Z

🛡️ Overview

This PR introduces a comprehensive guardrails system for Openlayer's tracing functionality, enabling automatic content filtering and protection for AI/LLM applications.

✨ Key Features

🏗️ Flexible Architecture

Base guardrail system with extensible BaseGuardrail abstract class
Multiple action types: ALLOW, BLOCK, MODIFY with configurable strategies
Block strategies: Graceful handling (return empty/error) vs exceptions
Rich metadata for monitoring, filtering, and analysis

🔒 PII Protection

PIIGuardrail implementation using Microsoft Presidio
Configurable entities: Block SSNs, redact phone numbers, etc.
Confidence thresholds for fine-tuned detection
Multiple handling strategies for different use cases

🎯 Comprehensive Integration

@trace() Decorator Support

@tracer.trace(guardrails=[pii_guardrail])
def process_user_input(user_data: str) -> str:
    return f"Processed: {user_data}"

Helper Functions Support

# Global configuration - applies to ALL helper functions
tracer.configure(guardrails=[pii_guardrail])
traced_client = tracer.trace_openai(openai.OpenAI())

Per-Call Overrides

tracer.add_chat_completion_step_to_trace(
    guardrails=[custom_guardrail],  # Override global settings
    inputs={"prompt": "sensitive data"},
    output="protected response"
)

📊 Metadata & Analytics

Each trace step includes comprehensive guardrail metadata:

{
  "guardrails": {
    "input_pii_protection": {
      "action": "redacted",
      "reason": "Redacted PII entities: PHONE_NUMBER",
      "block_strategy": "return_error_message"
    }
  },
  "has_guardrails": true,
  "guardrail_blocked": false,
  "guardrail_modified": true,
  "guardrail_allowed": false
}

🔧 Usage Examples

@trace() Decorator Example

See: examples/tracing/trace_decorator_with_guardrails.py

Function-level PII protection
Multiple guardrails with different strategies
Custom guardrail implementations
Role-based conditional protection

trace_openai() Helper Example

See: examples/tracing/trace_openai_with_guardrails.py

Global guardrails for all LLM calls
RAG pipeline protection
Application-specific configurations
Multi-model setups with different protection levels

⚡ Backward Compatibility

100% backward compatible - existing code works unchanged
Optional guardrails - only applied when explicitly configured
Graceful degradation - missing dependencies don't break functionality

🧪 Testing

Comprehensive test suite included:

Unit tests for all guardrail components
Integration tests for decorator and helper functions
Block strategy validation
Metadata structure verification

📦 Dependencies

Optional (for PII guardrails):

pip install presidio-analyzer presidio-anonymizer

Future extensibility ready for:

LLM-Guard integration
Custom content filters
Third-party security tools

🎯 Use Cases

PII Protection: Automatic detection and redaction of sensitive data
Content Filtering: Block inappropriate or harmful content
Compliance: Meet regulatory requirements (GDPR, HIPAA, etc.)
Security: Prevent data leaks in AI applications
Monitoring: Track and analyze content filtering actions

Impact: This system provides enterprise-grade content protection for AI applications with minimal integration overhead and maximum flexibility.

gustavocidornelas

Overall, the functionality is interesting and could be a good addition to the SDK. Below are my comments:

The examples under /examples/tracing don't make a lot of sense because:
- They do not follow the convention we use (of organizing examples under providers):

The file names indicate they are "drafts" and not informative examples for users. E.g., simple_guardrails_test.py, improved_guardrails_test.py, etc.

Overall, I'd completely remove these examples, as they are bloating the PR. Examples should probably go into a guardrails folder inside examples/tracing. I also believe they should be Jupyter notebooks. From what's described in the README.md inside /guardrails, a single example is enough. Maybe two: one showing something simple and another showing a custom guardrail.

It doesn't make sense to have a README.md inside src/openlayer/lib/guardrails. The content in it should probably go into a documentation page
The changes in tracer.py are conflicting with what we have in main. I'd rebase and fix the conflicts first. Eyeballing the code, though, I can see that the changes are pretty intrusive. There should be a less intrusive way to incorporate guardrails with tracing, as it's just a matter of adding a processing step to the inputs/outputs of traced functions and the base guardrail has a good interface for that. These big changes are hard to review (GitHub is not even rendering diffs, because there are so many) and since the code in tracer.py is part of many user's production code, reviewing it carefully is important.

- Add base guardrail architecture with GuardrailAction, BlockStrategy enums - Implement GuardrailResult dataclass for structured guardrail responses - Create BaseGuardrail abstract class for extensible guardrail implementations - Add PIIGuardrail using Microsoft Presidio for PII detection and redaction - Support multiple block strategies: raise exception, return empty, return error message, skip function - Include GuardrailRegistry for managing guardrail instances - Add comprehensive error handling and logging This foundational system enables flexible content filtering and protection for AI/LLM applications with configurable actions and strategies.

@trace

- Add guardrails parameter to @trace() and @trace_async() decorators - Enhance create_step() context manager to support guardrails - Update add_chat_completion_step_to_trace() for helper function integration - Add global guardrails configuration via tracer.configure() - Implement input and output guardrail processing with graceful error handling - Add comprehensive guardrail metadata to trace steps including action flags - Support multiple block strategies: graceful handling vs exceptions - Enable per-call guardrail overrides and global configuration - Maintain 100% backward compatibility with existing trace decorators This integration enables automatic protection for both @trace() decorated functions and LLM helper functions like trace_openai() with rich metadata for monitoring and analysis.

@trace

- Add trace_decorator_with_guardrails.py demonstrating @trace() with guardrails * Basic PII protection with different block strategies * Multiple guardrails with layered protection * Custom guardrail implementations * Role-based conditional guardrails - Add trace_openai_with_guardrails.py demonstrating helper function integration * Global guardrails configuration for all LLM calls * RAG pipeline protection with automatic guardrail application * Application-specific guardrail configurations * Multi-model setups with different protection levels * Monitoring and analytics with comprehensive metadata These examples provide clear guidance for implementing guardrails in production AI applications with both decorator and helper function patterns.

- Add comprehensive test suite for guardrails functionality - Include tests for different block strategies and error handling - Add helper functions integration tests - Provide additional usage examples and demonstrations These tests ensure the guardrails system works correctly across all integration points and usage patterns.

@trace

- Remove complex guardrail logic from create_step and add_chat_completion_step_to_trace - Simplify @trace() decorator integration to be much less intrusive - Remove global guardrails configuration (not needed for basic functionality) - Clean up examples to focus on core @trace() decorator usage - Remove unnecessary test files and complex examples - Keep only essential guardrails functionality in tracer.py This makes the integration much cleaner and easier to review while maintaining the core guardrails functionality for @trace() decorated functions.

gbayomi requested review from whoseoyster, gustavocidornelas and viniciusdsmello August 9, 2025 16:10

gustavocidornelas requested changes Aug 13, 2025

View reviewed changes

gustavocidornelas force-pushed the feature/guardrails-integration branch from c090f16 to f595c2e Compare August 14, 2025 13:42

Gabriel Bayomi Tinoco Kalejaiye and others added 7 commits August 14, 2025 11:01

chore: cleanup unnecessary files

be95cde

chore: update tracer implementation

f349cc5

gustavocidornelas force-pushed the feature/guardrails-integration branch from dac9bd4 to f349cc5 Compare August 14, 2025 14:12

gustavocidornelas added 2 commits August 14, 2025 13:03

fix: PII redaction and trace function calls

0bb11aa

feat: introduce guardrail step type

ab40f62

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Guardrails implementation #504

Guardrails implementation #504

Uh oh!

gbayomi commented Aug 9, 2025

Uh oh!

gustavocidornelas left a comment

Uh oh!

Uh oh!

Guardrails implementation #504

Are you sure you want to change the base?

Guardrails implementation #504

Uh oh!

Conversation

gbayomi commented Aug 9, 2025

🛡️ Overview

✨ Key Features

🏗️ Flexible Architecture

🔒 PII Protection

🎯 Comprehensive Integration

@trace() Decorator Support

Helper Functions Support

Per-Call Overrides

📊 Metadata & Analytics

🔧 Usage Examples

@trace() Decorator Example

trace_openai() Helper Example

⚡ Backward Compatibility

🧪 Testing

📦 Dependencies

🎯 Use Cases

Uh oh!

gustavocidornelas left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!