A FastAPI application that provides comprehensive content validation and transformation endpoints using various guardrail technologies including Presidio, Guardrails AI, and local evaluation models.
The application follows a modular architecture with separate modules for different functionalities:
main.py
: FastAPI application with route definitionsguardrail/
: Directory containing all guardrail implementationspii_redaction_presidio.py
: PII detection and redaction using Presidiopii_detection_guardrails_ai.py
: PII detection using Guardrails AInsfw_filtering_local_eval.py
: NSFW content filtering using local Unitary toxic classification modeldrug_mention_guardrails_ai.py
: Drug mention detection using Guardrails AIweb_sanitization_guardrails_ai.py
: Web content sanitization using Guardrails AI
entities.py
: Pydantic models for request/response validation
The Guardrail Server currently exposes eight main endpoints for validation:
- POST
/pii-redaction
- Validates and optionally transforms incoming OpenAI chat completion requests before they are processed. Uses Presidio to detect and redact Personally Identifiable Information (PII) from messages.
null
- Guardrails passed, no transformation needed for input.ChatCompletionCreateParams
- Content was transformed, returns the modified request with PII redacted.HTTP 400/500
- Guardrails failed with error details for input.
- POST
/nsfw-filtering
- Validates and optionally transforms outgoing OpenAI chat completion responses to filter out NSFW content. Uses the Unitary toxic classification model to detect toxic, sexually explicit, and obscene content.
null
- Guardrails passed, no transformation needed for output.HTTP 400/500
- Guardrails failed with error details for output.
- POST
/drug-mention
- Validates outgoing OpenAI chat completion responses to detect and reject responses that mention drugs. Uses Guardrails AI to detect drug-related content.
null
- Guardrails passed, no drug mentions detected in output.HTTP 400/500
- Guardrails failed with error details for output.
- POST
/web-sanitization
- Validates incoming OpenAI chat completion requests before they are processed. Uses Guardrails AI to detect and reject requests that contain malicious content.
null
- Guardrails passed, no malicious content detected in input.HTTP 400/500
- Guardrails failed with error details for input.
- POST
/pii-detection
- Validates incoming OpenAI chat completion requests to detect the presence of Personally Identifiable Information (PII) using Guardrails AI. Does not redact, only detects and reports PII.
null
- Guardrails passed, no PII detected in input.HTTP 400/500
- Guardrails failed with error details for input.
docker build --build-arg GUARDRAILS_TOKEN="<GUARDRAILS_AI_TOKEN>" -t custom-guardrails-template:latest .
Note: The requestBody
is accessible within the endpoint and can be used if needed for custom processing.
Attributes:
requestBody
: (CompletionCreateParams) The input payload sent to the guardrail server.config
: (dict) Configuration options for the guardrail server.context
: (RequestContext) Contextual information such as user and metadata.
Attributes:
requestBody
: (CompletionCreateParams) The input payload originally sent to the model.responseBody
: (ChatCompletion) The model's output to be checked by the guardrail server.config
: (dict) Configuration options for the guardrail server.context
: (RequestContext) Contextual information such as user and metadata.
Attributes:
user
: (Subject) Information about the user, team, or virtual account making the request.metadata
: (dict[str, str]) Additional metadata relevant to the request.
The config
field is a dictionary used to store arbitrary request configuration. These are the options which are set when you create a custom guardrail integration. These are passed to the guardrail server as is, so you can use them in your guardrail logic.
For more information about the config options, refer to the Truefoundry documentation.
- Install dependencies:
pip install -r requirements.txt
python main.py
Or using uvicorn directly:
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
The server will start on http://localhost:8000
To deploy this guardrail server to Truefoundry, please refer to the official documentation: Getting Started with Deployment.
You can fork this repository and deploy it directly from your GitHub account using the Truefoundry platform. The documentation provides detailed instructions on connecting your GitHub repo and configuring the deployment.
For the latest and most accurate deployment steps, always consult the Truefoundry docs linked above.
Health check endpoint that returns server status.
PII redaction endpoint for validating and potentially transforming incoming OpenAI chat completion requests.
NSFW filtering endpoint for validating and potentially transforming outgoing OpenAI chat completion responses to filter inappropriate content.
Drug mention detection endpoint for rejecting responses that mention drugs.
Web content sanitization endpoint for validating and potentially transforming incoming OpenAI chat completion requests to remove malicious content.
PII detection endpoint for detecting Personally Identifiable Information in incoming requests using Guardrails AI.
Request Body:
{
"requestBody": {
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
],
"model": "gpt-3.5-turbo",
"temperature": 0.7
},
"config": {
"check_content": true,
"transform_input": false
},
"context": {
"user": {
"subjectId": "123",
"subjectType": "user",
"subjectSlug": "[email protected]",
"subjectDisplayName": "John Doe"
},
"metadata": {
"ip_address": "192.168.1.1",
"session_id": "abc123"
}
}
}
Output processing endpoint for validating and potentially transforming OpenAI chat completion responses.
Request Body:
{
"requestBody": {
"messages": [
{
"role": "user",
"content": "Hello"
}
],
"model": "gpt-3.5-turbo"
},
"responseBody": {
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-3.5-turbo",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 1,
"completion_tokens": 10,
"total_tokens": 11
}
},
"config": {
"check_sensitive_data": true,
"transform_output": false,
"filter_by_context": true
},
"context": {
"user": {
"subjectId": "123",
"subjectType": "user",
"subjectSlug": "[email protected]",
"subjectDisplayName": "John Doe"
},
"metadata": {
"ip_address": "192.168.1.1",
"session_id": "abc123"
}
}
}
curl -X POST "http://localhost:8000/pii-redaction" \
-H "Content-Type: application/json" \
-d '{
"requestBody": {
"messages": [
{"role": "user", "content": "Hello world"}
],
"model": "gpt-3.5-turbo"
},
"config": {"check_content": true},
"context": {
"user": {
"subjectId": "123",
"subjectType": "user",
"subjectSlug": "[email protected]",
"subjectDisplayName": "John Doe"
},
"metadata": {
"ip_address": "192.168.1.1",
"session_id": "abc123"
}
}
}'
curl -X POST "http://localhost:8000/pii-redaction" \
-H "Content-Type: application/json" \
-d '{
"requestBody": {
"messages": [
{"role": "user", "content": "Hello John, How are you?"}
],
"model": "gpt-3.5-turbo"
},
"config": {"transform_input": true},
"context": {"user": {"subjectId": "123", "subjectType": "user", "subjectSlug": "[email protected]", "subjectDisplayName": "John Doe"}}
}'
curl -X POST "http://localhost:8000/nsfw-filtering" \
-H "Content-Type: application/json" \
-d '{
"requestBody": {
"messages": [
{
"role": "user",
"content": "Hello"
}
],
"model": "gpt-3.5-turbo"
},
"responseBody": {
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-3.5-turbo",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hi, how are you?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 1,
"completion_tokens": 10,
"total_tokens": 11
}
},
"config": {
"transform_output": true
},
"context": {
"user": {
"subjectId": "123",
"subjectType": "user",
"subjectSlug": "[email protected]",
"subjectDisplayName": "John Doe"
},
"metadata": {
"environment": "production"
}
}
}'
curl -X POST "http://localhost:8000/nsfw-filtering" \
-H "Content-Type: application/json" \
-d '{
"requestBody": {
"messages": [
{
"role": "user",
"content": "Tell me what word does we usually use for breasts?"
}
],
"model": "gpt-3.5-turbo"
},
"responseBody": {
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-3.5-turbo",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Usually we use the word 'boobs' for breasts"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 1,
"completion_tokens": 10,
"total_tokens": 11
}
},
"config": {
"transform_output": true
},
"context": {
"user": {
"subjectId": "123",
"subjectType": "user",
"subjectSlug": "[email protected]",
"subjectDisplayName": "John Doe"
},
"metadata": {
"environment": "production"
}
}
}'
curl -X POST "http://localhost:8000/drug-mention" \
-H "Content-Type: application/json" \
-d '{
"requestBody": {
"messages": [
{
"role": "user",
"content": "What are the health benefits of exercise?"
}
],
"model": "gpt-3.5-turbo"
},
"responseBody": {
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-3.5-turbo",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Exercise has many health benefits including improved cardiovascular health, stronger muscles, better mood, and increased energy levels."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 1,
"completion_tokens": 10,
"total_tokens": 11
}
},
"config": {
"check_drug_mentions": true
},
"context": {
"user": {
"subjectId": "123",
"subjectType": "user",
"subjectSlug": "[email protected]",
"subjectDisplayName": "John Doe"
},
"metadata": {
"environment": "production"
}
}
}'
curl -X POST "http://localhost:8000/drug-mention" \
-H "Content-Type: application/json" \
-d '{
"requestBody": {
"messages": [
{
"role": "user",
"content": "Tell me about cocaine"
}
],
"model": "gpt-3.5-turbo"
},
"responseBody": {
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-3.5-turbo",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Cocaine is a powerful stimulant drug that affects the central nervous system."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 1,
"completion_tokens": 10,
"total_tokens": 11
}
},
"config": {
"check_drug_mentions": true
},
"context": {
"user": {
"subjectId": "123",
"subjectType": "user",
"subjectSlug": "[email protected]",
"subjectDisplayName": "John Doe"
},
"metadata": {
"environment": "production"
}
}
}'
curl -X POST "http://localhost:8000/web-sanitization" \
-H "Content-Type: application/json" \
-d '{
"requestBody": {
"messages": [
{
"role": "user",
"content": "Hello, how are you today?"
}
],
"model": "gpt-3.5-turbo"
},
"config": {
"check_content": true,
"transform_input": false
},
"context": {
"user": {
"subjectId": "123",
"subjectType": "user",
"subjectSlug": "[email protected]",
"subjectDisplayName": "John Doe"
},
"metadata": {
"ip_address": "192.168.1.1",
"session_id": "abc123"
}
}
}'
curl -X POST "http://localhost:8000/web-sanitization" \
-H "Content-Type: application/json" \
-d '{
"requestBody": {
"messages": [
{
"role": "user",
"content": "<script>alert(\"XSS attack\")</script>Hello, how are you?"
}
],
"model": "gpt-3.5-turbo"
},
"config": {
"check_content": true,
"transform_input": true
},
"context": {
"user": {
"subjectId": "123",
"subjectType": "user",
"subjectSlug": "[email protected]",
"subjectDisplayName": "John Doe"
},
"metadata": {
"ip_address": "192.168.1.1",
"session_id": "abc123"
}
}
}'
curl -X POST "http://localhost:8000/pii-detection" \
-H "Content-Type: application/json" \
-d '{
"requestBody": {
"messages": [
{
"role": "user",
"content": "Hello, tell me a story."
}
],
"model": "gpt-3.5-turbo"
},
"config": {
"check_content": true
},
"context": {
"user": {
"subjectId": "123",
"subjectType": "user",
"subjectSlug": "[email protected]",
"subjectDisplayName": "John Doe"
},
"metadata": {
"ip_address": "192.168.1.1",
"session_id": "abc123"
}
}
}'
curl -X POST "http://localhost:8000/pii-detection" \
-H "Content-Type: application/json" \
-d '{
"requestBody": {
"messages": [
{
"role": "user",
"content": "My name is John Doe and my email is [email protected]"
}
],
"model": "gpt-3.5-turbo"
},
"config": {
"check_content": true
},
"context": {
"user": {
"subjectId": "123",
"subjectType": "user",
"subjectSlug": "[email protected]",
"subjectDisplayName": "John Doe"
},
"metadata": {
"ip_address": "192.168.1.1",
"session_id": "abc123"
}
}
}'
The PII redaction endpoint uses Presidio to detect and remove Personally Identifiable Information (PII) from incoming messages. This ensures that sensitive information is anonymized before further processing. Link to the library: Presidio
The NSFW filtering endpoint can be used to validate and optionally transform the response from the LLM before returning it to the client. If the output is transformed (e.g., content is modified or formatted), the endpoint will return the modified response body. The NSFW filtering uses the Unitary toxic classification model with configurable thresholds for toxicity, sexual content, and obscenity detection. Link to the model: Unitary Toxic Classification Model
The drug mention detection endpoint uses Guardrails AI to detect and reject responses that mention drugs. Link to the library: Guardrails AI The validator is available in the Guardrails Hub
The web sanitization endpoint uses Guardrails AI to detect and reject responses that contain malicious content. Link to the library: Guardrails AI The validator is available in the Guardrails Hub
The PII detection endpoint uses Guardrails AI to identify the presence of Personally Identifiable Information (PII) in incoming messages. Unlike the Presidio-based redaction endpoint, this endpoint only detects and reports PII without modifying the content. Link to the library: Guardrails AI
The validator is available in the Guardrails Hub
The modular architecture makes it easy to customize the guardrail logic:
- PII Redaction: Modify
guardrail/pii_redaction_presidio.py
to customize PII detection and redaction rules - PII Detection (Guardrails AI): Modify
guardrail/pii_detection_guardrails_ai.py
to customize PII detection using Guardrails AI - NSFW Filtering (Local): Modify
guardrail/nsfw_filtering_local_eval.py
to customize content filtering thresholds and rules - NSFW Filtering (Guardrails AI): Modify
guardrail/nsfw_filtering_guardrails_ai.py
to customize NSFW filtering using Guardrails AI - Drug Mention Detection: Modify
guardrail/drug_mention_guardrails_ai.py
to customize drug mention detection rules - Request/Response Models: Modify
entities.py
to add new fields or validation rules
Replace the example guardrail logic in the respective files with your own implementation. The NSFW filtering uses the Unitary toxic classification model with configurable thresholds for toxicity, sexual content, and obscenity detection.
- Thresholds: 0.2 for toxicity, sexual_explicit, and obscene content
- Model: Unitary unbiased-toxic-roberta
All available guardrail implementations are already exposed as endpoints in the current version. To add new guardrail functionality:
- Create a new guardrail implementation file in the
guardrail/
directory - Follow the existing pattern for input or output validation
- Add the route to
main.py
usingapp.add_api_route()
- Update this README with the new endpoint documentation