> ## Documentation Index
> Fetch the complete documentation index at: https://docs.langdock.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Guardrails

> Validate AI outputs and workflow data with automated checks for safety, accuracy, and compliance.

<img src="https://mintcdn.com/langdock-34/cWyoB3RsITQmAUnM/images/workflows/nodes/guardrails.jpg?fit=max&auto=format&n=cWyoB3RsITQmAUnM&q=85&s=520fa51abe1db4b50374f0d205473656" alt="Guardrails" width="1920" height="903" data-path="images/workflows/nodes/guardrails.jpg" />

## Overview

The Guardrails node validates content using AI-powered checks to ensure safety, accuracy, and compliance. Each guardrail uses an LLM as a judge to evaluate your input against specific criteria, failing the workflow if confidence thresholds are exceeded.

<Info>
  **Best for**: Content moderation, PII detection, hallucination checks, jailbreak prevention, and custom validation rules.
</Info>

## How It Works

1. Provide input content to validate (from previous nodes)
2. Enable specific guardrail checks
3. Set confidence threshold for each check (0-1)
4. Choose AI model for evaluation
5. If any check exceeds threshold → Node fails and flags the issue

## Configuration

### Input

The content you want to validate. Supports Manual, Auto, and Prompt AI modes.

**Example:**

```handlebars theme={null}
{{agent.output.structured.response}}
{{trigger.output.user_message}}
{{http_request.output.content}}
```

### Model Selection

Choose the AI model used to evaluate all enabled guardrails. More capable models provide more accurate detection but cost more.

## Available Guardrails

### Personally Identifiable Information (PII)

Detects personal information like names, emails, phone numbers, addresses, SSNs, credit cards, etc.

**When to use:**

* Before storing user-generated content
* When sharing data externally
* Compliance requirements (GDPR, HIPAA)
* Customer service workflows

**Configuration:**

* **Confidence Threshold**: 0.7 (recommended)
* Higher threshold = stricter detection

**Example:**

```text theme={null}
Input: {{agent.output.structured.customer_response}}
Threshold: 0.8
Result: Fails if PII detected with >80% confidence
```

***

### Moderation

Checks for inappropriate, harmful, or offensive content including hate speech, violence, adult content, harassment, etc.

**When to use:**

* User-generated content platforms
* Public-facing communications
* Community moderation
* Customer-facing outputs

**Configuration:**

* **Confidence Threshold**: 0.6 (recommended)
* Adjust based on your content policies

***

### Jailbreak Detection

Identifies attempts to bypass AI safety controls or manipulate the AI into unintended behaviors.

**When to use:**

* Processing user prompts before sending to AI
* Public AI interfaces
* Workflows with user-provided instructions
* Security-sensitive applications

**Configuration:**

* **Confidence Threshold**: 0.7 (recommended)
* Higher threshold for fewer false positives

**Example:**

```text theme={null}
Input: {{trigger.output.user_prompt}}
Threshold: 0.75
Flags: Attempts to "ignore previous instructions" or similar
```

***

### Hallucination Detection

Detects when AI-generated content contains false or unverifiable information.

**When to use:**

* Fact-based content generation
* Customer support responses
* Financial or medical information
* Any workflow where accuracy is critical

**Configuration:**

* **Confidence Threshold**: 0.6 (recommended)
* Requires reference data for comparison

**Example:**

```text theme={null}
Input: {{agent.output.structured.generated_summary}}
Reference: {{http_request.output.original_data}}
Threshold: 0.7
Checks: Does summary accurately reflect source data?
```

***

### Custom Evaluation

Define your own validation criteria using natural language instructions.

**When to use:**

* Domain-specific validation
* Brand voice compliance
* Custom business rules
* Specialized content requirements

**Configuration:**

* **Evaluation Criteria**: Describe what to check for
* **Confidence Threshold**: Set based on strictness needed

**Example:**

```text theme={null}
Criteria: "Check if this response maintains our brand voice:
- Professional but friendly tone
- No jargon or technical terms
- Addresses customer by name
- Offers clear next steps"

Input: {{agent.output.structured.email_response}}
Threshold: 0.8
```

## Setting Confidence Thresholds

The confidence threshold determines how strict each check is:

| Threshold   | Behavior    | Use When                                  |
| ----------- | ----------- | ----------------------------------------- |
| **0.3-0.5** | Lenient     | Avoid false positives, informational only |
| **0.6-0.7** | Balanced    | Most use cases, good accuracy             |
| **0.8-0.9** | Strict      | High-risk scenarios, critical validation  |
| **0.9-1.0** | Very Strict | Only flag very obvious violations         |

<Tip>
  Start with **0.7** as a balanced default, then adjust based on false positives or missed detections.
</Tip>

## Example Workflows

### Content Moderation Pipeline

```text theme={null}
Trigger: Form submission (user comment)
→ Guardrails:
  ✅ PII Detection (threshold: 0.8)
  ✅ Moderation (threshold: 0.6)
  Input: {{trigger.output.comment}}
→ [On Success] → Post comment publicly
→ [On Failure] → Send to manual review queue
```

### AI Response Validation

```text theme={null}
Agent: Generate customer response
→ Guardrails:
  ✅ Hallucination (threshold: 0.7)
  ✅ Custom: "Professional and helpful tone"
  Input: {{agent.output.structured.response}}
→ [On Success] → Send email to customer
→ [On Failure] → Regenerate with different prompt
```

### Multi-Check Validation

```text theme={null}
Agent: Generate article summary
→ Guardrails:
  ✅ PII Detection (threshold: 0.8)
  ✅ Hallucination (threshold: 0.7)
  ✅ Custom: "No promotional language" (threshold: 0.75)
  Input: {{agent.output.structured.summary}}
→ [On Success] → Publish to website
→ [On Failure] → Return to editor for revision
```

## Handling Failures

When a guardrail check fails, the workflow stops at the Guardrails node. You can configure error handling to route to alternative paths, send notifications, or trigger fallback actions.

## When to Use Each Guardrail

<AccordionGroup>
  <Accordion title="PII Detection">
    Use PII detection for:

    * Public content that shouldn't contain personal information
    * Data being sent to third parties or external systems
    * Compliance-sensitive workflows (GDPR, HIPAA, etc.)
    * Preventing accidental exposure of sensitive user data
  </Accordion>

  <Accordion title="Moderation">
    Use moderation for:

    * User-generated content that needs review
    * Public-facing outputs and communications
    * Community platforms and forums
    * Filtering inappropriate or harmful content
  </Accordion>

  <Accordion title="Jailbreak Detection">
    Use jailbreak detection for:

    * User-provided prompts or instructions to AI
    * Public AI interfaces accessible to external users
    * Security-critical applications where prompt manipulation is a risk
    * Protecting against attempts to bypass system constraints
  </Accordion>

  <Accordion title="Hallucination Detection">
    Use hallucination detection for:

    * Fact-based content generation requiring accuracy
    * Customer support responses with specific information
    * Financial or medical information where accuracy is critical
    * Any content where false information could cause harm
  </Accordion>

  <Accordion title="Custom Evaluation">
    Use custom evaluation for:

    * Brand compliance and tone of voice guidelines
    * Domain-specific rules and industry standards
    * Quality standards unique to your organization
    * Business-specific requirements not covered by other guardrails
  </Accordion>
</AccordionGroup>

## Best Practices

<AccordionGroup>
  <Accordion title="Enable Multiple Checks">
    Use multiple guardrails together for comprehensive validation. PII + Moderation is a common combination.
  </Accordion>

  <Accordion title="Start with Balanced Thresholds">
    Begin with 0.7 and adjust based on results. Too low = false positives, too high = missed issues.
  </Accordion>

  <Accordion title="Always Handle Failures">
    Don't just fail the workflow—add error paths to notify teams, log violations, or trigger alternative actions.
  </Accordion>

  <Accordion title="Test with Edge Cases">
    Test guardrails with borderline content to calibrate thresholds correctly.
  </Accordion>

  <Accordion title="Use Appropriate Models">
    More capable models (GPT-4) provide better detection but cost more. Balance accuracy needs with budget.
  </Accordion>

  <Accordion title="Document Custom Evaluations">
    Write clear, specific criteria for custom evaluations so the AI understands exactly what to check.
  </Accordion>
</AccordionGroup>

## Next Steps

<CardGroup cols={2}>
  <Card title="Agent Node" icon="brain" href="/en/using-langdock/workflows/nodes/agent-node">
    Validate AI-generated content
  </Card>

  <Card title="Condition Node" icon="code-branch" href="/en/using-langdock/workflows/nodes/condition-node">
    Route based on validation results
  </Card>

  <Card title="Human in the Loop" icon="user-check" href="/en/using-langdock/workflows/fundamentals/human-in-the-loop">
    Add manual review for sensitive content
  </Card>

  <Card title="Getting Started" icon="rocket" href="/en/using-langdock/workflows/getting-started">
    Build your first workflow with validation
  </Card>
</CardGroup>
