npm - ai-sdk-guardrails - Versions diffs - 4.0.0 → 5.0.0 - Mend

ai-sdk-guardrails 4.0.0 → 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +558 -730
package/package.json +26 -22

package/README.md CHANGED Viewed

@@ -1,895 +1,723 @@
 # AI SDK Guardrails
-Middleware for the Vercel AI SDK that adds safety, quality control, and cost management to your AI applications by intercepting prompts and responses.
+**Input and output validation for the Vercel AI SDK**
-Block harmful inputs, filter low-quality outputs, and gain observability, all in just a few lines of code.
+Add safety checks and quality controls to your AI applications. Guard against prompt injection, prevent sensitive data leaks, and improve output reliability - all while keeping your existing AI SDK code unchanged.
+**Now includes MCP (Model Context Protocol) security guardrails** to help protect against attacks when using AI tools.
+[![npm version](https://img.shields.io/npm/v/ai-sdk-guardrails.svg?logo=npm&label=npm)](https://www.npmjs.com/package/ai-sdk-guardrails)
+[![downloads](https://img.shields.io/npm/dw/ai-sdk-guardrails.svg?label=downloads)](https://www.npmjs.com/package/ai-sdk-guardrails)
+[![bundle size](https://img.shields.io/bundlephobia/minzip/ai-sdk-guardrails.svg?label=minzipped)](https://bundlephobia.com/package/ai-sdk-guardrails)
+[![license](https://img.shields.io/npm/l/ai-sdk-guardrails.svg?label=license)](./LICENSE)
+![types](https://img.shields.io/badge/TypeScript-Ready-3178C6?logo=typescript&logoColor=white)
 ![Guardrails Demo](./media/guardrail-example.gif)
-## ⚡ TL;DR
+## Why this matters
-Quickly add input and output validation to any AI SDK-compatible model.
+- **MCP**: Protect against prompt injection and data exfiltration when using MCP tools
+- **Agent**: Have more reliable and secure agentic workflows
+- **Tool security**: Protect against data exfiltration when using MCP tools
+- **Save costs**: Block unnecessary requests before they hit your model
+- **Improve safety**: Detect PII, block harmful content, prevent prompt injection
+- **Better quality**: Enforce minimum response lengths, validate structure, auto-retry on failures
+- **Easy integration**: Works as middleware with any AI SDK model
-```typescript
-import { openai } from '@ai-sdk/openai';
-import { generateText } from 'ai';
-import {
-  wrapWithGuardrails,
-  defineInputGuardrail,
-  defineOutputGuardrail,
-} from 'ai-sdk-guardrails';
+## Common use cases
-// 1. Define your guardrails
-const inputGuard = defineInputGuardrail({
-  name: 'length-check',
-  execute: async ({ prompt }) =>
-    prompt.length > 100
-      ? { tripwireTriggered: true, message: 'Input too long' }
-      : { tripwireTriggered: false },
-});
+- Content moderation and safety filters
+- PII detection for compliance
+- Output quality requirements (length, format)
+- Prompt injection prevention
+- Tool usage validation
+- Auto-retry on low-quality responses
-const outputGuard = defineOutputGuardrail({
-  name: 'quality-check',
-  execute: async ({ result }) =>
-    result.text.length < 10
-      ? { tripwireTriggered: true, message: 'Response too short' }
-      : { tripwireTriggered: false },
-});
+## Secure AI in Under 60 Seconds
-// 2. Wrap your model
-const guardedModel = wrapWithGuardrails(openai('gpt-4o'), {
-  inputGuardrails: [inputGuard],
-  outputGuardrails: [outputGuard],
-});
+**Step 1:** Install (10 seconds)
-// 3. Use it! Guardrails will run automatically.
-const { text } = await generateText({
-  model: guardedModel,
-  prompt: 'A prompt that is definitely not too long.',
-});
+```bash
+npm install ai-sdk-guardrails
 ```
-## How It Works
-### Without Guardrails (Inefficient, Poor Quality)
+**Step 2:** Import (15 seconds)
-```mermaid
-flowchart LR
-    A[User Input<br/>'hello'] --> B[AI Model] --> C[Response<br/>⚠️ Wastes resources<br/>😞 Often useless]
+```ts
+import { withGuardrails, piiDetector } from 'ai-sdk-guardrails';
 ```
-### With Input Guardrails (Save Resources)
+**Step 3:** Wrap your model (30 seconds)
-```mermaid
-flowchart LR
-    A[User Input<br/>'hello'] --> B[Input Guardrails] --> C[❌ STOPPED<br/>✅ No API call made]
+```ts
+const safeModel = withGuardrails(yourModel, {
+  inputGuardrails: [piiDetector()],
+});
 ```
-### With Output Guardrails (Ensure Quality)
+**Result:** Your AI now automatically blocks PII, prevents prompt injection, and validates outputs. That's it. No architecture changes, no security team required.
-```mermaid
-flowchart LR
-    A[AI Response<br/>'Here's my SSN: 123-45-6789'] --> B[Output Guardrails] --> C[❌ BLOCKED<br/>🛡️ Privacy protected]
-```
+## TL;DR
-### Complete Protection
+Copy/paste minimal setup:
-```mermaid
-flowchart LR
-    A[User Input] --> B[Input Guardrails] --> C[AI Model] --> D[Output Guardrails] --> E[Clean Response]
+```ts
+import { generateText } from 'ai';
+import { openai } from '@ai-sdk/openai';
+import {
+  withGuardrails,
+  piiDetector,
+  promptInjectionDetector,
+  minLengthRequirement,
+  mcpSecurityGuardrail,
+} from 'ai-sdk-guardrails';
+const model = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [piiDetector(), promptInjectionDetector()],
+  outputGuardrails: [
+    minLengthRequirement(160),
+    mcpSecurityGuardrail({
+      maxContentSize: 51200, // 50KB limit
+      injectionThreshold: 0.7, // Configurable sensitivity
+      allowedDomains: ['api.company.com'], // Domain allowlist
+    }),
+  ],
+});
+const { text } = await generateText({
+  model,
+  prompt: 'Write a friendly intro email.',
+});
 ```
-That's it! Input guardrails optimize resource usage by stopping inefficient requests. Output guardrails ensure quality by filtering responses.
+See runnable examples: [examples/README.md](./examples/README.md)
+## Quickstart (30 seconds)
-## 📦 Installation
+Install with your provider (OpenAI shown):
 ```bash
-npm install ai-sdk-guardrails
+pnpm add ai-sdk-guardrails ai @ai-sdk/openai
+# or: npm i ai-sdk-guardrails ai @ai-sdk/openai
+# or: yarn add ai-sdk-guardrails ai @ai-sdk/openai
+```
-# or
+Wrap your model and keep using `generateText` as usual:
-yarn add ai-sdk-guardrails
+```ts
+import { generateText } from 'ai';
+import { openai } from '@ai-sdk/openai';
+import { withGuardrails, piiDetector } from 'ai-sdk-guardrails';
-# or
+const model = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [piiDetector()],
+});
-pnpm add ai-sdk-guardrails
+const { text } = await generateText({
+  model,
+  prompt: 'Write a friendly intro email.',
+});
 ```
-## 🔄 Migration Guide
+## Contents
-For breaking changes from v3 to v4 (including the new analytics-rich callbacks), see [v3-v4-MIGRATION.md](./v3-v4-MIGRATION.md).
+- Overview
+- Concepts
+- Installation
+- Usage
+  - Define a guardrail
+  - Built-in helpers
+- Streaming
+- Auto Retry (utility and middleware)
+- Error Handling
+- API
+- Examples
+- Compatibility
+- Architecture
+- Contributing
-## 🚀 Quick Start
+## API Overview
-Add smart validation to your AI applications in just 3 steps:
+### Primary Functions
-### 1. Prevent Unnecessary AI Calls
+- **`withGuardrails(model, config)`** - Main API for wrapping language models with guardrails
+- **`createGuardrails(config)`** - Factory to create reusable guardrail configurations
+- **`withAgentGuardrails(agentSettings, config)`** - Wrap AI SDK Agents with guardrails
-```typescript
-import { generateText } from 'ai';
-import { openai } from '@ai-sdk/openai';
-import {
-  wrapWithInputGuardrails,
-  defineInputGuardrail,
-} from 'ai-sdk-guardrails';
-import { extractTextContent } from 'ai-sdk-guardrails/guardrails/input';
+### Migration from v3.x
-// Block inefficient requests before calling the AI model
-const lengthGuard = defineInputGuardrail({
-  name: 'blocked-keywords',
-  execute: async (context) => {
-    const { prompt } = extractTextContent(context);
-    const blockedWords = ['spam', 'test', 'hello'];
-    const foundWord = blockedWords.find((word) =>
-      prompt.toLowerCase().includes(word.toLowerCase()),
-    );
-    if (foundWord) {
-      return {
-        tripwireTriggered: true,
-        message: `Blocked keyword detected: ${foundWord}`,
-        severity: 'medium',
-      };
-    }
-    return { tripwireTriggered: false };
-  },
-});
+- `wrapWithGuardrails` → `withGuardrails` (alias available, deprecated)
+- `wrapAgentWithGuardrails` → `withAgentGuardrails` (alias available, deprecated)
+- Error classes: `InputBlockedError` → `GuardrailsInputError`, `OutputBlockedError` → `GuardrailsOutputError`
-const optimizedModel = wrapWithInputGuardrails(openai('gpt-4'), {
-  inputGuardrails: [lengthGuard],
-});
+```ts
+// Before (v3.x - still works but deprecated)
+import { wrapWithGuardrails, InputBlockedError } from 'ai-sdk-guardrails';
+const model = wrapWithGuardrails(openai('gpt-4o'), { ... });
+// After (v4.x - recommended)
+import { withGuardrails, GuardrailsInputError } from 'ai-sdk-guardrails';
+const model = withGuardrails(openai('gpt-4o'), { ... });
+// Factory pattern (new in v4.x)
+import { createGuardrails } from 'ai-sdk-guardrails';
+const guards = createGuardrails({ ... });
+const model = guards(openai('gpt-4o'));
+```
-// This would normally waste an API call for a useless response
-try {
-  const result = await generateText({
-    model: optimizedModel,
-    prompt: 'hello', // ❌ Blocked - prevents unnecessary API call
-  });
-} catch (error) {
-  console.log('Blocked request, saved money!');
-}
+## Concepts
-// This generates valuable content
-const goodResult = await generateText({
-  model: optimizedModel,
-  prompt: 'Write a product description for our new software', // ✅ This creates value
-});
-```
+- Input guardrails: Validate or block prompts to save cost and enforce rules before the call.
+- Output guardrails: Check results for quality and safety. Block, replace, or retry as needed.
+- Middleware: Guardrails wrap any model via AI SDK middleware. Your app code stays the same.
+## Installation
+See Quickstart for installation commands. Add providers you use as needed (e.g., `@ai-sdk/openai`, `@ai-sdk/mistral`).
-### 2. Ensure Quality Output
+## Usage
-```typescript
+### Create custom guardrails
+```ts
+import { openai } from '@ai-sdk/openai';
 import {
-  wrapWithOutputGuardrails,
+  defineInputGuardrail,
   defineOutputGuardrail,
+  withGuardrails,
 } from 'ai-sdk-guardrails';
+import { extractTextContent } from 'ai-sdk-guardrails/guardrails/input';
 import { extractContent } from 'ai-sdk-guardrails/guardrails/output';
-const qualityGuard = defineOutputGuardrail({
-  name: 'sensitive-info-detector',
-  execute: async (context) => {
-    const { text } = extractContent(context.result);
-    // Simple sensitive info patterns
-    const sensitivePatterns = [
-      /\b\d{3}-\d{2}-\d{4}\b/, // SSN
-      /\b[\w\.-]+@[\w\.-]+\.\w+\b/, // Email
-      /\b\d{3}-\d{3}-\d{4}\b/, // Phone
-    ];
-    const foundPattern = sensitivePatterns.find((pattern) =>
-      pattern.test(text),
-    );
-    if (foundPattern) {
-      return {
-        tripwireTriggered: true,
-        message: 'Sensitive information detected in response',
-        severity: 'high',
-      };
-    }
-    return { tripwireTriggered: false };
+const businessHours = defineInputGuardrail({
+  name: 'business-hours',
+  execute: async (params) => {
+    const hr = new Date().getHours();
+    return hr >= 9 && hr <= 17
+      ? { tripwireTriggered: false }
+      : { tripwireTriggered: true, message: 'Outside business hours' };
   },
 });
-const qualityModel = wrapWithOutputGuardrails(openai('gpt-4'), {
-  outputGuardrails: [qualityGuard],
-  onOutputBlocked: (executionSummary) => {
-    console.log(
-      'Prevented sensitive data leak:',
-      executionSummary.blockedResults[0]?.message,
-    );
-    // Access comprehensive analytics (New in v4.0.0)
-    console.log(
-      `Blocked ${executionSummary.stats.blocked} of ${executionSummary.guardrailsExecuted} guardrails`,
-    );
+const minQuality = defineOutputGuardrail({
+  name: 'min-quality',
+  execute: async ({ result }) => {
+    const { text } = extractContent(result);
+    return text.length >= 80
+      ? { tripwireTriggered: false }
+      : { tripwireTriggered: true, message: 'Response too short' };
   },
 });
-const result = await generateText({
-  model: qualityModel,
-  prompt: 'Create a user profile example',
+const model = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [businessHours],
+  outputGuardrails: [minQuality],
 });
-// Automatically blocks responses containing emails, phone numbers, or SSNs
 ```
-### 3. Custom Business Logic
-```typescript
-const businessHoursGuard = defineInputGuardrail({
-  name: 'business-hours-only',
-  execute: async () => {
-    const hour = new Date().getUTCHours();
-    // Only allow requests between 9 AM and 5 PM UTC
-    if (hour < 9 || hour > 17) {
-      return {
-        tripwireTriggered: true,
-        message:
-          'Requests are only permitted during business hours (9:00-17:00 UTC).',
-        severity: 'low',
-      };
-    }
-    return { tripwireTriggered: false };
-  },
-});
+### Built-in helpers
-const smartEducationModel = wrapWithInputGuardrails(openai('gpt-4'), {
-  inputGuardrails: [businessHoursGuard],
+```ts
+import { openai } from '@ai-sdk/openai';
+import {
+  withGuardrails,
+  piiDetector,
+  blockedKeywords,
+  contentLengthLimit,
+  promptInjectionDetector,
+  sensitiveDataFilter,
+  minLengthRequirement,
+  confidenceThreshold,
+  mcpSecurityGuardrail,
+  mcpResponseSanitizer,
+} from 'ai-sdk-guardrails';
+const model = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [
+    piiDetector(),
+    promptInjectionDetector({ threshold: 0.7 }),
+    blockedKeywords(['test', 'spam']),
+    contentLengthLimit(4000),
+  ],
+  outputGuardrails: [
+    mcpSecurityGuardrail({
+      detectExfiltration: true,
+      scanEncodedContent: true,
+      allowedDomains: ['trusted-api.com'],
+    }),
+    mcpResponseSanitizer(),
+    sensitiveDataFilter(),
+    minLengthRequirement(160),
+    confidenceThreshold(0.6),
+  ],
 });
 ```
-### 4. Type-Safe Metadata (TypeScript)
+## Streaming
-The library automatically infers metadata types from your guardrail definitions - no manual type annotations needed!
+Works out of the box. By default, guardrails run after the stream ends (buffer mode). For early blocking, enable progressive mode.
-```typescript
-// Define metadata interface for your guardrail
-interface PIIMetadata extends Record<string, unknown> {
-  detectedTypes: Array<{ type: string; description: string }>;
-  count: number;
-}
+```ts
+import { streamText } from 'ai';
+import { openai } from '@ai-sdk/openai';
+import { withGuardrails, minLengthRequirement } from 'ai-sdk-guardrails';
-// Create guardrail with typed metadata
-const piiDetectionGuardrail = defineInputGuardrail({
-  name: 'pii-detection',
-  execute: async (context) => {
-    const { prompt } = extractTextContent(context);
-    const patterns = [
-      {
-        name: 'SSN',
-        regex: /\b\d{3}-\d{2}-\d{4}\b/,
-        description: 'Social Security Number',
-      },
-      {
-        name: 'Email',
-        regex: /\b[\w\.-]+@[\w\.-]+\.\w+\b/,
-        description: 'Email address',
-      },
-    ];
-    const detected = patterns.filter((p) => p.regex.test(prompt));
-    if (detected.length > 0) {
-      // TypeScript knows this metadata matches PIIMetadata
-      const metadata: PIIMetadata = {
-        detectedTypes: detected.map((p) => ({
-          type: p.name,
-          description: p.description,
-        })),
-        count: detected.length,
-      };
-      return {
-        tripwireTriggered: true,
-        message: `PII detected: ${detected.map((p) => p.name).join(', ')}`,
-        severity: 'high',
-        metadata, // Type is automatically inferred!
-      };
-    }
-    return { tripwireTriggered: false };
-  },
+const model = withGuardrails(openai('gpt-4o'), {
+  outputGuardrails: [minLengthRequirement(120)],
+  // Evaluate as tokens arrive; stop or replace early when blocked
+  streamMode: 'progressive',
+  replaceOnBlocked: true,
 });
-// Use the guardrail - types flow through automatically!
-const protectedModel = wrapWithInputGuardrails(model, [piiDetectionGuardrail], {
-  onInputBlocked: (summary) => {
-    // TypeScript knows the metadata type - no casting needed!
-    const metadata = summary.blockedResults[0]?.metadata;
-    if (metadata?.detectedTypes) {
-      // Full type safety and autocomplete for metadata.detectedTypes
-      for (const type of metadata.detectedTypes) {
-        console.log(`Detected: ${type.type} - ${type.description}`);
-      }
-    }
-  },
+const { textStream } = await streamText({
+  model,
+  prompt: 'Tell me a short story about a robot.',
 });
-```
-**That's it!** Your AI application now optimizes resource usage, ensures quality, prevents inappropriate responses, and provides full type safety automatically.
-## ✨ Features
-- 🛡️ **Input & Output Guardrails**: Enforce custom safety, compliance, and quality policies on both prompts and LLM responses.
-- 💰 **Cost Control**: Block invalid or wasteful prompts before they are sent to your LLM provider, saving you money.
-- 🎯 **Quality Improvement**: Automatically filter, flag, or retry low-quality or irrelevant model outputs.
-- 🔒 **Security Protection**: Built-in defenses against prompt injection, jailbreak attempts, PII leakage, secret exposure, and tool call validation.
-- 🏛️ **Compliance & Governance**: Enforce regulatory guidelines and business rules for enterprise applications with jurisdiction-specific compliance.
-- 🔄 **Streaming Support**: Works seamlessly with both streaming (streamText) and standard (generateText) API responses with real-time content monitoring.
-- 📊 **Observability Hooks**: Built-in callbacks (onInputBlocked, onOutputBlocked, etc.) for logging and monitoring with comprehensive execution analytics.
-- ⚙️ **Configurable Execution**: Run guardrails in parallel or sequentially and set custom timeouts.
-- 🚀 **AI SDK Native**: Designed from the ground up to integrate cleanly with AI SDK middleware patterns.
-- 🧠 **AI-Powered Verification**: LLM-as-judge capabilities for hallucination detection and quality assessment.
-- 🌍 **Global Compliance**: Support for multiple jurisdictions (US, EU, UK, CA, AU, JP, CN, IN) with region-specific policies.
-- 📝 **Content Protection**: Copyright and IP protection with originality scoring and verbatim passage detection.
-- 🔐 **Data Integrity**: Comprehensive table validation, SQL code safety, and schema enforcement.
-- 🌐 **Network Security**: Domain allowlisting, URL sanitization, and external access controls.
-- 🔒 **Privacy & Memory**: PII redaction, memory minimization, and secure logging practices.
-- 🛡️ **Safety & Escalation**: Toxicity de-escalation, human review workflows, and streaming early termination.
-## 📚 API Overview
-| Function                     | Description                                                                   |
-| ---------------------------- | ----------------------------------------------------------------------------- |
-| `defineInputGuardrail()`     | Creates a guardrail to validate, inspect, or block prompts.                   |
-| `defineOutputGuardrail()`    | Creates a guardrail to validate, filter, or re-route LLM outputs.             |
-| `wrapWithGuardrails()`       | ⭐ **Recommended** - The easiest way to add both input and output guardrails. |
-| `wrapWithInputGuardrails()`  | Attaches input-only guardrails to a model.                                    |
-| `wrapWithOutputGuardrails()` | Attaches output-only guardrails to a model.                                   |
-| `isGuardrailsError()`, etc.  | Error handling utilities and structured error types.                          |
-## 🧠 Design Philosophy
-- ✅ **Helper-First**: Simple, chainable utility functions provide a great developer experience for fast adoption.
-- 🧩 **Composable**: Multiple guardrails can be chained together and will run in your specified order (or in parallel).
-- 🧾 **Type-Safe**: Full TypeScript support with automatic type inference for guardrail metadata - no manual type annotations needed!
-- 🧪 **Sensible Defaults**: Get started quickly with zero-config default behaviors that can be easily overridden.
-## Architecture Overview
-The library leverages the Vercel AI SDK's middleware architecture to provide composable guardrails that integrate seamlessly with your existing AI applications:
-```mermaid
-graph TB
-    subgraph "Your Application"
-        App[Your App Code]
-        Config[Guardrail Configuration]
-    end
-    subgraph "AI SDK Guardrails Middleware"
-        InputMW[Input Guardrails Middleware]
-        OutputMW[Output Guardrails Middleware]
-        subgraph "Input Guardrails Layer"
-            Length[Length Validation]
-            Spam[Spam Detection]
-            PII[PII Detection]
-            Business[Business Rules]
-            Custom1[Custom Guards]
-        end
-        subgraph "Output Guardrails Layer"
-            Quality[Quality Assurance]
-            Sensitive[Sensitive Info Filter]
-            Professional[Professional Tone]
-            Factual[Factual Validation]
-            Custom2[Custom Guards]
-        end
-    end
-    subgraph "AI SDK Core"
-        Wrapper[wrapLanguageModel]
-        Generator[generateText/Object/Stream]
-    end
-    subgraph "External Services"
-        AI[AI Model Provider]
-        Log[Logging & Telemetry]
-    end
-    App --> Config
-    Config --> InputMW
-    InputMW --> Length
-    InputMW --> Spam
-    InputMW --> PII
-    InputMW --> Business
-    InputMW --> Custom1
-    InputMW -->|Valid Request| Wrapper
-    InputMW -->|Blocked Request| Log
-    Wrapper --> Generator
-    Generator --> AI
-    AI --> OutputMW
-    OutputMW --> Quality
-    OutputMW --> Sensitive
-    OutputMW --> Professional
-    OutputMW --> Factual
-    OutputMW --> Custom2
-    OutputMW -->|Clean Response| App
-    OutputMW -->|Quality Issues| Log
-    style InputMW fill:#e1f5fe
-    style OutputMW fill:#f3e5f5
-    style AI fill:#fff3e0
-    style App fill:#e8f5e8
+for await (const delta of textStream) process.stdout.write(delta);
 ```
-## 🍳 Recipes & Use Cases
+## Auto Retry
-Guardrails can enforce any custom logic. Here are a few common patterns.
+Choose what fits your flow:
-### Rate Limiting
+- Standalone utility: Use `retry()` to wrap any generation function with your own validator and backoff.
+- Middleware option: Add `retry` to output guardrails so retries run automatically when a check fails.
-Pass a userId in the metadata of your generateText call to enforce per-user rate limits.
+### Utility
-```typescript
-const rateLimitGuard = defineInputGuardrail({
-  name: 'user-rate-limit',
-  execute: async ({ metadata }) => {
-    const userId = metadata?.userId ?? 'anonymous';
-    const allowed = await checkRateLimit(userId); // Your rate-limiting logic
+```ts
+import { retry } from 'ai-sdk-guardrails';
+import { generateText } from 'ai';
+import { openai } from '@ai-sdk/openai';
-    return allowed
-      ? { tripwireTriggered: false }
-      : {
-          tripwireTriggered: true,
-          message: `Rate limit exceeded for user: ${userId}`,
-        };
-  },
+const result = await retry({
+  generate: (params) => generateText({ model: openai('gpt-4o'), ...params }),
+  params: { prompt: 'Explain backpropagation in depth.' },
+  validate: (r) => ({
+    blocked: (r.text ?? '').length < 500,
+    message: 'Response too short',
+  }),
+  buildRetryParams: ({ lastParams }) => ({
+    ...lastParams,
+    maxOutputTokens: Math.max(800, (lastParams.maxOutputTokens ?? 400) + 300),
+  }),
+  maxRetries: 2,
 });
 ```
-### LLM-as-Judge for Quality Scoring
+### Middleware
-Use a cheaper, faster model to "judge" the output of a more powerful one.
+```ts
+import { generateText } from 'ai';
+import { openai } from '@ai-sdk/openai';
+import { withGuardrails, defineOutputGuardrail } from 'ai-sdk-guardrails';
+import { extractContent } from 'ai-sdk-guardrails/guardrails/output';
-```typescript
-const qualityJudge = defineOutputGuardrail({
-  name: 'llm-quality-judge',
+const minLengthGuardrail = defineOutputGuardrail<{ minChars: number }>({
+  name: 'min-output-length',
   execute: async ({ result }) => {
-    // Use a cheap model to score the primary model's output
-    const judgement = await generateText({
-      model: openai('gpt-3.5-turbo'),
-      prompt: `Is the following response helpful and safe? Answer YES or NO. \n\nResponse: "${result.text}"`,
-    });
-    const isSafe = judgement.text.includes('YES');
-    return isSafe
-      ? { tripwireTriggered: false }
-      : {
+    const { text } = extractContent(result);
+    const minChars = text.length + 1;
+    return text.length < minChars
+      ? {
           tripwireTriggered: true,
-          message: `Output failed LLM-as-judge quality check.`,
-          metadata: { originalText: result.text },
-        };
+          severity: 'medium',
+          message: `Answer too short: ${text.length} < ${minChars}`,
+          metadata: { minChars },
+        }
+      : { tripwireTriggered: false };
   },
 });
-```
-### Advanced Input Validation
-```typescript
-import { extractTextContent } from 'ai-sdk-guardrails/guardrails/input';
-const comprehensiveInputGuard = defineInputGuardrail({
-  name: 'comprehensive-input-validation',
-  execute: async (context) => {
-    const { prompt } = extractTextContent(context);
-    // Length validation
-    if (prompt.length < 10) {
-      return {
-        tripwireTriggered: true,
-        message: 'Input too short - likely to produce low-value response',
-        severity: 'medium',
-        suggestion: 'Please provide more detailed input for better results',
-      };
-    }
-    if (prompt.length > 4000) {
-      return {
-        tripwireTriggered: true,
-        message: 'Input too long - may exceed token limits',
-        severity: 'high',
-        suggestion: 'Break your request into smaller, focused parts',
-      };
-    }
-    // Content quality checks
-    const spamPatterns = [
-      /^(.)\1{10,}$/, // Repeated characters
-      /^(test|hello|hi|hey)$/i, // Common spam words
-    ];
-    const foundSpam = spamPatterns.find((pattern) => pattern.test(prompt));
-    if (foundSpam) {
-      return {
-        tripwireTriggered: true,
-        message: 'Low-quality input detected',
-        severity: 'high',
-      };
-    }
-    return { tripwireTriggered: false };
+const guarded = wrapWithOutputGuardrails(
+  openai('gpt-4o'),
+  [minLengthGuardrail],
+  {
+    replaceOnBlocked: false,
+    retry: {
+      maxRetries: 1,
+      buildRetryParams: ({ summary, lastParams }) => ({
+        ...lastParams,
+        maxOutputTokens: Math.max(
+          800,
+          (lastParams.maxOutputTokens ?? 400) + 300,
+        ),
+        prompt: [
+          ...(Array.isArray(lastParams.prompt) ? lastParams.prompt : []),
+          {
+            role: 'user' as const,
+            content: [
+              {
+                type: 'text' as const,
+                text: `Note: The previous answer ${summary.blockedResults[0]?.message}. Provide a comprehensive, detailed answer with examples.`,
+              },
+            ],
+          },
+        ],
+      }),
+    },
   },
+);
+const { text } = await generateText({
+  model: guarded,
+  prompt: 'Explain the significance of the Turing Test in AI history.',
 });
 ```
-### Professional Output Quality Control
+Tip: Use backoff helpers if you need delays between retries: `exponentialBackoff`, `linearBackoff`, `fixedBackoff`, `jitteredExponentialBackoff`, or `backoffPresets`.
-```typescript
-import { extractContent } from 'ai-sdk-guardrails/guardrails/output';
+## Error Handling
-const professionalQualityGuard = defineOutputGuardrail({
-  name: 'professional-quality-control',
-  execute: async (context) => {
-    const { text } = extractContent(context.result);
-    const qualityIssues = [];
-    // Check for unprofessional language
-    const unprofessionalTerms = ['lol', 'wtf', 'omg', 'ur', 'u r'];
-    const hasUnprofessional = unprofessionalTerms.some((term) =>
-      text.toLowerCase().includes(term),
-    );
-    if (hasUnprofessional) {
-      qualityIssues.push('Contains unprofessional language');
-    }
-    // Check for placeholder text
-    const placeholders = ['[insert', '[add', '[your', 'TODO:', 'FIXME:'];
-    const hasPlaceholders = placeholders.some((placeholder) =>
-      text.includes(placeholder),
-    );
-    if (hasPlaceholders) {
-      qualityIssues.push('Contains placeholder text - incomplete response');
-    }
-    // Check for excessive repetition
-    const sentences = text.split(/[.!?]+/).filter((s) => s.trim());
-    const uniqueSentences = new Set(
-      sentences.map((s) => s.trim().toLowerCase()),
-    );
-    const repetitionRatio = uniqueSentences.size / sentences.length;
-    if (sentences.length > 3 && repetitionRatio < 0.6) {
-      qualityIssues.push('Excessive repetition detected');
-    }
-    if (qualityIssues.length > 0) {
-      return {
-        tripwireTriggered: true,
-        message: `Quality issues found: ${qualityIssues.join(', ')}`,
-        severity: 'medium',
-        suggestion: 'Request a more professional, complete response',
-        metadata: {
-          issues: qualityIssues,
-          quality_score: repetitionRatio,
-        },
-      };
-    }
+Set `throwOnBlocked: true` to throw structured errors you can catch and turn into friendly messages.
-    return { tripwireTriggered: false };
-  },
-});
-```
+```ts
+import { isGuardrailsError } from 'ai-sdk-guardrails';
-## 🔄 Streaming Support
+try {
+  const { text } = await generateText({ model, prompt: '...' });
+} catch (err) {
+  if (isGuardrailsError(err)) {
+    console.error('Guardrail blocked:', err.message);
+    // err.results gives you details per guardrail
+  } else {
+    console.error('Unexpected error:', err);
+  }
+}
+```
-Guardrails work with streams out-of-the-box. By default, output guardrails run after the complete response has been streamed (buffer mode).
+## Reusable Guardrails Factory
-```typescript
-import { streamText } from 'ai';
+Use `createGuardrails()` to create reusable guardrail configurations that can be applied to multiple models:
-const guardedModel = wrapWithGuardrails(openai('gpt-4o'), {
-  outputGuardrails: [qualityJudge],
+```ts
+import { openai } from '@ai-sdk/openai';
+import { anthropic } from '@ai-sdk/anthropic';
+import { createGuardrails, defineInputGuardrail } from 'ai-sdk-guardrails';
+// Create reusable guardrails configuration
+const productionGuards = createGuardrails({
+  inputGuardrails: [piiDetector(), contentFilter()],
+  outputGuardrails: [qualityCheck(), minLength(100)],
+  throwOnBlocked: true,
 });
-const { textStream } = await streamText({
-  model: guardedModel,
-  prompt: 'Tell me a short story about a robot.',
-});
+// Apply to multiple models
+const gpt4 = productionGuards(openai('gpt-4o'));
+const claude = productionGuards(anthropic('claude-3-sonnet'));
-// Stream the response to the client
-for await (const delta of textStream) {
-  process.stdout.write(delta);
-}
+// Compose multiple guardrail sets
+const strictLimits = createGuardrails({ inputGuardrails: [maxLength(500)] });
+const piiProtection = createGuardrails({ inputGuardrails: [piiDetector()] });
-// The qualityJudge guardrail will run after the stream is complete.
+// Chain them together
+const model = piiProtection(strictLimits(openai('gpt-4o')));
 ```
-### Progressive Streaming (opt-in)
+## MCP Security Guardrails
-For early blocking, enable progressive evaluation:
+**Production-Ready**: Protect against prompt injection and data exfiltration attacks when using Model Context Protocol (MCP) tools. Based on research into the ["lethal trifecta" vulnerability](https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/) that has affected major AI platforms.
-```ts
-const guardedModel = wrapWithGuardrails(openai('gpt-4o'), {
-  outputGuardrails: [qualityJudge],
-  // Evaluate on the fly and stop early when blocked
-  streamMode: 'progressive',
-  // Replace blocked output with a placeholder (default: true)
-  replaceOnBlocked: true,
-});
-```
+### The Problem
-In progressive mode, guardrails evaluate text as it arrives. If blocked:
+AI agents with MCP tools can be vulnerable when they have:
-- with `throwOnBlocked: true`, the stream errors.
-- with `replaceOnBlocked: true`, a placeholder message is streamed and the stream ends.
-- otherwise, the original chunks continue (with a callback via `onOutputBlocked`).
+1. **Access to private data** (through tools)
+2. **Process untrusted content** (from tool responses)
+3. **Can communicate externally** (make web requests)
-Note: Progressive mode runs guardrails more frequently and may increase overhead for long streams.
+Malicious tool responses can contain hidden instructions that trick the AI into exfiltrating sensitive data.
-### Configuration Highlights
+### Production-Ready Solution
-- `replaceOnBlocked` (output): defaults to `true` for safer behavior.
-- `executionOptions.logLevel`: defaults to `'warn'` (respects `'none' | 'error' | 'warn' | 'info' | 'debug'`).
-- `onInputBlocked` / `onOutputBlocked`: receive a `GuardrailExecutionSummary` with analytics.
+Full configurability with sensible defaults for immediate deployment:
-### Cancellation Support
+```ts
+import {
+  withGuardrails,
+  promptInjectionDetector,
+  mcpSecurityGuardrail,
+  mcpResponseSanitizer,
+  toolEgressPolicy,
+} from 'ai-sdk-guardrails';
-Guardrails can receive an `AbortSignal` and should abort work on timeout or caller-initiated cancel:
+// Conservative production setup (high security)
+const secureModel = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [
+    promptInjectionDetector({ threshold: 0.6, includeExamples: true }),
+  ],
+  outputGuardrails: [
+    mcpSecurityGuardrail({
+      injectionThreshold: 0.5, // Lower = more sensitive
+      maxSuspiciousUrls: 0, // Zero tolerance
+      maxContentSize: 25600, // 25KB limit for performance
+      minEncodedLength: 15, // Detect shorter encoded attacks
+      encodedInjectionThreshold: 0.2, // Combined encoded + injection threshold
+      highRiskThreshold: 0.3, // High-risk cascade blocking
+      authorityThreshold: 0.5, // Authority manipulation detection
+      allowedDomains: ['api.company.com', 'trusted-partner.com'],
+      customSuspiciousDomains: ['evil.com', 'malicious.org'],
+      blockCascadingCalls: true,
+      scanEncodedContent: true,
+      detectExfiltration: true,
+    }),
+    mcpResponseSanitizer(), // Clean malicious content vs blocking
+    toolEgressPolicy({
+      allowedHosts: ['api.company.com', 'trusted-partner.com'],
+      blockedHosts: ['webhook.site', 'requestcatcher.com', 'ngrok.io'],
+      scanForUrls: true,
+    }),
+  ],
+});
+```
+### Environment & Role-Based Configuration
 ```ts
-const guard = defineInputGuardrail({
-  name: 'long-check',
-  async execute(context, { signal }) {
-    await doWork({ signal }); // Pass signal to your async ops
-    return { tripwireTriggered: false };
-  },
-});
+// Different security profiles for different environments
+function getSecurityConfig(env: 'production' | 'staging' | 'development') {
+  const configs = {
+    production: {
+      injectionThreshold: 0.5, // High security
+      maxContentSize: 25600, // 25KB limit
+      authorityThreshold: 0.5, // Very sensitive
+    },
+    staging: {
+      injectionThreshold: 0.7, // Balanced security
+      maxContentSize: 51200, // 50KB default
+      authorityThreshold: 0.7, // Standard sensitivity
+    },
+    development: {
+      injectionThreshold: 0.8, // Lower security, better performance
+      maxContentSize: 102400, // 100KB for testing
+      authorityThreshold: 0.8, // Less restrictive
+    },
+  };
+  return configs[env];
+}
-// Timeouts are enforced by guardrail execution; if it times out, you'll get a GuardrailTimeoutError.
+const productionModel = withGuardrails(openai('gpt-4o'), {
+  outputGuardrails: [mcpSecurityGuardrail(getSecurityConfig('production'))],
+});
 ```
-## 🛠️ Error Handling
+### Attack Vectors Prevented
-When `throwOnBlocked: true` (the default), you can catch structured errors to handle blocks gracefully.
+✅ **Direct prompt injection** - "System: ignore all previous instructions"
+✅ **Tool response poisoning** - Malicious content in MCP tool responses
+✅ **Data exfiltration** - URLs constructed to steal sensitive data
+✅ **Encoded attacks** - Base64/hex hidden malicious instructions
+✅ **Cascading exploits** - Tool responses triggering additional dangerous calls
+✅ **Context poisoning** - Attempts to modify AI behavior mid-conversation
-```typescript
-import { generateText } from 'ai';
-import { isGuardrailsError } from 'ai-sdk-guardrails';
+### Secure MCP Agent Example
-try {
-  const result = await generateText({
-    model: guardedModel,
-    prompt: 'A prompt that might be blocked...',
-  });
-} catch (error) {
-  if (isGuardrailsError(error)) {
-    // Error was thrown by one of our guardrails
-    console.error('Guardrail check failed:', error.message);
-    console.error('Triggered Guards:', error.results);
-  } else {
-    // Some other error occurred
-    console.error('An unexpected error occurred:', error);
-  }
-}
+```ts
+import { withAgentGuardrails } from 'ai-sdk-guardrails';
+const secureAgent = withAgentGuardrails(
+  {
+    model: openai('gpt-4o'),
+    tools: { file_search, api_call, database_query },
+    system: 'You are a secure assistant. Always validate tool responses.',
+  },
+  {
+    inputGuardrails: [promptInjectionDetector()],
+    outputGuardrails: [
+      mcpSecurityGuardrail({
+        detectExfiltration: true,
+        allowedDomains: ['trusted-api.com'],
+      }),
+      mcpResponseSanitizer(),
+    ],
+    toolGuardrails: [
+      toolEgressPolicy({
+        allowedHosts: ['trusted-api.com'],
+        scanForUrls: true,
+      }),
+    ],
+  },
+);
 ```
-### User-Friendly Error Messages
+### Configuration Options
-Transform technical guardrail messages into user-friendly guidance:
+All security parameters are fully configurable with sensible defaults:
-```typescript
-function createUserFriendlyMessage(guardrailResult): string {
-  const guardrailName = guardrailResult.context?.guardrailName;
+| Option                      | Default | Description                                      |
+| --------------------------- | ------- | ------------------------------------------------ |
+| `injectionThreshold`        | 0.7     | Prompt injection confidence threshold (0-1)      |
+| `maxSuspiciousUrls`         | 0       | Max allowed suspicious URLs (0 = zero tolerance) |
+| `maxContentSize`            | 51200   | Max content size in bytes (50KB default)         |
+| `minEncodedLength`          | 20      | Min encoded content length to analyze            |
+| `encodedInjectionThreshold` | 0.3     | Combined encoded + injection threshold           |
+| `authorityThreshold`        | 0.7     | Authority manipulation detection sensitivity     |
+| `allowedDomains`            | []      | Allowed domains for URL construction             |
+| `customSuspiciousDomains`   | []      | Additional suspicious domain patterns            |
-  switch (guardrailName) {
-    case 'content-length-limit':
-      return 'Your message is too long. Please keep it under 500 characters for the best response.';
+### Performance & Security Balance
-    case 'blocked-keywords':
-      return "I can't help with that topic. Try asking about something else I can assist with.";
+- **High Security**: Lower thresholds, stricter limits, comprehensive scanning
+- **Balanced**: Default settings, good for most production use cases
+- **High Performance**: Higher thresholds, larger limits, selective scanning
-    case 'user-rate-limit':
-      return "You're sending requests too quickly. Please wait a moment before trying again.";
+See complete examples:
-    default:
-      return (
-        guardrailResult.suggestion ||
-        'Please refine your request and try again.'
-      );
-  }
-}
-```
+- [Production MCP Configuration](./examples/44-production-mcp-config.ts) - **New!**
+- [MCP Security Test Suite](./examples/41-mcp-security-test.ts)
+- [Enhanced Security Testing](./examples/43-enhanced-mcp-security-test.ts)
+- [Vulnerability Proof of Concept](./examples/42-mcp-vulnerability-proof.ts)
-## Complete AI SDK Integration
+## Agent Support
-The library seamlessly integrates with all AI SDK functions:
+Guardrails work with AI SDK Agents for multi-step agentic workflows:
-```typescript
-// Create your production-ready model once
-const productionModel = wrapWithGuardrails(openai('gpt-4'), {
-  inputGuardrails: [lengthGuard, spamGuard, rateLimitGuard],
-  outputGuardrails: [qualityGuard, sensitiveInfoGuard],
-  throwOnBlocked: false,
-  onInputBlocked: (executionSummary) => {
-    console.log('Input blocked:', executionSummary.blockedResults[0]?.message);
+```ts
+import { openai } from '@ai-sdk/openai';
+import { withAgentGuardrails, defineOutputGuardrail } from 'ai-sdk-guardrails';
+import { tool } from 'ai';
+import { z } from 'zod';
+// Define tools for the agent
+const searchTool = tool({
+  description: 'Search for information',
+  inputSchema: z.object({ query: z.string() }),
+  execute: async ({ query }) => `Results for: ${query}`,
+});
-    // Enhanced analytics available in v4.0.0
-    console.log(`Execution time: ${executionSummary.totalExecutionTime}ms`);
-    console.log(
-      `Guardrails: ${executionSummary.stats.blocked} blocked, ${executionSummary.stats.passed} passed`,
-    );
+// Create agent with guardrails
+const agent = withAgentGuardrails(
+  {
+    model: openai('gpt-4o'),
+    tools: { search: searchTool },
+    system: 'You are a helpful research assistant.',
   },
-  onOutputBlocked: (executionSummary) => {
-    console.log(
-      'Output filtered:',
-      executionSummary.blockedResults[0]?.message,
-    );
-    // Track comprehensive metrics
-    analytics.track('output_blocked', {
-      severity: executionSummary.blockedResults[0]?.severity,
-      totalGuardrails: executionSummary.guardrailsExecuted,
-      executionTime: executionSummary.totalExecutionTime,
-    });
+  {
+    outputGuardrails: [
+      defineOutputGuardrail({
+        name: 'tool-usage-required',
+        description: 'Ensures agent uses search tools',
+        execute: async (params) => {
+          const hasToolCall = params.result.steps?.some(
+            (step) => step.type === 'tool-call',
+          );
+          return {
+            tripwireTriggered: !hasToolCall,
+            message: hasToolCall
+              ? 'Tool usage validated'
+              : 'Must use search tools for research',
+            severity: 'high',
+          };
+        },
+      }),
+    ],
+    throwOnBlocked: true,
   },
-});
-// Use with any AI SDK function
-const textResult = await generateText({
-  model: productionModel,
-  prompt: 'Write a professional email response',
-});
+);
-const objectResult = await generateObject({
-  model: productionModel,
-  prompt: 'Create a user profile',
-  schema: userProfileSchema,
-});
-const textStream = await streamText({
-  model: productionModel,
-  prompt: 'Explain our product features',
+// Use the guarded agent
+const result = await agent.generate({
+  prompt: 'Research the latest AI developments',
 });
 ```
-## Examples
+## API
-Explore **30 comprehensive examples** that demonstrate practical performance optimization, security protection, quality assurance, and enterprise-grade safety patterns:
+| Export                                                                                                      | Description                                                                      |
+| ----------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
+| `defineInputGuardrail`, `defineOutputGuardrail`                                                             | Create guardrails with clear messages, severity, and metadata.                   |
+| `withGuardrails`, `createGuardrails`, `withAgentGuardrails`                                                 | Attach guardrails to AI SDK models and agents via middleware.                    |
+| `executeInputGuardrails`, `executeOutputGuardrails`                                                         | Run guardrails programmatically (outside middleware) and get structured results. |
+| `retry`, `retryHelpers`                                                                                     | Standalone auto-retry utilities with validation and backoff.                     |
+| `GuardrailsError`, `GuardrailsInputError`, `GuardrailsOutputError`, `isGuardrailsError`, `extractErrorInfo` | Structured errors and helpers for robust handling.                               |
+| `exponentialBackoff`, `linearBackoff`, `fixedBackoff`, `jitteredExponentialBackoff`, `backoffPresets`       | Backoff strategies to control retry pacing.                                      |
-### Core Foundation Examples
+See source for built-in helpers:
-- **[Input Length Limits](examples/01-input-length-limit.ts)** - Foundation patterns for input validation
-- **[Blocked Keywords](examples/02-blocked-keywords.ts)** - Block prompts with specific keywords and content filtering
-- **[Output Length Check](examples/04-output-length-check.ts)** - Ensure minimum output length and quality control
-- **[Quality Assessment](examples/06-quality-assessment.ts)** - Assess response quality and content analysis
-- **[Combined Protection](examples/07-combined-protection.ts)** - Simple input/output validation for efficiency and quality
-- **[Simple Combined Protection](examples/07a-simple-combined-protection.ts)** - Simplified combined guardrails example
-- **[Blocking vs Warning](examples/08-blocking-vs-warning.ts)** - Compare blocking and warning modes with error handling
+- Input helpers: `./src/guardrails/input.ts`
+- Output helpers: `./src/guardrails/output.ts`
-### Security & Protection Examples
-- **[PII Detection](examples/03-pii-detection.ts)** - Detect and block personal information in inputs
-- **[Sensitive Output Filter](examples/05-sensitive-output-filter.ts)** - Filter sensitive data from responses
-- **[Prompt Injection Detection](examples/16-prompt-injection-detection.ts)** - Comprehensive prompt injection detection with pattern matching and heuristic scoring
-- **[Tool Call Validation](examples/17-tool-call-validation.ts)** - Tool call validation with security patterns and dangerous operation detection
-- **[Basic Tool Allowlist](examples/17a-basic-tool-allowlist.ts)** - Basic tool allowlisting for secure tool usage
-- **[Tool Parameter Validation](examples/17b-tool-parameter-validation.ts)** - Validate tool parameters for security
-- **[Secret Leakage Scan](examples/18-secret-leakage-scan.ts)** - Secret leakage scanning with automatic redaction and entropy calculation
-- **[Jailbreak Detection](examples/30-jailbreak-detection.ts)** - Jailbreak detection with safe response templates and pattern recognition
+## Examples
-### Content Quality & Validation Examples
+Browse runnable examples for streaming, compliance, safety, and more:
-- **[Autoevals Guardrails](examples/31-autoevals-guardrails.ts)** - AI-powered quality evaluation using Autoevals library for factuality checking
-- **[Business Logic](examples/14-business-logic.ts)** - Custom business rules, work hours, and professional standards
-- **[LLM-as-Judge](examples/15-llm-as-judge.ts)** - AI-powered quality evaluation and scoring
-- **[Simple Quality Judge](examples/15a-simple-quality-judge.ts)** - Simplified quality assessment example
-- **[Hallucination Detection](examples/19-hallucination-detection.ts)** - Hallucination detection with LLM-as-judge verification and fact-checking
-- **[Response Consistency](examples/22-response-consistency.ts)** - Response consistency validation and coherence checking
+- Index and commands: [examples/README.md](./examples/README.md)
-### Compliance & Regulation Examples
+Quick starts
-- **[Regulated Advice Compliance](examples/21-regulated-advice-compliance.ts)** - Regulated advice compliance with jurisdiction-specific rules (US, EU, UK, CA, AU, JP, CN, IN)
-- **[Role Hierarchy Enforcement](examples/23-role-hierarchy-enforcement.ts)** - Role hierarchy enforcement with multi-layered violation detection
+| Example                    | Description                     | File                                                                              |
+| -------------------------- | ------------------------------- | --------------------------------------------------------------------------------- |
+| Simple combined protection | Minimal input and output setup  | [07a-simple-combined-protection.ts](./examples/07a-simple-combined-protection.ts) |
+| Auto retry on output       | Retry until output meets a rule | [32-auto-retry-output.ts](./examples/32-auto-retry-output.ts)                     |
+| LLM judge auto-retry       | Judge feedback drives retry     | [33-judge-auto-retry.ts](./examples/33-judge-auto-retry.ts)                       |
+| Expected tool use retry    | Enforce/guide tool usage        | [34-expected-tool-use-retry.ts](./examples/34-expected-tool-use-retry.ts)         |
+| Weather assistant          | End-to-end input/output + retry | [33-blog-post-weather-assistant.ts](./examples/33-blog-post-weather-assistant.ts) |
-### Data Integrity & Code Safety Examples
+Input safety
-- **[Schema Validation](examples/09-schema-validation.ts)** - Schema validation and structured output quality
-- **[Object Content Filter](examples/10-object-content-filter.ts)** - Filter inappropriate content in generated objects
-- **[SQL Code Safety](examples/24-sql-code-safety.ts)** - SQL code safety with dangerous operation blocking and injection detection
+| Example            | Description                         | File                                                            |
+| ------------------ | ----------------------------------- | --------------------------------------------------------------- |
+| Input length limit | Enforce max input length            | [01-input-length-limit.ts](./examples/01-input-length-limit.ts) |
+| Blocked keywords   | Block specific terms                | [02-blocked-keywords.ts](./examples/02-blocked-keywords.ts)     |
+| PII detection      | Detect PII before calling the model | [03-pii-detection.ts](./examples/03-pii-detection.ts)           |
+| Rate limiting      | Simple per-user rate limit          | [13-rate-limiting.ts](./examples/13-rate-limiting.ts)           |
-### Network & External Access Examples
+Output safety
-- **[Domain Allowlisting](examples/25-browsing-domain-allowlist.ts)** - Domain allowlisting with URL sanitization and security validation
+| Example                 | Description                         | File                                                                      |
+| ----------------------- | ----------------------------------- | ------------------------------------------------------------------------- |
+| Output length check     | Require min/max output length       | [04-output-length-check.ts](./examples/04-output-length-check.ts)         |
+| Sensitive output filter | Filter secrets and PII in responses | [05-sensitive-output-filter.ts](./examples/05-sensitive-output-filter.ts) |
+| Hallucination detection | Flag uncertain factual claims       | [19-hallucination-detection.ts](./examples/19-hallucination-detection.ts) |
-### Privacy & Memory Management Examples
+Streaming
-- **[Memory Minimization](examples/26-memory-minimization.ts)** - Memory minimization with PII redaction and multiple redaction strategies
-- **[Logging Redaction](examples/27-logging-redaction.ts)** - Logging redaction with secure logging practices and compliance frameworks
+| Example           | Description                        | File                                                                              |
+| ----------------- | ---------------------------------- | --------------------------------------------------------------------------------- |
+| Streaming limits  | Apply limits in buffered streaming | [11-streaming-limits.ts](./examples/11-streaming-limits.ts)                       |
+| Streaming quality | Quality checks with streaming      | [12-streaming-quality.ts](./examples/12-streaming-quality.ts)                     |
+| Early termination | Stop streams early when blocked    | [28-streaming-early-termination.ts](./examples/28-streaming-early-termination.ts) |
-### Safety & Escalation Examples
+Advanced
-- **[Human Review Escalation](examples/20-human-review-escalation.ts)** - Human review escalation with content flagging, review routing, and quality control workflows
-- **[Toxicity & Harassment De-escalation](examples/29-toxicity-harassment-deescalation.ts)** - Toxicity and harassment de-escalation with safe response generation and user escalation tracking
+| Example                    | Description                   | File                                                                            |
+| -------------------------- | ----------------------------- | ------------------------------------------------------------------------------- |
+| Simple quality judge       | Cheaper model judges quality  | [15a-simple-quality-judge.ts](./examples/15a-simple-quality-judge.ts)           |
+| Secret leakage scan        | Scan responses for secrets    | [18-secret-leakage-scan.ts](./examples/18-secret-leakage-scan.ts)               |
+| SQL code safety            | Basic SQL safety checks       | [24-sql-code-safety.ts](./examples/24-sql-code-safety.ts)                       |
+| Role hierarchy enforcement | Enforce role rules in prompts | [23-role-hierarchy-enforcement.ts](./examples/23-role-hierarchy-enforcement.ts) |
-### Streaming Examples
+## Compatibility
-- **[Streaming Limits](examples/11-streaming-limits.ts)** - Apply guardrails to streaming responses with real-time validation
-- **[Streaming Quality](examples/12-streaming-quality.ts)** - Real-time quality monitoring for streams
-- **[Streaming Early Termination](examples/28-streaming-early-termination.ts)** - Streaming early termination with real-time content monitoring and session state management
+- Runtime: Node.js 18+ recommended
+- AI SDK: Compatible with AI SDK 5 (`ai@^5`); wraps any model
+- For `generateObject`: for strict object validation, run `executeOutputGuardrails()` after generation
-### Resource Management Examples
+## Architecture
-- **[Rate Limiting](examples/13-rate-limiting.ts)** - Smart rate limiting that prevents resource overuse
+```mermaid
+flowchart LR
+  A[Input] --> B[Input Guardrails]
+  B -->|Valid| C[AI Model]
+  B -->|Blocked| X[No API Call]
+  C --> D[Output Guardrails]
+  D -->|Clean| E[Response]
+  D -->|Blocked| R[Retry/Replace/Throw]
+```
-### Running Examples
+### Design principles
-```bash
-# Install dependencies
-pnpm install
-# Run core foundation examples
-tsx examples/01-input-length-limit.ts      # Basic input validation
-tsx examples/02-blocked-keywords.ts        # Keyword blocking
-tsx examples/04-output-length-check.ts     # Output length validation
-tsx examples/06-quality-assessment.ts      # Quality assessment
-tsx examples/07-combined-protection.ts     # Combined input/output protection
-tsx examples/07a-simple-combined-protection.ts # Simplified combined protection
-tsx examples/08-blocking-vs-warning.ts     # Blocking vs warning modes
-# Run security examples
-tsx examples/03-pii-detection.ts           # PII protection
-tsx examples/05-sensitive-output-filter.ts # Sensitive output filtering
-tsx examples/16-prompt-injection-detection.ts # Prompt injection protection
-tsx examples/17-tool-call-validation.ts    # Tool call validation
-tsx examples/17a-basic-tool-allowlist.ts   # Basic tool allowlisting
-tsx examples/17b-tool-parameter-validation.ts # Tool parameter validation
-tsx examples/18-secret-leakage-scan.ts     # Secret leakage prevention
-tsx examples/30-jailbreak-detection.ts     # Jailbreak detection
-# Run content quality examples
-tsx examples/31-autoevals-guardrails.ts    # AI-powered quality evaluation with Autoevals
-tsx examples/14-business-logic.ts          # Business-specific rules
-tsx examples/15-llm-as-judge.ts            # AI-powered quality control
-tsx examples/15a-simple-quality-judge.ts   # Simplified quality assessment
-tsx examples/19-hallucination-detection.ts # Hallucination detection
-tsx examples/22-response-consistency.ts    # Response consistency
-# Run compliance examples
-tsx examples/21-regulated-advice-compliance.ts # Regulatory compliance
-tsx examples/23-role-hierarchy-enforcement.ts # Role hierarchy enforcement
-# Run data integrity examples
-tsx examples/09-schema-validation.ts       # Schema validation
-tsx examples/10-object-content-filter.ts   # Object content filtering
-tsx examples/24-sql-code-safety.ts         # SQL code safety
-# Run network security examples
-tsx examples/25-browsing-domain-allowlist.ts # Domain allowlisting
-# Run privacy examples
-tsx examples/26-memory-minimization.ts     # Memory minimization
-tsx examples/27-logging-redaction.ts       # Logging redaction
-# Run safety examples
-tsx examples/20-human-review-escalation.ts # Human review escalation
-tsx examples/29-toxicity-harassment-deescalation.ts # Toxicity de-escalation
-# Run streaming examples
-tsx examples/11-streaming-limits.ts        # Streaming limits
-tsx examples/12-streaming-quality.ts       # Streaming quality monitoring
-tsx examples/28-streaming-early-termination.ts # Streaming early termination
-# Run resource management examples
-tsx examples/13-rate-limiting.ts           # Rate limiting
-```
+- Helper-first: simple, chainable APIs with great DX
+- Composable: run multiple guardrails in any order
+- Type-safe: rich TypeScript types and inference
+- Sensible defaults: zero-config to start, full control when you need it
-## 🤝 Contributing
+## Contributing
-Contributions of all sizes are welcome! Please open issues and pull requests on [GitHub](https://github.com/jagreehal/ai-sdk-guardrails).
+Issues and PRs are welcome.
-## 📄 License
+## License
-MIT © [Jag Reehal](https://github.com/jagreehal) – See LICENSE for full details.
+MIT © Jag Reehal. See LICENSE for details.