npm - ai-sdk-guardrails - Versions diffs - 3.0.0 → 5.0.0 - Mend

ai-sdk-guardrails 3.0.0 → 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/README.md +585 -513
package/package.json +34 -24
package/dist/chunk-HHQ3CIFN.js +0 -12
package/dist/chunk-LLCOPUS6.js +0 -159
package/dist/errors-BTTWMQEI.js +0 -24
package/dist/guardrails/input.cjs +0 -493
package/dist/guardrails/input.d.cts +0 -36
package/dist/guardrails/input.d.ts +0 -36
package/dist/guardrails/input.js +0 -453
package/dist/guardrails/output.cjs +0 -698
package/dist/guardrails/output.d.cts +0 -46
package/dist/guardrails/output.d.ts +0 -46
package/dist/guardrails/output.js +0 -654
package/dist/index.cjs +0 -815
package/dist/index.d.cts +0 -272
package/dist/index.d.ts +0 -272
package/dist/index.js +0 -607
package/dist/types-B9h_0Gyl.d.cts +0 -121
package/dist/types-B9h_0Gyl.d.ts +0 -121

package/README.md CHANGED Viewed

@@ -1,651 +1,723 @@
 # AI SDK Guardrails
-A powerful middleware for the Vercel AI SDK that adds safety, quality control, and cost management to your AI applications by intercepting prompts and responses.
+**Input and output validation for the Vercel AI SDK**
-Block harmful inputs, filter low-quality outputs, and gain observability, all in just a few lines of code.
+Add safety checks and quality controls to your AI applications. Guard against prompt injection, prevent sensitive data leaks, and improve output reliability - all while keeping your existing AI SDK code unchanged.
+**Now includes MCP (Model Context Protocol) security guardrails** to help protect against attacks when using AI tools.
+[![npm version](https://img.shields.io/npm/v/ai-sdk-guardrails.svg?logo=npm&label=npm)](https://www.npmjs.com/package/ai-sdk-guardrails)
+[![downloads](https://img.shields.io/npm/dw/ai-sdk-guardrails.svg?label=downloads)](https://www.npmjs.com/package/ai-sdk-guardrails)
+[![bundle size](https://img.shields.io/bundlephobia/minzip/ai-sdk-guardrails.svg?label=minzipped)](https://bundlephobia.com/package/ai-sdk-guardrails)
+[![license](https://img.shields.io/npm/l/ai-sdk-guardrails.svg?label=license)](./LICENSE)
+![types](https://img.shields.io/badge/TypeScript-Ready-3178C6?logo=typescript&logoColor=white)
 ![Guardrails Demo](./media/guardrail-example.gif)
-## ⚡ TL;DR
+## Why this matters
-Quickly add input and output validation to any AI SDK-compatible model.
+- **MCP**: Protect against prompt injection and data exfiltration when using MCP tools
+- **Agent**: Have more reliable and secure agentic workflows
+- **Tool security**: Protect against data exfiltration when using MCP tools
+- **Save costs**: Block unnecessary requests before they hit your model
+- **Improve safety**: Detect PII, block harmful content, prevent prompt injection
+- **Better quality**: Enforce minimum response lengths, validate structure, auto-retry on failures
+- **Easy integration**: Works as middleware with any AI SDK model
-```typescript
-import { openai } from '@ai-sdk/openai';
-import { generateText } from 'ai';
-import {
-  wrapWithGuardrails,
-  defineInputGuardrail,
-  defineOutputGuardrail,
-} from 'ai-sdk-guardrails';
+## Common use cases
-// 1. Define your guardrails
-const inputGuard = defineInputGuardrail({
-  name: 'length-check',
-  execute: async ({ prompt }) =>
-    prompt.length > 100
-      ? { tripwireTriggered: true, message: 'Input too long' }
-      : { tripwireTriggered: false },
-});
+- Content moderation and safety filters
+- PII detection for compliance
+- Output quality requirements (length, format)
+- Prompt injection prevention
+- Tool usage validation
+- Auto-retry on low-quality responses
-const outputGuard = defineOutputGuardrail({
-  name: 'quality-check',
-  execute: async ({ result }) =>
-    result.text.length < 10
-      ? { tripwireTriggered: true, message: 'Response too short' }
-      : { tripwireTriggered: false },
-});
+## Secure AI in Under 60 Seconds
-// 2. Wrap your model
-const guardedModel = wrapWithGuardrails(openai('gpt-4o'), {
-  inputGuardrails: [inputGuard],
-  outputGuardrails: [outputGuard],
-});
+**Step 1:** Install (10 seconds)
-// 3. Use it! Guardrails will run automatically.
-const { text } = await generateText({
-  model: guardedModel,
-  prompt: 'A prompt that is definitely not too long.',
-});
+```bash
+npm install ai-sdk-guardrails
 ```
-## How It Works
+**Step 2:** Import (15 seconds)
-### Without Guardrails (Inefficient, Poor Quality)
-```mermaid
-flowchart LR
-    A[User Input<br/>'hello'] --> B[AI Model] --> C[Response<br/>⚠️ Wastes resources<br/>😞 Often useless]
+```ts
+import { withGuardrails, piiDetector } from 'ai-sdk-guardrails';
 ```
-### With Input Guardrails (Save Resources)
+**Step 3:** Wrap your model (30 seconds)
-```mermaid
-flowchart LR
-    A[User Input<br/>'hello'] --> B[Input Guardrails] --> C[❌ STOPPED<br/>✅ No API call made]
+```ts
+const safeModel = withGuardrails(yourModel, {
+  inputGuardrails: [piiDetector()],
+});
 ```
-### With Output Guardrails (Ensure Quality)
+**Result:** Your AI now automatically blocks PII, prevents prompt injection, and validates outputs. That's it. No architecture changes, no security team required.
-```mermaid
-flowchart LR
-    A[AI Response<br/>'Here's my SSN: 123-45-6789'] --> B[Output Guardrails] --> C[❌ BLOCKED<br/>🛡️ Privacy protected]
-```
+## TL;DR
-### Complete Protection
+Copy/paste minimal setup:
-```mermaid
-flowchart LR
-    A[User Input] --> B[Input Guardrails] --> C[AI Model] --> D[Output Guardrails] --> E[Clean Response]
+```ts
+import { generateText } from 'ai';
+import { openai } from '@ai-sdk/openai';
+import {
+  withGuardrails,
+  piiDetector,
+  promptInjectionDetector,
+  minLengthRequirement,
+  mcpSecurityGuardrail,
+} from 'ai-sdk-guardrails';
+const model = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [piiDetector(), promptInjectionDetector()],
+  outputGuardrails: [
+    minLengthRequirement(160),
+    mcpSecurityGuardrail({
+      maxContentSize: 51200, // 50KB limit
+      injectionThreshold: 0.7, // Configurable sensitivity
+      allowedDomains: ['api.company.com'], // Domain allowlist
+    }),
+  ],
+});
+const { text } = await generateText({
+  model,
+  prompt: 'Write a friendly intro email.',
+});
 ```
-That's it! Input guardrails optimize resource usage by stopping inefficient requests. Output guardrails ensure quality by filtering responses.
+See runnable examples: [examples/README.md](./examples/README.md)
-## 📦 Installation
+## Quickstart (30 seconds)
+Install with your provider (OpenAI shown):
 ```bash
-npm install ai-sdk-guardrails
+pnpm add ai-sdk-guardrails ai @ai-sdk/openai
+# or: npm i ai-sdk-guardrails ai @ai-sdk/openai
+# or: yarn add ai-sdk-guardrails ai @ai-sdk/openai
+```
-# or
+Wrap your model and keep using `generateText` as usual:
+```ts
+import { generateText } from 'ai';
+import { openai } from '@ai-sdk/openai';
+import { withGuardrails, piiDetector } from 'ai-sdk-guardrails';
-yarn add ai-sdk-guardrails
+const model = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [piiDetector()],
+});
-# or
+const { text } = await generateText({
+  model,
+  prompt: 'Write a friendly intro email.',
+});
+```
-pnpm add ai-sdk-guardrails
+## Contents
+- Overview
+- Concepts
+- Installation
+- Usage
+  - Define a guardrail
+  - Built-in helpers
+- Streaming
+- Auto Retry (utility and middleware)
+- Error Handling
+- API
+- Examples
+- Compatibility
+- Architecture
+- Contributing
+## API Overview
+### Primary Functions
+- **`withGuardrails(model, config)`** - Main API for wrapping language models with guardrails
+- **`createGuardrails(config)`** - Factory to create reusable guardrail configurations
+- **`withAgentGuardrails(agentSettings, config)`** - Wrap AI SDK Agents with guardrails
+### Migration from v3.x
+- `wrapWithGuardrails` → `withGuardrails` (alias available, deprecated)
+- `wrapAgentWithGuardrails` → `withAgentGuardrails` (alias available, deprecated)
+- Error classes: `InputBlockedError` → `GuardrailsInputError`, `OutputBlockedError` → `GuardrailsOutputError`
+```ts
+// Before (v3.x - still works but deprecated)
+import { wrapWithGuardrails, InputBlockedError } from 'ai-sdk-guardrails';
+const model = wrapWithGuardrails(openai('gpt-4o'), { ... });
+// After (v4.x - recommended)
+import { withGuardrails, GuardrailsInputError } from 'ai-sdk-guardrails';
+const model = withGuardrails(openai('gpt-4o'), { ... });
+// Factory pattern (new in v4.x)
+import { createGuardrails } from 'ai-sdk-guardrails';
+const guards = createGuardrails({ ... });
+const model = guards(openai('gpt-4o'));
 ```
-## 🚀 Quick Start
+## Concepts
-Add smart validation to your AI applications in just 3 steps:
+- Input guardrails: Validate or block prompts to save cost and enforce rules before the call.
+- Output guardrails: Check results for quality and safety. Block, replace, or retry as needed.
+- Middleware: Guardrails wrap any model via AI SDK middleware. Your app code stays the same.
-### 1. Prevent Unnecessary AI Calls
+## Installation
-```typescript
-import { generateText } from 'ai';
+See Quickstart for installation commands. Add providers you use as needed (e.g., `@ai-sdk/openai`, `@ai-sdk/mistral`).
+## Usage
+### Create custom guardrails
+```ts
 import { openai } from '@ai-sdk/openai';
 import {
-  wrapWithInputGuardrails,
   defineInputGuardrail,
+  defineOutputGuardrail,
+  withGuardrails,
 } from 'ai-sdk-guardrails';
 import { extractTextContent } from 'ai-sdk-guardrails/guardrails/input';
+import { extractContent } from 'ai-sdk-guardrails/guardrails/output';
-// Block inefficient requests before calling the AI model
-const lengthGuard = defineInputGuardrail({
-  name: 'blocked-keywords',
-  execute: async (context) => {
-    const { prompt } = extractTextContent(context);
-    const blockedWords = ['spam', 'test', 'hello'];
-    const foundWord = blockedWords.find((word) =>
-      prompt.toLowerCase().includes(word.toLowerCase()),
-    );
-    if (foundWord) {
-      return {
-        tripwireTriggered: true,
-        message: `Blocked keyword detected: ${foundWord}`,
-        severity: 'medium',
-      };
-    }
-    return { tripwireTriggered: false };
+const businessHours = defineInputGuardrail({
+  name: 'business-hours',
+  execute: async (params) => {
+    const hr = new Date().getHours();
+    return hr >= 9 && hr <= 17
+      ? { tripwireTriggered: false }
+      : { tripwireTriggered: true, message: 'Outside business hours' };
   },
 });
-const optimizedModel = wrapWithInputGuardrails(openai('gpt-4'), {
-  inputGuardrails: [lengthGuard],
+const minQuality = defineOutputGuardrail({
+  name: 'min-quality',
+  execute: async ({ result }) => {
+    const { text } = extractContent(result);
+    return text.length >= 80
+      ? { tripwireTriggered: false }
+      : { tripwireTriggered: true, message: 'Response too short' };
+  },
 });
-// This would normally waste an API call for a useless response
-try {
-  const result = await generateText({
-    model: optimizedModel,
-    prompt: 'hello', // ❌ Blocked - prevents unnecessary API call
-  });
-} catch (error) {
-  console.log('Blocked request, saved money!');
-}
-// This generates valuable content
-const goodResult = await generateText({
-  model: optimizedModel,
-  prompt: 'Write a product description for our new software', // ✅ This creates value
+const model = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [businessHours],
+  outputGuardrails: [minQuality],
 });
 ```
-### 2. Ensure Quality Output
+### Built-in helpers
-```typescript
+```ts
+import { openai } from '@ai-sdk/openai';
 import {
-  wrapWithOutputGuardrails,
-  defineOutputGuardrail,
+  withGuardrails,
+  piiDetector,
+  blockedKeywords,
+  contentLengthLimit,
+  promptInjectionDetector,
+  sensitiveDataFilter,
+  minLengthRequirement,
+  confidenceThreshold,
+  mcpSecurityGuardrail,
+  mcpResponseSanitizer,
 } from 'ai-sdk-guardrails';
-import { extractContent } from 'ai-sdk-guardrails/guardrails/output';
-const qualityGuard = defineOutputGuardrail({
-  name: 'sensitive-info-detector',
-  execute: async (context) => {
-    const { text } = extractContent(context.result);
-    // Simple sensitive info patterns
-    const sensitivePatterns = [
-      /\b\d{3}-\d{2}-\d{4}\b/, // SSN
-      /\b[\w\.-]+@[\w\.-]+\.\w+\b/, // Email
-      /\b\d{3}-\d{3}-\d{4}\b/, // Phone
-    ];
-    const foundPattern = sensitivePatterns.find((pattern) =>
-      pattern.test(text),
-    );
-    if (foundPattern) {
-      return {
-        tripwireTriggered: true,
-        message: 'Sensitive information detected in response',
-        severity: 'high',
-      };
-    }
-    return { tripwireTriggered: false };
-  },
+const model = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [
+    piiDetector(),
+    promptInjectionDetector({ threshold: 0.7 }),
+    blockedKeywords(['test', 'spam']),
+    contentLengthLimit(4000),
+  ],
+  outputGuardrails: [
+    mcpSecurityGuardrail({
+      detectExfiltration: true,
+      scanEncodedContent: true,
+      allowedDomains: ['trusted-api.com'],
+    }),
+    mcpResponseSanitizer(),
+    sensitiveDataFilter(),
+    minLengthRequirement(160),
+    confidenceThreshold(0.6),
+  ],
 });
+```
-const qualityModel = wrapWithOutputGuardrails(openai('gpt-4'), {
-  outputGuardrails: [qualityGuard],
-  onOutputBlocked: (results) => {
-    console.log('Prevented sensitive data leak:', results[0]?.message);
-  },
-});
+## Streaming
-const result = await generateText({
-  model: qualityModel,
-  prompt: 'Create a user profile example',
-});
-// Automatically blocks responses containing emails, phone numbers, or SSNs
-```
+Works out of the box. By default, guardrails run after the stream ends (buffer mode). For early blocking, enable progressive mode.
-### 3. Custom Business Logic
-```typescript
-const businessHoursGuard = defineInputGuardrail({
-  name: 'business-hours-only',
-  execute: async () => {
-    const hour = new Date().getUTCHours();
-    // Only allow requests between 9 AM and 5 PM UTC
-    if (hour < 9 || hour > 17) {
-      return {
-        tripwireTriggered: true,
-        message:
-          'Requests are only permitted during business hours (9:00-17:00 UTC).',
-        severity: 'low',
-      };
-    }
-    return { tripwireTriggered: false };
-  },
+```ts
+import { streamText } from 'ai';
+import { openai } from '@ai-sdk/openai';
+import { withGuardrails, minLengthRequirement } from 'ai-sdk-guardrails';
+const model = withGuardrails(openai('gpt-4o'), {
+  outputGuardrails: [minLengthRequirement(120)],
+  // Evaluate as tokens arrive; stop or replace early when blocked
+  streamMode: 'progressive',
+  replaceOnBlocked: true,
 });
-const smartEducationModel = wrapWithInputGuardrails(openai('gpt-4'), {
-  inputGuardrails: [businessHoursGuard],
+const { textStream } = await streamText({
+  model,
+  prompt: 'Tell me a short story about a robot.',
 });
+for await (const delta of textStream) process.stdout.write(delta);
 ```
-**That's it!** Your AI application now optimizes resource usage, ensures quality, and prevents inappropriate responses automatically.
+## Auto Retry
-## ✨ Features
+Choose what fits your flow:
-- 🛡️ **Input & Output Guardrails**: Enforce custom safety, compliance, and quality policies on both prompts and LLM responses.
-- 💰 **Cost Control**: Block invalid or wasteful prompts before they are sent to your LLM provider, saving you money.
-- 🎯 **Quality Improvement**: Automatically filter, flag, or retry low-quality or irrelevant model outputs.
-- 🔄 **Streaming Support**: Works seamlessly with both streaming (streamText) and standard (generateText) API responses.
-- 📊 **Observability Hooks**: Built-in callbacks (onInputBlocked, onOutputBlocked, etc.) for logging and monitoring.
-- ⚙️ **Configurable Execution**: Run guardrails in parallel or sequentially and set custom timeouts.
-- 🚀 **AI SDK Native**: Designed from the ground up to integrate cleanly with AI SDK middleware patterns.
+- Standalone utility: Use `retry()` to wrap any generation function with your own validator and backoff.
+- Middleware option: Add `retry` to output guardrails so retries run automatically when a check fails.
-## 📚 API Overview
+### Utility
-| Function                     | Description                                                                   |
-| ---------------------------- | ----------------------------------------------------------------------------- |
-| `defineInputGuardrail()`     | Creates a guardrail to validate, inspect, or block prompts.                   |
-| `defineOutputGuardrail()`    | Creates a guardrail to validate, filter, or re-route LLM outputs.             |
-| `wrapWithGuardrails()`       | ⭐ **Recommended** - The easiest way to add both input and output guardrails. |
-| `wrapWithInputGuardrails()`  | Attaches input-only guardrails to a model.                                    |
-| `wrapWithOutputGuardrails()` | Attaches output-only guardrails to a model.                                   |
-| `InputBlockedError`, etc.    | Custom, structured error types for easy try/catch handling.                   |
+```ts
+import { retry } from 'ai-sdk-guardrails';
+import { generateText } from 'ai';
+import { openai } from '@ai-sdk/openai';
-## 🧠 Design Philosophy
+const result = await retry({
+  generate: (params) => generateText({ model: openai('gpt-4o'), ...params }),
+  params: { prompt: 'Explain backpropagation in depth.' },
+  validate: (r) => ({
+    blocked: (r.text ?? '').length < 500,
+    message: 'Response too short',
+  }),
+  buildRetryParams: ({ lastParams }) => ({
+    ...lastParams,
+    maxOutputTokens: Math.max(800, (lastParams.maxOutputTokens ?? 400) + 300),
+  }),
+  maxRetries: 2,
+});
+```
-- ✅ **Helper-First**: Simple, chainable utility functions provide a great developer experience for fast adoption.
-- 🧩 **Composable**: Multiple guardrails can be chained together and will run in your specified order (or in parallel).
-- 🧾 **Type-Safe**: Full TypeScript support with contextual typing for guardrail inputs, outputs, and metadata.
-- 🧪 **Sensible Defaults**: Get started quickly with zero-config default behaviors that can be easily overridden.
+### Middleware
-## Architecture Overview
+```ts
+import { generateText } from 'ai';
+import { openai } from '@ai-sdk/openai';
+import { withGuardrails, defineOutputGuardrail } from 'ai-sdk-guardrails';
+import { extractContent } from 'ai-sdk-guardrails/guardrails/output';
-The library leverages the Vercel AI SDK's middleware architecture to provide composable guardrails that integrate seamlessly with your existing AI applications:
+const minLengthGuardrail = defineOutputGuardrail<{ minChars: number }>({
+  name: 'min-output-length',
+  execute: async ({ result }) => {
+    const { text } = extractContent(result);
+    const minChars = text.length + 1;
+    return text.length < minChars
+      ? {
+          tripwireTriggered: true,
+          severity: 'medium',
+          message: `Answer too short: ${text.length} < ${minChars}`,
+          metadata: { minChars },
+        }
+      : { tripwireTriggered: false };
+  },
+});
-```mermaid
-graph TB
-    subgraph "Your Application"
-        App[Your App Code]
-        Config[Guardrail Configuration]
-    end
-    subgraph "AI SDK Guardrails Middleware"
-        InputMW[Input Guardrails Middleware]
-        OutputMW[Output Guardrails Middleware]
-        subgraph "Input Guardrails Layer"
-            Length[Length Validation]
-            Spam[Spam Detection]
-            PII[PII Detection]
-            Business[Business Rules]
-            Custom1[Custom Guards]
-        end
-        subgraph "Output Guardrails Layer"
-            Quality[Quality Assurance]
-            Sensitive[Sensitive Info Filter]
-            Professional[Professional Tone]
-            Factual[Factual Validation]
-            Custom2[Custom Guards]
-        end
-    end
-    subgraph "AI SDK Core"
-        Wrapper[wrapLanguageModel]
-        Generator[generateText/Object/Stream]
-    end
-    subgraph "External Services"
-        AI[AI Model Provider]
-        Log[Logging & Telemetry]
-    end
-    App --> Config
-    Config --> InputMW
-    InputMW --> Length
-    InputMW --> Spam
-    InputMW --> PII
-    InputMW --> Business
-    InputMW --> Custom1
-    InputMW -->|Valid Request| Wrapper
-    InputMW -->|Blocked Request| Log
-    Wrapper --> Generator
-    Generator --> AI
-    AI --> OutputMW
-    OutputMW --> Quality
-    OutputMW --> Sensitive
-    OutputMW --> Professional
-    OutputMW --> Factual
-    OutputMW --> Custom2
-    OutputMW -->|Clean Response| App
-    OutputMW -->|Quality Issues| Log
-    style InputMW fill:#e1f5fe
-    style OutputMW fill:#f3e5f5
-    style AI fill:#fff3e0
-    style App fill:#e8f5e8
-```
+const guarded = wrapWithOutputGuardrails(
+  openai('gpt-4o'),
+  [minLengthGuardrail],
+  {
+    replaceOnBlocked: false,
+    retry: {
+      maxRetries: 1,
+      buildRetryParams: ({ summary, lastParams }) => ({
+        ...lastParams,
+        maxOutputTokens: Math.max(
+          800,
+          (lastParams.maxOutputTokens ?? 400) + 300,
+        ),
+        prompt: [
+          ...(Array.isArray(lastParams.prompt) ? lastParams.prompt : []),
+          {
+            role: 'user' as const,
+            content: [
+              {
+                type: 'text' as const,
+                text: `Note: The previous answer ${summary.blockedResults[0]?.message}. Provide a comprehensive, detailed answer with examples.`,
+              },
+            ],
+          },
+        ],
+      }),
+    },
+  },
+);
-## 🍳 Recipes & Use Cases
+const { text } = await generateText({
+  model: guarded,
+  prompt: 'Explain the significance of the Turing Test in AI history.',
+});
+```
-Guardrails can enforce any custom logic. Here are a few common patterns.
+Tip: Use backoff helpers if you need delays between retries: `exponentialBackoff`, `linearBackoff`, `fixedBackoff`, `jitteredExponentialBackoff`, or `backoffPresets`.
-### Rate Limiting
+## Error Handling
-Pass a userId in the metadata of your generateText call to enforce per-user rate limits.
+Set `throwOnBlocked: true` to throw structured errors you can catch and turn into friendly messages.
-```typescript
-const rateLimitGuard = defineInputGuardrail({
-  name: 'user-rate-limit',
-  execute: async ({ metadata }) => {
-    const userId = metadata?.userId ?? 'anonymous';
-    const allowed = await checkRateLimit(userId); // Your rate-limiting logic
+```ts
+import { isGuardrailsError } from 'ai-sdk-guardrails';
-    return allowed
-      ? { tripwireTriggered: false }
-      : {
-          tripwireTriggered: true,
-          message: `Rate limit exceeded for user: ${userId}`,
-        };
-  },
-});
+try {
+  const { text } = await generateText({ model, prompt: '...' });
+} catch (err) {
+  if (isGuardrailsError(err)) {
+    console.error('Guardrail blocked:', err.message);
+    // err.results gives you details per guardrail
+  } else {
+    console.error('Unexpected error:', err);
+  }
+}
 ```
-### LLM-as-Judge for Quality Scoring
+## Reusable Guardrails Factory
-Use a cheaper, faster model to "judge" the output of a more powerful one.
+Use `createGuardrails()` to create reusable guardrail configurations that can be applied to multiple models:
-```typescript
-const qualityJudge = defineOutputGuardrail({
-  name: 'llm-quality-judge',
-  execute: async ({ result }) => {
-    // Use a cheap model to score the primary model's output
-    const judgement = await generateText({
-      model: openai('gpt-3.5-turbo'),
-      prompt: `Is the following response helpful and safe? Answer YES or NO. \n\nResponse: "${result.text}"`,
-    });
-    const isSafe = judgement.text.includes('YES');
-    return isSafe
-      ? { tripwireTriggered: false }
-      : {
-          tripwireTriggered: true,
-          message: `Output failed LLM-as-judge quality check.`,
-          metadata: { originalText: result.text },
-        };
-  },
+```ts
+import { openai } from '@ai-sdk/openai';
+import { anthropic } from '@ai-sdk/anthropic';
+import { createGuardrails, defineInputGuardrail } from 'ai-sdk-guardrails';
+// Create reusable guardrails configuration
+const productionGuards = createGuardrails({
+  inputGuardrails: [piiDetector(), contentFilter()],
+  outputGuardrails: [qualityCheck(), minLength(100)],
+  throwOnBlocked: true,
 });
-```
-### Advanced Input Validation
+// Apply to multiple models
+const gpt4 = productionGuards(openai('gpt-4o'));
+const claude = productionGuards(anthropic('claude-3-sonnet'));
-```typescript
-import { extractTextContent } from 'ai-sdk-guardrails/guardrails/input';
+// Compose multiple guardrail sets
+const strictLimits = createGuardrails({ inputGuardrails: [maxLength(500)] });
+const piiProtection = createGuardrails({ inputGuardrails: [piiDetector()] });
-const comprehensiveInputGuard = defineInputGuardrail({
-  name: 'comprehensive-input-validation',
-  execute: async (context) => {
-    const { prompt } = extractTextContent(context);
-    // Length validation
-    if (prompt.length < 10) {
-      return {
-        tripwireTriggered: true,
-        message: 'Input too short - likely to produce low-value response',
-        severity: 'medium',
-        suggestion: 'Please provide more detailed input for better results',
-      };
-    }
-    if (prompt.length > 4000) {
-      return {
-        tripwireTriggered: true,
-        message: 'Input too long - may exceed token limits',
-        severity: 'high',
-        suggestion: 'Break your request into smaller, focused parts',
-      };
-    }
-    // Content quality checks
-    const spamPatterns = [
-      /^(.)\1{10,}$/, // Repeated characters
-      /^(test|hello|hi|hey)$/i, // Common spam words
-    ];
-    const foundSpam = spamPatterns.find((pattern) => pattern.test(prompt));
-    if (foundSpam) {
-      return {
-        tripwireTriggered: true,
-        message: 'Low-quality input detected',
-        severity: 'high',
-      };
-    }
-    return { tripwireTriggered: false };
-  },
-});
+// Chain them together
+const model = piiProtection(strictLimits(openai('gpt-4o')));
 ```
-### Professional Output Quality Control
+## MCP Security Guardrails
-```typescript
-import { extractContent } from 'ai-sdk-guardrails/guardrails/output';
+**Production-Ready**: Protect against prompt injection and data exfiltration attacks when using Model Context Protocol (MCP) tools. Based on research into the ["lethal trifecta" vulnerability](https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/) that has affected major AI platforms.
-const professionalQualityGuard = defineOutputGuardrail({
-  name: 'professional-quality-control',
-  execute: async (context) => {
-    const { text } = extractContent(context.result);
-    const qualityIssues = [];
-    // Check for unprofessional language
-    const unprofessionalTerms = ['lol', 'wtf', 'omg', 'ur', 'u r'];
-    const hasUnprofessional = unprofessionalTerms.some((term) =>
-      text.toLowerCase().includes(term),
-    );
-    if (hasUnprofessional) {
-      qualityIssues.push('Contains unprofessional language');
-    }
-    // Check for placeholder text
-    const placeholders = ['[insert', '[add', '[your', 'TODO:', 'FIXME:'];
-    const hasPlaceholders = placeholders.some((placeholder) =>
-      text.includes(placeholder),
-    );
-    if (hasPlaceholders) {
-      qualityIssues.push('Contains placeholder text - incomplete response');
-    }
-    // Check for excessive repetition
-    const sentences = text.split(/[.!?]+/).filter((s) => s.trim());
-    const uniqueSentences = new Set(
-      sentences.map((s) => s.trim().toLowerCase()),
-    );
-    const repetitionRatio = uniqueSentences.size / sentences.length;
-    if (sentences.length > 3 && repetitionRatio < 0.6) {
-      qualityIssues.push('Excessive repetition detected');
-    }
-    if (qualityIssues.length > 0) {
-      return {
-        tripwireTriggered: true,
-        message: `Quality issues found: ${qualityIssues.join(', ')}`,
-        severity: 'medium',
-        suggestion: 'Request a more professional, complete response',
-        metadata: {
-          issues: qualityIssues,
-          quality_score: repetitionRatio,
-        },
-      };
-    }
+### The Problem
-    return { tripwireTriggered: false };
-  },
-});
-```
+AI agents with MCP tools can be vulnerable when they have:
-## 🔄 Streaming Support
+1. **Access to private data** (through tools)
+2. **Process untrusted content** (from tool responses)
+3. **Can communicate externally** (make web requests)
-Guardrails work with streams out-of-the-box. Output guardrails will run after the complete response has been streamed and generated.
+Malicious tool responses can contain hidden instructions that trick the AI into exfiltrating sensitive data.
-```typescript
-import { streamText } from 'ai';
+### Production-Ready Solution
-const guardedModel = wrapWithGuardrails(openai('gpt-4o'), {
-  outputGuardrails: [qualityJudge],
-});
+Full configurability with sensible defaults for immediate deployment:
-const { textStream } = await streamText({
-  model: guardedModel,
-  prompt: 'Tell me a short story about a robot.',
+```ts
+import {
+  withGuardrails,
+  promptInjectionDetector,
+  mcpSecurityGuardrail,
+  mcpResponseSanitizer,
+  toolEgressPolicy,
+} from 'ai-sdk-guardrails';
+// Conservative production setup (high security)
+const secureModel = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [
+    promptInjectionDetector({ threshold: 0.6, includeExamples: true }),
+  ],
+  outputGuardrails: [
+    mcpSecurityGuardrail({
+      injectionThreshold: 0.5, // Lower = more sensitive
+      maxSuspiciousUrls: 0, // Zero tolerance
+      maxContentSize: 25600, // 25KB limit for performance
+      minEncodedLength: 15, // Detect shorter encoded attacks
+      encodedInjectionThreshold: 0.2, // Combined encoded + injection threshold
+      highRiskThreshold: 0.3, // High-risk cascade blocking
+      authorityThreshold: 0.5, // Authority manipulation detection
+      allowedDomains: ['api.company.com', 'trusted-partner.com'],
+      customSuspiciousDomains: ['evil.com', 'malicious.org'],
+      blockCascadingCalls: true,
+      scanEncodedContent: true,
+      detectExfiltration: true,
+    }),
+    mcpResponseSanitizer(), // Clean malicious content vs blocking
+    toolEgressPolicy({
+      allowedHosts: ['api.company.com', 'trusted-partner.com'],
+      blockedHosts: ['webhook.site', 'requestcatcher.com', 'ngrok.io'],
+      scanForUrls: true,
+    }),
+  ],
 });
+```
-// Stream the response to the client
-for await (const delta of textStream) {
-  process.stdout.write(delta);
+### Environment & Role-Based Configuration
+```ts
+// Different security profiles for different environments
+function getSecurityConfig(env: 'production' | 'staging' | 'development') {
+  const configs = {
+    production: {
+      injectionThreshold: 0.5, // High security
+      maxContentSize: 25600, // 25KB limit
+      authorityThreshold: 0.5, // Very sensitive
+    },
+    staging: {
+      injectionThreshold: 0.7, // Balanced security
+      maxContentSize: 51200, // 50KB default
+      authorityThreshold: 0.7, // Standard sensitivity
+    },
+    development: {
+      injectionThreshold: 0.8, // Lower security, better performance
+      maxContentSize: 102400, // 100KB for testing
+      authorityThreshold: 0.8, // Less restrictive
+    },
+  };
+  return configs[env];
 }
-// The qualityJudge guardrail will run after the stream is complete.
+const productionModel = withGuardrails(openai('gpt-4o'), {
+  outputGuardrails: [mcpSecurityGuardrail(getSecurityConfig('production'))],
+});
 ```
-## 🛠️ Error Handling
+### Attack Vectors Prevented
-When `throwOnBlocked: true` (the default), you can catch structured errors to handle blocks gracefully.
+✅ **Direct prompt injection** - "System: ignore all previous instructions"
+✅ **Tool response poisoning** - Malicious content in MCP tool responses
+✅ **Data exfiltration** - URLs constructed to steal sensitive data
+✅ **Encoded attacks** - Base64/hex hidden malicious instructions
+✅ **Cascading exploits** - Tool responses triggering additional dangerous calls
+✅ **Context poisoning** - Attempts to modify AI behavior mid-conversation
-```typescript
-import { generateText } from 'ai';
-import { isGuardrailsError } from 'ai-sdk-guardrails';
+### Secure MCP Agent Example
-try {
-  const result = await generateText({
-    model: guardedModel,
-    prompt: 'A prompt that might be blocked...',
-  });
-} catch (error) {
-  if (isGuardrailsError(error)) {
-    // Error was thrown by one of our guardrails
-    console.error('Guardrail check failed:', error.message);
-    console.error('Triggered Guards:', error.results);
-  } else {
-    // Some other error occurred
-    console.error('An unexpected error occurred:', error);
-  }
-}
+```ts
+import { withAgentGuardrails } from 'ai-sdk-guardrails';
+const secureAgent = withAgentGuardrails(
+  {
+    model: openai('gpt-4o'),
+    tools: { file_search, api_call, database_query },
+    system: 'You are a secure assistant. Always validate tool responses.',
+  },
+  {
+    inputGuardrails: [promptInjectionDetector()],
+    outputGuardrails: [
+      mcpSecurityGuardrail({
+        detectExfiltration: true,
+        allowedDomains: ['trusted-api.com'],
+      }),
+      mcpResponseSanitizer(),
+    ],
+    toolGuardrails: [
+      toolEgressPolicy({
+        allowedHosts: ['trusted-api.com'],
+        scanForUrls: true,
+      }),
+    ],
+  },
+);
 ```
-### User-Friendly Error Messages
+### Configuration Options
-Transform technical guardrail messages into user-friendly guidance:
+All security parameters are fully configurable with sensible defaults:
-```typescript
-function createUserFriendlyMessage(guardrailResult): string {
-  const guardrailName = guardrailResult.context?.guardrailName;
+| Option                      | Default | Description                                      |
+| --------------------------- | ------- | ------------------------------------------------ |
+| `injectionThreshold`        | 0.7     | Prompt injection confidence threshold (0-1)      |
+| `maxSuspiciousUrls`         | 0       | Max allowed suspicious URLs (0 = zero tolerance) |
+| `maxContentSize`            | 51200   | Max content size in bytes (50KB default)         |
+| `minEncodedLength`          | 20      | Min encoded content length to analyze            |
+| `encodedInjectionThreshold` | 0.3     | Combined encoded + injection threshold           |
+| `authorityThreshold`        | 0.7     | Authority manipulation detection sensitivity     |
+| `allowedDomains`            | []      | Allowed domains for URL construction             |
+| `customSuspiciousDomains`   | []      | Additional suspicious domain patterns            |
-  switch (guardrailName) {
-    case 'content-length-limit':
-      return 'Your message is too long. Please keep it under 500 characters for the best response.';
+### Performance & Security Balance
-    case 'blocked-keywords':
-      return "I can't help with that topic. Try asking about something else I can assist with.";
+- **High Security**: Lower thresholds, stricter limits, comprehensive scanning
+- **Balanced**: Default settings, good for most production use cases
+- **High Performance**: Higher thresholds, larger limits, selective scanning
-    case 'user-rate-limit':
-      return "You're sending requests too quickly. Please wait a moment before trying again.";
+See complete examples:
-    default:
-      return (
-        guardrailResult.suggestion ||
-        'Please refine your request and try again.'
-      );
-  }
-}
-```
+- [Production MCP Configuration](./examples/44-production-mcp-config.ts) - **New!**
+- [MCP Security Test Suite](./examples/41-mcp-security-test.ts)
+- [Enhanced Security Testing](./examples/43-enhanced-mcp-security-test.ts)
+- [Vulnerability Proof of Concept](./examples/42-mcp-vulnerability-proof.ts)
-## Complete AI SDK Integration
+## Agent Support
-The library seamlessly integrates with all AI SDK functions:
+Guardrails work with AI SDK Agents for multi-step agentic workflows:
-```typescript
-// Create your production-ready model once
-const productionModel = wrapWithGuardrails(openai('gpt-4'), {
-  inputGuardrails: [lengthGuard, spamGuard, rateLimitGuard],
-  outputGuardrails: [qualityGuard, sensitiveInfoGuard],
-  throwOnBlocked: false,
-  onInputBlocked: (results) => {
-    console.log('Input blocked:', results[0]?.message);
+```ts
+import { openai } from '@ai-sdk/openai';
+import { withAgentGuardrails, defineOutputGuardrail } from 'ai-sdk-guardrails';
+import { tool } from 'ai';
+import { z } from 'zod';
+// Define tools for the agent
+const searchTool = tool({
+  description: 'Search for information',
+  inputSchema: z.object({ query: z.string() }),
+  execute: async ({ query }) => `Results for: ${query}`,
+});
+// Create agent with guardrails
+const agent = withAgentGuardrails(
+  {
+    model: openai('gpt-4o'),
+    tools: { search: searchTool },
+    system: 'You are a helpful research assistant.',
   },
-  onOutputBlocked: (results) => {
-    console.log('Output filtered:', results[0]?.message);
+  {
+    outputGuardrails: [
+      defineOutputGuardrail({
+        name: 'tool-usage-required',
+        description: 'Ensures agent uses search tools',
+        execute: async (params) => {
+          const hasToolCall = params.result.steps?.some(
+            (step) => step.type === 'tool-call',
+          );
+          return {
+            tripwireTriggered: !hasToolCall,
+            message: hasToolCall
+              ? 'Tool usage validated'
+              : 'Must use search tools for research',
+            severity: 'high',
+          };
+        },
+      }),
+    ],
+    throwOnBlocked: true,
   },
-});
+);
-// Use with any AI SDK function
-const textResult = await generateText({
-  model: productionModel,
-  prompt: 'Write a professional email response',
+// Use the guarded agent
+const result = await agent.generate({
+  prompt: 'Research the latest AI developments',
 });
+```
-const objectResult = await generateObject({
-  model: productionModel,
-  prompt: 'Create a user profile',
-  schema: userProfileSchema,
-});
+## API
-const textStream = await streamText({
-  model: productionModel,
-  prompt: 'Explain our product features',
-});
-```
+| Export                                                                                                      | Description                                                                      |
+| ----------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
+| `defineInputGuardrail`, `defineOutputGuardrail`                                                             | Create guardrails with clear messages, severity, and metadata.                   |
+| `withGuardrails`, `createGuardrails`, `withAgentGuardrails`                                                 | Attach guardrails to AI SDK models and agents via middleware.                    |
+| `executeInputGuardrails`, `executeOutputGuardrails`                                                         | Run guardrails programmatically (outside middleware) and get structured results. |
+| `retry`, `retryHelpers`                                                                                     | Standalone auto-retry utilities with validation and backoff.                     |
+| `GuardrailsError`, `GuardrailsInputError`, `GuardrailsOutputError`, `isGuardrailsError`, `extractErrorInfo` | Structured errors and helpers for robust handling.                               |
+| `exponentialBackoff`, `linearBackoff`, `fixedBackoff`, `jitteredExponentialBackoff`, `backoffPresets`       | Backoff strategies to control retry pacing.                                      |
+See source for built-in helpers:
+- Input helpers: `./src/guardrails/input.ts`
+- Output helpers: `./src/guardrails/output.ts`
 ## Examples
-Explore focused examples that demonstrate practical performance optimization and quality assurance:
+Browse runnable examples for streaming, compliance, safety, and more:
-### Core Examples
+- Index and commands: [examples/README.md](./examples/README.md)
-- **[Basic Composition](examples/basic-composition.ts)** - Simple input/output validation for efficiency and quality
-- **[Basic Guardrails](examples/basic-guardrails.ts)** - Foundation patterns for input/output validation
-- **[Business Logic](examples/business-logic.ts)** - Custom business rules, work hours, and professional standards
-- **[LLM-as-Judge](examples/llm-as-judge.ts)** - AI-powered quality evaluation and scoring
+Quick starts
-### Additional Examples
+| Example                    | Description                     | File                                                                              |
+| -------------------------- | ------------------------------- | --------------------------------------------------------------------------------- |
+| Simple combined protection | Minimal input and output setup  | [07a-simple-combined-protection.ts](./examples/07a-simple-combined-protection.ts) |
+| Auto retry on output       | Retry until output meets a rule | [32-auto-retry-output.ts](./examples/32-auto-retry-output.ts)                     |
+| LLM judge auto-retry       | Judge feedback drives retry     | [33-judge-auto-retry.ts](./examples/33-judge-auto-retry.ts)                       |
+| Expected tool use retry    | Enforce/guide tool usage        | [34-expected-tool-use-retry.ts](./examples/34-expected-tool-use-retry.ts)         |
+| Weather assistant          | End-to-end input/output + retry | [33-blog-post-weather-assistant.ts](./examples/33-blog-post-weather-assistant.ts) |
-- **[Object Guardrails](examples/object-guardrails.ts)** - Schema validation and structured output quality
-- **[Streaming Guardrails](examples/streaming-guardrails.ts)** - Real-time quality monitoring
-- **[Rate Limiting](examples/rate-limit-guardrail.ts)** - Smart rate limiting that prevents resource overuse
-- **[Autoevals Integration](examples/autoevals-guardrails.ts)** - Advanced AI-powered evaluation
+Input safety
-### Running Examples
+| Example            | Description                         | File                                                            |
+| ------------------ | ----------------------------------- | --------------------------------------------------------------- |
+| Input length limit | Enforce max input length            | [01-input-length-limit.ts](./examples/01-input-length-limit.ts) |
+| Blocked keywords   | Block specific terms                | [02-blocked-keywords.ts](./examples/02-blocked-keywords.ts)     |
+| PII detection      | Detect PII before calling the model | [03-pii-detection.ts](./examples/03-pii-detection.ts)           |
+| Rate limiting      | Simple per-user rate limit          | [13-rate-limiting.ts](./examples/13-rate-limiting.ts)           |
-```bash
-# Install dependencies
-pnpm install
-# Interactive examples with better UX
-tsx examples/basic-composition.ts     # Start here - simplest example
-tsx examples/basic-guardrails.ts      # Core patterns with 8 examples
-tsx examples/business-logic.ts        # Business-specific rules
-tsx examples/llm-as-judge.ts          # AI-powered quality control
-# Or run specific examples directly
-tsx examples/basic-guardrails.ts 1    # Run first example only
-tsx examples/streaming-guardrails.ts 3 # Run third streaming example
+Output safety
+| Example                 | Description                         | File                                                                      |
+| ----------------------- | ----------------------------------- | ------------------------------------------------------------------------- |
+| Output length check     | Require min/max output length       | [04-output-length-check.ts](./examples/04-output-length-check.ts)         |
+| Sensitive output filter | Filter secrets and PII in responses | [05-sensitive-output-filter.ts](./examples/05-sensitive-output-filter.ts) |
+| Hallucination detection | Flag uncertain factual claims       | [19-hallucination-detection.ts](./examples/19-hallucination-detection.ts) |
+Streaming
+| Example           | Description                        | File                                                                              |
+| ----------------- | ---------------------------------- | --------------------------------------------------------------------------------- |
+| Streaming limits  | Apply limits in buffered streaming | [11-streaming-limits.ts](./examples/11-streaming-limits.ts)                       |
+| Streaming quality | Quality checks with streaming      | [12-streaming-quality.ts](./examples/12-streaming-quality.ts)                     |
+| Early termination | Stop streams early when blocked    | [28-streaming-early-termination.ts](./examples/28-streaming-early-termination.ts) |
+Advanced
+| Example                    | Description                   | File                                                                            |
+| -------------------------- | ----------------------------- | ------------------------------------------------------------------------------- |
+| Simple quality judge       | Cheaper model judges quality  | [15a-simple-quality-judge.ts](./examples/15a-simple-quality-judge.ts)           |
+| Secret leakage scan        | Scan responses for secrets    | [18-secret-leakage-scan.ts](./examples/18-secret-leakage-scan.ts)               |
+| SQL code safety            | Basic SQL safety checks       | [24-sql-code-safety.ts](./examples/24-sql-code-safety.ts)                       |
+| Role hierarchy enforcement | Enforce role rules in prompts | [23-role-hierarchy-enforcement.ts](./examples/23-role-hierarchy-enforcement.ts) |
+## Compatibility
+- Runtime: Node.js 18+ recommended
+- AI SDK: Compatible with AI SDK 5 (`ai@^5`); wraps any model
+- For `generateObject`: for strict object validation, run `executeOutputGuardrails()` after generation
+## Architecture
+```mermaid
+flowchart LR
+  A[Input] --> B[Input Guardrails]
+  B -->|Valid| C[AI Model]
+  B -->|Blocked| X[No API Call]
+  C --> D[Output Guardrails]
+  D -->|Clean| E[Response]
+  D -->|Blocked| R[Retry/Replace/Throw]
 ```
-All examples feature interactive menus with arrow key navigation, multi-selection with checkboxes, and automatic return to the main menu.
+### Design principles
+- Helper-first: simple, chainable APIs with great DX
+- Composable: run multiple guardrails in any order
+- Type-safe: rich TypeScript types and inference
+- Sensible defaults: zero-config to start, full control when you need it
-## 🤝 Contributing
+## Contributing
-Contributions of all sizes are welcome! Please open issues and pull requests on [GitHub](https://github.com/jagreehal/ai-sdk-guardrails).
+Issues and PRs are welcome.
-## 📄 License
+## License
-MIT © [Jag Reehal](https://github.com/jagreehal) – See LICENSE for full details.
+MIT © Jag Reehal. See LICENSE for details.