npm - ai-sdk-guardrails - Versions diffs - 5.0.0 → 5.0.2 - Mend

ai-sdk-guardrails 5.0.0 → 5.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/README.md +368 -436
package/dist/chunk-CSUDFTRH.js +1067 -0
package/dist/chunk-LVXHWHZC.js +355 -0
package/dist/chunk-ND2ICBTR.js +1209 -0
package/dist/chunk-VKJ5EAS7.js +397 -0
package/dist/guardrails/input.cjs +1109 -0
package/dist/guardrails/input.d.cts +133 -0
package/dist/guardrails/input.d.ts +133 -0
package/dist/guardrails/input.js +37 -0
package/dist/guardrails/output.cjs +1260 -0
package/dist/guardrails/output.d.cts +100 -0
package/dist/guardrails/output.d.ts +100 -0
package/dist/guardrails/output.js +55 -0
package/dist/guardrails/tools.cjs +658 -0
package/dist/guardrails/tools.d.cts +60 -0
package/dist/guardrails/tools.d.ts +60 -0
package/dist/guardrails/tools.js +10 -0
package/dist/index.cjs +6046 -0
package/dist/index.d.cts +1043 -0
package/dist/index.d.ts +1043 -0
package/dist/index.js +3063 -0
package/dist/types-CeZi2BBN.d.cts +727 -0
package/dist/types-CeZi2BBN.d.ts +727 -0
package/package.json +31 -22

package/README.md CHANGED Viewed

@@ -1,10 +1,8 @@
 # AI SDK Guardrails
-**Input and output validation for the Vercel AI SDK**
+## Safety and quality controls for Vercel AI SDK
-Add safety checks and quality controls to your AI applications. Guard against prompt injection, prevent sensitive data leaks, and improve output reliability - all while keeping your existing AI SDK code unchanged.
-**Now includes MCP (Model Context Protocol) security guardrails** to help protect against attacks when using AI tools.
+Add guardrails to your AI applications in one line of code. Block PII, prevent prompt injection, enforce output quality - while keeping your existing telemetry and observability stack intact.
 [![npm version](https://img.shields.io/npm/v/ai-sdk-guardrails.svg?logo=npm&label=npm)](https://www.npmjs.com/package/ai-sdk-guardrails)
 [![downloads](https://img.shields.io/npm/dw/ai-sdk-guardrails.svg?label=downloads)](https://www.npmjs.com/package/ai-sdk-guardrails)
@@ -14,52 +12,79 @@ Add safety checks and quality controls to your AI applications. Guard against pr
 ![Guardrails Demo](./media/guardrail-example.gif)
-## Why this matters
+## Drop-in Guardrails for any AI model
-- **MCP**: Protect against prompt injection and data exfiltration when using MCP tools
-- **Agent**: Have more reliable and secure agentic workflows
-- **Tool security**: Protect against data exfiltration when using MCP tools
-- **Save costs**: Block unnecessary requests before they hit your model
-- **Improve safety**: Detect PII, block harmful content, prevent prompt injection
-- **Better quality**: Enforce minimum response lengths, validate structure, auto-retry on failures
-- **Easy integration**: Works as middleware with any AI SDK model
+```ts
+import { withGuardrails, piiDetector } from 'ai-sdk-guardrails';
+const model = openai('gpt-4o'); // or any other AI model
-## Common use cases
+// Everything else stays the same
+const safeModel = withGuardrails(model, {
+  inputGuardrails: [piiDetector()],
+});
-- Content moderation and safety filters
-- PII detection for compliance
-- Output quality requirements (length, format)
-- Prompt injection prevention
-- Tool usage validation
-- Auto-retry on low-quality responses
+// Your existing code, telemetry, and logging still works
+await generateText({ model: safeModel, prompt: '...' });
+```
-## Secure AI in Under 60 Seconds
+**That's it.** Your AI now blocks PII automatically.
-**Step 1:** Install (10 seconds)
+## Installation
 ```bash
 npm install ai-sdk-guardrails
 ```
-**Step 2:** Import (15 seconds)
+## 🧙‍♂️ No-Code Wizard (New!)
+**Don't want to write code?** Use our visual wizard to configure guardrails:
+1. **Open the wizard**: [wizard-prototype/index.html](./wizard-prototype/index.html)
+2. **Choose your use case**: Content moderation, data protection, quality assurance, or security
+3. **Select guardrails**: Pick from 40+ built-in guardrails
+4. **Configure settings**: Adjust thresholds and parameters with sliders and toggles
+5. **Copy generated code**: Get production-ready TypeScript code instantly
+**Perfect for:**
+- 🎯 **Non-technical users** who need AI safety
+- 🚀 **Quick prototyping** of guardrail configurations
+- 📚 **Learning** how to use the library
+- 👥 **Team onboarding** and training
+The wizard generates code that works out of the box - just copy, paste, and run!
+## Why Guardrails Matter
+Real problems that guardrails solve:
+❌ **Without guardrails:**
 ```ts
-import { withGuardrails, piiDetector } from 'ai-sdk-guardrails';
+// User: "My email is john@company.com, help me..."
+// → Sends PII to model → Compliance violation → $$$
 ```
-**Step 3:** Wrap your model (30 seconds)
+✅ **With guardrails:**
 ```ts
-const safeModel = withGuardrails(yourModel, {
-  inputGuardrails: [piiDetector()],
+const model = withGuardrails(baseModel, {
+  inputGuardrails: [piiDetector()], // Blocks before API call
 });
+// → Request blocked → No PII leak → No cost → Compliant
 ```
-**Result:** Your AI now automatically blocks PII, prevents prompt injection, and validates outputs. That's it. No architecture changes, no security team required.
+Common use cases:
+- 🛡️ **Compliance**: Block PII before it reaches your model
+- 💰 **Cost control**: Stop bad requests before they cost money
+- 🔒 **Security**: Prevent prompt injection and data exfiltration
+- ✅ **Quality**: Enforce minimum response standards
+- 🔧 **Production**: Works with your existing observability tools
-## TL;DR
+## Copy-Paste Examples
-Copy/paste minimal setup:
+### Basic Protection (Most Common)
 ```ts
 import { generateText } from 'ai';
@@ -68,142 +93,187 @@ import {
   withGuardrails,
   piiDetector,
   promptInjectionDetector,
-  minLengthRequirement,
-  mcpSecurityGuardrail,
 } from 'ai-sdk-guardrails';
 const model = withGuardrails(openai('gpt-4o'), {
   inputGuardrails: [piiDetector(), promptInjectionDetector()],
-  outputGuardrails: [
-    minLengthRequirement(160),
-    mcpSecurityGuardrail({
-      maxContentSize: 51200, // 50KB limit
-      injectionThreshold: 0.7, // Configurable sensitivity
-      allowedDomains: ['api.company.com'], // Domain allowlist
-    }),
-  ],
 });
+// Use exactly like before - nothing else changes
 const { text } = await generateText({
   model,
-  prompt: 'Write a friendly intro email.',
+  prompt: 'Write a friendly email',
 });
 ```
-See runnable examples: [examples/README.md](./examples/README.md)
-## Quickstart (30 seconds)
+### Input + Output Protection
-Install with your provider (OpenAI shown):
+```ts
+import {
+  withGuardrails,
+  piiDetector,
+  sensitiveDataFilter,
+  minLengthRequirement,
+} from 'ai-sdk-guardrails';
-```bash
-pnpm add ai-sdk-guardrails ai @ai-sdk/openai
-# or: npm i ai-sdk-guardrails ai @ai-sdk/openai
-# or: yarn add ai-sdk-guardrails ai @ai-sdk/openai
+const model = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [piiDetector()], // Block PII in prompts
+  outputGuardrails: [
+    sensitiveDataFilter(), // Remove secrets from responses
+    minLengthRequirement(100), // Enforce quality standards
+  ],
+});
 ```
-Wrap your model and keep using `generateText` as usual:
+### Works With Streaming
 ```ts
-import { generateText } from 'ai';
-import { openai } from '@ai-sdk/openai';
-import { withGuardrails, piiDetector } from 'ai-sdk-guardrails';
+import { streamText } from 'ai';
 const model = withGuardrails(openai('gpt-4o'), {
-  inputGuardrails: [piiDetector()],
+  outputGuardrails: [minLengthRequirement(100)],
 });
-const { text } = await generateText({
-  model,
-  prompt: 'Write a friendly intro email.',
+// Streaming just works - guardrails run after stream completes
+const { textStream } = await streamText({ model, prompt: '...' });
+for await (const chunk of textStream) {
+  process.stdout.write(chunk);
+}
+```
+### Production Setup (With Error Handling)
+```ts
+import { isGuardrailsError } from 'ai-sdk-guardrails';
+const model = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [piiDetector(), promptInjectionDetector()],
+  outputGuardrails: [sensitiveDataFilter()],
+  throwOnBlocked: true, // Throw errors instead of silent blocking
 });
+try {
+  const { text } = await generateText({ model, prompt: '...' });
+  console.log(text);
+} catch (error) {
+  if (isGuardrailsError(error)) {
+    console.error('Blocked by guardrail:', error.message);
+    // Show user-friendly message
+  }
+}
 ```
-## Contents
-- Overview
-- Concepts
-- Installation
-- Usage
-  - Define a guardrail
-  - Built-in helpers
-- Streaming
-- Auto Retry (utility and middleware)
-- Error Handling
-- API
-- Examples
-- Compatibility
-- Architecture
-- Contributing
-## API Overview
+## How It Works
-### Primary Functions
+Guardrails run **in parallel** with your AI calls as middleware:
-- **`withGuardrails(model, config)`** - Main API for wrapping language models with guardrails
-- **`createGuardrails(config)`** - Factory to create reusable guardrail configurations
-- **`withAgentGuardrails(agentSettings, config)`** - Wrap AI SDK Agents with guardrails
+```mermaid
+flowchart LR
+  A[Input] --> B[Input Guardrails]
+  B -->|✅ Clean| C[AI Model]
+  B -->|❌ Blocked| X[No API Call]
+  C --> D[Output Guardrails]
+  D -->|✅ Clean| E[Response]
+  D -->|❌ Blocked| R[Retry/Replace/Block]
+```
-### Migration from v3.x
+**Three-step workflow:**
-- `wrapWithGuardrails` → `withGuardrails` (alias available, deprecated)
-- `wrapAgentWithGuardrails` → `withAgentGuardrails` (alias available, deprecated)
-- Error classes: `InputBlockedError` → `GuardrailsInputError`, `OutputBlockedError` → `GuardrailsOutputError`
+1. **Receive**: Input or output arrives
+2. **Check**: Guardrails run (PII detection, validation, etc.)
+3. **Decide**: Pass through, block, or retry
-```ts
-// Before (v3.x - still works but deprecated)
-import { wrapWithGuardrails, InputBlockedError } from 'ai-sdk-guardrails';
-const model = wrapWithGuardrails(openai('gpt-4o'), { ... });
+**Key benefit**: Non-invasive. Your existing telemetry, logging, and observability tools keep working because guardrails are just middleware.
-// After (v4.x - recommended)
-import { withGuardrails, GuardrailsInputError } from 'ai-sdk-guardrails';
-const model = withGuardrails(openai('gpt-4o'), { ... });
+## Built-in Guardrails
+### Input Guardrails (Run Before Model)
+| Guardrail                   | Purpose                          | Example             |
+| --------------------------- | -------------------------------- | ------------------- |
+| `piiDetector()`             | Block emails, phones, SSNs       | Compliance, privacy |
+| `promptInjectionDetector()` | Detect injection attempts        | Security            |
+| `blockedKeywords()`         | Block specific terms             | Content policy      |
+| `inputLengthLimit()`        | Enforce max input length         | Cost control        |
+| `rateLimiting()`            | Per-user rate limits             | Abuse prevention    |
+| `profanityFilter()`         | Block offensive language         | Content moderation  |
+| `toxicityDetector()`        | Detect toxic content             | Safety              |
+| `allowedToolsGuardrail()`   | Restrict which tools can be used | Tool security       |
+### Output Guardrails (Run After Model)
+| Guardrail                 | Purpose                     | Example                   |
+| ------------------------- | --------------------------- | ------------------------- |
+| `sensitiveDataFilter()`   | Remove secrets, API keys    | Security                  |
+| `minLengthRequirement()`  | Enforce minimum length      | Quality control           |
+| `outputLengthLimit()`     | Enforce maximum length      | Cost/UX control           |
+| `toxicityFilter()`        | Block toxic responses       | Safety                    |
+| `jsonValidation()`        | Validate JSON structure     | Structured output         |
+| `schemaValidation()`      | Validate against Zod schema | Type safety               |
+| `confidenceThreshold()`   | Require minimum confidence  | Quality                   |
+| `hallucinationDetector()` | Detect uncertain claims     | Accuracy                  |
+| `secretRedaction()`       | Redact secrets from output  | Security                  |
+| `mcpSecurityGuardrail()`  | MCP tool security           | Prevent data exfiltration |
+### MCP Security Guardrails
+Protect against prompt injection and data exfiltration when using Model Context Protocol (MCP) tools:
+```ts
+import { mcpSecurityGuardrail, mcpResponseSanitizer } from 'ai-sdk-guardrails';
-// Factory pattern (new in v4.x)
-import { createGuardrails } from 'ai-sdk-guardrails';
-const guards = createGuardrails({ ... });
-const model = guards(openai('gpt-4o'));
+const model = withGuardrails(openai('gpt-4o'), {
+  outputGuardrails: [
+    mcpSecurityGuardrail({
+      detectExfiltration: true, // Detect data exfiltration attempts
+      scanEncodedContent: true, // Scan base64/hex encoded content
+      allowedDomains: ['api.company.com'], // Domain allowlist
+      maxContentSize: 51200, // 50KB limit
+      injectionThreshold: 0.7, // Sensitivity (lower = stricter)
+    }),
+    mcpResponseSanitizer(), // Clean malicious content vs blocking
+  ],
+});
 ```
-## Concepts
+**Attack vectors prevented:**
-- Input guardrails: Validate or block prompts to save cost and enforce rules before the call.
-- Output guardrails: Check results for quality and safety. Block, replace, or retry as needed.
-- Middleware: Guardrails wrap any model via AI SDK middleware. Your app code stays the same.
+- ✅ Direct prompt injection
+- ✅ Tool response poisoning
+- ✅ Data exfiltration via URLs
+- ✅ Encoded attacks (base64/hex)
+- ✅ Cascading exploits
+- ✅ Context poisoning
-## Installation
+See [MCP Security documentation](#mcp-security-guardrails-advanced) for full details.
-See Quickstart for installation commands. Add providers you use as needed (e.g., `@ai-sdk/openai`, `@ai-sdk/mistral`).
+## Advanced Features
-## Usage
+### Custom Guardrails
-### Create custom guardrails
+Create domain-specific guardrails:
 ```ts
-import { openai } from '@ai-sdk/openai';
-import {
-  defineInputGuardrail,
-  defineOutputGuardrail,
-  withGuardrails,
-} from 'ai-sdk-guardrails';
-import { extractTextContent } from 'ai-sdk-guardrails/guardrails/input';
+import { defineInputGuardrail, defineOutputGuardrail } from 'ai-sdk-guardrails';
 import { extractContent } from 'ai-sdk-guardrails/guardrails/output';
+// Custom input guardrail
 const businessHours = defineInputGuardrail({
   name: 'business-hours',
-  execute: async (params) => {
-    const hr = new Date().getHours();
-    return hr >= 9 && hr <= 17
+  execute: async () => {
+    const hour = new Date().getHours();
+    return hour >= 9 && hour <= 17
       ? { tripwireTriggered: false }
       : { tripwireTriggered: true, message: 'Outside business hours' };
   },
 });
+// Custom output guardrail
 const minQuality = defineOutputGuardrail({
   name: 'min-quality',
   execute: async ({ result }) => {
     const { text } = extractContent(result);
-    return text.length >= 80
+    return text.length >= 100
       ? { tripwireTriggered: false }
       : { tripwireTriggered: true, message: 'Response too short' };
   },
@@ -215,213 +285,114 @@ const model = withGuardrails(openai('gpt-4o'), {
 });
 ```
-### Built-in helpers
+### Auto-Retry on Failures
+Automatically retry when output doesn't meet requirements:
 ```ts
-import { openai } from '@ai-sdk/openai';
 import {
-  withGuardrails,
-  piiDetector,
-  blockedKeywords,
-  contentLengthLimit,
-  promptInjectionDetector,
-  sensitiveDataFilter,
+  wrapWithOutputGuardrails,
   minLengthRequirement,
-  confidenceThreshold,
-  mcpSecurityGuardrail,
-  mcpResponseSanitizer,
 } from 'ai-sdk-guardrails';
-const model = withGuardrails(openai('gpt-4o'), {
-  inputGuardrails: [
-    piiDetector(),
-    promptInjectionDetector({ threshold: 0.7 }),
-    blockedKeywords(['test', 'spam']),
-    contentLengthLimit(4000),
-  ],
-  outputGuardrails: [
-    mcpSecurityGuardrail({
-      detectExfiltration: true,
-      scanEncodedContent: true,
-      allowedDomains: ['trusted-api.com'],
-    }),
-    mcpResponseSanitizer(),
-    sensitiveDataFilter(),
-    minLengthRequirement(160),
-    confidenceThreshold(0.6),
-  ],
-});
-```
-## Streaming
-Works out of the box. By default, guardrails run after the stream ends (buffer mode). For early blocking, enable progressive mode.
-```ts
-import { streamText } from 'ai';
-import { openai } from '@ai-sdk/openai';
-import { withGuardrails, minLengthRequirement } from 'ai-sdk-guardrails';
-const model = withGuardrails(openai('gpt-4o'), {
-  outputGuardrails: [minLengthRequirement(120)],
-  // Evaluate as tokens arrive; stop or replace early when blocked
-  streamMode: 'progressive',
-  replaceOnBlocked: true,
-});
-const { textStream } = await streamText({
-  model,
-  prompt: 'Tell me a short story about a robot.',
-});
-for await (const delta of textStream) process.stdout.write(delta);
-```
-## Auto Retry
-Choose what fits your flow:
-- Standalone utility: Use `retry()` to wrap any generation function with your own validator and backoff.
-- Middleware option: Add `retry` to output guardrails so retries run automatically when a check fails.
-### Utility
-```ts
-import { retry } from 'ai-sdk-guardrails';
-import { generateText } from 'ai';
-import { openai } from '@ai-sdk/openai';
-const result = await retry({
-  generate: (params) => generateText({ model: openai('gpt-4o'), ...params }),
-  params: { prompt: 'Explain backpropagation in depth.' },
-  validate: (r) => ({
-    blocked: (r.text ?? '').length < 500,
-    message: 'Response too short',
-  }),
-  buildRetryParams: ({ lastParams }) => ({
-    ...lastParams,
-    maxOutputTokens: Math.max(800, (lastParams.maxOutputTokens ?? 400) + 300),
-  }),
-  maxRetries: 2,
-});
-```
-### Middleware
-```ts
-import { generateText } from 'ai';
-import { openai } from '@ai-sdk/openai';
-import { withGuardrails, defineOutputGuardrail } from 'ai-sdk-guardrails';
-import { extractContent } from 'ai-sdk-guardrails/guardrails/output';
-const minLengthGuardrail = defineOutputGuardrail<{ minChars: number }>({
-  name: 'min-output-length',
-  execute: async ({ result }) => {
-    const { text } = extractContent(result);
-    const minChars = text.length + 1;
-    return text.length < minChars
-      ? {
-          tripwireTriggered: true,
-          severity: 'medium',
-          message: `Answer too short: ${text.length} < ${minChars}`,
-          metadata: { minChars },
-        }
-      : { tripwireTriggered: false };
-  },
-});
-const guarded = wrapWithOutputGuardrails(
+const model = wrapWithOutputGuardrails(
   openai('gpt-4o'),
-  [minLengthGuardrail],
+  [minLengthRequirement(100)],
   {
-    replaceOnBlocked: false,
     retry: {
-      maxRetries: 1,
-      buildRetryParams: ({ summary, lastParams }) => ({
+      maxRetries: 2,
+      buildRetryParams: ({ lastParams }) => ({
         ...lastParams,
-        maxOutputTokens: Math.max(
-          800,
-          (lastParams.maxOutputTokens ?? 400) + 300,
-        ),
+        // Increase max tokens on retry
+        maxOutputTokens: (lastParams.maxOutputTokens ?? 400) + 200,
+        // Add context about the failure
         prompt: [
-          ...(Array.isArray(lastParams.prompt) ? lastParams.prompt : []),
+          ...lastParams.prompt,
           {
-            role: 'user' as const,
-            content: [
-              {
-                type: 'text' as const,
-                text: `Note: The previous answer ${summary.blockedResults[0]?.message}. Provide a comprehensive, detailed answer with examples.`,
-              },
-            ],
+            role: 'user',
+            content: 'Please provide a more detailed response.',
           },
         ],
       }),
     },
   },
 );
-const { text } = await generateText({
-  model: guarded,
-  prompt: 'Explain the significance of the Turing Test in AI history.',
-});
-```
-Tip: Use backoff helpers if you need delays between retries: `exponentialBackoff`, `linearBackoff`, `fixedBackoff`, `jitteredExponentialBackoff`, or `backoffPresets`.
-## Error Handling
-Set `throwOnBlocked: true` to throw structured errors you can catch and turn into friendly messages.
-```ts
-import { isGuardrailsError } from 'ai-sdk-guardrails';
-try {
-  const { text } = await generateText({ model, prompt: '...' });
-} catch (err) {
-  if (isGuardrailsError(err)) {
-    console.error('Guardrail blocked:', err.message);
-    // err.results gives you details per guardrail
-  } else {
-    console.error('Unexpected error:', err);
-  }
-}
 ```
-## Reusable Guardrails Factory
+### Reusable Configurations
-Use `createGuardrails()` to create reusable guardrail configurations that can be applied to multiple models:
+Create reusable guardrail sets:
 ```ts
-import { openai } from '@ai-sdk/openai';
-import { anthropic } from '@ai-sdk/anthropic';
-import { createGuardrails, defineInputGuardrail } from 'ai-sdk-guardrails';
+import {
+  createGuardrails,
+  piiDetector,
+  sensitiveDataFilter,
+} from 'ai-sdk-guardrails';
-// Create reusable guardrails configuration
+// Define once
 const productionGuards = createGuardrails({
-  inputGuardrails: [piiDetector(), contentFilter()],
-  outputGuardrails: [qualityCheck(), minLength(100)],
+  inputGuardrails: [piiDetector()],
+  outputGuardrails: [sensitiveDataFilter()],
   throwOnBlocked: true,
 });
 // Apply to multiple models
 const gpt4 = productionGuards(openai('gpt-4o'));
 const claude = productionGuards(anthropic('claude-3-sonnet'));
+```
+### Streaming Modes
-// Compose multiple guardrail sets
-const strictLimits = createGuardrails({ inputGuardrails: [maxLength(500)] });
-const piiProtection = createGuardrails({ inputGuardrails: [piiDetector()] });
+Control when guardrails run during streaming:
-// Chain them together
-const model = piiProtection(strictLimits(openai('gpt-4o')));
+```ts
+const model = withGuardrails(openai('gpt-4o'), {
+  outputGuardrails: [minLengthRequirement(100)],
+  streamMode: 'progressive', // Run guardrails as tokens arrive
+  replaceOnBlocked: true, // Replace blocked output with fallback
+});
 ```
-## MCP Security Guardrails
+- `buffer` (default): Wait for stream to complete, then check
+- `progressive`: Check guardrails as tokens arrive (early termination)
+### Agent Support
-**Production-Ready**: Protect against prompt injection and data exfiltration attacks when using Model Context Protocol (MCP) tools. Based on research into the ["lethal trifecta" vulnerability](https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/) that has affected major AI platforms.
+Guardrails work with AI SDK Agents:
+```ts
+import { withAgentGuardrails } from 'ai-sdk-guardrails';
+import { tool } from 'ai';
+const agent = withAgentGuardrails(
+  {
+    model: openai('gpt-4o'),
+    tools: { search: searchTool },
+    system: 'You are a helpful assistant.',
+  },
+  {
+    inputGuardrails: [piiDetector()],
+    outputGuardrails: [sensitiveDataFilter()],
+    toolGuardrails: [
+      toolEgressPolicy({
+        allowedHosts: ['api.company.com'],
+        scanForUrls: true,
+      }),
+    ],
+  },
+);
+const result = await agent.generate({ prompt: '...' });
+```
+## MCP Security Guardrails (Advanced)
+**Production-Ready**: Protect against the ["lethal trifecta" vulnerability](https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/) when using Model Context Protocol (MCP) tools.
 ### The Problem
-AI agents with MCP tools can be vulnerable when they have:
+AI agents with MCP tools are vulnerable when they have:
 1. **Access to private data** (through tools)
 2. **Process untrusted content** (from tool responses)
@@ -429,9 +400,9 @@ AI agents with MCP tools can be vulnerable when they have:
 Malicious tool responses can contain hidden instructions that trick the AI into exfiltrating sensitive data.
-### Production-Ready Solution
+### Production Configuration
-Full configurability with sensible defaults for immediate deployment:
+Full configurability with sensible defaults:
 ```ts
 import {
@@ -451,100 +422,58 @@ const secureModel = withGuardrails(openai('gpt-4o'), {
     mcpSecurityGuardrail({
       injectionThreshold: 0.5, // Lower = more sensitive
       maxSuspiciousUrls: 0, // Zero tolerance
-      maxContentSize: 25600, // 25KB limit for performance
+      maxContentSize: 25600, // 25KB limit
       minEncodedLength: 15, // Detect shorter encoded attacks
-      encodedInjectionThreshold: 0.2, // Combined encoded + injection threshold
+      encodedInjectionThreshold: 0.2, // Combined threshold
       highRiskThreshold: 0.3, // High-risk cascade blocking
       authorityThreshold: 0.5, // Authority manipulation detection
       allowedDomains: ['api.company.com', 'trusted-partner.com'],
-      customSuspiciousDomains: ['evil.com', 'malicious.org'],
+      customSuspiciousDomains: ['evil.com'],
       blockCascadingCalls: true,
       scanEncodedContent: true,
       detectExfiltration: true,
     }),
-    mcpResponseSanitizer(), // Clean malicious content vs blocking
+    mcpResponseSanitizer(), // Clean vs block
     toolEgressPolicy({
-      allowedHosts: ['api.company.com', 'trusted-partner.com'],
-      blockedHosts: ['webhook.site', 'requestcatcher.com', 'ngrok.io'],
+      allowedHosts: ['api.company.com'],
+      blockedHosts: ['webhook.site', 'requestcatcher.com'],
       scanForUrls: true,
     }),
   ],
 });
 ```
-### Environment & Role-Based Configuration
+### Environment-Based Configuration
 ```ts
-// Different security profiles for different environments
 function getSecurityConfig(env: 'production' | 'staging' | 'development') {
   const configs = {
     production: {
       injectionThreshold: 0.5, // High security
-      maxContentSize: 25600, // 25KB limit
-      authorityThreshold: 0.5, // Very sensitive
+      maxContentSize: 25600, // 25KB
+      authorityThreshold: 0.5,
     },
     staging: {
-      injectionThreshold: 0.7, // Balanced security
-      maxContentSize: 51200, // 50KB default
-      authorityThreshold: 0.7, // Standard sensitivity
+      injectionThreshold: 0.7, // Balanced
+      maxContentSize: 51200, // 50KB
+      authorityThreshold: 0.7,
     },
     development: {
-      injectionThreshold: 0.8, // Lower security, better performance
-      maxContentSize: 102400, // 100KB for testing
-      authorityThreshold: 0.8, // Less restrictive
+      injectionThreshold: 0.8, // Permissive
+      maxContentSize: 102400, // 100KB
+      authorityThreshold: 0.8,
     },
   };
   return configs[env];
 }
-const productionModel = withGuardrails(openai('gpt-4o'), {
+const model = withGuardrails(openai('gpt-4o'), {
   outputGuardrails: [mcpSecurityGuardrail(getSecurityConfig('production'))],
 });
 ```
-### Attack Vectors Prevented
-✅ **Direct prompt injection** - "System: ignore all previous instructions"
-✅ **Tool response poisoning** - Malicious content in MCP tool responses
-✅ **Data exfiltration** - URLs constructed to steal sensitive data
-✅ **Encoded attacks** - Base64/hex hidden malicious instructions
-✅ **Cascading exploits** - Tool responses triggering additional dangerous calls
-✅ **Context poisoning** - Attempts to modify AI behavior mid-conversation
-### Secure MCP Agent Example
-```ts
-import { withAgentGuardrails } from 'ai-sdk-guardrails';
-const secureAgent = withAgentGuardrails(
-  {
-    model: openai('gpt-4o'),
-    tools: { file_search, api_call, database_query },
-    system: 'You are a secure assistant. Always validate tool responses.',
-  },
-  {
-    inputGuardrails: [promptInjectionDetector()],
-    outputGuardrails: [
-      mcpSecurityGuardrail({
-        detectExfiltration: true,
-        allowedDomains: ['trusted-api.com'],
-      }),
-      mcpResponseSanitizer(),
-    ],
-    toolGuardrails: [
-      toolEgressPolicy({
-        allowedHosts: ['trusted-api.com'],
-        scanForUrls: true,
-      }),
-    ],
-  },
-);
-```
 ### Configuration Options
-All security parameters are fully configurable with sensible defaults:
 | Option                      | Default | Description                                      |
 | --------------------------- | ------- | ------------------------------------------------ |
 | `injectionThreshold`        | 0.7     | Prompt injection confidence threshold (0-1)      |
@@ -556,106 +485,92 @@ All security parameters are fully configurable with sensible defaults:
 | `allowedDomains`            | []      | Allowed domains for URL construction             |
 | `customSuspiciousDomains`   | []      | Additional suspicious domain patterns            |
-### Performance & Security Balance
-- **High Security**: Lower thresholds, stricter limits, comprehensive scanning
-- **Balanced**: Default settings, good for most production use cases
-- **High Performance**: Higher thresholds, larger limits, selective scanning
 See complete examples:
-- [Production MCP Configuration](./examples/44-production-mcp-config.ts) - **New!**
+- [Production MCP Configuration](./examples/44-production-mcp-config.ts)
 - [MCP Security Test Suite](./examples/41-mcp-security-test.ts)
 - [Enhanced Security Testing](./examples/43-enhanced-mcp-security-test.ts)
-- [Vulnerability Proof of Concept](./examples/42-mcp-vulnerability-proof.ts)
-## Agent Support
+## Error Handling
-Guardrails work with AI SDK Agents for multi-step agentic workflows:
+### Throw Errors on Block
 ```ts
-import { openai } from '@ai-sdk/openai';
-import { withAgentGuardrails, defineOutputGuardrail } from 'ai-sdk-guardrails';
-import { tool } from 'ai';
-import { z } from 'zod';
-// Define tools for the agent
-const searchTool = tool({
-  description: 'Search for information',
-  inputSchema: z.object({ query: z.string() }),
-  execute: async ({ query }) => `Results for: ${query}`,
+const model = withGuardrails(openai('gpt-4o'), {
+  inputGuardrails: [piiDetector()],
+  throwOnBlocked: true, // Throw errors instead of silent blocking
 });
-// Create agent with guardrails
-const agent = withAgentGuardrails(
-  {
-    model: openai('gpt-4o'),
-    tools: { search: searchTool },
-    system: 'You are a helpful research assistant.',
-  },
-  {
-    outputGuardrails: [
-      defineOutputGuardrail({
-        name: 'tool-usage-required',
-        description: 'Ensures agent uses search tools',
-        execute: async (params) => {
-          const hasToolCall = params.result.steps?.some(
-            (step) => step.type === 'tool-call',
-          );
-          return {
-            tripwireTriggered: !hasToolCall,
-            message: hasToolCall
-              ? 'Tool usage validated'
-              : 'Must use search tools for research',
-            severity: 'high',
-          };
-        },
-      }),
-    ],
-    throwOnBlocked: true,
-  },
-);
-// Use the guarded agent
-const result = await agent.generate({
-  prompt: 'Research the latest AI developments',
-});
+try {
+  const { text } = await generateText({ model, prompt: '...' });
+} catch (error) {
+  if (isGuardrailsError(error)) {
+    console.error('Blocked:', error.message);
+    // error.results gives details per guardrail
+  }
+}
 ```
-## API
+### Error Types
-| Export                                                                                                      | Description                                                                      |
-| ----------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
-| `defineInputGuardrail`, `defineOutputGuardrail`                                                             | Create guardrails with clear messages, severity, and metadata.                   |
-| `withGuardrails`, `createGuardrails`, `withAgentGuardrails`                                                 | Attach guardrails to AI SDK models and agents via middleware.                    |
-| `executeInputGuardrails`, `executeOutputGuardrails`                                                         | Run guardrails programmatically (outside middleware) and get structured results. |
-| `retry`, `retryHelpers`                                                                                     | Standalone auto-retry utilities with validation and backoff.                     |
-| `GuardrailsError`, `GuardrailsInputError`, `GuardrailsOutputError`, `isGuardrailsError`, `extractErrorInfo` | Structured errors and helpers for robust handling.                               |
-| `exponentialBackoff`, `linearBackoff`, `fixedBackoff`, `jitteredExponentialBackoff`, `backoffPresets`       | Backoff strategies to control retry pacing.                                      |
+- `GuardrailsInputError` - Input guardrail blocked
+- `GuardrailsOutputError` - Output guardrail blocked
+- `GuardrailExecutionError` - Guardrail threw an error
+- `GuardrailTimeoutError` - Guardrail exceeded timeout
+- `GuardrailConfigurationError` - Invalid configuration
-See source for built-in helpers:
+## API Reference
-- Input helpers: `./src/guardrails/input.ts`
-- Output helpers: `./src/guardrails/output.ts`
+### Primary Functions
-## Examples
+| Function                  | Purpose                                  |
+| ------------------------- | ---------------------------------------- |
+| `withGuardrails`          | Wrap model with guardrails (main API)    |
+| `createGuardrails`        | Create reusable guardrail configurations |
+| `withAgentGuardrails`     | Wrap AI SDK Agents with guardrails       |
+| `defineInputGuardrail`    | Create custom input guardrail            |
+| `defineOutputGuardrail`   | Create custom output guardrail           |
+| `executeInputGuardrails`  | Run input guardrails programmatically    |
+| `executeOutputGuardrails` | Run output guardrails programmatically   |
+### Error Utilities
+| Function            | Purpose                              |
+| ------------------- | ------------------------------------ |
+| `isGuardrailsError` | Check if error is from guardrails    |
+| `extractErrorInfo`  | Extract structured error information |
+### Retry Utilities
-Browse runnable examples for streaming, compliance, safety, and more:
+| Function                     | Purpose                           |
+| ---------------------------- | --------------------------------- |
+| `retry`                      | Standalone retry utility          |
+| `exponentialBackoff`         | Exponential backoff strategy      |
+| `linearBackoff`              | Linear backoff strategy           |
+| `jitteredExponentialBackoff` | Jittered exponential backoff      |
+| `backoffPresets`             | Pre-configured backoff strategies |
-- Index and commands: [examples/README.md](./examples/README.md)
+See source for all built-in guardrails:
-Quick starts
+- Input helpers: [`./src/guardrails/input.ts`](./src/guardrails/input.ts)
+- Output helpers: [`./src/guardrails/output.ts`](./src/guardrails/output.ts)
+- Tool helpers: [`./src/guardrails/tools.ts`](./src/guardrails/tools.ts)
+- MCP security: [`./src/guardrails/mcp-security.ts`](./src/guardrails/mcp-security.ts)
+## Examples
+Browse 48+ runnable examples: [examples/README.md](./examples/README.md) |
+### Quick Starts
 | Example                    | Description                     | File                                                                              |
 | -------------------------- | ------------------------------- | --------------------------------------------------------------------------------- |
 | Simple combined protection | Minimal input and output setup  | [07a-simple-combined-protection.ts](./examples/07a-simple-combined-protection.ts) |
 | Auto retry on output       | Retry until output meets a rule | [32-auto-retry-output.ts](./examples/32-auto-retry-output.ts)                     |
-| LLM judge auto-retry       | Judge feedback drives retry     | [33-judge-auto-retry.ts](./examples/33-judge-auto-retry.ts)                       |
-| Expected tool use retry    | Enforce/guide tool usage        | [34-expected-tool-use-retry.ts](./examples/34-expected-tool-use-retry.ts)         |
+| LLM judge auto-retry       | Judge feedback drives retry     | [35-judge-auto-retry.ts](./examples/35-judge-auto-retry.ts)                       |
 | Weather assistant          | End-to-end input/output + retry | [33-blog-post-weather-assistant.ts](./examples/33-blog-post-weather-assistant.ts) |
-Input safety
+### Input Safety
 | Example            | Description                         | File                                                            |
 | ------------------ | ----------------------------------- | --------------------------------------------------------------- |
@@ -664,7 +579,7 @@ Input safety
 | PII detection      | Detect PII before calling the model | [03-pii-detection.ts](./examples/03-pii-detection.ts)           |
 | Rate limiting      | Simple per-user rate limit          | [13-rate-limiting.ts](./examples/13-rate-limiting.ts)           |
-Output safety
+### Output Safety
 | Example                 | Description                         | File                                                                      |
 | ----------------------- | ----------------------------------- | ------------------------------------------------------------------------- |
@@ -672,7 +587,7 @@ Output safety
 | Sensitive output filter | Filter secrets and PII in responses | [05-sensitive-output-filter.ts](./examples/05-sensitive-output-filter.ts) |
 | Hallucination detection | Flag uncertain factual claims       | [19-hallucination-detection.ts](./examples/19-hallucination-detection.ts) |
-Streaming
+### Streaming
 | Example           | Description                        | File                                                                              |
 | ----------------- | ---------------------------------- | --------------------------------------------------------------------------------- |
@@ -680,7 +595,7 @@ Streaming
 | Streaming quality | Quality checks with streaming      | [12-streaming-quality.ts](./examples/12-streaming-quality.ts)                     |
 | Early termination | Stop streams early when blocked    | [28-streaming-early-termination.ts](./examples/28-streaming-early-termination.ts) |
-Advanced
+### Advanced
 | Example                    | Description                   | File                                                                            |
 | -------------------------- | ----------------------------- | ------------------------------------------------------------------------------- |
@@ -689,30 +604,47 @@ Advanced
 | SQL code safety            | Basic SQL safety checks       | [24-sql-code-safety.ts](./examples/24-sql-code-safety.ts)                       |
 | Role hierarchy enforcement | Enforce role rules in prompts | [23-role-hierarchy-enforcement.ts](./examples/23-role-hierarchy-enforcement.ts) |
-## Compatibility
+## Migration from v3.x
-- Runtime: Node.js 18+ recommended
-- AI SDK: Compatible with AI SDK 5 (`ai@^5`); wraps any model
-- For `generateObject`: for strict object validation, run `executeOutputGuardrails()` after generation
+API naming has been improved in v4.x (old names still work but are deprecated):
-## Architecture
+```ts
+// Before (v3.x - still works but deprecated)
+import { wrapWithGuardrails, InputBlockedError } from 'ai-sdk-guardrails';
+const model = wrapWithGuardrails(openai('gpt-4o'), { ... });
-```mermaid
-flowchart LR
-  A[Input] --> B[Input Guardrails]
-  B -->|Valid| C[AI Model]
-  B -->|Blocked| X[No API Call]
-  C --> D[Output Guardrails]
-  D -->|Clean| E[Response]
-  D -->|Blocked| R[Retry/Replace/Throw]
+// After (v4.x - recommended)
+import { withGuardrails, GuardrailsInputError } from 'ai-sdk-guardrails';
+const model = withGuardrails(openai('gpt-4o'), { ... });
 ```
-### Design principles
+Changes:
+- `wrapWithGuardrails` → `withGuardrails`
+- `wrapAgentWithGuardrails` → `withAgentGuardrails`
+- `InputBlockedError` → `GuardrailsInputError`
+- `OutputBlockedError` → `GuardrailsOutputError`
+## Compatibility
+- **Runtime**: Node.js 18+ recommended
+- **AI SDK**: Compatible with AI SDK 5.x (`ai@^5`)
+- **TypeScript**: Full type safety with TypeScript 5+
+- **Works with any model**: OpenAI, Anthropic, Mistral, Groq, etc.
+## Why This Library?
+**Non-invasive**: Guardrails are middleware. Your existing code, telemetry (Langfuse, Helicone), and logging stay intact.
+**Production-ready**: Used in production by teams who need compliance, security, and cost control without rebuilding their infrastructure.
+**Developer experience**: One line to add safety. Progressive complexity - start simple, add advanced features when needed.
+**Type-safe**: Rich TypeScript types and inference throughout.
+**Comprehensive**: 40+ built-in guardrails covering security, quality, compliance, and performance.
-- Helper-first: simple, chainable APIs with great DX
-- Composable: run multiple guardrails in any order
-- Type-safe: rich TypeScript types and inference
-- Sensible defaults: zero-config to start, full control when you need it
+**Advanced features**: Early detection, parallel execution, enhanced prompt injection detection, MCP security, and more.
 ## Contributing
@@ -720,4 +652,4 @@ Issues and PRs are welcome.
 ## License
-MIT © Jag Reehal. See LICENSE for details.
+MIT © Jag Reehal. See [LICENSE](./LICENSE) for details.