ai-sdk-guardrails 5.0.0 → 5.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +368 -436
- package/dist/chunk-CSUDFTRH.js +1067 -0
- package/dist/chunk-LVXHWHZC.js +355 -0
- package/dist/chunk-ND2ICBTR.js +1209 -0
- package/dist/chunk-VKJ5EAS7.js +397 -0
- package/dist/guardrails/input.cjs +1109 -0
- package/dist/guardrails/input.d.cts +133 -0
- package/dist/guardrails/input.d.ts +133 -0
- package/dist/guardrails/input.js +37 -0
- package/dist/guardrails/output.cjs +1260 -0
- package/dist/guardrails/output.d.cts +100 -0
- package/dist/guardrails/output.d.ts +100 -0
- package/dist/guardrails/output.js +55 -0
- package/dist/guardrails/tools.cjs +658 -0
- package/dist/guardrails/tools.d.cts +60 -0
- package/dist/guardrails/tools.d.ts +60 -0
- package/dist/guardrails/tools.js +10 -0
- package/dist/index.cjs +6046 -0
- package/dist/index.d.cts +1043 -0
- package/dist/index.d.ts +1043 -0
- package/dist/index.js +3063 -0
- package/dist/types-CeZi2BBN.d.cts +727 -0
- package/dist/types-CeZi2BBN.d.ts +727 -0
- package/package.json +31 -22
package/README.md
CHANGED
|
@@ -1,10 +1,8 @@
|
|
|
1
1
|
# AI SDK Guardrails
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
## Safety and quality controls for Vercel AI SDK
|
|
4
4
|
|
|
5
|
-
Add
|
|
6
|
-
|
|
7
|
-
**Now includes MCP (Model Context Protocol) security guardrails** to help protect against attacks when using AI tools.
|
|
5
|
+
Add guardrails to your AI applications in one line of code. Block PII, prevent prompt injection, enforce output quality - while keeping your existing telemetry and observability stack intact.
|
|
8
6
|
|
|
9
7
|
[](https://www.npmjs.com/package/ai-sdk-guardrails)
|
|
10
8
|
[](https://www.npmjs.com/package/ai-sdk-guardrails)
|
|
@@ -14,52 +12,79 @@ Add safety checks and quality controls to your AI applications. Guard against pr
|
|
|
14
12
|
|
|
15
13
|

|
|
16
14
|
|
|
17
|
-
##
|
|
15
|
+
## Drop-in Guardrails for any AI model
|
|
18
16
|
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
- **Save costs**: Block unnecessary requests before they hit your model
|
|
23
|
-
- **Improve safety**: Detect PII, block harmful content, prevent prompt injection
|
|
24
|
-
- **Better quality**: Enforce minimum response lengths, validate structure, auto-retry on failures
|
|
25
|
-
- **Easy integration**: Works as middleware with any AI SDK model
|
|
17
|
+
```ts
|
|
18
|
+
import { withGuardrails, piiDetector } from 'ai-sdk-guardrails';
|
|
19
|
+
const model = openai('gpt-4o'); // or any other AI model
|
|
26
20
|
|
|
27
|
-
|
|
21
|
+
// Everything else stays the same
|
|
22
|
+
const safeModel = withGuardrails(model, {
|
|
23
|
+
inputGuardrails: [piiDetector()],
|
|
24
|
+
});
|
|
28
25
|
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
- Prompt injection prevention
|
|
33
|
-
- Tool usage validation
|
|
34
|
-
- Auto-retry on low-quality responses
|
|
26
|
+
// Your existing code, telemetry, and logging still works
|
|
27
|
+
await generateText({ model: safeModel, prompt: '...' });
|
|
28
|
+
```
|
|
35
29
|
|
|
36
|
-
|
|
30
|
+
**That's it.** Your AI now blocks PII automatically.
|
|
37
31
|
|
|
38
|
-
|
|
32
|
+
## Installation
|
|
39
33
|
|
|
40
34
|
```bash
|
|
41
35
|
npm install ai-sdk-guardrails
|
|
42
36
|
```
|
|
43
37
|
|
|
44
|
-
|
|
38
|
+
## 🧙♂️ No-Code Wizard (New!)
|
|
39
|
+
|
|
40
|
+
**Don't want to write code?** Use our visual wizard to configure guardrails:
|
|
41
|
+
|
|
42
|
+
1. **Open the wizard**: [wizard-prototype/index.html](./wizard-prototype/index.html)
|
|
43
|
+
2. **Choose your use case**: Content moderation, data protection, quality assurance, or security
|
|
44
|
+
3. **Select guardrails**: Pick from 40+ built-in guardrails
|
|
45
|
+
4. **Configure settings**: Adjust thresholds and parameters with sliders and toggles
|
|
46
|
+
5. **Copy generated code**: Get production-ready TypeScript code instantly
|
|
47
|
+
|
|
48
|
+
**Perfect for:**
|
|
49
|
+
|
|
50
|
+
- 🎯 **Non-technical users** who need AI safety
|
|
51
|
+
- 🚀 **Quick prototyping** of guardrail configurations
|
|
52
|
+
- 📚 **Learning** how to use the library
|
|
53
|
+
- 👥 **Team onboarding** and training
|
|
54
|
+
|
|
55
|
+
The wizard generates code that works out of the box - just copy, paste, and run!
|
|
56
|
+
|
|
57
|
+
## Why Guardrails Matter
|
|
58
|
+
|
|
59
|
+
Real problems that guardrails solve:
|
|
60
|
+
|
|
61
|
+
❌ **Without guardrails:**
|
|
45
62
|
|
|
46
63
|
```ts
|
|
47
|
-
|
|
64
|
+
// User: "My email is john@company.com, help me..."
|
|
65
|
+
// → Sends PII to model → Compliance violation → $$$
|
|
48
66
|
```
|
|
49
67
|
|
|
50
|
-
**
|
|
68
|
+
✅ **With guardrails:**
|
|
51
69
|
|
|
52
70
|
```ts
|
|
53
|
-
const
|
|
54
|
-
inputGuardrails: [piiDetector()],
|
|
71
|
+
const model = withGuardrails(baseModel, {
|
|
72
|
+
inputGuardrails: [piiDetector()], // Blocks before API call
|
|
55
73
|
});
|
|
74
|
+
// → Request blocked → No PII leak → No cost → Compliant
|
|
56
75
|
```
|
|
57
76
|
|
|
58
|
-
|
|
77
|
+
Common use cases:
|
|
78
|
+
|
|
79
|
+
- 🛡️ **Compliance**: Block PII before it reaches your model
|
|
80
|
+
- 💰 **Cost control**: Stop bad requests before they cost money
|
|
81
|
+
- 🔒 **Security**: Prevent prompt injection and data exfiltration
|
|
82
|
+
- ✅ **Quality**: Enforce minimum response standards
|
|
83
|
+
- 🔧 **Production**: Works with your existing observability tools
|
|
59
84
|
|
|
60
|
-
##
|
|
85
|
+
## Copy-Paste Examples
|
|
61
86
|
|
|
62
|
-
|
|
87
|
+
### Basic Protection (Most Common)
|
|
63
88
|
|
|
64
89
|
```ts
|
|
65
90
|
import { generateText } from 'ai';
|
|
@@ -68,142 +93,187 @@ import {
|
|
|
68
93
|
withGuardrails,
|
|
69
94
|
piiDetector,
|
|
70
95
|
promptInjectionDetector,
|
|
71
|
-
minLengthRequirement,
|
|
72
|
-
mcpSecurityGuardrail,
|
|
73
96
|
} from 'ai-sdk-guardrails';
|
|
74
97
|
|
|
75
98
|
const model = withGuardrails(openai('gpt-4o'), {
|
|
76
99
|
inputGuardrails: [piiDetector(), promptInjectionDetector()],
|
|
77
|
-
outputGuardrails: [
|
|
78
|
-
minLengthRequirement(160),
|
|
79
|
-
mcpSecurityGuardrail({
|
|
80
|
-
maxContentSize: 51200, // 50KB limit
|
|
81
|
-
injectionThreshold: 0.7, // Configurable sensitivity
|
|
82
|
-
allowedDomains: ['api.company.com'], // Domain allowlist
|
|
83
|
-
}),
|
|
84
|
-
],
|
|
85
100
|
});
|
|
86
101
|
|
|
102
|
+
// Use exactly like before - nothing else changes
|
|
87
103
|
const { text } = await generateText({
|
|
88
104
|
model,
|
|
89
|
-
prompt: 'Write a friendly
|
|
105
|
+
prompt: 'Write a friendly email',
|
|
90
106
|
});
|
|
91
107
|
```
|
|
92
108
|
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
## Quickstart (30 seconds)
|
|
109
|
+
### Input + Output Protection
|
|
96
110
|
|
|
97
|
-
|
|
111
|
+
```ts
|
|
112
|
+
import {
|
|
113
|
+
withGuardrails,
|
|
114
|
+
piiDetector,
|
|
115
|
+
sensitiveDataFilter,
|
|
116
|
+
minLengthRequirement,
|
|
117
|
+
} from 'ai-sdk-guardrails';
|
|
98
118
|
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
119
|
+
const model = withGuardrails(openai('gpt-4o'), {
|
|
120
|
+
inputGuardrails: [piiDetector()], // Block PII in prompts
|
|
121
|
+
outputGuardrails: [
|
|
122
|
+
sensitiveDataFilter(), // Remove secrets from responses
|
|
123
|
+
minLengthRequirement(100), // Enforce quality standards
|
|
124
|
+
],
|
|
125
|
+
});
|
|
103
126
|
```
|
|
104
127
|
|
|
105
|
-
|
|
128
|
+
### Works With Streaming
|
|
106
129
|
|
|
107
130
|
```ts
|
|
108
|
-
import {
|
|
109
|
-
import { openai } from '@ai-sdk/openai';
|
|
110
|
-
import { withGuardrails, piiDetector } from 'ai-sdk-guardrails';
|
|
131
|
+
import { streamText } from 'ai';
|
|
111
132
|
|
|
112
133
|
const model = withGuardrails(openai('gpt-4o'), {
|
|
113
|
-
|
|
134
|
+
outputGuardrails: [minLengthRequirement(100)],
|
|
114
135
|
});
|
|
115
136
|
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
137
|
+
// Streaming just works - guardrails run after stream completes
|
|
138
|
+
const { textStream } = await streamText({ model, prompt: '...' });
|
|
139
|
+
for await (const chunk of textStream) {
|
|
140
|
+
process.stdout.write(chunk);
|
|
141
|
+
}
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
### Production Setup (With Error Handling)
|
|
145
|
+
|
|
146
|
+
```ts
|
|
147
|
+
import { isGuardrailsError } from 'ai-sdk-guardrails';
|
|
148
|
+
|
|
149
|
+
const model = withGuardrails(openai('gpt-4o'), {
|
|
150
|
+
inputGuardrails: [piiDetector(), promptInjectionDetector()],
|
|
151
|
+
outputGuardrails: [sensitiveDataFilter()],
|
|
152
|
+
throwOnBlocked: true, // Throw errors instead of silent blocking
|
|
119
153
|
});
|
|
154
|
+
|
|
155
|
+
try {
|
|
156
|
+
const { text } = await generateText({ model, prompt: '...' });
|
|
157
|
+
console.log(text);
|
|
158
|
+
} catch (error) {
|
|
159
|
+
if (isGuardrailsError(error)) {
|
|
160
|
+
console.error('Blocked by guardrail:', error.message);
|
|
161
|
+
// Show user-friendly message
|
|
162
|
+
}
|
|
163
|
+
}
|
|
120
164
|
```
|
|
121
165
|
|
|
122
|
-
##
|
|
123
|
-
|
|
124
|
-
- Overview
|
|
125
|
-
- Concepts
|
|
126
|
-
- Installation
|
|
127
|
-
- Usage
|
|
128
|
-
- Define a guardrail
|
|
129
|
-
- Built-in helpers
|
|
130
|
-
- Streaming
|
|
131
|
-
- Auto Retry (utility and middleware)
|
|
132
|
-
- Error Handling
|
|
133
|
-
- API
|
|
134
|
-
- Examples
|
|
135
|
-
- Compatibility
|
|
136
|
-
- Architecture
|
|
137
|
-
- Contributing
|
|
138
|
-
|
|
139
|
-
## API Overview
|
|
166
|
+
## How It Works
|
|
140
167
|
|
|
141
|
-
|
|
168
|
+
Guardrails run **in parallel** with your AI calls as middleware:
|
|
142
169
|
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
170
|
+
```mermaid
|
|
171
|
+
flowchart LR
|
|
172
|
+
A[Input] --> B[Input Guardrails]
|
|
173
|
+
B -->|✅ Clean| C[AI Model]
|
|
174
|
+
B -->|❌ Blocked| X[No API Call]
|
|
175
|
+
C --> D[Output Guardrails]
|
|
176
|
+
D -->|✅ Clean| E[Response]
|
|
177
|
+
D -->|❌ Blocked| R[Retry/Replace/Block]
|
|
178
|
+
```
|
|
146
179
|
|
|
147
|
-
|
|
180
|
+
**Three-step workflow:**
|
|
148
181
|
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
182
|
+
1. **Receive**: Input or output arrives
|
|
183
|
+
2. **Check**: Guardrails run (PII detection, validation, etc.)
|
|
184
|
+
3. **Decide**: Pass through, block, or retry
|
|
152
185
|
|
|
153
|
-
|
|
154
|
-
// Before (v3.x - still works but deprecated)
|
|
155
|
-
import { wrapWithGuardrails, InputBlockedError } from 'ai-sdk-guardrails';
|
|
156
|
-
const model = wrapWithGuardrails(openai('gpt-4o'), { ... });
|
|
186
|
+
**Key benefit**: Non-invasive. Your existing telemetry, logging, and observability tools keep working because guardrails are just middleware.
|
|
157
187
|
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
188
|
+
## Built-in Guardrails
|
|
189
|
+
|
|
190
|
+
### Input Guardrails (Run Before Model)
|
|
191
|
+
|
|
192
|
+
| Guardrail | Purpose | Example |
|
|
193
|
+
| --------------------------- | -------------------------------- | ------------------- |
|
|
194
|
+
| `piiDetector()` | Block emails, phones, SSNs | Compliance, privacy |
|
|
195
|
+
| `promptInjectionDetector()` | Detect injection attempts | Security |
|
|
196
|
+
| `blockedKeywords()` | Block specific terms | Content policy |
|
|
197
|
+
| `inputLengthLimit()` | Enforce max input length | Cost control |
|
|
198
|
+
| `rateLimiting()` | Per-user rate limits | Abuse prevention |
|
|
199
|
+
| `profanityFilter()` | Block offensive language | Content moderation |
|
|
200
|
+
| `toxicityDetector()` | Detect toxic content | Safety |
|
|
201
|
+
| `allowedToolsGuardrail()` | Restrict which tools can be used | Tool security |
|
|
202
|
+
|
|
203
|
+
### Output Guardrails (Run After Model)
|
|
204
|
+
|
|
205
|
+
| Guardrail | Purpose | Example |
|
|
206
|
+
| ------------------------- | --------------------------- | ------------------------- |
|
|
207
|
+
| `sensitiveDataFilter()` | Remove secrets, API keys | Security |
|
|
208
|
+
| `minLengthRequirement()` | Enforce minimum length | Quality control |
|
|
209
|
+
| `outputLengthLimit()` | Enforce maximum length | Cost/UX control |
|
|
210
|
+
| `toxicityFilter()` | Block toxic responses | Safety |
|
|
211
|
+
| `jsonValidation()` | Validate JSON structure | Structured output |
|
|
212
|
+
| `schemaValidation()` | Validate against Zod schema | Type safety |
|
|
213
|
+
| `confidenceThreshold()` | Require minimum confidence | Quality |
|
|
214
|
+
| `hallucinationDetector()` | Detect uncertain claims | Accuracy |
|
|
215
|
+
| `secretRedaction()` | Redact secrets from output | Security |
|
|
216
|
+
| `mcpSecurityGuardrail()` | MCP tool security | Prevent data exfiltration |
|
|
217
|
+
|
|
218
|
+
### MCP Security Guardrails
|
|
219
|
+
|
|
220
|
+
Protect against prompt injection and data exfiltration when using Model Context Protocol (MCP) tools:
|
|
221
|
+
|
|
222
|
+
```ts
|
|
223
|
+
import { mcpSecurityGuardrail, mcpResponseSanitizer } from 'ai-sdk-guardrails';
|
|
161
224
|
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
225
|
+
const model = withGuardrails(openai('gpt-4o'), {
|
|
226
|
+
outputGuardrails: [
|
|
227
|
+
mcpSecurityGuardrail({
|
|
228
|
+
detectExfiltration: true, // Detect data exfiltration attempts
|
|
229
|
+
scanEncodedContent: true, // Scan base64/hex encoded content
|
|
230
|
+
allowedDomains: ['api.company.com'], // Domain allowlist
|
|
231
|
+
maxContentSize: 51200, // 50KB limit
|
|
232
|
+
injectionThreshold: 0.7, // Sensitivity (lower = stricter)
|
|
233
|
+
}),
|
|
234
|
+
mcpResponseSanitizer(), // Clean malicious content vs blocking
|
|
235
|
+
],
|
|
236
|
+
});
|
|
166
237
|
```
|
|
167
238
|
|
|
168
|
-
|
|
239
|
+
**Attack vectors prevented:**
|
|
169
240
|
|
|
170
|
-
-
|
|
171
|
-
-
|
|
172
|
-
-
|
|
241
|
+
- ✅ Direct prompt injection
|
|
242
|
+
- ✅ Tool response poisoning
|
|
243
|
+
- ✅ Data exfiltration via URLs
|
|
244
|
+
- ✅ Encoded attacks (base64/hex)
|
|
245
|
+
- ✅ Cascading exploits
|
|
246
|
+
- ✅ Context poisoning
|
|
173
247
|
|
|
174
|
-
|
|
248
|
+
See [MCP Security documentation](#mcp-security-guardrails-advanced) for full details.
|
|
175
249
|
|
|
176
|
-
|
|
250
|
+
## Advanced Features
|
|
177
251
|
|
|
178
|
-
|
|
252
|
+
### Custom Guardrails
|
|
179
253
|
|
|
180
|
-
|
|
254
|
+
Create domain-specific guardrails:
|
|
181
255
|
|
|
182
256
|
```ts
|
|
183
|
-
import {
|
|
184
|
-
import {
|
|
185
|
-
defineInputGuardrail,
|
|
186
|
-
defineOutputGuardrail,
|
|
187
|
-
withGuardrails,
|
|
188
|
-
} from 'ai-sdk-guardrails';
|
|
189
|
-
import { extractTextContent } from 'ai-sdk-guardrails/guardrails/input';
|
|
257
|
+
import { defineInputGuardrail, defineOutputGuardrail } from 'ai-sdk-guardrails';
|
|
190
258
|
import { extractContent } from 'ai-sdk-guardrails/guardrails/output';
|
|
191
259
|
|
|
260
|
+
// Custom input guardrail
|
|
192
261
|
const businessHours = defineInputGuardrail({
|
|
193
262
|
name: 'business-hours',
|
|
194
|
-
execute: async (
|
|
195
|
-
const
|
|
196
|
-
return
|
|
263
|
+
execute: async () => {
|
|
264
|
+
const hour = new Date().getHours();
|
|
265
|
+
return hour >= 9 && hour <= 17
|
|
197
266
|
? { tripwireTriggered: false }
|
|
198
267
|
: { tripwireTriggered: true, message: 'Outside business hours' };
|
|
199
268
|
},
|
|
200
269
|
});
|
|
201
270
|
|
|
271
|
+
// Custom output guardrail
|
|
202
272
|
const minQuality = defineOutputGuardrail({
|
|
203
273
|
name: 'min-quality',
|
|
204
274
|
execute: async ({ result }) => {
|
|
205
275
|
const { text } = extractContent(result);
|
|
206
|
-
return text.length >=
|
|
276
|
+
return text.length >= 100
|
|
207
277
|
? { tripwireTriggered: false }
|
|
208
278
|
: { tripwireTriggered: true, message: 'Response too short' };
|
|
209
279
|
},
|
|
@@ -215,213 +285,114 @@ const model = withGuardrails(openai('gpt-4o'), {
|
|
|
215
285
|
});
|
|
216
286
|
```
|
|
217
287
|
|
|
218
|
-
###
|
|
288
|
+
### Auto-Retry on Failures
|
|
289
|
+
|
|
290
|
+
Automatically retry when output doesn't meet requirements:
|
|
219
291
|
|
|
220
292
|
```ts
|
|
221
|
-
import { openai } from '@ai-sdk/openai';
|
|
222
293
|
import {
|
|
223
|
-
|
|
224
|
-
piiDetector,
|
|
225
|
-
blockedKeywords,
|
|
226
|
-
contentLengthLimit,
|
|
227
|
-
promptInjectionDetector,
|
|
228
|
-
sensitiveDataFilter,
|
|
294
|
+
wrapWithOutputGuardrails,
|
|
229
295
|
minLengthRequirement,
|
|
230
|
-
confidenceThreshold,
|
|
231
|
-
mcpSecurityGuardrail,
|
|
232
|
-
mcpResponseSanitizer,
|
|
233
296
|
} from 'ai-sdk-guardrails';
|
|
234
297
|
|
|
235
|
-
const model =
|
|
236
|
-
inputGuardrails: [
|
|
237
|
-
piiDetector(),
|
|
238
|
-
promptInjectionDetector({ threshold: 0.7 }),
|
|
239
|
-
blockedKeywords(['test', 'spam']),
|
|
240
|
-
contentLengthLimit(4000),
|
|
241
|
-
],
|
|
242
|
-
outputGuardrails: [
|
|
243
|
-
mcpSecurityGuardrail({
|
|
244
|
-
detectExfiltration: true,
|
|
245
|
-
scanEncodedContent: true,
|
|
246
|
-
allowedDomains: ['trusted-api.com'],
|
|
247
|
-
}),
|
|
248
|
-
mcpResponseSanitizer(),
|
|
249
|
-
sensitiveDataFilter(),
|
|
250
|
-
minLengthRequirement(160),
|
|
251
|
-
confidenceThreshold(0.6),
|
|
252
|
-
],
|
|
253
|
-
});
|
|
254
|
-
```
|
|
255
|
-
|
|
256
|
-
## Streaming
|
|
257
|
-
|
|
258
|
-
Works out of the box. By default, guardrails run after the stream ends (buffer mode). For early blocking, enable progressive mode.
|
|
259
|
-
|
|
260
|
-
```ts
|
|
261
|
-
import { streamText } from 'ai';
|
|
262
|
-
import { openai } from '@ai-sdk/openai';
|
|
263
|
-
import { withGuardrails, minLengthRequirement } from 'ai-sdk-guardrails';
|
|
264
|
-
|
|
265
|
-
const model = withGuardrails(openai('gpt-4o'), {
|
|
266
|
-
outputGuardrails: [minLengthRequirement(120)],
|
|
267
|
-
// Evaluate as tokens arrive; stop or replace early when blocked
|
|
268
|
-
streamMode: 'progressive',
|
|
269
|
-
replaceOnBlocked: true,
|
|
270
|
-
});
|
|
271
|
-
|
|
272
|
-
const { textStream } = await streamText({
|
|
273
|
-
model,
|
|
274
|
-
prompt: 'Tell me a short story about a robot.',
|
|
275
|
-
});
|
|
276
|
-
|
|
277
|
-
for await (const delta of textStream) process.stdout.write(delta);
|
|
278
|
-
```
|
|
279
|
-
|
|
280
|
-
## Auto Retry
|
|
281
|
-
|
|
282
|
-
Choose what fits your flow:
|
|
283
|
-
|
|
284
|
-
- Standalone utility: Use `retry()` to wrap any generation function with your own validator and backoff.
|
|
285
|
-
- Middleware option: Add `retry` to output guardrails so retries run automatically when a check fails.
|
|
286
|
-
|
|
287
|
-
### Utility
|
|
288
|
-
|
|
289
|
-
```ts
|
|
290
|
-
import { retry } from 'ai-sdk-guardrails';
|
|
291
|
-
import { generateText } from 'ai';
|
|
292
|
-
import { openai } from '@ai-sdk/openai';
|
|
293
|
-
|
|
294
|
-
const result = await retry({
|
|
295
|
-
generate: (params) => generateText({ model: openai('gpt-4o'), ...params }),
|
|
296
|
-
params: { prompt: 'Explain backpropagation in depth.' },
|
|
297
|
-
validate: (r) => ({
|
|
298
|
-
blocked: (r.text ?? '').length < 500,
|
|
299
|
-
message: 'Response too short',
|
|
300
|
-
}),
|
|
301
|
-
buildRetryParams: ({ lastParams }) => ({
|
|
302
|
-
...lastParams,
|
|
303
|
-
maxOutputTokens: Math.max(800, (lastParams.maxOutputTokens ?? 400) + 300),
|
|
304
|
-
}),
|
|
305
|
-
maxRetries: 2,
|
|
306
|
-
});
|
|
307
|
-
```
|
|
308
|
-
|
|
309
|
-
### Middleware
|
|
310
|
-
|
|
311
|
-
```ts
|
|
312
|
-
import { generateText } from 'ai';
|
|
313
|
-
import { openai } from '@ai-sdk/openai';
|
|
314
|
-
import { withGuardrails, defineOutputGuardrail } from 'ai-sdk-guardrails';
|
|
315
|
-
import { extractContent } from 'ai-sdk-guardrails/guardrails/output';
|
|
316
|
-
|
|
317
|
-
const minLengthGuardrail = defineOutputGuardrail<{ minChars: number }>({
|
|
318
|
-
name: 'min-output-length',
|
|
319
|
-
execute: async ({ result }) => {
|
|
320
|
-
const { text } = extractContent(result);
|
|
321
|
-
const minChars = text.length + 1;
|
|
322
|
-
return text.length < minChars
|
|
323
|
-
? {
|
|
324
|
-
tripwireTriggered: true,
|
|
325
|
-
severity: 'medium',
|
|
326
|
-
message: `Answer too short: ${text.length} < ${minChars}`,
|
|
327
|
-
metadata: { minChars },
|
|
328
|
-
}
|
|
329
|
-
: { tripwireTriggered: false };
|
|
330
|
-
},
|
|
331
|
-
});
|
|
332
|
-
|
|
333
|
-
const guarded = wrapWithOutputGuardrails(
|
|
298
|
+
const model = wrapWithOutputGuardrails(
|
|
334
299
|
openai('gpt-4o'),
|
|
335
|
-
[
|
|
300
|
+
[minLengthRequirement(100)],
|
|
336
301
|
{
|
|
337
|
-
replaceOnBlocked: false,
|
|
338
302
|
retry: {
|
|
339
|
-
maxRetries:
|
|
340
|
-
buildRetryParams: ({
|
|
303
|
+
maxRetries: 2,
|
|
304
|
+
buildRetryParams: ({ lastParams }) => ({
|
|
341
305
|
...lastParams,
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
|
|
345
|
-
),
|
|
306
|
+
// Increase max tokens on retry
|
|
307
|
+
maxOutputTokens: (lastParams.maxOutputTokens ?? 400) + 200,
|
|
308
|
+
// Add context about the failure
|
|
346
309
|
prompt: [
|
|
347
|
-
...
|
|
310
|
+
...lastParams.prompt,
|
|
348
311
|
{
|
|
349
|
-
role: 'user'
|
|
350
|
-
content:
|
|
351
|
-
{
|
|
352
|
-
type: 'text' as const,
|
|
353
|
-
text: `Note: The previous answer ${summary.blockedResults[0]?.message}. Provide a comprehensive, detailed answer with examples.`,
|
|
354
|
-
},
|
|
355
|
-
],
|
|
312
|
+
role: 'user',
|
|
313
|
+
content: 'Please provide a more detailed response.',
|
|
356
314
|
},
|
|
357
315
|
],
|
|
358
316
|
}),
|
|
359
317
|
},
|
|
360
318
|
},
|
|
361
319
|
);
|
|
362
|
-
|
|
363
|
-
const { text } = await generateText({
|
|
364
|
-
model: guarded,
|
|
365
|
-
prompt: 'Explain the significance of the Turing Test in AI history.',
|
|
366
|
-
});
|
|
367
|
-
```
|
|
368
|
-
|
|
369
|
-
Tip: Use backoff helpers if you need delays between retries: `exponentialBackoff`, `linearBackoff`, `fixedBackoff`, `jitteredExponentialBackoff`, or `backoffPresets`.
|
|
370
|
-
|
|
371
|
-
## Error Handling
|
|
372
|
-
|
|
373
|
-
Set `throwOnBlocked: true` to throw structured errors you can catch and turn into friendly messages.
|
|
374
|
-
|
|
375
|
-
```ts
|
|
376
|
-
import { isGuardrailsError } from 'ai-sdk-guardrails';
|
|
377
|
-
|
|
378
|
-
try {
|
|
379
|
-
const { text } = await generateText({ model, prompt: '...' });
|
|
380
|
-
} catch (err) {
|
|
381
|
-
if (isGuardrailsError(err)) {
|
|
382
|
-
console.error('Guardrail blocked:', err.message);
|
|
383
|
-
// err.results gives you details per guardrail
|
|
384
|
-
} else {
|
|
385
|
-
console.error('Unexpected error:', err);
|
|
386
|
-
}
|
|
387
|
-
}
|
|
388
320
|
```
|
|
389
321
|
|
|
390
|
-
|
|
322
|
+
### Reusable Configurations
|
|
391
323
|
|
|
392
|
-
|
|
324
|
+
Create reusable guardrail sets:
|
|
393
325
|
|
|
394
326
|
```ts
|
|
395
|
-
import {
|
|
396
|
-
|
|
397
|
-
|
|
327
|
+
import {
|
|
328
|
+
createGuardrails,
|
|
329
|
+
piiDetector,
|
|
330
|
+
sensitiveDataFilter,
|
|
331
|
+
} from 'ai-sdk-guardrails';
|
|
398
332
|
|
|
399
|
-
//
|
|
333
|
+
// Define once
|
|
400
334
|
const productionGuards = createGuardrails({
|
|
401
|
-
inputGuardrails: [piiDetector()
|
|
402
|
-
outputGuardrails: [
|
|
335
|
+
inputGuardrails: [piiDetector()],
|
|
336
|
+
outputGuardrails: [sensitiveDataFilter()],
|
|
403
337
|
throwOnBlocked: true,
|
|
404
338
|
});
|
|
405
339
|
|
|
406
340
|
// Apply to multiple models
|
|
407
341
|
const gpt4 = productionGuards(openai('gpt-4o'));
|
|
408
342
|
const claude = productionGuards(anthropic('claude-3-sonnet'));
|
|
343
|
+
```
|
|
344
|
+
|
|
345
|
+
### Streaming Modes
|
|
409
346
|
|
|
410
|
-
|
|
411
|
-
const strictLimits = createGuardrails({ inputGuardrails: [maxLength(500)] });
|
|
412
|
-
const piiProtection = createGuardrails({ inputGuardrails: [piiDetector()] });
|
|
347
|
+
Control when guardrails run during streaming:
|
|
413
348
|
|
|
414
|
-
|
|
415
|
-
const model =
|
|
349
|
+
```ts
|
|
350
|
+
const model = withGuardrails(openai('gpt-4o'), {
|
|
351
|
+
outputGuardrails: [minLengthRequirement(100)],
|
|
352
|
+
streamMode: 'progressive', // Run guardrails as tokens arrive
|
|
353
|
+
replaceOnBlocked: true, // Replace blocked output with fallback
|
|
354
|
+
});
|
|
416
355
|
```
|
|
417
356
|
|
|
418
|
-
|
|
357
|
+
- `buffer` (default): Wait for stream to complete, then check
|
|
358
|
+
- `progressive`: Check guardrails as tokens arrive (early termination)
|
|
359
|
+
|
|
360
|
+
### Agent Support
|
|
419
361
|
|
|
420
|
-
|
|
362
|
+
Guardrails work with AI SDK Agents:
|
|
363
|
+
|
|
364
|
+
```ts
|
|
365
|
+
import { withAgentGuardrails } from 'ai-sdk-guardrails';
|
|
366
|
+
import { tool } from 'ai';
|
|
367
|
+
|
|
368
|
+
const agent = withAgentGuardrails(
|
|
369
|
+
{
|
|
370
|
+
model: openai('gpt-4o'),
|
|
371
|
+
tools: { search: searchTool },
|
|
372
|
+
system: 'You are a helpful assistant.',
|
|
373
|
+
},
|
|
374
|
+
{
|
|
375
|
+
inputGuardrails: [piiDetector()],
|
|
376
|
+
outputGuardrails: [sensitiveDataFilter()],
|
|
377
|
+
toolGuardrails: [
|
|
378
|
+
toolEgressPolicy({
|
|
379
|
+
allowedHosts: ['api.company.com'],
|
|
380
|
+
scanForUrls: true,
|
|
381
|
+
}),
|
|
382
|
+
],
|
|
383
|
+
},
|
|
384
|
+
);
|
|
385
|
+
|
|
386
|
+
const result = await agent.generate({ prompt: '...' });
|
|
387
|
+
```
|
|
388
|
+
|
|
389
|
+
## MCP Security Guardrails (Advanced)
|
|
390
|
+
|
|
391
|
+
**Production-Ready**: Protect against the ["lethal trifecta" vulnerability](https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/) when using Model Context Protocol (MCP) tools.
|
|
421
392
|
|
|
422
393
|
### The Problem
|
|
423
394
|
|
|
424
|
-
AI agents with MCP tools
|
|
395
|
+
AI agents with MCP tools are vulnerable when they have:
|
|
425
396
|
|
|
426
397
|
1. **Access to private data** (through tools)
|
|
427
398
|
2. **Process untrusted content** (from tool responses)
|
|
@@ -429,9 +400,9 @@ AI agents with MCP tools can be vulnerable when they have:
|
|
|
429
400
|
|
|
430
401
|
Malicious tool responses can contain hidden instructions that trick the AI into exfiltrating sensitive data.
|
|
431
402
|
|
|
432
|
-
### Production
|
|
403
|
+
### Production Configuration
|
|
433
404
|
|
|
434
|
-
Full configurability with sensible defaults
|
|
405
|
+
Full configurability with sensible defaults:
|
|
435
406
|
|
|
436
407
|
```ts
|
|
437
408
|
import {
|
|
@@ -451,100 +422,58 @@ const secureModel = withGuardrails(openai('gpt-4o'), {
|
|
|
451
422
|
mcpSecurityGuardrail({
|
|
452
423
|
injectionThreshold: 0.5, // Lower = more sensitive
|
|
453
424
|
maxSuspiciousUrls: 0, // Zero tolerance
|
|
454
|
-
maxContentSize: 25600, // 25KB limit
|
|
425
|
+
maxContentSize: 25600, // 25KB limit
|
|
455
426
|
minEncodedLength: 15, // Detect shorter encoded attacks
|
|
456
|
-
encodedInjectionThreshold: 0.2, // Combined
|
|
427
|
+
encodedInjectionThreshold: 0.2, // Combined threshold
|
|
457
428
|
highRiskThreshold: 0.3, // High-risk cascade blocking
|
|
458
429
|
authorityThreshold: 0.5, // Authority manipulation detection
|
|
459
430
|
allowedDomains: ['api.company.com', 'trusted-partner.com'],
|
|
460
|
-
customSuspiciousDomains: ['evil.com'
|
|
431
|
+
customSuspiciousDomains: ['evil.com'],
|
|
461
432
|
blockCascadingCalls: true,
|
|
462
433
|
scanEncodedContent: true,
|
|
463
434
|
detectExfiltration: true,
|
|
464
435
|
}),
|
|
465
|
-
mcpResponseSanitizer(), // Clean
|
|
436
|
+
mcpResponseSanitizer(), // Clean vs block
|
|
466
437
|
toolEgressPolicy({
|
|
467
|
-
allowedHosts: ['api.company.com'
|
|
468
|
-
blockedHosts: ['webhook.site', 'requestcatcher.com'
|
|
438
|
+
allowedHosts: ['api.company.com'],
|
|
439
|
+
blockedHosts: ['webhook.site', 'requestcatcher.com'],
|
|
469
440
|
scanForUrls: true,
|
|
470
441
|
}),
|
|
471
442
|
],
|
|
472
443
|
});
|
|
473
444
|
```
|
|
474
445
|
|
|
475
|
-
### Environment
|
|
446
|
+
### Environment-Based Configuration
|
|
476
447
|
|
|
477
448
|
```ts
|
|
478
|
-
// Different security profiles for different environments
|
|
479
449
|
function getSecurityConfig(env: 'production' | 'staging' | 'development') {
|
|
480
450
|
const configs = {
|
|
481
451
|
production: {
|
|
482
452
|
injectionThreshold: 0.5, // High security
|
|
483
|
-
maxContentSize: 25600, // 25KB
|
|
484
|
-
authorityThreshold: 0.5,
|
|
453
|
+
maxContentSize: 25600, // 25KB
|
|
454
|
+
authorityThreshold: 0.5,
|
|
485
455
|
},
|
|
486
456
|
staging: {
|
|
487
|
-
injectionThreshold: 0.7, // Balanced
|
|
488
|
-
maxContentSize: 51200, // 50KB
|
|
489
|
-
authorityThreshold: 0.7,
|
|
457
|
+
injectionThreshold: 0.7, // Balanced
|
|
458
|
+
maxContentSize: 51200, // 50KB
|
|
459
|
+
authorityThreshold: 0.7,
|
|
490
460
|
},
|
|
491
461
|
development: {
|
|
492
|
-
injectionThreshold: 0.8, //
|
|
493
|
-
maxContentSize: 102400, // 100KB
|
|
494
|
-
authorityThreshold: 0.8,
|
|
462
|
+
injectionThreshold: 0.8, // Permissive
|
|
463
|
+
maxContentSize: 102400, // 100KB
|
|
464
|
+
authorityThreshold: 0.8,
|
|
495
465
|
},
|
|
496
466
|
};
|
|
497
467
|
return configs[env];
|
|
498
468
|
}
|
|
499
469
|
|
|
500
|
-
const
|
|
470
|
+
const model = withGuardrails(openai('gpt-4o'), {
|
|
501
471
|
outputGuardrails: [mcpSecurityGuardrail(getSecurityConfig('production'))],
|
|
502
472
|
});
|
|
503
473
|
```
|
|
504
474
|
|
|
505
|
-
### Attack Vectors Prevented
|
|
506
|
-
|
|
507
|
-
✅ **Direct prompt injection** - "System: ignore all previous instructions"
|
|
508
|
-
✅ **Tool response poisoning** - Malicious content in MCP tool responses
|
|
509
|
-
✅ **Data exfiltration** - URLs constructed to steal sensitive data
|
|
510
|
-
✅ **Encoded attacks** - Base64/hex hidden malicious instructions
|
|
511
|
-
✅ **Cascading exploits** - Tool responses triggering additional dangerous calls
|
|
512
|
-
✅ **Context poisoning** - Attempts to modify AI behavior mid-conversation
|
|
513
|
-
|
|
514
|
-
### Secure MCP Agent Example
|
|
515
|
-
|
|
516
|
-
```ts
|
|
517
|
-
import { withAgentGuardrails } from 'ai-sdk-guardrails';
|
|
518
|
-
|
|
519
|
-
const secureAgent = withAgentGuardrails(
|
|
520
|
-
{
|
|
521
|
-
model: openai('gpt-4o'),
|
|
522
|
-
tools: { file_search, api_call, database_query },
|
|
523
|
-
system: 'You are a secure assistant. Always validate tool responses.',
|
|
524
|
-
},
|
|
525
|
-
{
|
|
526
|
-
inputGuardrails: [promptInjectionDetector()],
|
|
527
|
-
outputGuardrails: [
|
|
528
|
-
mcpSecurityGuardrail({
|
|
529
|
-
detectExfiltration: true,
|
|
530
|
-
allowedDomains: ['trusted-api.com'],
|
|
531
|
-
}),
|
|
532
|
-
mcpResponseSanitizer(),
|
|
533
|
-
],
|
|
534
|
-
toolGuardrails: [
|
|
535
|
-
toolEgressPolicy({
|
|
536
|
-
allowedHosts: ['trusted-api.com'],
|
|
537
|
-
scanForUrls: true,
|
|
538
|
-
}),
|
|
539
|
-
],
|
|
540
|
-
},
|
|
541
|
-
);
|
|
542
|
-
```
|
|
543
|
-
|
|
544
475
|
### Configuration Options
|
|
545
476
|
|
|
546
|
-
All security parameters are fully configurable with sensible defaults:
|
|
547
|
-
|
|
548
477
|
| Option | Default | Description |
|
|
549
478
|
| --------------------------- | ------- | ------------------------------------------------ |
|
|
550
479
|
| `injectionThreshold` | 0.7 | Prompt injection confidence threshold (0-1) |
|
|
@@ -556,106 +485,92 @@ All security parameters are fully configurable with sensible defaults:
|
|
|
556
485
|
| `allowedDomains` | [] | Allowed domains for URL construction |
|
|
557
486
|
| `customSuspiciousDomains` | [] | Additional suspicious domain patterns |
|
|
558
487
|
|
|
559
|
-
### Performance & Security Balance
|
|
560
|
-
|
|
561
|
-
- **High Security**: Lower thresholds, stricter limits, comprehensive scanning
|
|
562
|
-
- **Balanced**: Default settings, good for most production use cases
|
|
563
|
-
- **High Performance**: Higher thresholds, larger limits, selective scanning
|
|
564
|
-
|
|
565
488
|
See complete examples:
|
|
566
489
|
|
|
567
|
-
- [Production MCP Configuration](./examples/44-production-mcp-config.ts)
|
|
490
|
+
- [Production MCP Configuration](./examples/44-production-mcp-config.ts)
|
|
568
491
|
- [MCP Security Test Suite](./examples/41-mcp-security-test.ts)
|
|
569
492
|
- [Enhanced Security Testing](./examples/43-enhanced-mcp-security-test.ts)
|
|
570
|
-
- [Vulnerability Proof of Concept](./examples/42-mcp-vulnerability-proof.ts)
|
|
571
493
|
|
|
572
|
-
##
|
|
494
|
+
## Error Handling
|
|
573
495
|
|
|
574
|
-
|
|
496
|
+
### Throw Errors on Block
|
|
575
497
|
|
|
576
498
|
```ts
|
|
577
|
-
|
|
578
|
-
|
|
579
|
-
|
|
580
|
-
import { z } from 'zod';
|
|
581
|
-
|
|
582
|
-
// Define tools for the agent
|
|
583
|
-
const searchTool = tool({
|
|
584
|
-
description: 'Search for information',
|
|
585
|
-
inputSchema: z.object({ query: z.string() }),
|
|
586
|
-
execute: async ({ query }) => `Results for: ${query}`,
|
|
499
|
+
const model = withGuardrails(openai('gpt-4o'), {
|
|
500
|
+
inputGuardrails: [piiDetector()],
|
|
501
|
+
throwOnBlocked: true, // Throw errors instead of silent blocking
|
|
587
502
|
});
|
|
588
503
|
|
|
589
|
-
|
|
590
|
-
const
|
|
591
|
-
|
|
592
|
-
|
|
593
|
-
|
|
594
|
-
|
|
595
|
-
}
|
|
596
|
-
|
|
597
|
-
outputGuardrails: [
|
|
598
|
-
defineOutputGuardrail({
|
|
599
|
-
name: 'tool-usage-required',
|
|
600
|
-
description: 'Ensures agent uses search tools',
|
|
601
|
-
execute: async (params) => {
|
|
602
|
-
const hasToolCall = params.result.steps?.some(
|
|
603
|
-
(step) => step.type === 'tool-call',
|
|
604
|
-
);
|
|
605
|
-
|
|
606
|
-
return {
|
|
607
|
-
tripwireTriggered: !hasToolCall,
|
|
608
|
-
message: hasToolCall
|
|
609
|
-
? 'Tool usage validated'
|
|
610
|
-
: 'Must use search tools for research',
|
|
611
|
-
severity: 'high',
|
|
612
|
-
};
|
|
613
|
-
},
|
|
614
|
-
}),
|
|
615
|
-
],
|
|
616
|
-
throwOnBlocked: true,
|
|
617
|
-
},
|
|
618
|
-
);
|
|
619
|
-
|
|
620
|
-
// Use the guarded agent
|
|
621
|
-
const result = await agent.generate({
|
|
622
|
-
prompt: 'Research the latest AI developments',
|
|
623
|
-
});
|
|
504
|
+
try {
|
|
505
|
+
const { text } = await generateText({ model, prompt: '...' });
|
|
506
|
+
} catch (error) {
|
|
507
|
+
if (isGuardrailsError(error)) {
|
|
508
|
+
console.error('Blocked:', error.message);
|
|
509
|
+
// error.results gives details per guardrail
|
|
510
|
+
}
|
|
511
|
+
}
|
|
624
512
|
```
|
|
625
513
|
|
|
626
|
-
|
|
514
|
+
### Error Types
|
|
627
515
|
|
|
628
|
-
|
|
629
|
-
|
|
630
|
-
|
|
631
|
-
|
|
632
|
-
|
|
633
|
-
| `retry`, `retryHelpers` | Standalone auto-retry utilities with validation and backoff. |
|
|
634
|
-
| `GuardrailsError`, `GuardrailsInputError`, `GuardrailsOutputError`, `isGuardrailsError`, `extractErrorInfo` | Structured errors and helpers for robust handling. |
|
|
635
|
-
| `exponentialBackoff`, `linearBackoff`, `fixedBackoff`, `jitteredExponentialBackoff`, `backoffPresets` | Backoff strategies to control retry pacing. |
|
|
516
|
+
- `GuardrailsInputError` - Input guardrail blocked
|
|
517
|
+
- `GuardrailsOutputError` - Output guardrail blocked
|
|
518
|
+
- `GuardrailExecutionError` - Guardrail threw an error
|
|
519
|
+
- `GuardrailTimeoutError` - Guardrail exceeded timeout
|
|
520
|
+
- `GuardrailConfigurationError` - Invalid configuration
|
|
636
521
|
|
|
637
|
-
|
|
522
|
+
## API Reference
|
|
638
523
|
|
|
639
|
-
|
|
640
|
-
- Output helpers: `./src/guardrails/output.ts`
|
|
524
|
+
### Primary Functions
|
|
641
525
|
|
|
642
|
-
|
|
526
|
+
| Function | Purpose |
|
|
527
|
+
| ------------------------- | ---------------------------------------- |
|
|
528
|
+
| `withGuardrails` | Wrap model with guardrails (main API) |
|
|
529
|
+
| `createGuardrails` | Create reusable guardrail configurations |
|
|
530
|
+
| `withAgentGuardrails` | Wrap AI SDK Agents with guardrails |
|
|
531
|
+
| `defineInputGuardrail` | Create custom input guardrail |
|
|
532
|
+
| `defineOutputGuardrail` | Create custom output guardrail |
|
|
533
|
+
| `executeInputGuardrails` | Run input guardrails programmatically |
|
|
534
|
+
| `executeOutputGuardrails` | Run output guardrails programmatically |
|
|
535
|
+
|
|
536
|
+
### Error Utilities
|
|
537
|
+
|
|
538
|
+
| Function | Purpose |
|
|
539
|
+
| ------------------- | ------------------------------------ |
|
|
540
|
+
| `isGuardrailsError` | Check if error is from guardrails |
|
|
541
|
+
| `extractErrorInfo` | Extract structured error information |
|
|
542
|
+
|
|
543
|
+
### Retry Utilities
|
|
643
544
|
|
|
644
|
-
|
|
545
|
+
| Function | Purpose |
|
|
546
|
+
| ---------------------------- | --------------------------------- |
|
|
547
|
+
| `retry` | Standalone retry utility |
|
|
548
|
+
| `exponentialBackoff` | Exponential backoff strategy |
|
|
549
|
+
| `linearBackoff` | Linear backoff strategy |
|
|
550
|
+
| `jitteredExponentialBackoff` | Jittered exponential backoff |
|
|
551
|
+
| `backoffPresets` | Pre-configured backoff strategies |
|
|
645
552
|
|
|
646
|
-
|
|
553
|
+
See source for all built-in guardrails:
|
|
647
554
|
|
|
648
|
-
|
|
555
|
+
- Input helpers: [`./src/guardrails/input.ts`](./src/guardrails/input.ts)
|
|
556
|
+
- Output helpers: [`./src/guardrails/output.ts`](./src/guardrails/output.ts)
|
|
557
|
+
- Tool helpers: [`./src/guardrails/tools.ts`](./src/guardrails/tools.ts)
|
|
558
|
+
- MCP security: [`./src/guardrails/mcp-security.ts`](./src/guardrails/mcp-security.ts)
|
|
559
|
+
|
|
560
|
+
## Examples
|
|
561
|
+
|
|
562
|
+
Browse 48+ runnable examples: [examples/README.md](./examples/README.md) |
|
|
563
|
+
|
|
564
|
+
### Quick Starts
|
|
649
565
|
|
|
650
566
|
| Example | Description | File |
|
|
651
567
|
| -------------------------- | ------------------------------- | --------------------------------------------------------------------------------- |
|
|
652
568
|
| Simple combined protection | Minimal input and output setup | [07a-simple-combined-protection.ts](./examples/07a-simple-combined-protection.ts) |
|
|
653
569
|
| Auto retry on output | Retry until output meets a rule | [32-auto-retry-output.ts](./examples/32-auto-retry-output.ts) |
|
|
654
|
-
| LLM judge auto-retry | Judge feedback drives retry | [
|
|
655
|
-
| Expected tool use retry | Enforce/guide tool usage | [34-expected-tool-use-retry.ts](./examples/34-expected-tool-use-retry.ts) |
|
|
570
|
+
| LLM judge auto-retry | Judge feedback drives retry | [35-judge-auto-retry.ts](./examples/35-judge-auto-retry.ts) |
|
|
656
571
|
| Weather assistant | End-to-end input/output + retry | [33-blog-post-weather-assistant.ts](./examples/33-blog-post-weather-assistant.ts) |
|
|
657
572
|
|
|
658
|
-
Input
|
|
573
|
+
### Input Safety
|
|
659
574
|
|
|
660
575
|
| Example | Description | File |
|
|
661
576
|
| ------------------ | ----------------------------------- | --------------------------------------------------------------- |
|
|
@@ -664,7 +579,7 @@ Input safety
|
|
|
664
579
|
| PII detection | Detect PII before calling the model | [03-pii-detection.ts](./examples/03-pii-detection.ts) |
|
|
665
580
|
| Rate limiting | Simple per-user rate limit | [13-rate-limiting.ts](./examples/13-rate-limiting.ts) |
|
|
666
581
|
|
|
667
|
-
Output
|
|
582
|
+
### Output Safety
|
|
668
583
|
|
|
669
584
|
| Example | Description | File |
|
|
670
585
|
| ----------------------- | ----------------------------------- | ------------------------------------------------------------------------- |
|
|
@@ -672,7 +587,7 @@ Output safety
|
|
|
672
587
|
| Sensitive output filter | Filter secrets and PII in responses | [05-sensitive-output-filter.ts](./examples/05-sensitive-output-filter.ts) |
|
|
673
588
|
| Hallucination detection | Flag uncertain factual claims | [19-hallucination-detection.ts](./examples/19-hallucination-detection.ts) |
|
|
674
589
|
|
|
675
|
-
Streaming
|
|
590
|
+
### Streaming
|
|
676
591
|
|
|
677
592
|
| Example | Description | File |
|
|
678
593
|
| ----------------- | ---------------------------------- | --------------------------------------------------------------------------------- |
|
|
@@ -680,7 +595,7 @@ Streaming
|
|
|
680
595
|
| Streaming quality | Quality checks with streaming | [12-streaming-quality.ts](./examples/12-streaming-quality.ts) |
|
|
681
596
|
| Early termination | Stop streams early when blocked | [28-streaming-early-termination.ts](./examples/28-streaming-early-termination.ts) |
|
|
682
597
|
|
|
683
|
-
Advanced
|
|
598
|
+
### Advanced
|
|
684
599
|
|
|
685
600
|
| Example | Description | File |
|
|
686
601
|
| -------------------------- | ----------------------------- | ------------------------------------------------------------------------------- |
|
|
@@ -689,30 +604,47 @@ Advanced
|
|
|
689
604
|
| SQL code safety | Basic SQL safety checks | [24-sql-code-safety.ts](./examples/24-sql-code-safety.ts) |
|
|
690
605
|
| Role hierarchy enforcement | Enforce role rules in prompts | [23-role-hierarchy-enforcement.ts](./examples/23-role-hierarchy-enforcement.ts) |
|
|
691
606
|
|
|
692
|
-
##
|
|
607
|
+
## Migration from v3.x
|
|
693
608
|
|
|
694
|
-
|
|
695
|
-
- AI SDK: Compatible with AI SDK 5 (`ai@^5`); wraps any model
|
|
696
|
-
- For `generateObject`: for strict object validation, run `executeOutputGuardrails()` after generation
|
|
609
|
+
API naming has been improved in v4.x (old names still work but are deprecated):
|
|
697
610
|
|
|
698
|
-
|
|
611
|
+
```ts
|
|
612
|
+
// Before (v3.x - still works but deprecated)
|
|
613
|
+
import { wrapWithGuardrails, InputBlockedError } from 'ai-sdk-guardrails';
|
|
614
|
+
const model = wrapWithGuardrails(openai('gpt-4o'), { ... });
|
|
699
615
|
|
|
700
|
-
|
|
701
|
-
|
|
702
|
-
|
|
703
|
-
B -->|Valid| C[AI Model]
|
|
704
|
-
B -->|Blocked| X[No API Call]
|
|
705
|
-
C --> D[Output Guardrails]
|
|
706
|
-
D -->|Clean| E[Response]
|
|
707
|
-
D -->|Blocked| R[Retry/Replace/Throw]
|
|
616
|
+
// After (v4.x - recommended)
|
|
617
|
+
import { withGuardrails, GuardrailsInputError } from 'ai-sdk-guardrails';
|
|
618
|
+
const model = withGuardrails(openai('gpt-4o'), { ... });
|
|
708
619
|
```
|
|
709
620
|
|
|
710
|
-
|
|
621
|
+
Changes:
|
|
622
|
+
|
|
623
|
+
- `wrapWithGuardrails` → `withGuardrails`
|
|
624
|
+
- `wrapAgentWithGuardrails` → `withAgentGuardrails`
|
|
625
|
+
- `InputBlockedError` → `GuardrailsInputError`
|
|
626
|
+
- `OutputBlockedError` → `GuardrailsOutputError`
|
|
627
|
+
|
|
628
|
+
## Compatibility
|
|
629
|
+
|
|
630
|
+
- **Runtime**: Node.js 18+ recommended
|
|
631
|
+
- **AI SDK**: Compatible with AI SDK 5.x (`ai@^5`)
|
|
632
|
+
- **TypeScript**: Full type safety with TypeScript 5+
|
|
633
|
+
- **Works with any model**: OpenAI, Anthropic, Mistral, Groq, etc.
|
|
634
|
+
|
|
635
|
+
## Why This Library?
|
|
636
|
+
|
|
637
|
+
**Non-invasive**: Guardrails are middleware. Your existing code, telemetry (Langfuse, Helicone), and logging stay intact.
|
|
638
|
+
|
|
639
|
+
**Production-ready**: Used in production by teams who need compliance, security, and cost control without rebuilding their infrastructure.
|
|
640
|
+
|
|
641
|
+
**Developer experience**: One line to add safety. Progressive complexity - start simple, add advanced features when needed.
|
|
642
|
+
|
|
643
|
+
**Type-safe**: Rich TypeScript types and inference throughout.
|
|
644
|
+
|
|
645
|
+
**Comprehensive**: 40+ built-in guardrails covering security, quality, compliance, and performance.
|
|
711
646
|
|
|
712
|
-
|
|
713
|
-
- Composable: run multiple guardrails in any order
|
|
714
|
-
- Type-safe: rich TypeScript types and inference
|
|
715
|
-
- Sensible defaults: zero-config to start, full control when you need it
|
|
647
|
+
**Advanced features**: Early detection, parallel execution, enhanced prompt injection detection, MCP security, and more.
|
|
716
648
|
|
|
717
649
|
## Contributing
|
|
718
650
|
|
|
@@ -720,4 +652,4 @@ Issues and PRs are welcome.
|
|
|
720
652
|
|
|
721
653
|
## License
|
|
722
654
|
|
|
723
|
-
MIT © Jag Reehal. See LICENSE for details.
|
|
655
|
+
MIT © Jag Reehal. See [LICENSE](./LICENSE) for details.
|