truthguard-ai 0.1.3 → 0.1.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +159 -46
- package/package.json +1 -1
- package/README.full.bak +0 -363
package/README.md
CHANGED
|
@@ -1,8 +1,8 @@
|
|
|
1
1
|
# TruthGuard
|
|
2
2
|
|
|
3
|
-
**
|
|
3
|
+
**Standardized grounding validation for tool-calling AI agents.**
|
|
4
4
|
|
|
5
|
-
> Detect when an agent's response contradicts the data returned by the tools it called — without LLM-as-judge overhead.
|
|
5
|
+
> Detect when an agent's response contradicts the data returned by the tools it called — deterministically, without LLM-as-judge overhead.
|
|
6
6
|
|
|
7
7
|
[](https://www.npmjs.com/package/truthguard-ai)
|
|
8
8
|
[](LICENSE)
|
|
@@ -11,24 +11,26 @@
|
|
|
11
11
|
|
|
12
12
|
## The Problem
|
|
13
13
|
|
|
14
|
-
Most "hallucinations" in tool-calling agents are **grounding failures** — the agent calls a tool, gets accurate data, then ignores it, miscalculates, or fabricates from empty results.
|
|
14
|
+
Most "hallucinations" in tool-calling agents are **grounding failures** — the agent calls a tool, gets accurate data, and then ignores it, miscalculates, or fabricates from empty results. The source of truth is already in the trace.
|
|
15
15
|
|
|
16
16
|
## The Solution
|
|
17
17
|
|
|
18
|
-
TruthGuard extracts factual claims from the agent's response
|
|
18
|
+
TruthGuard extracts factual claims from the agent's response, cross-references them against tool outputs, and reports grounding failures with standardized codes — like OBD diagnostic codes for AI.
|
|
19
19
|
|
|
20
|
-
```
|
|
21
|
-
npm install truthguard
|
|
20
|
+
```
|
|
21
|
+
npm install truthguard
|
|
22
22
|
```
|
|
23
23
|
|
|
24
|
+
**Zero LLM calls.** 30+ deterministic failure detectors. Runs in <50ms.
|
|
25
|
+
|
|
24
26
|
---
|
|
25
27
|
|
|
26
|
-
## Quick Start
|
|
28
|
+
## Quick Start — 3 Minutes
|
|
27
29
|
|
|
28
|
-
### Evaluate a trace
|
|
30
|
+
### 1. Evaluate a trace
|
|
29
31
|
|
|
30
32
|
```typescript
|
|
31
|
-
import { TraceBuilder, GroundingEngine, generateReport } from 'truthguard
|
|
33
|
+
import { TraceBuilder, GroundingEngine, generateReport } from 'truthguard';
|
|
32
34
|
|
|
33
35
|
const trace = new TraceBuilder({ traceId: 'run-001' })
|
|
34
36
|
.addUserInput('How many employees are on leave today?')
|
|
@@ -37,7 +39,7 @@ const trace = new TraceBuilder({ traceId: 'run-001' })
|
|
|
37
39
|
{ employeeId: 'E01', name: 'Ana Jovic', status: 'on_leave' },
|
|
38
40
|
{ employeeId: 'E02', name: 'Ivan Petrovic', status: 'on_leave' },
|
|
39
41
|
])
|
|
40
|
-
.addFinalResponse('There are 3 employees on leave today.')
|
|
42
|
+
.addFinalResponse('There are 3 employees on leave today.') // ← Bug: says 3, data shows 2
|
|
41
43
|
.build();
|
|
42
44
|
|
|
43
45
|
const engine = new GroundingEngine();
|
|
@@ -45,39 +47,36 @@ const report = engine.evaluate(trace);
|
|
|
45
47
|
|
|
46
48
|
console.log(report.groundingScore); // 0.5
|
|
47
49
|
console.log(report.detectedFailures[0]); // { type: 'grounding.data_ignored', severity: 'high' }
|
|
50
|
+
|
|
51
|
+
const { text } = generateReport(report);
|
|
52
|
+
console.log(text);
|
|
48
53
|
```
|
|
49
54
|
|
|
50
|
-
###
|
|
55
|
+
### 2. Add a CI quality gate
|
|
51
56
|
|
|
52
57
|
```typescript
|
|
53
|
-
import {
|
|
54
|
-
import OpenAI from 'openai';
|
|
58
|
+
import { loadDataset, runDataset, evaluateGate, loadGateConfig } from 'truthguard';
|
|
55
59
|
|
|
56
|
-
const
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
```
|
|
61
|
-
|
|
62
|
-
```typescript
|
|
63
|
-
import { wrapAnthropic } from 'truthguard-ai';
|
|
64
|
-
import Anthropic from '@anthropic-ai/sdk';
|
|
60
|
+
const entries = loadDataset('./test-cases.jsonl');
|
|
61
|
+
const result = runDataset(entries);
|
|
62
|
+
const gate = loadGateConfig('.ai-rcp-gate.yml');
|
|
63
|
+
const verdict = evaluateGate(result, gate);
|
|
65
64
|
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
}
|
|
65
|
+
if (!verdict.pass) {
|
|
66
|
+
console.error(verdict.report);
|
|
67
|
+
process.exit(1);
|
|
68
|
+
}
|
|
70
69
|
```
|
|
71
70
|
|
|
72
|
-
###
|
|
71
|
+
### 3. Monitor in production (proxy mode)
|
|
73
72
|
|
|
74
73
|
Works with **any language** — PHP, Python, Go, Java, Ruby, C#:
|
|
75
74
|
|
|
76
75
|
```bash
|
|
77
|
-
npx truthguard
|
|
76
|
+
npx truthguard observe --port 3001
|
|
78
77
|
```
|
|
79
78
|
|
|
80
|
-
|
|
79
|
+
Change your AI base URL:
|
|
81
80
|
```
|
|
82
81
|
# OpenAI
|
|
83
82
|
OPENAI_BASE_URL=http://localhost:3001/proxy/openai
|
|
@@ -86,17 +85,72 @@ OPENAI_BASE_URL=http://localhost:3001/proxy/openai
|
|
|
86
85
|
ANTHROPIC_BASE_URL=http://localhost:3001/proxy/anthropic
|
|
87
86
|
```
|
|
88
87
|
|
|
89
|
-
Your app works
|
|
88
|
+
Your app works exactly the same. TruthGuard transparently proxies requests and evaluates grounding in the background.
|
|
90
89
|
|
|
91
|
-
|
|
90
|
+
---
|
|
92
91
|
|
|
93
|
-
|
|
94
|
-
|
|
92
|
+
## Detection
|
|
93
|
+
|
|
94
|
+
30+ deterministic failure detectors across grounding, orchestration, and reasoning categories.
|
|
95
|
+
|
|
96
|
+
Examples:
|
|
97
|
+
- Fabrication from empty tool results
|
|
98
|
+
- Math errors against correct tool data
|
|
99
|
+
- Ignored or altered tool data in the response
|
|
100
|
+
- Entity mismatches and mix-ups
|
|
101
|
+
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
## Features
|
|
105
|
+
|
|
106
|
+
### Diagnostic Advisor
|
|
107
|
+
|
|
108
|
+
Every detected failure includes actionable diagnostics — root cause identification, evidence, and remediation guidance.
|
|
109
|
+
|
|
110
|
+
### Policy Engine
|
|
111
|
+
|
|
112
|
+
Configure per-failure actions — block, warn, or observe:
|
|
113
|
+
|
|
114
|
+
```typescript
|
|
115
|
+
import { wrapOpenAI, GroundingError } from 'truthguard';
|
|
116
|
+
import OpenAI from 'openai';
|
|
117
|
+
|
|
118
|
+
const openai = wrapOpenAI(new OpenAI(), {
|
|
119
|
+
mode: 'block',
|
|
120
|
+
threshold: 0.85,
|
|
121
|
+
policy: {
|
|
122
|
+
rules: {
|
|
123
|
+
'grounding.empty_fabrication': 'block',
|
|
124
|
+
'grounding.math_error': 'warn',
|
|
125
|
+
'reasoning.overconfident_language': 'observe',
|
|
126
|
+
},
|
|
127
|
+
},
|
|
128
|
+
});
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
### Baseline Regression Detection
|
|
132
|
+
|
|
133
|
+
```typescript
|
|
134
|
+
import { createSnapshot, saveBaseline, loadBaseline, compareToBaseline } from 'truthguard';
|
|
135
|
+
|
|
136
|
+
// Save after a known-good run
|
|
137
|
+
const snapshot = createSnapshot(result, 'v1.2-main');
|
|
138
|
+
saveBaseline('.ai-rcp-baseline.json', snapshot);
|
|
139
|
+
|
|
140
|
+
// Compare after changes
|
|
141
|
+
const comparison = compareToBaseline(newResult, snapshot);
|
|
142
|
+
if (!comparison.withinTolerance) {
|
|
143
|
+
console.error('Regression detected:', comparison.report);
|
|
144
|
+
}
|
|
95
145
|
```
|
|
96
146
|
|
|
97
147
|
### MCP Server (VS Code, Cursor)
|
|
98
148
|
|
|
99
|
-
Use TruthGuard from your IDE —
|
|
149
|
+
Use TruthGuard directly from your IDE — no terminal needed.
|
|
150
|
+
|
|
151
|
+
**Setup (one time):**
|
|
152
|
+
1. In VS Code: `Ctrl+Shift+P` → **"MCP: Open User Configuration"**
|
|
153
|
+
2. Add this to `mcp.json`:
|
|
100
154
|
|
|
101
155
|
```json
|
|
102
156
|
{
|
|
@@ -104,17 +158,29 @@ Use TruthGuard from your IDE — add to `.vscode/mcp.json`:
|
|
|
104
158
|
"truthguard": {
|
|
105
159
|
"type": "stdio",
|
|
106
160
|
"command": "npx",
|
|
107
|
-
"args": ["-y", "truthguard
|
|
161
|
+
"args": ["-y", "truthguard", "mcp"]
|
|
108
162
|
}
|
|
109
163
|
}
|
|
110
164
|
}
|
|
111
165
|
```
|
|
112
166
|
|
|
167
|
+
3. Restart VS Code
|
|
168
|
+
|
|
169
|
+
**Usage:** In Copilot Chat, say: *"Call truthguard verify_response with this trace: {...}"*
|
|
170
|
+
|
|
171
|
+
8 tools available: `verify_response`, `quick_check`, `check_trace_quality`, `list_rules`, `get_failure_info`, `evaluate_with_policy`, `get_live_traces`, `get_trace_report`
|
|
172
|
+
|
|
173
|
+
The last two tools bridge proxy results to your IDE — ask Copilot *"Call get_live_traces"* to see recent production evaluations.
|
|
174
|
+
|
|
175
|
+
Full setup guide: [docs/getting-started.md](docs/getting-started.md#ide--mcp-server-vs-code-cursor)
|
|
176
|
+
|
|
113
177
|
### Express Middleware
|
|
114
178
|
|
|
115
179
|
```typescript
|
|
116
|
-
import
|
|
180
|
+
import express from 'express';
|
|
181
|
+
import { groundingMiddleware, FileStore } from 'truthguard';
|
|
117
182
|
|
|
183
|
+
const app = express();
|
|
118
184
|
app.post('/api/chat', groundingMiddleware({
|
|
119
185
|
mode: 'warn',
|
|
120
186
|
store: new FileStore('./traces/grounding.jsonl'),
|
|
@@ -124,21 +190,68 @@ app.post('/api/chat', groundingMiddleware({
|
|
|
124
190
|
|
|
125
191
|
---
|
|
126
192
|
|
|
127
|
-
##
|
|
193
|
+
## CLI
|
|
194
|
+
|
|
195
|
+
```bash
|
|
196
|
+
npx truthguard debug trace.json # Evaluate one trace
|
|
197
|
+
npx truthguard run dataset.jsonl # Batch dataset evaluation
|
|
198
|
+
npx truthguard run dataset.jsonl --gate gate.yml # CI quality gate
|
|
199
|
+
npx truthguard observe --port 3001 # Start observe server + proxy
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
---
|
|
203
|
+
|
|
204
|
+
## CI/CD Integration
|
|
205
|
+
|
|
206
|
+
### GitHub Actions
|
|
207
|
+
|
|
208
|
+
```yaml
|
|
209
|
+
# .github/workflows/truthguard-gate.yml
|
|
210
|
+
name: TruthGuard Quality Gate
|
|
211
|
+
on: [push, pull_request]
|
|
212
|
+
|
|
213
|
+
jobs:
|
|
214
|
+
grounding-gate:
|
|
215
|
+
runs-on: ubuntu-latest
|
|
216
|
+
steps:
|
|
217
|
+
- uses: actions/checkout@v4
|
|
218
|
+
- uses: actions/setup-node@v4
|
|
219
|
+
with:
|
|
220
|
+
node-version: '20'
|
|
221
|
+
- run: npm ci
|
|
222
|
+
- run: npx truthguard run test-cases.jsonl --gate .ai-rcp-gate.yml
|
|
223
|
+
```
|
|
128
224
|
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
-
|
|
225
|
+
### Gate config (`.ai-rcp-gate.yml`)
|
|
226
|
+
|
|
227
|
+
```yaml
|
|
228
|
+
name: "Grounding Quality Gate"
|
|
229
|
+
assertions:
|
|
230
|
+
- metric: grounding_score
|
|
231
|
+
operator: ">="
|
|
232
|
+
threshold: 0.90
|
|
233
|
+
- metric: failure_count
|
|
234
|
+
operator: "<="
|
|
235
|
+
threshold: 0
|
|
236
|
+
- metric: pass_rate
|
|
237
|
+
operator: ">="
|
|
238
|
+
threshold: 1.0
|
|
239
|
+
```
|
|
135
240
|
|
|
136
241
|
---
|
|
137
242
|
|
|
138
|
-
##
|
|
243
|
+
## How It Works
|
|
139
244
|
|
|
140
|
-
|
|
245
|
+
1. **Extract** factual claims from the agent's response
|
|
246
|
+
2. **Match** each claim against tool output data
|
|
247
|
+
3. **Detect** failure patterns across grounding, orchestration, and reasoning
|
|
248
|
+
4. **Score** overall grounding quality
|
|
249
|
+
5. **Diagnose** root causes with actionable remediation
|
|
250
|
+
|
|
251
|
+
**No LLM calls.** 100% deterministic. Configurable tolerances and multi-language support (13+ languages).
|
|
252
|
+
|
|
253
|
+
---
|
|
141
254
|
|
|
142
255
|
## License
|
|
143
256
|
|
|
144
|
-
MIT
|
|
257
|
+
MIT
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "truthguard-ai",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.5",
|
|
4
4
|
"description": "TruthGuard — Standardized grounding validation for tool-calling AI agents. Detect, diagnose, and prevent grounding failures.",
|
|
5
5
|
"main": "dist-npm/thin.js",
|
|
6
6
|
"types": "dist-npm/thin.d.ts",
|
package/README.full.bak
DELETED
|
@@ -1,363 +0,0 @@
|
|
|
1
|
-
# TruthGuard
|
|
2
|
-
|
|
3
|
-
**Standardized grounding validation for tool-calling AI agents.**
|
|
4
|
-
|
|
5
|
-
> Detect when an agent's response contradicts the data returned by the tools it called — deterministically, without LLM-as-judge overhead.
|
|
6
|
-
|
|
7
|
-
[](https://www.npmjs.com/package/truthguard)
|
|
8
|
-
[](LICENSE)
|
|
9
|
-
|
|
10
|
-
---
|
|
11
|
-
|
|
12
|
-
## The Problem
|
|
13
|
-
|
|
14
|
-
Most "hallucinations" in tool-calling agents are **grounding failures** — the agent calls a tool, gets accurate data, and then ignores it, miscalculates, or fabricates from empty results. The source of truth is already in the trace.
|
|
15
|
-
|
|
16
|
-
## The Solution
|
|
17
|
-
|
|
18
|
-
TruthGuard extracts factual claims from the agent's response, cross-references them against tool outputs, and reports grounding failures with standardized codes — like OBD diagnostic codes for AI.
|
|
19
|
-
|
|
20
|
-
```
|
|
21
|
-
npm install truthguard
|
|
22
|
-
```
|
|
23
|
-
|
|
24
|
-
**Zero LLM calls.** Deterministic regex extraction + fuzzy matching. 30 detection rules across 4 categories. Runs in <50ms.
|
|
25
|
-
|
|
26
|
-
---
|
|
27
|
-
|
|
28
|
-
## Quick Start — 3 Minutes
|
|
29
|
-
|
|
30
|
-
### 1. Evaluate a trace
|
|
31
|
-
|
|
32
|
-
```typescript
|
|
33
|
-
import { TraceBuilder, GroundingEngine, generateReport } from 'truthguard';
|
|
34
|
-
|
|
35
|
-
const trace = new TraceBuilder({ traceId: 'run-001' })
|
|
36
|
-
.addUserInput('How many employees are on leave today?')
|
|
37
|
-
.addToolCall('getLeaveRecords', { date: '2024-03-15' })
|
|
38
|
-
.addToolOutput('getLeaveRecords', [
|
|
39
|
-
{ employeeId: 'E01', name: 'Ana Jovic', status: 'on_leave' },
|
|
40
|
-
{ employeeId: 'E02', name: 'Ivan Petrovic', status: 'on_leave' },
|
|
41
|
-
])
|
|
42
|
-
.addFinalResponse('There are 3 employees on leave today.') // ← Bug: says 3, data shows 2
|
|
43
|
-
.build();
|
|
44
|
-
|
|
45
|
-
const engine = new GroundingEngine();
|
|
46
|
-
const report = engine.evaluate(trace);
|
|
47
|
-
|
|
48
|
-
console.log(report.groundingScore); // 0.5
|
|
49
|
-
console.log(report.detectedFailures[0]); // { type: 'grounding.data_ignored', severity: 'high' }
|
|
50
|
-
|
|
51
|
-
const { text } = generateReport(report);
|
|
52
|
-
console.log(text);
|
|
53
|
-
```
|
|
54
|
-
|
|
55
|
-
### 2. Add a CI quality gate
|
|
56
|
-
|
|
57
|
-
```typescript
|
|
58
|
-
import { loadDataset, runDataset, evaluateGate, loadGateConfig } from 'truthguard';
|
|
59
|
-
|
|
60
|
-
const entries = loadDataset('./test-cases.jsonl');
|
|
61
|
-
const result = runDataset(entries);
|
|
62
|
-
const gate = loadGateConfig('.ai-rcp-gate.yml');
|
|
63
|
-
const verdict = evaluateGate(result, gate);
|
|
64
|
-
|
|
65
|
-
if (!verdict.pass) {
|
|
66
|
-
console.error(verdict.report);
|
|
67
|
-
process.exit(1);
|
|
68
|
-
}
|
|
69
|
-
```
|
|
70
|
-
|
|
71
|
-
### 3. Monitor in production (proxy mode)
|
|
72
|
-
|
|
73
|
-
Works with **any language** — PHP, Python, Go, Java, Ruby, C#:
|
|
74
|
-
|
|
75
|
-
```bash
|
|
76
|
-
npx truthguard observe --port 3001
|
|
77
|
-
```
|
|
78
|
-
|
|
79
|
-
Change your AI base URL:
|
|
80
|
-
```php
|
|
81
|
-
// Before: ANTHROPIC_BASE_URL=https://api.anthropic.com
|
|
82
|
-
// After:
|
|
83
|
-
ANTHROPIC_BASE_URL=http://localhost:3001/proxy/anthropic
|
|
84
|
-
```
|
|
85
|
-
|
|
86
|
-
Your app works exactly the same. TruthGuard transparently proxies requests and evaluates grounding in the background.
|
|
87
|
-
|
|
88
|
-
---
|
|
89
|
-
|
|
90
|
-
## Detection Rules (30)
|
|
91
|
-
|
|
92
|
-
### Grounding (16 rules)
|
|
93
|
-
|
|
94
|
-
| Code | Description |
|
|
95
|
-
|------|-------------|
|
|
96
|
-
| `empty_fabrication` | Tool returned `[]`, agent fabricated results |
|
|
97
|
-
| `no_tool_call` | Factual question answered without calling any tool |
|
|
98
|
-
| `math_error` | Incorrect calculation from correct tool data |
|
|
99
|
-
| `data_ignored` | Tool data altered or ignored in response |
|
|
100
|
-
| `wrong_query` | Tool called with incorrect parameters |
|
|
101
|
-
| `entity_mismatch` | Agent mixed up entities from results |
|
|
102
|
-
| `hallucinated_entity` | Agent invented entity not in tool data |
|
|
103
|
-
| `partial_answer` | Only part of the question answered |
|
|
104
|
-
| `question_not_answered` | Core question not addressed |
|
|
105
|
-
| `selective_omission` | Some tool results selectively excluded |
|
|
106
|
-
| `tool_error_ignored` | Tool error not handled |
|
|
107
|
-
| `stale_knowledge` | Used outdated data instead of tool results |
|
|
108
|
-
| `incomplete_response` | Empty or fallback response despite having data |
|
|
109
|
-
| `irrelevant_context` | Used unrelated data from different context |
|
|
110
|
-
| `contradictory_claims` | Response contains self-contradicting statements |
|
|
111
|
-
| `unverified_value` | Factual values with no tool data to verify against |
|
|
112
|
-
|
|
113
|
-
### Orchestration (8 rules)
|
|
114
|
-
|
|
115
|
-
| Code | Description |
|
|
116
|
-
|------|-------------|
|
|
117
|
-
| `malformed_tool_input` | Bad parameter format in tool call |
|
|
118
|
-
| `raw_output_leak` | XML/JSON markup leaked into response |
|
|
119
|
-
| `intermediate_response_leak` | "Let me check..." text shown to user |
|
|
120
|
-
| `excessive_tool_calls` | Redundant repeated tool invocations |
|
|
121
|
-
| `token_limit_truncation` | Response cut off by token limit |
|
|
122
|
-
| `rate_limit_degradation` | Quality degraded due to rate limiting |
|
|
123
|
-
| `quota_exhaustion` | API quota exceeded |
|
|
124
|
-
| `model_fallback` | Unexpected model fallback |
|
|
125
|
-
|
|
126
|
-
### Reasoning (4) & Safety (2)
|
|
127
|
-
|
|
128
|
-
`scope_mismatch`, `overconfident_language`, `language_mismatch`, `duplicate_user_input`, `prompt_leak`, `sensitive_data_exposure`
|
|
129
|
-
|
|
130
|
-
---
|
|
131
|
-
|
|
132
|
-
## Features
|
|
133
|
-
|
|
134
|
-
### Diagnostic Advisor
|
|
135
|
-
|
|
136
|
-
Every detected failure includes root cause analysis, evidence from the trace, and two remediation paths:
|
|
137
|
-
|
|
138
|
-
```typescript
|
|
139
|
-
import { generateAdvisorReport, formatAdvisorReport } from 'truthguard';
|
|
140
|
-
|
|
141
|
-
const advisor = generateAdvisorReport(report, trace);
|
|
142
|
-
console.log(formatAdvisorReport(advisor));
|
|
143
|
-
// REPAIR ORDER:
|
|
144
|
-
// 1. Fix grounding.no_tool_call (root cause)
|
|
145
|
-
// 2. Fix grounding.unverified_value (likely resolves after #1)
|
|
146
|
-
// PROMPT HINT: "Always call the relevant tool before answering factual questions"
|
|
147
|
-
// CODE GUARD: if (!trace.hasToolCall()) return forceToolCall(query);
|
|
148
|
-
```
|
|
149
|
-
|
|
150
|
-
### Policy Engine
|
|
151
|
-
|
|
152
|
-
Configure per-failure actions — block, warn, or observe:
|
|
153
|
-
|
|
154
|
-
```typescript
|
|
155
|
-
import { wrapOpenAI, GroundingError } from 'truthguard';
|
|
156
|
-
import OpenAI from 'openai';
|
|
157
|
-
|
|
158
|
-
const openai = wrapOpenAI(new OpenAI(), {
|
|
159
|
-
mode: 'block',
|
|
160
|
-
threshold: 0.85,
|
|
161
|
-
policy: {
|
|
162
|
-
rules: {
|
|
163
|
-
'grounding.empty_fabrication': 'block',
|
|
164
|
-
'grounding.math_error': 'warn',
|
|
165
|
-
'reasoning.overconfident_language': 'observe',
|
|
166
|
-
},
|
|
167
|
-
},
|
|
168
|
-
});
|
|
169
|
-
```
|
|
170
|
-
|
|
171
|
-
### Baseline Regression Detection
|
|
172
|
-
|
|
173
|
-
```typescript
|
|
174
|
-
import { createSnapshot, saveBaseline, loadBaseline, compareToBaseline } from 'truthguard';
|
|
175
|
-
|
|
176
|
-
// Save after a known-good run
|
|
177
|
-
const snapshot = createSnapshot(result, 'v1.2-main');
|
|
178
|
-
saveBaseline('.ai-rcp-baseline.json', snapshot);
|
|
179
|
-
|
|
180
|
-
// Compare after changes
|
|
181
|
-
const comparison = compareToBaseline(newResult, snapshot);
|
|
182
|
-
if (!comparison.withinTolerance) {
|
|
183
|
-
console.error('Regression detected:', comparison.report);
|
|
184
|
-
}
|
|
185
|
-
```
|
|
186
|
-
|
|
187
|
-
### MCP Server (VS Code, Cursor)
|
|
188
|
-
|
|
189
|
-
Use TruthGuard directly from your IDE — no terminal needed.
|
|
190
|
-
|
|
191
|
-
**Setup (one time):**
|
|
192
|
-
1. In VS Code: `Ctrl+Shift+P` → **"MCP: Open User Configuration"**
|
|
193
|
-
2. Add this to `mcp.json`:
|
|
194
|
-
|
|
195
|
-
```json
|
|
196
|
-
{
|
|
197
|
-
"servers": {
|
|
198
|
-
"truthguard": {
|
|
199
|
-
"type": "stdio",
|
|
200
|
-
"command": "npx",
|
|
201
|
-
"args": ["-y", "truthguard", "mcp"]
|
|
202
|
-
}
|
|
203
|
-
}
|
|
204
|
-
}
|
|
205
|
-
```
|
|
206
|
-
|
|
207
|
-
3. Restart VS Code
|
|
208
|
-
|
|
209
|
-
**Usage:** In Copilot Chat, say: *"Call truthguard verify_response with this trace: {...}"*
|
|
210
|
-
|
|
211
|
-
8 tools available: `verify_response`, `quick_check`, `check_trace_quality`, `list_rules`, `get_failure_info`, `evaluate_with_policy`, `get_live_traces`, `get_trace_report`
|
|
212
|
-
|
|
213
|
-
The last two tools bridge proxy results to your IDE — ask Copilot *"Call get_live_traces"* to see recent production evaluations.
|
|
214
|
-
|
|
215
|
-
Full setup guide: [docs/getting-started.md](docs/getting-started.md#ide--mcp-server-vs-code-cursor)
|
|
216
|
-
|
|
217
|
-
### Express Middleware
|
|
218
|
-
|
|
219
|
-
```typescript
|
|
220
|
-
import express from 'express';
|
|
221
|
-
import { groundingMiddleware, FileStore } from 'truthguard';
|
|
222
|
-
|
|
223
|
-
const app = express();
|
|
224
|
-
app.post('/api/chat', groundingMiddleware({
|
|
225
|
-
mode: 'warn',
|
|
226
|
-
store: new FileStore('./traces/grounding.jsonl'),
|
|
227
|
-
extractTrace: (req, res, body) => body.trace,
|
|
228
|
-
}));
|
|
229
|
-
```
|
|
230
|
-
|
|
231
|
-
---
|
|
232
|
-
|
|
233
|
-
## CLI
|
|
234
|
-
|
|
235
|
-
```bash
|
|
236
|
-
npx truthguard debug trace.json # Evaluate one trace
|
|
237
|
-
npx truthguard run dataset.jsonl # Batch dataset evaluation
|
|
238
|
-
npx truthguard run dataset.jsonl --gate gate.yml # CI quality gate
|
|
239
|
-
npx truthguard observe --port 3001 # Start observe server + proxy
|
|
240
|
-
```
|
|
241
|
-
|
|
242
|
-
---
|
|
243
|
-
|
|
244
|
-
## CI/CD Integration
|
|
245
|
-
|
|
246
|
-
### GitHub Actions
|
|
247
|
-
|
|
248
|
-
```yaml
|
|
249
|
-
# .github/workflows/truthguard-gate.yml
|
|
250
|
-
name: TruthGuard Quality Gate
|
|
251
|
-
on: [push, pull_request]
|
|
252
|
-
|
|
253
|
-
jobs:
|
|
254
|
-
grounding-gate:
|
|
255
|
-
runs-on: ubuntu-latest
|
|
256
|
-
steps:
|
|
257
|
-
- uses: actions/checkout@v4
|
|
258
|
-
- uses: actions/setup-node@v4
|
|
259
|
-
with:
|
|
260
|
-
node-version: '20'
|
|
261
|
-
- run: npm ci
|
|
262
|
-
- run: npx truthguard run test-cases.jsonl --gate .ai-rcp-gate.yml
|
|
263
|
-
```
|
|
264
|
-
|
|
265
|
-
### Gate config (`.ai-rcp-gate.yml`)
|
|
266
|
-
|
|
267
|
-
```yaml
|
|
268
|
-
name: "Grounding Quality Gate"
|
|
269
|
-
assertions:
|
|
270
|
-
- metric: grounding_score
|
|
271
|
-
operator: ">="
|
|
272
|
-
threshold: 0.90
|
|
273
|
-
- metric: failure_count
|
|
274
|
-
operator: "<="
|
|
275
|
-
threshold: 0
|
|
276
|
-
- metric: pass_rate
|
|
277
|
-
operator: ">="
|
|
278
|
-
threshold: 1.0
|
|
279
|
-
```
|
|
280
|
-
|
|
281
|
-
---
|
|
282
|
-
|
|
283
|
-
## How It Works
|
|
284
|
-
|
|
285
|
-
```
|
|
286
|
-
Agent Response → Claim Extraction → Matcher → Rules → Report
|
|
287
|
-
(regex: numbers, (numeric, (30 (score,
|
|
288
|
-
dates, names, count, rules) failures,
|
|
289
|
-
counts) date, advisor)
|
|
290
|
-
name)
|
|
291
|
-
```
|
|
292
|
-
|
|
293
|
-
1. **Extract** factual claims from the agent's text response (numbers, dates, names, counts)
|
|
294
|
-
2. **Match** each claim against values in tool outputs (with configurable tolerances)
|
|
295
|
-
3. **Detect** failure patterns using 30 rules across 4 categories
|
|
296
|
-
4. **Score** grounding quality with severity-weighted formula
|
|
297
|
-
5. **Diagnose** root causes and suggest repair sequence
|
|
298
|
-
|
|
299
|
-
**No LLM calls.** 100% deterministic. ~55% claim coverage (numbers, dates, names, counts). L2 structured matching (booleans, enums, key-values) extends to ~70-75%.
|
|
300
|
-
|
|
301
|
-
---
|
|
302
|
-
|
|
303
|
-
## Configurable Tolerances
|
|
304
|
-
|
|
305
|
-
```yaml
|
|
306
|
-
# .ai-rcp.yml
|
|
307
|
-
tolerances:
|
|
308
|
-
numeric:
|
|
309
|
-
relative_tolerance: 0.05 # ±5% for numbers
|
|
310
|
-
rounding_allowed: true
|
|
311
|
-
count:
|
|
312
|
-
exact_match: true
|
|
313
|
-
date:
|
|
314
|
-
exact_match: true
|
|
315
|
-
name:
|
|
316
|
-
fuzzy_match: true
|
|
317
|
-
threshold: 0.85 # Jaro-Winkler similarity
|
|
318
|
-
```
|
|
319
|
-
|
|
320
|
-
---
|
|
321
|
-
|
|
322
|
-
## Language Support
|
|
323
|
-
|
|
324
|
-
- **Claim extraction:** Numbers, dates (7 formats incl. European DD.MM.YYYY), Serbian months (januar–decembar), relative dates (yesterday/juče, pre N dana)
|
|
325
|
-
- **Unit conversion:** 13 languages (EN, SR, ES, FR, PT, RU, HI, AR, BN, ZH, JA...)
|
|
326
|
-
- **Vague qualifier guard:** English + Serbian (oko, otprilike, negde)
|
|
327
|
-
- **Name matching:** Diacritics-aware (ć→c, š→s) via Jaro-Winkler
|
|
328
|
-
|
|
329
|
-
---
|
|
330
|
-
|
|
331
|
-
## Architecture
|
|
332
|
-
|
|
333
|
-
```
|
|
334
|
-
src/
|
|
335
|
-
├── Trace/ TraceBuilder SDK + multi-turn support
|
|
336
|
-
├── Claims/ Claim extraction (regex, multilingual)
|
|
337
|
-
├── Matchers/ Numeric, count, date, name matchers
|
|
338
|
-
├── Rules/ 30 detection rules (4 categories)
|
|
339
|
-
├── Grounding/ Engine orchestration + entity-aware grounding
|
|
340
|
-
├── Advisor/ Diagnostic advisor (RCA, repair sequence, hints)
|
|
341
|
-
├── Registry/ Failure registry (severity, suppression graph)
|
|
342
|
-
├── Policy/ Per-failure enforcement (block/warn/observe)
|
|
343
|
-
├── Reports/ JSON + text report generators
|
|
344
|
-
├── Config/ YAML tolerance configuration
|
|
345
|
-
├── Gate/ CI/CD quality gate
|
|
346
|
-
├── Baseline/ Snapshot regression detection
|
|
347
|
-
├── Runner/ JSONL dataset batch evaluation
|
|
348
|
-
├── Mode/ Pipeline (debug/ci/observe/warn/block)
|
|
349
|
-
├── Store/ FileStore + InMemoryStore
|
|
350
|
-
├── Alerting/ Console, Webhook, Callback dispatchers
|
|
351
|
-
├── Middleware/ Express middleware factory
|
|
352
|
-
├── SDK/ OpenAI wrapper (auto trace capture)
|
|
353
|
-
├── Proxy/ Transparent AI API proxy builders
|
|
354
|
-
├── MCP/ MCP Server (8 IDE tools)
|
|
355
|
-
├── L2/ Structured context matching (boolean, enum, key-value)
|
|
356
|
-
└── cli/ CLI commands
|
|
357
|
-
```
|
|
358
|
-
|
|
359
|
-
---
|
|
360
|
-
|
|
361
|
-
## License
|
|
362
|
-
|
|
363
|
-
MIT
|