@boshu2/vibe-check 2.2.1 → 2.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/plans/2025-12-28-ai-safety-integration-plan.md +326 -0
- package/.agents/plans/2025-12-29-complexity-driver-plan.md +225 -0
- package/.agents/plans/2025-12-29-complexity-drivers-plan.md +253 -0
- package/.agents/research/2025-12-28-ai-platform-security-integration.md +295 -0
- package/.agents/research/2025-12-29-complexity-driver-architecture.md +392 -0
- package/.agents/research/2025-12-29-complexity-drivers.md +227 -0
- package/.beads/README.md +81 -0
- package/.beads/config.yaml +62 -0
- package/.beads/interactions.jsonl +0 -0
- package/.beads/issues.jsonl +21 -0
- package/.beads/metadata.json +4 -0
- package/.gitattributes +3 -0
- package/AGENTS.md +40 -0
- package/CHANGELOG.md +69 -0
- package/CLAUDE.md +75 -0
- package/README.md +71 -0
- package/dist/ai-safety/contract-drift.d.ts +14 -0
- package/dist/ai-safety/contract-drift.d.ts.map +1 -0
- package/dist/ai-safety/contract-drift.js +230 -0
- package/dist/ai-safety/contract-drift.js.map +1 -0
- package/dist/ai-safety/index.d.ts +43 -0
- package/dist/ai-safety/index.d.ts.map +1 -0
- package/dist/ai-safety/index.js +177 -0
- package/dist/ai-safety/index.js.map +1 -0
- package/dist/ai-safety/scope-violation.d.ts +18 -0
- package/dist/ai-safety/scope-violation.d.ts.map +1 -0
- package/dist/ai-safety/scope-violation.js +150 -0
- package/dist/ai-safety/scope-violation.js.map +1 -0
- package/dist/ai-safety/secret-leakage.d.ts +18 -0
- package/dist/ai-safety/secret-leakage.d.ts.map +1 -0
- package/dist/ai-safety/secret-leakage.js +188 -0
- package/dist/ai-safety/secret-leakage.js.map +1 -0
- package/dist/ai-safety/token-spiral.d.ts +17 -0
- package/dist/ai-safety/token-spiral.d.ts.map +1 -0
- package/dist/ai-safety/token-spiral.js +183 -0
- package/dist/ai-safety/token-spiral.js.map +1 -0
- package/dist/ai-safety/types.d.ts +122 -0
- package/dist/ai-safety/types.d.ts.map +1 -0
- package/dist/ai-safety/types.js +32 -0
- package/dist/ai-safety/types.js.map +1 -0
- package/dist/analyzers/complexity.d.ts +92 -0
- package/dist/analyzers/complexity.d.ts.map +1 -0
- package/dist/analyzers/complexity.js +79 -0
- package/dist/analyzers/complexity.js.map +1 -0
- package/dist/analyzers/modularity.d.ts +3 -1
- package/dist/analyzers/modularity.d.ts.map +1 -1
- package/dist/analyzers/modularity.js +32 -6
- package/dist/analyzers/modularity.js.map +1 -1
- package/dist/cli.js +2 -1
- package/dist/cli.js.map +1 -1
- package/dist/commands/driver.d.ts +18 -0
- package/dist/commands/driver.d.ts.map +1 -0
- package/dist/commands/driver.js +58 -0
- package/dist/commands/driver.js.map +1 -0
- package/dist/commands/index.d.ts +1 -0
- package/dist/commands/index.d.ts.map +1 -1
- package/dist/commands/index.js +1 -0
- package/dist/commands/index.js.map +1 -1
- package/dist/commands/modularity.d.ts +2 -0
- package/dist/commands/modularity.d.ts.map +1 -1
- package/dist/commands/modularity.js +86 -7
- package/dist/commands/modularity.js.map +1 -1
- package/dist/commands/session.d.ts +9 -0
- package/dist/commands/session.d.ts.map +1 -1
- package/dist/commands/session.js +42 -0
- package/dist/commands/session.js.map +1 -1
- package/dist/commands/watch.d.ts.map +1 -1
- package/dist/commands/watch.js +59 -0
- package/dist/commands/watch.js.map +1 -1
- package/drivers/README.md +327 -0
- package/drivers/go.sh +131 -0
- package/drivers/java.sh +137 -0
- package/drivers/javascript.sh +134 -0
- package/drivers/php.sh +132 -0
- package/drivers/python.sh +90 -0
- package/drivers/rust.sh +132 -0
- package/package.json +4 -1
|
@@ -0,0 +1,253 @@
|
|
|
1
|
+
---
|
|
2
|
+
date: 2025-12-29
|
|
3
|
+
type: Plan
|
|
4
|
+
topic: "Add complexity drivers for Rust, PHP, and Java"
|
|
5
|
+
research: ".agents/research/2025-12-29-complexity-drivers.md"
|
|
6
|
+
tags: [plan, drivers, complexity, rust, php, java]
|
|
7
|
+
status: COMPLETED
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Plan: Additional Complexity Drivers
|
|
11
|
+
|
|
12
|
+
**Created:** 2025-12-29
|
|
13
|
+
**Research:** .agents/research/2025-12-29-complexity-drivers.md
|
|
14
|
+
**Vibe Level:** L4 (high confidence - following established driver pattern)
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## Overview
|
|
19
|
+
|
|
20
|
+
Add complexity drivers for **Rust**, **PHP**, and **Java** to expand vibe-check's language coverage. These are the highest-value additions based on language popularity (TIOBE/Stack Overflow 2025) and tool quality. All three have tools with JSON output, making implementation straightforward.
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## Approach
|
|
25
|
+
|
|
26
|
+
Follow existing driver pattern established by `python.sh`, `javascript.sh`, and `go.sh`:
|
|
27
|
+
|
|
28
|
+
1. Shell script wrapper around language-specific tool
|
|
29
|
+
2. Transform tool output to `ComplexityReport` JSON schema
|
|
30
|
+
3. Update CLI error messages to include new driver
|
|
31
|
+
4. Update documentation
|
|
32
|
+
|
|
33
|
+
**Why this approach:**
|
|
34
|
+
- Proven pattern (3 working drivers)
|
|
35
|
+
- Language-agnostic kernel stays clean
|
|
36
|
+
- Shell scripts are portable and don't add npm dependencies
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## Features
|
|
41
|
+
|
|
42
|
+
### Feature 1: Rust Driver
|
|
43
|
+
|
|
44
|
+
**Priority:** P1
|
|
45
|
+
**Type:** feature
|
|
46
|
+
**Depends On:** None
|
|
47
|
+
|
|
48
|
+
**Acceptance Criteria:**
|
|
49
|
+
- [ ] `drivers/rust.sh` created wrapping `rust-code-analysis-cli`
|
|
50
|
+
- [ ] Outputs valid `ComplexityReport` JSON
|
|
51
|
+
- [ ] Handles missing tool with helpful error message
|
|
52
|
+
- [ ] Handles empty directories gracefully
|
|
53
|
+
- [ ] CLI updated to list `rust` as available driver
|
|
54
|
+
- [ ] Documentation updated (drivers/README.md, README.md)
|
|
55
|
+
|
|
56
|
+
**Files Affected:**
|
|
57
|
+
- `drivers/rust.sh` - New driver script
|
|
58
|
+
- `drivers/README.md` - Add Rust section
|
|
59
|
+
- `README.md` - Add Rust to Available Drivers list
|
|
60
|
+
- `src/commands/driver.ts` - Update available drivers message
|
|
61
|
+
- `src/commands/modularity.ts` - Update available drivers message
|
|
62
|
+
|
|
63
|
+
**Test Strategy:**
|
|
64
|
+
1. Create test Rust file with known complexity
|
|
65
|
+
2. Run `./drivers/rust.sh /path/to/test`
|
|
66
|
+
3. Verify JSON output matches schema
|
|
67
|
+
4. Run `vibe-check driver rust /path/to/test`
|
|
68
|
+
5. Run `vibe-check modularity --with-complexity rust`
|
|
69
|
+
|
|
70
|
+
**Tool Details:**
|
|
71
|
+
```bash
|
|
72
|
+
# Install
|
|
73
|
+
cargo install rust-code-analysis-cli
|
|
74
|
+
|
|
75
|
+
# Usage (native JSON)
|
|
76
|
+
rust-code-analysis-cli -m -O json -p /path/to/src
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
### Feature 2: PHP Driver
|
|
82
|
+
|
|
83
|
+
**Priority:** P2
|
|
84
|
+
**Type:** feature
|
|
85
|
+
**Depends On:** None (can be done in parallel with Rust)
|
|
86
|
+
|
|
87
|
+
**Acceptance Criteria:**
|
|
88
|
+
- [ ] `drivers/php.sh` created wrapping `phpmd`
|
|
89
|
+
- [ ] Outputs valid `ComplexityReport` JSON
|
|
90
|
+
- [ ] Handles missing tool with helpful error message
|
|
91
|
+
- [ ] Handles empty directories gracefully
|
|
92
|
+
- [ ] CLI updated to list `php` as available driver
|
|
93
|
+
- [ ] Documentation updated
|
|
94
|
+
|
|
95
|
+
**Files Affected:**
|
|
96
|
+
- `drivers/php.sh` - New driver script
|
|
97
|
+
- `drivers/README.md` - Add PHP section
|
|
98
|
+
- `README.md` - Add PHP to Available Drivers list
|
|
99
|
+
- `src/commands/driver.ts` - Update available drivers message
|
|
100
|
+
- `src/commands/modularity.ts` - Update available drivers message
|
|
101
|
+
|
|
102
|
+
**Test Strategy:**
|
|
103
|
+
1. Create test PHP file with known complexity
|
|
104
|
+
2. Run `./drivers/php.sh /path/to/test`
|
|
105
|
+
3. Verify JSON output matches schema
|
|
106
|
+
4. Run `vibe-check driver php /path/to/test`
|
|
107
|
+
|
|
108
|
+
**Tool Details:**
|
|
109
|
+
```bash
|
|
110
|
+
# Install
|
|
111
|
+
composer global require phpmd/phpmd
|
|
112
|
+
|
|
113
|
+
# Usage (native JSON)
|
|
114
|
+
phpmd /path/to/src json codesize
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
---
|
|
118
|
+
|
|
119
|
+
### Feature 3: Java Driver
|
|
120
|
+
|
|
121
|
+
**Priority:** P2
|
|
122
|
+
**Type:** feature
|
|
123
|
+
**Depends On:** None (can be done in parallel)
|
|
124
|
+
|
|
125
|
+
**Acceptance Criteria:**
|
|
126
|
+
- [ ] `drivers/java.sh` created wrapping PMD
|
|
127
|
+
- [ ] Outputs valid `ComplexityReport` JSON
|
|
128
|
+
- [ ] Handles missing tool with helpful error message
|
|
129
|
+
- [ ] Handles empty directories gracefully
|
|
130
|
+
- [ ] CLI updated to list `java` as available driver
|
|
131
|
+
- [ ] Documentation updated
|
|
132
|
+
|
|
133
|
+
**Files Affected:**
|
|
134
|
+
- `drivers/java.sh` - New driver script
|
|
135
|
+
- `drivers/README.md` - Add Java section
|
|
136
|
+
- `README.md` - Add Java to Available Drivers list
|
|
137
|
+
- `src/commands/driver.ts` - Update available drivers message
|
|
138
|
+
- `src/commands/modularity.ts` - Update available drivers message
|
|
139
|
+
|
|
140
|
+
**Test Strategy:**
|
|
141
|
+
1. Create test Java file with known complexity
|
|
142
|
+
2. Run `./drivers/java.sh /path/to/test`
|
|
143
|
+
3. Verify JSON output matches schema
|
|
144
|
+
4. Run `vibe-check driver java /path/to/test`
|
|
145
|
+
|
|
146
|
+
**Tool Details:**
|
|
147
|
+
```bash
|
|
148
|
+
# Install (download PMD)
|
|
149
|
+
# https://pmd.github.io/
|
|
150
|
+
|
|
151
|
+
# Usage (JSON via flag)
|
|
152
|
+
pmd check -d /path/to/src -R category/java/design.xml -f json
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
**Note:** PMD requires JRE. Document this clearly.
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## Implementation Order
|
|
160
|
+
|
|
161
|
+
| Step | Feature | Depends On | Validation |
|
|
162
|
+
|------|---------|------------|------------|
|
|
163
|
+
| 1 | Rust Driver | - | `vibe-check driver rust ./test` produces valid JSON |
|
|
164
|
+
| 2 | PHP Driver | - | `vibe-check driver php ./test` produces valid JSON |
|
|
165
|
+
| 3 | Java Driver | - | `vibe-check driver java ./test` produces valid JSON |
|
|
166
|
+
|
|
167
|
+
**Note:** All three can be implemented in parallel - no dependencies between them.
|
|
168
|
+
|
|
169
|
+
---
|
|
170
|
+
|
|
171
|
+
## Beads Issues to Create
|
|
172
|
+
|
|
173
|
+
After approval, these issues will be created:
|
|
174
|
+
|
|
175
|
+
| ID | Title | Type | Priority | Depends On |
|
|
176
|
+
|----|-------|------|----------|------------|
|
|
177
|
+
| TBD | Epic: Additional Complexity Drivers | epic | P1 | - |
|
|
178
|
+
| TBD | Create Rust complexity driver | feature | P1 | Epic |
|
|
179
|
+
| TBD | Create PHP complexity driver | feature | P2 | Epic |
|
|
180
|
+
| TBD | Create Java complexity driver | feature | P2 | Epic |
|
|
181
|
+
|
|
182
|
+
---
|
|
183
|
+
|
|
184
|
+
## Output Format Reference
|
|
185
|
+
|
|
186
|
+
All drivers must output JSON matching this schema:
|
|
187
|
+
|
|
188
|
+
```typescript
|
|
189
|
+
{
|
|
190
|
+
tool: string; // "rust-code-analysis", "phpmd", "pmd"
|
|
191
|
+
language: string; // "rust", "php", "java"
|
|
192
|
+
generatedAt: string; // ISO timestamp
|
|
193
|
+
files: {
|
|
194
|
+
[filepath: string]: {
|
|
195
|
+
functions: Array<{
|
|
196
|
+
name: string;
|
|
197
|
+
complexity: number;
|
|
198
|
+
grade: 'A' | 'B' | 'C' | 'D' | 'E' | 'F';
|
|
199
|
+
line: number;
|
|
200
|
+
endLine?: number;
|
|
201
|
+
}>;
|
|
202
|
+
avgComplexity: number;
|
|
203
|
+
maxComplexity: number;
|
|
204
|
+
grade: 'A' | 'B' | 'C' | 'D' | 'E' | 'F';
|
|
205
|
+
};
|
|
206
|
+
};
|
|
207
|
+
summary: {
|
|
208
|
+
totalFiles: number;
|
|
209
|
+
totalFunctions: number;
|
|
210
|
+
avgComplexity: number;
|
|
211
|
+
gradeDistribution: Record<'A'|'B'|'C'|'D'|'E'|'F', number>;
|
|
212
|
+
};
|
|
213
|
+
}
|
|
214
|
+
```
|
|
215
|
+
|
|
216
|
+
**Grade Thresholds:**
|
|
217
|
+
- A: 1-5 (simple)
|
|
218
|
+
- B: 6-10 (acceptable)
|
|
219
|
+
- C: 11-20 (consider refactoring)
|
|
220
|
+
- D: 21-30 (refactor)
|
|
221
|
+
- E: 31-40 (high risk)
|
|
222
|
+
- F: 41+ (unmaintainable)
|
|
223
|
+
|
|
224
|
+
---
|
|
225
|
+
|
|
226
|
+
## Rollback Procedure
|
|
227
|
+
|
|
228
|
+
Each driver is independent. To rollback:
|
|
229
|
+
1. `git revert <commit>` for the specific driver
|
|
230
|
+
2. Remove the `drivers/<lang>.sh` file
|
|
231
|
+
3. Update CLI error messages to remove language
|
|
232
|
+
|
|
233
|
+
---
|
|
234
|
+
|
|
235
|
+
## Not In Scope
|
|
236
|
+
|
|
237
|
+
- Ruby driver (P3 - requires text parsing)
|
|
238
|
+
- C# driver (P3 - requires XML parsing)
|
|
239
|
+
- Automated tool installation
|
|
240
|
+
- Windows-specific handling
|
|
241
|
+
|
|
242
|
+
---
|
|
243
|
+
|
|
244
|
+
## Next Steps
|
|
245
|
+
|
|
246
|
+
1. Review and approve this plan
|
|
247
|
+
2. Run beads issue creation (below)
|
|
248
|
+
3. `bd ready` to see unblocked issues
|
|
249
|
+
4. `/implement` to execute
|
|
250
|
+
|
|
251
|
+
---
|
|
252
|
+
|
|
253
|
+
**Output:** .agents/plans/2025-12-29-complexity-drivers-plan.md
|
|
@@ -0,0 +1,295 @@
|
|
|
1
|
+
---
|
|
2
|
+
date: 2025-12-28
|
|
3
|
+
type: Research
|
|
4
|
+
topic: "Integrating ai-platform security and LLM validation into vibe-check"
|
|
5
|
+
tags: [research, security, llm-validation, integration, ai-platform, vibe-check]
|
|
6
|
+
status: COMPLETE
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Research: Integrating AI-Platform Security & LLM Validation into Vibe-Check
|
|
10
|
+
|
|
11
|
+
**Created:** 2025-12-28
|
|
12
|
+
**Goal:** Understand how to integrate the security and LLM validation components from ai-platform into vibe-check for detecting AI agent misbehavior patterns.
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## Executive Summary
|
|
17
|
+
|
|
18
|
+
The **ai-platform** repository (`/Users/fullerbt/workspaces/work/ai-platform`) contains a comprehensive security and LLM validation stack built for production use in classified environments. Key components include: **RBAC access control**, **audit logging** (SIEM-ready), **agent execution tracking** (Prometheus metrics), **rate limiting with token budgets**, and **contract-driven agent testing**.
|
|
19
|
+
|
|
20
|
+
**Recommendation:** Integrate these concepts into vibe-check as a new `ai-safety/` or `agent-validation/` module, extending the existing inner-loop failure pattern detection to include LLM-specific antipatterns like **prompt injection attempts**, **hallucination markers**, **scope violations**, and **contract drift**.
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## Current State
|
|
25
|
+
|
|
26
|
+
### What Exists in ai-platform
|
|
27
|
+
|
|
28
|
+
| Component | Location | Purpose |
|
|
29
|
+
|-----------|----------|---------|
|
|
30
|
+
| **Access Control** | `services/gateway/access_control.py` | RBAC for agents/tools via OIDC groups |
|
|
31
|
+
| **Audit Logging** | `services/gateway/audit.py` | JSON request/response logging for SIEM |
|
|
32
|
+
| **Agent Audit** | `services/gateway/agent_audit.py` | Agent execution tracking + Prometheus metrics |
|
|
33
|
+
| **Rate Limiting** | `services/gateway/rate_limit.py` | Per-user rate limits + token budgets |
|
|
34
|
+
| **Config Validator** | `services/etl/app/config_validator.py` | Startup config health checks |
|
|
35
|
+
| **Security Tests** | `tests/agents/test_agent_security.py` | Boundary violation detection |
|
|
36
|
+
| **Contract Tests** | `tests/agents/test_agent_contract_validation.py` | Output contract compliance |
|
|
37
|
+
| **Platform Validator** | `services/etl/scripts/validate-platform.py` | End-to-end LLM platform checks |
|
|
38
|
+
|
|
39
|
+
### Key Patterns Worth Porting to vibe-check
|
|
40
|
+
|
|
41
|
+
1. **Agent Scope Boundary Detection** (`test_agent_security.py`)
|
|
42
|
+
- Detects when agents attempt actions outside their declared scope
|
|
43
|
+
- Pattern: Check if commit changes files/APIs not in the agent's domain
|
|
44
|
+
|
|
45
|
+
2. **Hallucination Markers** (`test_agent_security.py:50-70`)
|
|
46
|
+
- Detects file path hallucination (made-up paths)
|
|
47
|
+
- Pattern: Commits referencing non-existent files or phantom dependencies
|
|
48
|
+
|
|
49
|
+
3. **Secret Leakage Detection** (`test_agent_security.py:120-150`)
|
|
50
|
+
- Regex patterns for API keys, tokens, credentials
|
|
51
|
+
- Pattern: Commits accidentally exposing secrets
|
|
52
|
+
|
|
53
|
+
4. **Contract Drift Detection** (`test_agent_contract_validation.py`)
|
|
54
|
+
- Validates agent responses match expected structure
|
|
55
|
+
- Pattern: Detect when AI outputs stop conforming to expected formats
|
|
56
|
+
|
|
57
|
+
5. **Token Budget Tracking** (`rate_limit.py`)
|
|
58
|
+
- Tracks token consumption per user/session
|
|
59
|
+
- Pattern: Detect runaway token usage (context explosion)
|
|
60
|
+
|
|
61
|
+
### Key Files
|
|
62
|
+
|
|
63
|
+
| File | Purpose | Relevance to vibe-check |
|
|
64
|
+
|------|---------|-------------------------|
|
|
65
|
+
| `services/gateway/access_control.py` | RBAC via OIDC groups | Could map to commit author scope detection |
|
|
66
|
+
| `services/gateway/audit.py` | JSON audit logging | Structure for storing AI interaction events |
|
|
67
|
+
| `services/gateway/agent_audit.py` | Prometheus metrics + Langfuse traces | Pattern for tracking AI agent behavior over time |
|
|
68
|
+
| `tests/agents/test_agent_security.py` | Security test patterns | Regex patterns and violation detection logic |
|
|
69
|
+
| `services/gateway/rate_limit.py` | Token budget tracking | Pattern for detecting context window abuse |
|
|
70
|
+
|
|
71
|
+
### Existing Patterns in vibe-check
|
|
72
|
+
|
|
73
|
+
vibe-check already has:
|
|
74
|
+
- **Inner-loop failure detection** (`src/inner-loop/`) with 4 detectors
|
|
75
|
+
- **Spiral detection** with pattern regexes in `watch.ts`
|
|
76
|
+
- **Session tracking** with baseline comparison
|
|
77
|
+
- **NDJSON storage** for historical pattern analysis
|
|
78
|
+
- **Prometheus-like metrics** (not exposed, but structured similarly)
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
|
|
82
|
+
## Findings
|
|
83
|
+
|
|
84
|
+
### Finding 1: Security Validation Patterns Are Git-Analyzable
|
|
85
|
+
|
|
86
|
+
**Evidence:** The ai-platform security tests detect violations by analyzing:
|
|
87
|
+
- Commit messages for intent signals
|
|
88
|
+
- File changes for scope violations
|
|
89
|
+
- Code content for secret patterns
|
|
90
|
+
|
|
91
|
+
**Implication:** These patterns can be adapted for vibe-check's commit-based analysis:
|
|
92
|
+
```
|
|
93
|
+
ai-platform pattern → vibe-check integration
|
|
94
|
+
-------------------------------------------
|
|
95
|
+
Scope violation test → Detect commits touching files outside declared domain
|
|
96
|
+
Secret leakage test → Detect commits adding API keys/tokens
|
|
97
|
+
Hallucination test → Detect commits referencing non-existent paths
|
|
98
|
+
Contract drift test → Detect commits breaking expected output formats
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
### Finding 2: Agent Audit Structure Is Session-Compatible
|
|
102
|
+
|
|
103
|
+
**Evidence:** `agent_audit.py` tracks events in a structure similar to vibe-check sessions:
|
|
104
|
+
```python
|
|
105
|
+
audit_record = {
|
|
106
|
+
"correlation_id": str, # → session_id
|
|
107
|
+
"agent_name": str, # → (new field)
|
|
108
|
+
"user_identity": str, # → (already tracked)
|
|
109
|
+
"tokens_used": { # → (new metric)
|
|
110
|
+
"input": int,
|
|
111
|
+
"output": int,
|
|
112
|
+
"total": int
|
|
113
|
+
},
|
|
114
|
+
"tool_invocations": [...], # → (map to commit file changes)
|
|
115
|
+
"duration_ms": int # → (session duration)
|
|
116
|
+
}
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
**Implication:** Can extend `active-session.json` with agent/LLM metadata.
|
|
120
|
+
|
|
121
|
+
### Finding 3: Rate Limiting = Token Spiral Detection
|
|
122
|
+
|
|
123
|
+
**Evidence:** `rate_limit.py` implements token budget tracking:
|
|
124
|
+
```python
|
|
125
|
+
@dataclass
|
|
126
|
+
class UserRateLimit:
|
|
127
|
+
max_tokens_per_minute: int = 100_000
|
|
128
|
+
max_tokens_per_hour: int = 1_000_000
|
|
129
|
+
max_tokens_per_day: int = 10_000_000
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
**Implication:** vibe-check could detect "token spirals" - sessions where token consumption explodes, indicating:
|
|
133
|
+
- Context window abuse
|
|
134
|
+
- Prompt stuffing
|
|
135
|
+
- Repeated failed attempts (AI trying same thing repeatedly)
|
|
136
|
+
|
|
137
|
+
### Finding 4: Secret Detection Regex Is Ready to Use
|
|
138
|
+
|
|
139
|
+
**Evidence:** `test_agent_security.py:120-150`:
|
|
140
|
+
```python
|
|
141
|
+
SECRET_PATTERNS = [
|
|
142
|
+
r'sk-[a-zA-Z0-9]{48}', # OpenAI
|
|
143
|
+
r'ghp_[a-zA-Z0-9]{36}', # GitHub PAT
|
|
144
|
+
r'glpat-[a-zA-Z0-9]{20}', # GitLab PAT
|
|
145
|
+
r'AKIA[0-9A-Z]{16}', # AWS Access Key
|
|
146
|
+
r'xox[baprs]-[a-zA-Z0-9-]+', # Slack tokens
|
|
147
|
+
]
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
**Implication:** Can add a `secret-leakage.ts` detector to inner-loop.
|
|
151
|
+
|
|
152
|
+
### Finding 5: Contract Validation Is Output-Pattern Detection
|
|
153
|
+
|
|
154
|
+
**Evidence:** `test_agent_contract_validation.py` validates agent outputs against YAML-defined contracts:
|
|
155
|
+
```yaml
|
|
156
|
+
agents:
|
|
157
|
+
- name: code-review
|
|
158
|
+
output_contract:
|
|
159
|
+
required_sections: ["summary", "issues", "suggestions"]
|
|
160
|
+
max_length: 5000
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
**Implication:** vibe-check could detect "contract drift" - when AI commits start deviating from expected patterns:
|
|
164
|
+
- Commit messages becoming less structured
|
|
165
|
+
- PR descriptions missing required sections
|
|
166
|
+
- Code comments becoming vague or absent
|
|
167
|
+
|
|
168
|
+
---
|
|
169
|
+
|
|
170
|
+
## Constraints
|
|
171
|
+
|
|
172
|
+
| Constraint | Impact | Mitigation |
|
|
173
|
+
|------------|--------|------------|
|
|
174
|
+
| ai-platform is Python, vibe-check is TypeScript | Can't share code directly | Port patterns/logic, not code |
|
|
175
|
+
| ai-platform runs in Kubernetes, vibe-check is CLI | Different runtime context | Adapt to git-based detection |
|
|
176
|
+
| ai-platform has Prometheus/Langfuse dependencies | Heavy deps for CLI tool | Use file-based storage, optionally export |
|
|
177
|
+
| Token tracking requires LLM API integration | vibe-check only sees git | Estimate tokens from commit size/complexity |
|
|
178
|
+
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
## Risks
|
|
182
|
+
|
|
183
|
+
| Risk | Likelihood | Impact | Mitigation |
|
|
184
|
+
|------|------------|--------|------------|
|
|
185
|
+
| Over-engineering simple CLI tool | Medium | High | Start with 2-3 highest-value patterns |
|
|
186
|
+
| False positives on secret detection | Medium | Medium | Require confidence threshold, allow suppression |
|
|
187
|
+
| Token estimation inaccuracy | High | Low | Use as relative signal, not absolute |
|
|
188
|
+
| Breaking existing inner-loop interface | Low | High | Extend, don't replace |
|
|
189
|
+
|
|
190
|
+
---
|
|
191
|
+
|
|
192
|
+
## Recommendation
|
|
193
|
+
|
|
194
|
+
**Approach:** Add a new `src/ai-safety/` module to vibe-check with 4 new detectors:
|
|
195
|
+
|
|
196
|
+
### Phase 1: High-Value Ports (Recommended First)
|
|
197
|
+
|
|
198
|
+
1. **Secret Leakage Detector** (`src/ai-safety/secret-leakage.ts`)
|
|
199
|
+
- Port regex patterns from `test_agent_security.py`
|
|
200
|
+
- Scan commit diffs for exposed secrets
|
|
201
|
+
- Integrate into `session end` output
|
|
202
|
+
|
|
203
|
+
2. **Scope Violation Detector** (`src/ai-safety/scope-violation.ts`)
|
|
204
|
+
- Detect commits touching files outside declared domain
|
|
205
|
+
- Requires domain configuration (similar to access_control.py mappings)
|
|
206
|
+
- Warning when agent strays from its lane
|
|
207
|
+
|
|
208
|
+
### Phase 2: Medium-Value Ports
|
|
209
|
+
|
|
210
|
+
3. **Contract Drift Detector** (`src/ai-safety/contract-drift.ts`)
|
|
211
|
+
- Detect when commit message patterns degrade
|
|
212
|
+
- Track deviation from conventional commit format
|
|
213
|
+
- Alert when AI stops following established patterns
|
|
214
|
+
|
|
215
|
+
4. **Token Spiral Estimator** (`src/ai-safety/token-spiral.ts`)
|
|
216
|
+
- Estimate token usage from commit size/complexity
|
|
217
|
+
- Detect sessions with exploding context
|
|
218
|
+
- Use as relative metric (not absolute)
|
|
219
|
+
|
|
220
|
+
### Integration Points
|
|
221
|
+
|
|
222
|
+
```typescript
|
|
223
|
+
// src/ai-safety/index.ts
|
|
224
|
+
export interface AISafetyAnalysis {
|
|
225
|
+
secretsDetected: SecretLeakageResult;
|
|
226
|
+
scopeViolations: ScopeViolationResult;
|
|
227
|
+
contractDrift: ContractDriftResult;
|
|
228
|
+
tokenSpiral: TokenSpiralResult;
|
|
229
|
+
summary: {
|
|
230
|
+
totalIssues: number;
|
|
231
|
+
criticalIssues: number;
|
|
232
|
+
warningIssues: number;
|
|
233
|
+
overallHealth: 'healthy' | 'warning' | 'critical';
|
|
234
|
+
};
|
|
235
|
+
recommendations: string[];
|
|
236
|
+
}
|
|
237
|
+
|
|
238
|
+
export function analyzeAISafety(
|
|
239
|
+
commits: Commit[],
|
|
240
|
+
filesPerCommit: Map<string, string[]>,
|
|
241
|
+
config?: AISafetyConfig
|
|
242
|
+
): AISafetyAnalysis;
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
**Integration into existing commands:**
|
|
246
|
+
- `session end` → Add `ai_safety` section to JSON output
|
|
247
|
+
- `watch` → Add real-time alerts for secret detection
|
|
248
|
+
- `insights` → Add AI safety pattern history
|
|
249
|
+
|
|
250
|
+
### Why This Approach
|
|
251
|
+
|
|
252
|
+
1. **Non-breaking:** Extends existing architecture, doesn't replace
|
|
253
|
+
2. **High-value first:** Secret leakage is immediately useful
|
|
254
|
+
3. **Familiar patterns:** Same structure as existing inner-loop detectors
|
|
255
|
+
4. **Low deps:** No new dependencies required (pure TypeScript)
|
|
256
|
+
5. **Git-native:** Works with existing commit-based analysis
|
|
257
|
+
|
|
258
|
+
---
|
|
259
|
+
|
|
260
|
+
## Alternatives Considered
|
|
261
|
+
|
|
262
|
+
1. **Import ai-platform as dependency** - Rejected: Different language, heavy deps
|
|
263
|
+
2. **Create shared npm/pip packages** - Rejected: Over-engineering for 4 patterns
|
|
264
|
+
3. **Keep validation in ai-platform only** - Rejected: vibe-check users need these patterns
|
|
265
|
+
4. **Port entire audit system** - Rejected: Too complex, different runtime
|
|
266
|
+
|
|
267
|
+
---
|
|
268
|
+
|
|
269
|
+
## Next Steps
|
|
270
|
+
|
|
271
|
+
1. Run `/plan` to create implementation plan from this research
|
|
272
|
+
2. Plan will create beads issues for each detector
|
|
273
|
+
3. Implement in priority order: secrets → scope → contract → token
|
|
274
|
+
|
|
275
|
+
---
|
|
276
|
+
|
|
277
|
+
## Appendix: Key Source File References
|
|
278
|
+
|
|
279
|
+
### ai-platform Security Components
|
|
280
|
+
- `/Users/fullerbt/workspaces/work/ai-platform/services/gateway/access_control.py` - RBAC implementation
|
|
281
|
+
- `/Users/fullerbt/workspaces/work/ai-platform/services/gateway/audit.py` - Audit logging
|
|
282
|
+
- `/Users/fullerbt/workspaces/work/ai-platform/services/gateway/agent_audit.py` - Agent execution tracking
|
|
283
|
+
- `/Users/fullerbt/workspaces/work/ai-platform/services/gateway/rate_limit.py` - Rate limiting
|
|
284
|
+
- `/Users/fullerbt/workspaces/work/ai-platform/tests/agents/test_agent_security.py` - Security tests
|
|
285
|
+
- `/Users/fullerbt/workspaces/work/ai-platform/tests/agents/test_agent_contract_validation.py` - Contract tests
|
|
286
|
+
|
|
287
|
+
### vibe-check Integration Points
|
|
288
|
+
- `/Users/fullerbt/workspaces/personal/vibe-check/src/inner-loop/index.ts` - Existing detector orchestrator
|
|
289
|
+
- `/Users/fullerbt/workspaces/personal/vibe-check/src/commands/session.ts` - Session end output
|
|
290
|
+
- `/Users/fullerbt/workspaces/personal/vibe-check/src/commands/watch.ts` - Real-time monitoring
|
|
291
|
+
- `/Users/fullerbt/workspaces/personal/vibe-check/src/types.ts` - Type definitions
|
|
292
|
+
|
|
293
|
+
---
|
|
294
|
+
|
|
295
|
+
**Output:** .agents/research/2025-12-28-ai-platform-security-integration.md
|