@lhi/tdd-audit 1.16.0 → 1.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,162 +1,283 @@
1
1
  # @lhi/tdd-audit
2
+ ![Coverage](https://img.shields.io/badge/coverage-98%25-brightgreen)
2
3
  [![tdd-audit](https://img.shields.io/badge/tdd--audit-passing-brightgreen)](https://www.npmjs.com/package/@lhi/tdd-audit) <!-- tdd-audit-badge -->
3
4
 
4
- > **v1.15.0** Security skill installer for **Claude Code, Gemini CLI, Cursor, Codex, and OpenCode**. Patches vulnerabilities using a Red-Green-Refactor exploit-test protocol prove the hole exists, apply the fix, prove it's closed. Enforces 95% test coverage, README badge, and SECURITY.md on every audit.
5
+ > **Your AI-generated code is probably vulnerable right now.** SQL injection. Hardcoded secrets. Prompt injection backdoors. The same assistant that built your feature in 30 minutes didn't think twice about security. `tdd-audit` finds the holes, proves they're real, and closes them every fix backed by a failing test before and a passing test after.
6
+
7
+ ## One command. Proven secure.
8
+
9
+ ```bash
10
+ npx @lhi/tdd-audit --local --claude
11
+ ```
12
+
13
+ That's it. In seconds you get:
14
+
15
+ - A severity-ranked findings report (CRITICAL → LOW) with the exact file and line
16
+ - Exploit tests that **prove the vulnerability is real** — not theoretical
17
+ - Patches that close each hole, verified by a passing test suite
18
+ - ≥ 95% test coverage enforced, a README security badge, and a `SECURITY.md` ready for auditors
19
+
20
+ No security expertise required. No config needed to start.
21
+
22
+ ---
23
+
24
+ ## Why this exists
25
+
26
+ Vibecoders move fast. AI assistants hallucinate security. The result: apps with SQL injection in the ORM layer, JWT algorithm confusion, hardcoded API keys one `git log` away from leaking, and LLM prompt injection that hands your backend to anyone who knows the trick.
27
+
28
+ PMs and security officers feel it too — "is this thing actually secure?" has no good answer when there are no tests proving it.
29
+
30
+ `tdd-audit` gives you the answer. Every vulnerability is proven closed by a test, not just patched and hoped for the best.
31
+
32
+ ---
5
33
 
6
34
  ## Install
7
35
 
8
36
  ```bash
9
- npx @lhi/tdd-audit
37
+ # Claude Code (recommended)
38
+ npx @lhi/tdd-audit --local --claude
39
+
40
+ # Gemini CLI / Codex / OpenCode / Cursor
41
+ npx @lhi/tdd-audit --local
10
42
  ```
11
43
 
12
44
  On first run the installer:
13
45
 
14
- 1. Scans your codebase for **57 vulnerability patterns** across 6 scanner modules and prints a severity-ranked report
15
- 2. Scaffolds `__tests__/security/` with a framework-matched exploit test boilerplate
16
- 3. Adds `test:security` to `package.json`
17
- 4. Creates `.github/workflows/security-tests.yml` with SHA-pinned actions and `npm audit`
18
- 5. Installs the `/tdd-audit` skill for your AI agent
46
+ 1. Scaffolds `__tests__/security/` with framework-matched exploit test boilerplate
47
+ 2. Adds `test:security` to `package.json`
48
+ 3. Creates `.github/workflows/security-tests.yml` SHA-pinned actions, `npm audit` on every PR
49
+ 4. Installs the `/tdd-audit` skill in your AI agent
19
50
 
20
- ### Flags
51
+ Then open your agent and type `/tdd-audit`. It handles the rest.
21
52
 
22
- | Flag | Description |
23
- |---|---|
24
- | `--local` | Install into the current project instead of `~` |
25
- | `--claude` | Use `.claude/` instead of `.agents/` |
26
- | `--with-hooks` | Add a pre-commit hook that blocks commits on failing security tests |
27
- | `--skip-scan` | Skip the vulnerability scan on install |
28
- | `--scan` / `--scan-only` | Scan only — no install, no code changes |
29
- | `--json` | Output findings as JSON |
30
- | `--format sarif` | Output findings as SARIF 2.1.0 (GitHub code scanning) |
31
- | `--config <path>` | Load config from an explicit file path |
32
-
33
- ### Platform
34
-
35
- | Platform | Command |
53
+ ### Install flags
54
+
55
+ | Flag | What it does |
36
56
  |---|---|
37
- | Claude Code | `npx @lhi/tdd-audit --local --claude` |
38
- | Gemini CLI / Codex / OpenCode | `npx @lhi/tdd-audit --local` |
57
+ | `--local` | Install into the current project (recommended) |
58
+ | `--claude` | Use `.claude/` for Claude Code |
59
+ | `--with-hooks` | Block commits that break security tests |
60
+ | `--skip-scan` | Skip the initial vulnerability scan |
61
+ | `--config <path>` | Load config from a specific path |
39
62
 
40
- ## Usage
63
+ ---
41
64
 
42
- ```text
43
- /tdd-audit
44
- ```
65
+ ## What gets caught
45
66
 
46
- The agent detects your stack, presents a CRITICAL → LOW findings report, waits for confirmation, then works through each vulnerability one at a time using Red-Green-Refactor. Pass `--scan` for a report-only run with no code changes.
67
+ 100+ patterns across Node.js, Python, Go, React, React Native, Flutter, and Expo including the AI-specific vulnerabilities that most scanners miss entirely:
47
68
 
48
- ## Config file
69
+ **Standard OWASP holes** — SQL/NoSQL/Command injection · Path traversal · Broken auth · XSS · IDOR · Mass assignment · SSRF · Open redirect · XXE · Insecure deserialization · Prototype pollution · Weak crypto · Hardcoded secrets · TLS bypass
49
70
 
50
- Scaffold a starter config with a single command:
71
+ **AI / LLM-specific** (the ones that will actually get you hacked in 2025) — LLM prompt injection · Eval of model output · LangChain ShellTool / ExecChain · Unbounded agent loops · MCP credential leakage · GitHub Actions expression injection · Hardcoded provider keys (OpenAI, Anthropic, Gemini, Cohere, Mistral, HuggingFace) · Missing `max_tokens` · Dynamic require from user input · VM sandbox escape · Electron `nodeIntegration: true`
51
72
 
52
- ```bash
53
- npx @lhi/tdd-audit init
54
- # or at a custom path:
55
- npx @lhi/tdd-audit init ~/configs/my-audit.json
56
- ```
73
+ **Vibecoding anti-patterns** — `localStorage` token storage · `Math.random()` for session IDs · `process.env.SECRET || "hardcoded-fallback"` · Silent exception swallowing · Insecure WebSocket URLs
74
+
75
+ ---
76
+
77
+ ## How it works
78
+
79
+ The full `/tdd-audit` skill run follows Red-Green-Refactor for every finding:
80
+
81
+ 1. **Detect** — scans your stack, scopes patterns to what's actually relevant
82
+ 2. **Report** — presents a CRITICAL → LOW findings table with plain-language risk and effort estimate. Waits for your sign-off before touching anything
83
+ 3. **Red** — writes an exploit test that **fails** (proves the hole is real)
84
+ 4. **Green** — applies the patch (test now passes)
85
+ 5. **Refactor** — runs the full suite (zero regressions)
86
+ 6. **Harden** — security headers, rate limiting, `npm audit`, secret scan, production error handling
87
+ 7. **Coverage gate** — pushes test coverage to ≥ 95% line and branch
88
+ 8. **Badge + SECURITY.md** — updates your README badge, creates `SECURITY.md` in GitHub Security Advisory format
89
+
90
+ Nothing is marked done until a test proves it.
91
+
92
+ ---
93
+
94
+ ## For PMs and security officers
95
+
96
+ You need evidence, not promises. `tdd-audit` produces:
97
+
98
+ - **Exploit tests** — a failing test per vulnerability, committed to source, proves the hole existed
99
+ - **Passing tests** — the fix is proven by the test suite, not just code review
100
+ - **`--format report`** — a markdown compliance report with findings table, fix evidence, patch commits, and coverage gate result; ready to attach to SOC 2, ISO 27001, or vendor security questionnaire
101
+ - **`--sbom`** — CycloneDX Software Bill of Materials (required for US federal contracts under EO 14028)
102
+ - **`SECURITY.md`** — GitHub Security Advisory format with your security contact, supported versions, and hardening summary
103
+ - **Webhook + Slack notifications** — findings summary delivered to your security channel on every scan
57
104
 
58
- `.tdd-audit.json` — all CLI flags settable here, loaded automatically from your project root:
105
+ Configure your security contact in `.tdd-audit.json`:
59
106
 
60
107
  ```json
61
108
  {
62
- "provider": "openai",
63
- "model": "gpt-4o",
64
- "apiKeyEnv": "OPENAI_API_KEY",
65
- "baseUrl": null,
66
- "output": "text",
67
- "severityThreshold": "LOW",
68
- "port": 3000,
69
- "serverApiKey": null,
70
- "trustProxy": false,
71
- "ignore": ["node_modules", "dist", "build", "coverage"]
109
+ "security_name": "Alice Smith",
110
+ "security_email": "security@yourorg.com"
72
111
  }
73
112
  ```
74
113
 
75
- Point to a config anywhere with `--config`:
114
+ Both fields are optional use one, both, or neither. When set, they appear in SECURITY.md, the compliance report, and webhook payloads.
115
+
116
+ ---
117
+
118
+ ## AI Audit (`--ai`)
119
+
120
+ Let the LLM explore and report on your codebase directly:
76
121
 
77
122
  ```bash
78
- npx @lhi/tdd-audit serve --config ~/configs/prod-audit.json
123
+ npx @lhi/tdd-audit --ai \
124
+ --provider anthropic \
125
+ --api-key $ANTHROPIC_API_KEY \
126
+ --depth tier-2 \
127
+ --format json
79
128
  ```
80
129
 
81
- ## REST API + AI remediation
130
+ ### Depth tiers
131
+
132
+ | Tier | Mode | What you get | Billing unit |
133
+ |---|---|---|---|
134
+ | `tier-1` | Scan only | File, line, severity, snippet | per report |
135
+ | `tier-2` | Scan only | + risk explanation, effort estimate, CWE, OWASP, references | per report |
136
+ | `tier-3` | Full audit, read-only | + copy-ready patches and test snippets — you apply manually | per report |
137
+ | `tier-4` | Full audit, writes | LLM applies every patch via `write_file` | per applied patch |
82
138
 
83
139
  ```bash
84
- # Start the API server (now powered by Fastify)
85
- npx @lhi/tdd-audit serve --port 3000 --api-key YOUR_SECRET
140
+ # Fast scan just the findings
141
+ npx @lhi/tdd-audit --ai --depth tier-1 --format json
86
142
 
87
- # Scan any path JSON
88
- curl -X POST http://localhost:3000/scan \
89
- -H "Authorization: Bearer YOUR_SECRET" \
90
- -d '{"path": "."}' | jq '.summary'
143
+ # Full report with context, no changes made
144
+ npx @lhi/tdd-audit --ai --depth tier-2 --format json
91
145
 
92
- # Full automated pipeline: scan + remediate in one shot
93
- curl -X POST http://localhost:3000/audit \
94
- -H "Authorization: Bearer YOUR_SECRET" \
95
- -H "Content-Type: application/json" \
96
- -d '{"path": ".", "provider": "anthropic", "apiKey": "sk-ant-..."}' \
97
- | jq '.jobId'
146
+ # Copy-ready patches apply yourself
147
+ npx @lhi/tdd-audit --ai --depth tier-3 --format json
148
+
149
+ # Let the LLM fix everything
150
+ npx @lhi/tdd-audit --ai --depth tier-4 --allow-writes
151
+ ```
98
152
 
99
- # Poll job status
100
- curl http://localhost:3000/jobs/<jobId>
153
+ ### AI flags
101
154
 
102
- # Or stream real-time updates via SSE
103
- curl -N http://localhost:3000/jobs/<jobId>/stream
155
+ | Flag | Description |
156
+ |---|---|
157
+ | `--ai` | Enable LLM agentic audit |
158
+ | `--depth tier-1\|2\|3\|4` | Output depth tier (default: `tier-1`) |
159
+ | `--allow-writes` | Permit the LLM to write files (auto-enabled for `tier-4`) |
160
+ | `--provider <name>` | `anthropic` \| `openai` \| `gemini` \| `ollama` |
161
+ | `--api-key <key>` | Provider API key |
162
+ | `--model <name>` | Model override (e.g. `claude-opus-4-6`, `gpt-4o`) |
163
+ | `--base-url <url>` | Any OpenAI-compatible service |
164
+ | `--format json\|sarif\|report` | Structured output format |
165
+ | `--verbose` | Print tool call details to stderr |
166
+
167
+ ---
168
+
169
+ ## CI integration
170
+
171
+ ### PR gate — block merges on new findings
172
+
173
+ ```yaml
174
+ - run: npx @lhi/tdd-audit@latest --pr --threshold HIGH
175
+ ```
176
+
177
+ Exits non-zero if any finding meets or exceeds the threshold. Sub-second — no AI, no agents, pure static scan. Wire into branch protection rules to stop vulnerable code from merging.
178
+
179
+ ### Org-wide posture scan
180
+
181
+ ```bash
182
+ npx @lhi/tdd-audit@latest --org my-github-org --format report
183
+ ```
184
+
185
+ Scans every repo in the org, produces a cross-org summary and a compliance report. Fires your webhook/Slack with the aggregate payload.
186
+
187
+ ---
188
+
189
+ ## Config file
190
+
191
+ ```bash
192
+ npx @lhi/tdd-audit init # scaffold .tdd-audit.json
193
+ npx @lhi/tdd-audit init --provider anthropic # with Anthropic defaults
194
+ ```
195
+
196
+ `.tdd-audit.json` — everything settable here, CLI flags always win:
104
197
 
105
- # Use any OpenAI-compatible service (Groq, OpenRouter, Together AI, etc.)
106
- npx @lhi/tdd-audit serve \
107
- --provider openai \
108
- --base-url https://api.groq.com/openai/v1 \
109
- --api-key $GROQ_API_KEY \
110
- --model llama-3.3-70b-versatile
198
+ ```json
199
+ {
200
+ "provider": "anthropic",
201
+ "model": "claude-opus-4-6",
202
+ "apiKeyEnv": "ANTHROPIC_API_KEY",
203
+ "severityThreshold": "HIGH",
204
+ "ignore": ["node_modules", "dist", "coverage"],
205
+
206
+ "security_name": "Alice Smith",
207
+ "security_email": "security@yourorg.com",
208
+
209
+ "webhook_url": "https://hooks.yourorg.com/security",
210
+ "slack_webhook": "https://hooks.slack.com/services/...",
211
+ "slack_channel": "#security-alerts",
212
+
213
+ "severity_overrides": {
214
+ "CORS Wildcard": "CRITICAL"
215
+ }
216
+ }
111
217
  ```
112
218
 
113
- Supported providers: `anthropic` · `openai` · `gemini` · `ollama` (local) · **any OpenAI-compatible endpoint via `--base-url`**
219
+ Full schema [docs/configuration.md](docs/configuration.md)
114
220
 
115
- ### Endpoints
221
+ ---
222
+
223
+ ## REST API
224
+
225
+ ```bash
226
+ npx @lhi/tdd-audit serve --port 3000 --api-key YOUR_SECRET
227
+ ```
116
228
 
117
229
  | Method | Path | Auth | Description |
118
230
  |---|---|---|---|
119
- | `GET` | `/health` | No | Version + liveness check |
120
- | `POST` | `/scan` | Yes | Scan a path, return findings |
121
- | `POST` | `/remediate` | Yes | AI-fix a findings list; returns `jobId` |
122
- | `POST` | `/audit` | Yes | Full scan+remediate pipeline; returns `jobId` |
231
+ | `GET` | `/health` | No | Liveness check |
232
+ | `POST` | `/audit/ai` | Yes | LLM audit with depth tiers; returns `jobId` |
233
+ | `POST` | `/scan` | Yes | Static scan; returns findings immediately |
234
+ | `POST` | `/remediate` | Yes | AI-fix a provided findings list; returns `jobId` |
123
235
  | `GET` | `/jobs/:id` | Yes | Poll job status |
124
- | `GET` | `/jobs/:id/stream` | Yes | SSE stream real-time job progress |
125
-
126
- ## Output formats
236
+ | `GET` | `/jobs/:id/stream` | Yes | SSE stream live LLM output |
127
237
 
128
238
  ```bash
129
- npx @lhi/tdd-audit --scan --json # structured JSON
130
- npx @lhi/tdd-audit --scan --format sarif # GitHub code scanning (inline PR annotations)
131
- npx @lhi/tdd-audit --scan # human-readable text (default)
239
+ # Start an AI audit
240
+ curl -X POST http://localhost:3000/audit/ai \
241
+ -H "Authorization: Bearer YOUR_SECRET" \
242
+ -H "Content-Type: application/json" \
243
+ -d '{"provider": "anthropic", "apiKey": "sk-ant-...", "depth": "tier-2"}'
244
+
245
+ # Stream results live
246
+ curl -N http://localhost:3000/jobs/<jobId>/stream
132
247
  ```
133
248
 
134
- ## Testing
249
+ Supported providers: `anthropic` · `openai` · `gemini` · `ollama` · **any OpenAI-compatible endpoint via `--base-url`**
250
+
251
+ ---
135
252
 
136
- 586 tests across unit, E2E, and security suites:
253
+ ## Testing
137
254
 
138
255
  ```bash
139
- npm test # full suite
140
- npm run test:unit # unit tests with coverage (96.6% branch coverage)
256
+ npm test # full suite (841 tests)
257
+ npm run test:unit # unit tests with coverage
141
258
  npm run test:security # security regression tests only
142
259
  npm run test:e2e # end-to-end REST API tests
143
260
  ```
144
261
 
145
- Security tests cover prompt injection, path traversal, rate limiting, timing-safe auth, job store bounds, SARIF schema, and more. See [__tests__/security/](__tests__/security/) for all 22 regression tests.
262
+ Security tests cover: prompt injection · path traversal · SSRF via webhook and baseUrl · rate limiting · timing-safe auth · XFF bypass · job store bounds · SARIF schema · AI key redaction · coverage skip detection · and more. Every past vulnerability is a permanent regression test.
263
+
264
+ ---
146
265
 
147
266
  ## Documentation
148
267
 
149
268
  | | |
150
269
  |---|---|
151
- | [REST API](docs/rest-api.md) | Endpoints, auth, rate limiting, trust-proxy, request/response schema |
152
- | [AI Remediation](docs/ai-remediation.md) | Provider setup, `--base-url` for compatible APIs, config file |
153
- | [Scanner](docs/scanner.md) | Architecture, detection logic, false-positive handling |
154
- | [Vulnerability Patterns](docs/vulnerability-patterns.md) | All 57 patterns — descriptions, grep signatures, fix pointers |
270
+ | [Configuration](docs/configuration.md) | Full schema all fields, CLI equivalents, payload schemas |
271
+ | [REST API](docs/rest-api.md) | Endpoints, auth, rate limiting, depth tiers, targeted apply |
272
+ | [AI Remediation](docs/ai-remediation.md) | Provider setup, `--base-url` for compatible APIs |
273
+ | [Vulnerability Patterns](docs/vulnerability-patterns.md) | All 100+ patterns — descriptions, grep signatures, fix pointers |
155
274
  | [TDD Protocol](docs/tdd-protocol.md) | Red-Green-Refactor in full, with framework templates for all 6 stacks |
156
275
  | [Agentic AI Security](docs/agentic-ai-security.md) | ASI01–ASI10 — prompt injection, MCP supply chain, Actions injection |
157
276
  | [Hardening](docs/hardening.md) | Phase 4 controls — Helmet, CSP, CSRF, rate limiting, gitleaks, SRI |
158
277
  | [CI/CD](docs/ci-cd.md) | Workflow templates, existing pipeline integration, secret leak prevention |
159
278
 
279
+ ---
280
+
160
281
  ## License
161
282
 
162
283
  MIT
package/SKILL.md CHANGED
@@ -1,4 +1,8 @@
1
1
  ---
2
+ # IMPORTANT CONSTRAINTS
3
+ # 1. NEVER add AI attribution trailers to commits (no "Co-Authored-By: Claude ..." lines).
4
+ # 2. NEVER add AI attribution to any file, comment, or commit message.
5
+ # 3. All commits must appear as human-authored work only.
2
6
  name: TDD Remediation Protocol
3
7
  description: A comprehensive toolkit for applying Red-Green-Refactor to fix security vulnerabilities.
4
8
  category: security
@@ -12,6 +16,8 @@ audit_status: safe
12
16
 
13
17
  # TDD Remediation Protocol
14
18
 
19
+ > **Non-negotiable rule:** Never add AI attribution to any commit, file, or comment. Do not append `Co-Authored-By`, `Generated by`, or any similar trailer. All commits must be plain, human-authored messages.
20
+
15
21
  Applying Test-Driven Development (TDD) to code that has already been generated requires Test-Driven Remediation. You must prove the security hole exists by writing a test that exploits it, apply the fix, and then prove the hole is closed.
16
22
 
17
23
  ## Autonomous Audit Mode
@@ -1,6 +1,31 @@
1
1
  # AI Remediation
2
2
 
3
- Pass a provider and API key to have tdd-audit autonomously generate exploit tests, patches, and regression checks for each finding no agent required.
3
+ `tdd-audit --ai` runs a fully agentic LLM audit: the model explores your codebase with tool calls, identifies vulnerabilities, and (depending on the depth tier) provides copy-ready patches or applies them directly. No agent shell required.
4
+
5
+ ---
6
+
7
+ ## Depth tiers
8
+
9
+ | Tier | What you get | Writes files? | Billing unit |
10
+ |---|---|---|---|
11
+ | `tier-1` | Scan report: name, severity, file, line, snippet | no | per report |
12
+ | `tier-2` | + risk, effort, CWE, OWASP category, references | no | per report |
13
+ | `tier-3` | + copy-ready `patch` and `testSnippet` per finding | no | per report |
14
+ | `tier-4` | LLM applies patches via `write_file`; `patchesApplied` count in envelope | yes | per applied patch |
15
+
16
+ ```bash
17
+ # Fast scan report
18
+ npx @lhi/tdd-audit --ai --depth tier-1 --format json
19
+
20
+ # Rich report with CWE/OWASP — review and decide what to fix
21
+ npx @lhi/tdd-audit --ai --depth tier-2 --format json
22
+
23
+ # Patch included in every finding — copy and apply yourself
24
+ npx @lhi/tdd-audit --ai --depth tier-3 --format json
25
+
26
+ # Let the LLM write the fixes
27
+ npx @lhi/tdd-audit --ai --depth tier-4
28
+ ```
4
29
 
5
30
  ---
6
31
 
@@ -25,13 +50,13 @@ Edit `.tdd-audit.json`:
25
50
  `apiKeyEnv` names the environment variable to read the key from — no key ever touches disk. Then just:
26
51
 
27
52
  ```bash
28
- npx @lhi/tdd-audit serve
53
+ npx @lhi/tdd-audit --ai --depth tier-2 --format json
29
54
  ```
30
55
 
31
56
  Point to a config at any path:
32
57
 
33
58
  ```bash
34
- npx @lhi/tdd-audit serve --config ~/configs/my-audit.json
59
+ npx @lhi/tdd-audit --ai --config ~/configs/my-audit.json --depth tier-3
35
60
  ```
36
61
 
37
62
  ---
@@ -40,15 +65,17 @@ npx @lhi/tdd-audit serve --config ~/configs/my-audit.json
40
65
 
41
66
  ```bash
42
67
  # Anthropic
43
- npx @lhi/tdd-audit serve \
68
+ npx @lhi/tdd-audit --ai \
44
69
  --provider anthropic \
45
- --api-key $ANTHROPIC_API_KEY
70
+ --api-key $ANTHROPIC_API_KEY \
71
+ --depth tier-2
46
72
 
47
73
  # OpenAI
48
- npx @lhi/tdd-audit serve \
74
+ npx @lhi/tdd-audit --ai \
49
75
  --provider openai \
50
76
  --api-key $OPENAI_API_KEY \
51
- --model gpt-4o-mini
77
+ --model gpt-4o-mini \
78
+ --depth tier-1
52
79
  ```
53
80
 
54
81
  ---
@@ -60,28 +87,28 @@ The API key is sent in the `Authorization: Bearer` header — never in the URL.
60
87
 
61
88
  ```bash
62
89
  # Groq (fast inference)
63
- npx @lhi/tdd-audit serve \
90
+ npx @lhi/tdd-audit --ai \
64
91
  --provider openai \
65
92
  --base-url https://api.groq.com/openai/v1 \
66
93
  --model llama-3.3-70b-versatile \
67
94
  --api-key $GROQ_API_KEY
68
95
 
69
96
  # OpenRouter (access 200+ models)
70
- npx @lhi/tdd-audit serve \
97
+ npx @lhi/tdd-audit --ai \
71
98
  --provider openai \
72
99
  --base-url https://openrouter.ai/api/v1 \
73
100
  --model meta-llama/llama-3.3-70b-instruct \
74
101
  --api-key $OPENROUTER_API_KEY
75
102
 
76
103
  # Together AI
77
- npx @lhi/tdd-audit serve \
104
+ npx @lhi/tdd-audit --ai \
78
105
  --provider openai \
79
106
  --base-url https://api.together.xyz/v1 \
80
107
  --model mistralai/Mixtral-8x7B-Instruct-v0.1 \
81
108
  --api-key $TOGETHER_API_KEY
82
109
 
83
110
  # LM Studio / vLLM / llama.cpp (fully local)
84
- npx @lhi/tdd-audit serve \
111
+ npx @lhi/tdd-audit --ai \
85
112
  --provider openai \
86
113
  --base-url http://localhost:1234/v1 \
87
114
  --model local-model
@@ -116,53 +143,95 @@ In `.tdd-audit.json`:
116
143
  ## REST API usage
117
144
 
118
145
  ```bash
119
- # 1. Scan and get findings
120
- FINDINGS=$(curl -s -X POST http://localhost:3000/scan \
121
- -H "Authorization: Bearer $SERVER_KEY" \
122
- -H "Content-Type: application/json" \
123
- -d '{"path": "."}' | jq '.findings')
146
+ # Start the server
147
+ npx @lhi/tdd-audit serve --port 3000 --api-key $SERVER_KEY
124
148
 
125
- # 2. Submit remediation job (using Groq via --base-url)
126
- JOB=$(curl -s -X POST http://localhost:3000/remediate \
149
+ # Launch a tier-2 audit job
150
+ JOB=$(curl -s -X POST http://localhost:3000/audit/ai \
127
151
  -H "Authorization: Bearer $SERVER_KEY" \
128
152
  -H "Content-Type: application/json" \
129
153
  -d "{
130
- \"findings\": $FINDINGS,
131
154
  \"provider\": \"openai\",
132
- \"apiKey\": \"$GROQ_API_KEY\",
133
- \"baseUrl\": \"https://api.groq.com/openai/v1\",
134
- \"model\": \"llama-3.3-70b-versatile\",
135
- \"severity\": \"HIGH\"
136
- }")
155
+ \"apiKey\": \"$GROQ_API_KEY\",
156
+ \"baseUrl\": \"https://api.groq.com/openai/v1\",
157
+ \"model\": \"llama-3.3-70b-versatile\",
158
+ \"depth\": \"tier-2\"
159
+ }" | jq -r '.jobId')
160
+
161
+ # Poll until done
162
+ while true; do
163
+ STATUS=$(curl -s http://localhost:3000/jobs/$JOB \
164
+ -H "Authorization: Bearer $SERVER_KEY" | jq -r '.status')
165
+ [ "$STATUS" = "done" ] || [ "$STATUS" = "error" ] && break
166
+ sleep 3
167
+ done
168
+
169
+ # Print summary
170
+ curl -s http://localhost:3000/jobs/$JOB \
171
+ -H "Authorization: Bearer $SERVER_KEY" | jq '.result.summary'
172
+ ```
173
+
174
+ ### Targeted apply (tier-4)
137
175
 
138
- JOB_ID=$(echo $JOB | jq -r '.jobId')
176
+ Take a single finding from a tier-3 report and apply its patch without re-scanning:
177
+
178
+ ```bash
179
+ # Get the first finding from a previous tier-3 job
180
+ FINDING=$(curl -s http://localhost:3000/jobs/$TIER3_JOB_ID \
181
+ -H "Authorization: Bearer $SERVER_KEY" \
182
+ | jq '.result.findings[0]')
139
183
 
140
- # 3. Poll for results
141
- curl -s "http://localhost:3000/jobs/$JOB_ID" \
142
- -H "Authorization: Bearer $SERVER_KEY" | jq '.status'
184
+ # Apply only that patch
185
+ curl -s -X POST http://localhost:3000/audit/ai \
186
+ -H "Authorization: Bearer $SERVER_KEY" \
187
+ -H "Content-Type: application/json" \
188
+ -d "{
189
+ \"provider\": \"anthropic\",
190
+ \"apiKey\": \"$ANTHROPIC_API_KEY\",
191
+ \"depth\": \"tier-4\",
192
+ \"findings\": [$FINDING]
193
+ }" | jq '.jobId'
143
194
  ```
144
195
 
196
+ See [REST API](rest-api.md) for the full endpoint reference.
197
+
145
198
  ---
146
199
 
147
- ## What the model returns
200
+ ## Output envelope
148
201
 
149
- For each finding the remediator sends a structured prompt and expects back:
202
+ All structured output (`--format json` or via REST API) returns the same envelope shape regardless of tier:
150
203
 
151
204
  ```json
152
205
  {
153
- "exploitTest": {
154
- "filename": "__tests__/security/xss-comments.test.js",
155
- "content": "..."
156
- },
157
- "patch": {
158
- "filename": "src/routes/comments.js",
159
- "diff": "--- a/src/routes/comments.js\n+++ ..."
160
- },
161
- "refactorChecks": ["npm test", "npm run test:security"]
206
+ "version": "1.17.0",
207
+ "provider": "anthropic",
208
+ "model": "claude-opus-4-6",
209
+ "depth": "tier-3",
210
+ "mode": "full",
211
+ "stack": "Node.js / Express",
212
+ "summary": { "CRITICAL": 0, "HIGH": 1, "MEDIUM": 2, "LOW": 0 },
213
+ "patchesApplied": 0,
214
+ "findings": [
215
+ {
216
+ "name": "SQL Injection",
217
+ "severity": "HIGH",
218
+ "file": "src/db.js",
219
+ "line": 42,
220
+ "snippet": "db.query(userInput)",
221
+ "risk": "Full database read/write via UNION injection",
222
+ "effort": "low",
223
+ "cwe": "CWE-89",
224
+ "patch": "const stmt = db.prepare('SELECT ...');\nstmt.run(id);",
225
+ "testSnippet": "test('prevents injection', () => { ... });"
226
+ }
227
+ ],
228
+ "likelyFalsePositives": [],
229
+ "remediation": [],
230
+ "scannedAt": "2026-03-26T12:00:00.000Z"
162
231
  }
163
232
  ```
164
233
 
165
- The result is returned as-is from the API review and apply patches manually or pipe into your own automation.
234
+ `patchesApplied` is always `0` for tiers 1–3 (no files written). For tier-4 it counts `remediation` entries where `status === "fixed"` — the billable unit.
166
235
 
167
236
  ---
168
237
 
@@ -174,9 +243,12 @@ ollama pull codellama
174
243
  ollama serve
175
244
 
176
245
  # Run tdd-audit against it
177
- npx @lhi/tdd-audit serve \
246
+ npx @lhi/tdd-audit --ai \
178
247
  --provider ollama \
179
- --model codellama
248
+ --model codellama \
249
+ --depth tier-1
180
250
  ```
181
251
 
182
252
  No API key required. Ollama must be running on `http://localhost:11434`.
253
+
254
+ Ollama does not support the tool-use API, so the audit runs in single-shot mode: the project files are bundled into the prompt rather than explored interactively with tool calls. For best results, use `tier-1` or `tier-2` with Ollama.