@lhi/tdd-audit 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,48 +1,109 @@
1
1
  # @lhi/tdd-audit
2
2
 
3
- Anti-Gravity Skill for TDD Remediation. This package securely patches code vulnerabilities by utilizing a Test-Driven Remediation (Red-Green-Refactor) protocol.
3
+ Anti-Gravity Skill for TDD Remediation. Patches security vulnerabilities by applying a Test-Driven Remediation (Red-Green-Refactor) protocol — you prove the hole exists, apply the fix, and prove it's closed.
4
+
5
+ ## What happens on install
6
+
7
+ Running the installer does five things immediately:
8
+
9
+ 1. **Scans your codebase** for common vulnerability patterns (SQL injection, IDOR, XSS, command injection, path traversal, broken auth) and prints findings to stdout
10
+ 2. **Scaffolds `__tests__/security/`** with a framework-matched boilerplate exploit test
11
+ 3. **Adds `test:security`** to your `package.json` scripts (Node.js projects)
12
+ 4. **Creates `.github/workflows/security-tests.yml`** so the CI gate exists from day one
13
+ 5. **Installs the `/tdd-audit` workflow shortcode** for your agent
4
14
 
5
15
  ## Installation
6
16
 
7
- You can install this skill globally so that it is available to the Anti-Gravity agent across all of your projects:
17
+ Install globally so the skill is available across all your projects:
8
18
 
9
19
  ```bash
10
20
  npx @lhi/tdd-audit
11
21
  ```
12
22
 
13
- Or run it directly if you have cloned the repository:
23
+ Or clone and run directly:
14
24
 
15
25
  ```bash
16
26
  node index.js
17
27
  ```
18
28
 
19
- ### Local Installation
29
+ ### Flags
20
30
 
21
- If you prefer to install the skill and its workflow strictly to your current workspace instead of globally, use the `--local` flag:
31
+ | Flag | Description |
32
+ |---|---|
33
+ | `--local` | Install skill files to the current project directory instead of `~` |
34
+ | `--claude` | Use `.claude/` instead of `.agents/` as the skill directory |
35
+ | `--with-hooks` | Install a pre-commit hook that blocks commits if security tests fail |
36
+ | `--skip-scan` | Skip the automatic vulnerability scan on install |
22
37
 
38
+ **Install to a Claude Code project with pre-commit protection:**
23
39
  ```bash
24
- npx @lhi/tdd-audit --local
25
- # or
26
- node index.js --local
40
+ npx @lhi/tdd-audit --local --claude --with-hooks
27
41
  ```
28
42
 
29
- This will create an `.agents` folder in your current directory.
43
+ ### Framework Detection
30
44
 
31
- *Note: Regardless of whether you install globally or locally, the boilerplate security tests will always be scaffolded into your current project's directory at `__tests__/security`.*
45
+ The installer automatically detects your project's test framework and scaffolds the right boilerplate:
46
+
47
+ | Detected | Boilerplate | `test:security` command |
48
+ |---|---|---|
49
+ | `jest` / `supertest` | `sample.exploit.test.js` | `jest --testPathPattern=__tests__/security` |
50
+ | `vitest` | `sample.exploit.test.vitest.js` | `vitest run __tests__/security` |
51
+ | `mocha` | `sample.exploit.test.js` | `mocha '__tests__/security/**/*.spec.js'` |
52
+ | `pytest.ini` / `pyproject.toml` | `sample.exploit.test.pytest.py` | `pytest tests/security/ -v` |
53
+ | `go.mod` | `sample.exploit.test.go` | `go test ./security/... -v` |
32
54
 
33
55
  ## Usage
34
56
 
35
- Once installed, you can trigger the autonomous audit in your Anti-Gravity chat using the provided slash command:
57
+ Once installed, trigger the autonomous audit in your agent:
36
58
 
37
59
  ```text
38
60
  /tdd-audit
39
61
  ```
40
62
 
41
- This will instruct the agent to:
42
- 1. Explore the designated structure to find any vulnerabilities.
43
- 2. Exploit the vulnerability with a failing test (Red).
44
- 3. Patch the flaw to make the test pass (Green).
45
- 4. Ensure no regressions occur (Refactor).
63
+ The agent will:
64
+ 1. Scan the codebase and present a severity-ranked findings report (CRITICAL / HIGH / MEDIUM / LOW)
65
+ 2. Wait for your confirmation before making any changes
66
+ 3. For each confirmed vulnerability, apply the full Red-Green-Refactor loop:
67
+ - **Red** write an exploit test that fails, proving the vulnerability exists
68
+ - **Green** — apply the targeted patch, making the test pass
69
+ - **Refactor** — run the full suite to confirm no regressions
70
+ 4. Deliver a final Remediation Summary table
71
+
72
+ The agent works one vulnerability at a time and does not advance until the current one is fully proven closed.
73
+
74
+ ## Running security tests manually
75
+
76
+ ```bash
77
+ # Node.js
78
+ npm run test:security
79
+
80
+ # Python
81
+ pytest tests/security/ -v
82
+
83
+ # Go
84
+ go test ./security/... -v
85
+ ```
86
+
87
+ ## CI/CD
88
+
89
+ The installer creates `.github/workflows/security-tests.yml` for your stack. It runs on every pull request targeting `main` — any exploit test that regresses will block the merge.
90
+
91
+ To add this gate to an existing CI pipeline manually:
92
+
93
+ ```yaml
94
+ - name: Run security exploit tests
95
+ run: npm run test:security # or pytest tests/security/, or go test ./security/...
96
+ ```
97
+
98
+ ## Pre-commit Hook
99
+
100
+ The `--with-hooks` flag appends a security gate to `.git/hooks/pre-commit`. Commits are blocked if any exploit test fails:
101
+
102
+ ```
103
+ ❌ Security tests failed. Commit blocked.
104
+ ```
105
+
106
+ The hook is non-destructive — it appends to any existing hook content rather than overwriting it.
46
107
 
47
108
  ## License
48
109
 
package/SKILL.md CHANGED
@@ -9,13 +9,13 @@ Applying Test-Driven Development (TDD) to code that has already been generated r
9
9
 
10
10
  ## Autonomous Audit Mode
11
11
  If the user asks you to "Run the TDD Remediation Auto-Audit" or asks you to implement this on your own:
12
- 1. **Explore**: Proactively use your tools (like `grep_search`, `view_file`, and `list_dir`) to scan the user's repository. Focus on `controllers/`, `routes/`, `api/`, and database files. Search for anti-patterns: missing authorization checks, unparameterized SQL queries, and lack of sanitization.
13
- 2. **Plan**: Identify the active vulnerabilities and outline them to the user.
14
- 3. **Self-Implement**: For *each* vulnerability found, autonomously execute the complete 3-phase protocol:
12
+ 1. **Explore**: Proactively use `Glob`, `Grep`, and `Read` to scan the repository. Focus on `controllers/`, `routes/`, `api/`, `middleware/`, and database files. Search for anti-patterns: unparameterized SQL queries, missing ownership checks, unsafe HTML rendering, and command injection sinks. Full search patterns are in [auto-audit.md](./prompts/auto-audit.md).
13
+ 2. **Plan**: Present a structured list of vulnerabilities (grouped by severity: CRITICAL / HIGH / MEDIUM / LOW) and get confirmation before making any changes.
14
+ 3. **Self-Implement**: For *each* confirmed vulnerability, autonomously execute the complete 3-phase protocol:
15
15
  - **[Phase 1 (Red)](./prompts/red-phase.md)**: Write the exploit test ensuring it fails.
16
16
  - **[Phase 2 (Green)](./prompts/green-phase.md)**: Write the security patch ensuring the test passes.
17
- - **[Phase 3 (Refactor)](./prompts/refactor-phase.md)**: Clean the code and ensure no business logic broke.
18
- Move methodically through the vulnerabilities one by one.
17
+ - **[Phase 3 (Refactor)](./prompts/refactor-phase.md)**: Run the full test suite and ensure no business logic broke.
18
+ Move methodically through vulnerabilities one by one, CRITICAL-first. Do not advance until the current vulnerability is fully remediated.
19
19
 
20
20
  ---
21
21
 
package/index.js CHANGED
@@ -4,56 +4,251 @@ const fs = require('fs');
4
4
  const path = require('path');
5
5
  const os = require('os');
6
6
 
7
- const isLocal = process.argv.includes('--local');
7
+ const args = process.argv.slice(2);
8
+ const isLocal = args.includes('--local');
9
+ const isClaude = args.includes('--claude');
10
+ const withHooks = args.includes('--with-hooks');
11
+ const skipScan = args.includes('--skip-scan');
12
+
8
13
  const agentBaseDir = isLocal ? process.cwd() : os.homedir();
14
+ const agentDirName = isClaude ? '.claude' : '.agents';
15
+ const projectDir = process.cwd();
16
+
17
+ const targetSkillDir = path.join(agentBaseDir, agentDirName, 'skills', 'tdd-remediation');
18
+ const targetWorkflowDir = path.join(agentBaseDir, agentDirName, 'workflows');
19
+ const targetTestDir = path.join(projectDir, '__tests__', 'security');
20
+
21
+ // ─── 1. Framework Detection ──────────────────────────────────────────────────
22
+
23
+ function detectFramework() {
24
+ const pkgPath = path.join(projectDir, 'package.json');
25
+ if (fs.existsSync(pkgPath)) {
26
+ try {
27
+ const pkg = JSON.parse(fs.readFileSync(pkgPath, 'utf8'));
28
+ const deps = { ...(pkg.dependencies || {}), ...(pkg.devDependencies || {}) };
29
+ if (deps.vitest) return 'vitest';
30
+ if (deps.jest || deps.supertest) return 'jest';
31
+ if (deps.mocha) return 'mocha';
32
+ } catch {}
33
+ }
34
+ if (
35
+ fs.existsSync(path.join(projectDir, 'pytest.ini')) ||
36
+ fs.existsSync(path.join(projectDir, 'pyproject.toml')) ||
37
+ fs.existsSync(path.join(projectDir, 'setup.py')) ||
38
+ fs.existsSync(path.join(projectDir, 'requirements.txt'))
39
+ ) return 'pytest';
40
+ if (fs.existsSync(path.join(projectDir, 'go.mod'))) return 'go';
41
+ return 'jest';
42
+ }
43
+
44
+ const framework = detectFramework();
45
+
46
+ // ─── 2. Quick Scan ───────────────────────────────────────────────────────────
9
47
 
10
- const targetSkillDir = path.join(agentBaseDir, '.agents', 'skills', 'tdd-remediation');
11
- const targetWorkflowDir = path.join(agentBaseDir, '.agents', 'workflows');
12
- const targetTestDir = path.join(process.cwd(), '__tests__', 'security');
48
+ const VULN_PATTERNS = [
49
+ { name: 'SQL Injection', severity: 'CRITICAL', pattern: /(`SELECT[^`]*\$\{|"SELECT[^"]*"\s*\+|execute\(f"|cursor\.execute\(.*%s|\.query\(`[^`]*\$\{)/i },
50
+ { name: 'Command Injection', severity: 'CRITICAL', pattern: /\bexec(Sync)?\s*\(.*req\.(params|body|query)|subprocess\.(run|Popen|call)\([^)]*shell\s*=\s*True/i },
51
+ { name: 'IDOR', severity: 'HIGH', pattern: /findById\s*\(\s*req\.(params|body|query)\.|findOne\s*\(\s*\{[^}]*id\s*:\s*req\.(params|body|query)/i },
52
+ { name: 'XSS', severity: 'HIGH', pattern: /[^/]innerHTML\s*=(?!=)|dangerouslySetInnerHTML\s*=\s*\{\{|document\.write\s*\(|res\.send\s*\(`[^`]*\$\{req\./i },
53
+ { name: 'Path Traversal', severity: 'HIGH', pattern: /(readFile|sendFile|createReadStream|open)\s*\(.*req\.(params|body|query)|path\.join\s*\([^)]*req\.(params|body|query)/i },
54
+ { name: 'Broken Auth', severity: 'HIGH', pattern: /jwt\.decode\s*\((?![^;]*\.verify)|verify\s*:\s*false|secret\s*=\s*['"][a-z0-9]{1,20}['"]/i },
55
+ ];
13
56
 
14
- console.log(`Installing TDD Remediation Skill (${isLocal ? 'Local' : 'Global'})...`);
57
+ const SCAN_EXTENSIONS = new Set(['.js', '.ts', '.jsx', '.tsx', '.mjs', '.py', '.go']);
58
+ const SKIP_DIRS = new Set(['node_modules', '.git', 'dist', 'build', '.next', 'out', '__pycache__', 'venv', '.venv', 'vendor']);
15
59
 
16
- // 1. Install the Skill
17
- if (!fs.existsSync(targetSkillDir)) {
18
- fs.mkdirSync(targetSkillDir, { recursive: true });
60
+ function* walkFiles(dir) {
61
+ let entries;
62
+ try { entries = fs.readdirSync(dir, { withFileTypes: true }); } catch { return; }
63
+ for (const entry of entries) {
64
+ if (SKIP_DIRS.has(entry.name)) continue;
65
+ const fullPath = path.join(dir, entry.name);
66
+ if (entry.isDirectory()) yield* walkFiles(fullPath);
67
+ else if (SCAN_EXTENSIONS.has(path.extname(entry.name))) yield fullPath;
68
+ }
19
69
  }
20
70
 
21
- // Copy the specific skill files and directories
22
- const filesToCopy = ['SKILL.md', 'prompts', 'templates'];
23
- for (const item of filesToCopy) {
24
- const sourcePath = path.join(__dirname, item);
25
- const targetPath = path.join(targetSkillDir, item);
26
- if (fs.existsSync(sourcePath)) {
27
- fs.cpSync(sourcePath, targetPath, { recursive: true });
71
+ function quickScan() {
72
+ const findings = [];
73
+ for (const filePath of walkFiles(projectDir)) {
74
+ let lines;
75
+ try { lines = fs.readFileSync(filePath, 'utf8').split('\n'); } catch { continue; }
76
+ for (let i = 0; i < lines.length; i++) {
77
+ for (const vuln of VULN_PATTERNS) {
78
+ if (vuln.pattern.test(lines[i])) {
79
+ findings.push({
80
+ severity: vuln.severity,
81
+ name: vuln.name,
82
+ file: path.relative(projectDir, filePath),
83
+ line: i + 1,
84
+ snippet: lines[i].trim().slice(0, 80),
85
+ });
86
+ break; // one finding per line
87
+ }
88
+ }
89
+ }
28
90
  }
91
+ return findings;
92
+ }
93
+
94
+ function printFindings(findings) {
95
+ if (findings.length === 0) {
96
+ console.log(' ✅ No obvious vulnerability patterns detected.\n');
97
+ return;
98
+ }
99
+ const bySeverity = { CRITICAL: [], HIGH: [], MEDIUM: [], LOW: [] };
100
+ for (const f of findings) (bySeverity[f.severity] || bySeverity.LOW).push(f);
101
+ const icons = { CRITICAL: '🔴', HIGH: '🟠', MEDIUM: '🟡', LOW: '🔵' };
102
+
103
+ console.log(`\n Found ${findings.length} potential issue(s):\n`);
104
+ for (const [sev, list] of Object.entries(bySeverity)) {
105
+ if (!list.length) continue;
106
+ for (const f of list) {
107
+ console.log(` ${icons[sev]} [${sev}] ${f.name} — ${f.file}:${f.line}`);
108
+ console.log(` ${f.snippet}`);
109
+ }
110
+ }
111
+ console.log('\n Run /tdd-audit in your agent to remediate.\n');
112
+ }
113
+
114
+ // ─── 3. Install Skill Files ───────────────────────────────────────────────────
115
+
116
+ console.log(`\nInstalling TDD Remediation Skill (${isLocal ? 'local' : 'global'}, framework: ${framework})...\n`);
117
+
118
+ if (!fs.existsSync(targetSkillDir)) fs.mkdirSync(targetSkillDir, { recursive: true });
119
+
120
+ for (const item of ['SKILL.md', 'prompts', 'templates']) {
121
+ const src = path.join(__dirname, item);
122
+ const dest = path.join(targetSkillDir, item);
123
+ if (fs.existsSync(src)) fs.cpSync(src, dest, { recursive: true });
29
124
  }
30
125
 
31
- // 2. Scaffold the security-tests directory
126
+ // ─── 4. Scaffold Security Test Boilerplate ────────────────────────────────────
127
+
32
128
  if (!fs.existsSync(targetTestDir)) {
33
129
  fs.mkdirSync(targetTestDir, { recursive: true });
34
- console.log(`Created security test directory at ${targetTestDir}`);
130
+ console.log(`✅ Created ${path.relative(projectDir, targetTestDir)}/`);
131
+ }
132
+
133
+ const testTemplateMap = {
134
+ jest: 'sample.exploit.test.js',
135
+ vitest: 'sample.exploit.test.vitest.js',
136
+ mocha: 'sample.exploit.test.js',
137
+ pytest: 'sample.exploit.test.pytest.py',
138
+ go: 'sample.exploit.test.go',
139
+ };
140
+
141
+ const testTemplateName = testTemplateMap[framework];
142
+ const srcTest = path.join(__dirname, 'templates', testTemplateName);
143
+ const destTest = path.join(targetTestDir, testTemplateName);
144
+
145
+ if (!fs.existsSync(destTest) && fs.existsSync(srcTest)) {
146
+ fs.copyFileSync(srcTest, destTest);
147
+ console.log(`✅ Scaffolded ${path.relative(projectDir, destTest)}`);
35
148
  }
36
149
 
37
- const sourceTestFile = path.join(__dirname, 'templates', 'sample.exploit.test.js');
38
- const targetTestFile = path.join(targetTestDir, 'sample.exploit.test.js');
150
+ // ─── 5. Install Workflow Shortcode ────────────────────────────────────────────
39
151
 
40
- if (!fs.existsSync(targetTestFile)) {
41
- fs.copyFileSync(sourceTestFile, targetTestFile);
42
- console.log(`Scaffolded boilerplate exploit test at ${targetTestFile}`);
152
+ if (!fs.existsSync(targetWorkflowDir)) fs.mkdirSync(targetWorkflowDir, { recursive: true });
153
+ const srcWorkflow = path.join(__dirname, 'workflows', 'tdd-audit.md');
154
+ const destWorkflow = path.join(targetWorkflowDir, 'tdd-audit.md');
155
+ if (fs.existsSync(srcWorkflow)) {
156
+ fs.copyFileSync(srcWorkflow, destWorkflow);
157
+ console.log(`✅ Installed /tdd-audit workflow shortcode`);
43
158
  }
44
159
 
45
- // 3. Install the workflow shortcode
46
- if (!fs.existsSync(targetWorkflowDir)) {
47
- fs.mkdirSync(targetWorkflowDir, { recursive: true });
160
+ // ─── 6. Inject test:security into package.json ────────────────────────────────
161
+
162
+ const pkgPath = path.join(projectDir, 'package.json');
163
+ if (framework !== 'pytest' && framework !== 'go' && fs.existsSync(pkgPath)) {
164
+ try {
165
+ const pkg = JSON.parse(fs.readFileSync(pkgPath, 'utf8'));
166
+ if (!pkg.scripts?.['test:security']) {
167
+ pkg.scripts = pkg.scripts || {};
168
+ pkg.scripts['test:security'] = {
169
+ jest: 'jest --testPathPattern=__tests__/security --forceExit',
170
+ vitest: 'vitest run __tests__/security',
171
+ mocha: "mocha '__tests__/security/**/*.spec.js'",
172
+ }[framework] || 'jest --testPathPattern=__tests__/security --forceExit';
173
+ fs.writeFileSync(pkgPath, JSON.stringify(pkg, null, 2) + '\n');
174
+ console.log(`✅ Added "test:security" script to package.json`);
175
+ } else {
176
+ console.log(` "test:security" already in package.json — skipped`);
177
+ }
178
+ } catch (e) {
179
+ console.warn(` ⚠️ Could not update package.json: ${e.message}`);
180
+ }
181
+ }
182
+
183
+ // ─── 7. Scaffold CI Workflow ─────────────────────────────────────────────────
184
+
185
+ const ciWorkflowDir = path.join(projectDir, '.github', 'workflows');
186
+ const ciWorkflowPath = path.join(ciWorkflowDir, 'security-tests.yml');
187
+
188
+ if (!fs.existsSync(ciWorkflowPath)) {
189
+ const ciTemplateMap = {
190
+ jest: 'security-tests.node.yml',
191
+ vitest: 'security-tests.node.yml',
192
+ mocha: 'security-tests.node.yml',
193
+ pytest: 'security-tests.python.yml',
194
+ go: 'security-tests.go.yml',
195
+ };
196
+ const ciTemplatePath = path.join(__dirname, 'templates', 'workflows', ciTemplateMap[framework]);
197
+ if (fs.existsSync(ciTemplatePath)) {
198
+ fs.mkdirSync(ciWorkflowDir, { recursive: true });
199
+ fs.copyFileSync(ciTemplatePath, ciWorkflowPath);
200
+ console.log(`✅ Scaffolded .github/workflows/security-tests.yml`);
201
+ }
202
+ } else {
203
+ console.log(` .github/workflows/security-tests.yml already exists — skipped`);
204
+ }
205
+
206
+ // ─── 8. Pre-commit Hook (opt-in) ─────────────────────────────────────────────
207
+
208
+ if (withHooks) {
209
+ const gitDir = path.join(projectDir, '.git');
210
+ if (fs.existsSync(gitDir)) {
211
+ const hooksDir = path.join(gitDir, 'hooks');
212
+ if (!fs.existsSync(hooksDir)) fs.mkdirSync(hooksDir);
213
+ const hookPath = path.join(hooksDir, 'pre-commit');
214
+
215
+ const testCmd = {
216
+ pytest: 'pytest tests/security/ -q',
217
+ go: 'go test ./security/... -v',
218
+ }[framework] || 'npm run test:security --silent';
219
+
220
+ const injection = [
221
+ '# tdd-remediation: security gate',
222
+ testCmd,
223
+ 'if [ $? -ne 0 ]; then',
224
+ ' printf "\\n\\033[0;31m❌ Security tests failed. Commit blocked.\\033[0m\\n"',
225
+ ' exit 1',
226
+ 'fi',
227
+ '',
228
+ ].join('\n');
229
+
230
+ const existing = fs.existsSync(hookPath) ? fs.readFileSync(hookPath, 'utf8') : '#!/bin/sh\n';
231
+ if (existing.includes('tdd-remediation')) {
232
+ console.log(` Pre-commit hook already has security gate — skipped`);
233
+ } else {
234
+ const newContent = existing.trimEnd() + '\n\n' + injection;
235
+ fs.writeFileSync(hookPath, newContent);
236
+ fs.chmodSync(hookPath, '755');
237
+ console.log(`✅ Installed pre-commit hook (.git/hooks/pre-commit)`);
238
+ }
239
+ } else {
240
+ console.warn(` ⚠️ No .git directory found — skipping pre-commit hook`);
241
+ }
48
242
  }
49
243
 
50
- const sourceWorkflowFile = path.join(__dirname, 'workflows', 'tdd-audit.md');
51
- const targetWorkflowFile = path.join(targetWorkflowDir, 'tdd-audit.md');
244
+ // ─── 9. Quick Scan ───────────────────────────────────────────────────────────
52
245
 
53
- if (fs.existsSync(sourceWorkflowFile)) {
54
- fs.copyFileSync(sourceWorkflowFile, targetWorkflowFile);
55
- console.log(`Installed shortcode workflow at ${targetWorkflowFile}`);
246
+ if (!skipScan) {
247
+ process.stdout.write('\n🔍 Scanning for vulnerability patterns...');
248
+ const findings = quickScan();
249
+ process.stdout.write('\n');
250
+ printFindings(findings);
56
251
  }
57
252
 
58
- console.log(`Successfully installed TDD Remediation skill to ${targetSkillDir}`);
59
- console.log('You can now use `/tdd-audit` in your Anti-Gravity chat!');
253
+ console.log(`\nSkill installed to ${path.relative(os.homedir(), targetSkillDir)}`);
254
+ console.log('Run /tdd-audit in your agent to begin remediation.\n');
package/package.json CHANGED
@@ -1,13 +1,40 @@
1
1
  {
2
2
  "name": "@lhi/tdd-audit",
3
- "version": "1.0.0",
4
- "description": "Anti-Gravity Skill for TDD Remediation",
3
+ "version": "1.1.0",
4
+ "description": "Anti-Gravity Skill for TDD Remediation. Patches security vulnerabilities using a Red-Green-Refactor protocol with automated exploit tests.",
5
5
  "main": "index.js",
6
6
  "bin": {
7
- "tdd-audit": "./index.js"
7
+ "tdd-audit": "index.js"
8
8
  },
9
+ "files": [
10
+ "index.js",
11
+ "SKILL.md",
12
+ "prompts/",
13
+ "templates/",
14
+ "workflows/",
15
+ "README.md",
16
+ "LICENSE"
17
+ ],
9
18
  "scripts": {
10
- "test": "echo \"Error: no test specified\" && exit 1"
19
+ "test": "node index.js --local --skip-scan && echo 'Smoke test passed'",
20
+ "test:security": "jest --testPathPattern=__tests__/security --forceExit"
21
+ },
22
+ "keywords": [
23
+ "security",
24
+ "tdd",
25
+ "test-driven-development",
26
+ "vulnerability",
27
+ "remediation",
28
+ "exploit",
29
+ "red-green-refactor",
30
+ "owasp",
31
+ "audit",
32
+ "claude",
33
+ "ai-agent",
34
+ "skill"
35
+ ],
36
+ "engines": {
37
+ "node": ">=16.7.0"
11
38
  },
12
39
  "author": "Kyra Lee",
13
40
  "license": "MIT"
@@ -1,19 +1,120 @@
1
1
  # TDD Remediation: Auto-Audit Mode
2
2
 
3
- When invoked in Auto-Audit mode, you must proactively secure the user's entire repository without waiting for explicit files to be provided.
3
+ When invoked in Auto-Audit mode, proactively secure the user's entire repository without waiting for explicit files to be provided.
4
4
 
5
5
  ## Phase 0: Discovery
6
- 1. **Explore the Architecture**: Use your `list_dir` and `view_file` tools to understand the project structure. Look for directories named `controllers`, `routes`, `api`, `services`, or `models`.
7
- 2. **Search for Anti-Patterns**: Use your `grep_search` tool to look for common vulnerabilities:
8
- - *SQL Injection*: Search for raw query strings, e.g., `` `SELECT * FROM users WHERE id = ${req.body.id}` ``
9
- - *IDOR*: Search for direct lookups without tenant or user ID checks.
10
- - *XSS*: Search for raw HTML rendering `innerHTML`, `dangerouslySetInnerHTML`, or similar sinks.
11
- 3. **Present Findings**: Provide a list of identified vulnerabilities to the user before proceeding.
12
-
13
- ## Phase 1 to 3: Remediation Engine
14
- For each vulnerability approved for fixing, you must rigorously apply the RED-GREEN-REFACTOR protocol:
15
- 1. **[RED](./red-phase.md)**: Write the exploit test in `__tests__/security/` and run it to prove the vulnerability exists.
16
- 2. **[GREEN](./green-phase.md)**: Write the patch and run the tests to prove the exploit is blocked.
17
- 3. **[REFACTOR](./refactor-phase.md)**: Ensure standard functionality is maintained and existing tests pass.
18
-
19
- Do not move to the next vulnerability until the current one is fully remediated and tested.
6
+
7
+ ### 0a. Explore the Architecture
8
+ Use `Glob` and `Read` to understand the project structure. Focus on:
9
+ - `controllers/`, `routes/`, `api/`, `handlers/` request entry points
10
+ - `services/`, `models/`, `db/`, `repositories/` data access
11
+ - `middleware/`, `utils/`, `helpers/`, `lib/` shared utilities
12
+ - Config files: `*.env`, `config.js`, `settings.py` — secrets and security settings
13
+
14
+ ### 0b. Search for Anti-Patterns
15
+ Use `Grep` with the following patterns to surface candidates. Read the matched files to confirm before reporting.
16
+
17
+ **SQL Injection**
18
+ ```
19
+ `SELECT.*\$\{ # template literal SQL (JS/TS)
20
+ "SELECT.*" \+ # string concatenation SQL (Java/Python/JS)
21
+ execute\(f" # f-string SQL (Python)
22
+ cursor\.execute\(.*% # %-formatted SQL (Python)
23
+ raw\( # Django raw() queries
24
+ \.query\(` # tagged template DB calls
25
+ ```
26
+
27
+ **IDOR / Missing Ownership Checks**
28
+ ```
29
+ findById\(req\. # lookup directly from request params without user scope
30
+ params\.id # request param used in a DB lookup
31
+ req\.body\.userId # trusting client-supplied user ID
32
+ findOne\(\{.*id:.*req # DB findOne keyed only to request param
33
+ ```
34
+
35
+ **XSS / Unsafe Rendering**
36
+ ```
37
+ innerHTML\s*= # direct DOM write
38
+ dangerouslySetInnerHTML # React unsafe HTML
39
+ \.write\( # document.write
40
+ res\.send\(.*req\. # reflecting request data directly into response
41
+ render_template_string # Flask dynamic template with user input
42
+ ```
43
+
44
+ **Command Injection**
45
+ ```
46
+ exec\(.*req\. # exec with request data
47
+ execSync\(.*req\. # execSync with request data
48
+ shell=True # Python subprocess with shell=True
49
+ child_process # review all child_process usages
50
+ ```
51
+
52
+ **Path Traversal**
53
+ ```
54
+ readFile.*req\. # file read from request param
55
+ sendFile.*req\. # file send from request param
56
+ join.*req\.params # path.join with user input
57
+ open\(.*request\. # Python file open with request data
58
+ ```
59
+
60
+ **Broken Authentication**
61
+ ```
62
+ jwt\.decode\( # JWT decoded but not verified
63
+ verify.*false # verification disabled
64
+ secret.*=.*['"] # hardcoded secrets
65
+ Bearer.*hardcoded # hardcoded tokens
66
+ ```
67
+
68
+ **Missing Rate Limiting**
69
+ ```
70
+ router\.(post|put|delete) # mutation routes (check for rate-limit middleware)
71
+ app\.post\( # POST handlers (check for rate-limit middleware)
72
+ ```
73
+
74
+ ### 0c. Present Findings
75
+ Before touching any code, output a structured **Audit Report** with this format:
76
+
77
+ ```
78
+ ## Audit Findings
79
+
80
+ ### CRITICAL
81
+ - [ ] [SQLi] `src/routes/users.js:34` — raw template literal in SELECT query
82
+ - [ ] [IDOR] `src/controllers/docs.js:87` — findById(req.params.id) with no ownership check
83
+
84
+ ### HIGH
85
+ - [ ] [XSS] `src/api/comments.js:52` — req.body.content reflected via res.send()
86
+ - [ ] [CmdInj] `src/utils/export.js:19` — exec() called with req.body.filename
87
+
88
+ ### MEDIUM
89
+ - [ ] [PathTraversal] `src/routes/files.js:41` — path.join with req.params.name, no bounds check
90
+ - [ ] [BrokenAuth] `src/middleware/auth.js:12` — JWT decoded without signature verification
91
+
92
+ ### LOW / INFORMATIONAL
93
+ - [ ] [RateLimit] `src/routes/auth.js` — /login endpoint has no rate limiting
94
+ ```
95
+
96
+ Ask the user to confirm the list before beginning remediation. If they say "fix all" or "proceed", work through them top-down (CRITICAL first).
97
+
98
+ ---
99
+
100
+ ## Phase 1–3: Remediation Engine
101
+
102
+ For **each** confirmed vulnerability, rigorously apply the RED-GREEN-REFACTOR protocol in order:
103
+
104
+ 1. **[RED](./red-phase.md)**: Write the exploit test in `__tests__/security/` and run it to prove the vulnerability exists (test must fail).
105
+ 2. **[GREEN](./green-phase.md)**: Apply the targeted patch. Run the exploit test — it must now pass.
106
+ 3. **[REFACTOR](./refactor-phase.md)**: Run the full test suite. All tests must be green before moving on.
107
+
108
+ **Do not move to the next vulnerability until the current one is fully remediated and all tests pass.**
109
+
110
+ After all vulnerabilities are addressed, output a final **Remediation Summary**:
111
+
112
+ ```
113
+ ## Remediation Summary
114
+
115
+ | Vulnerability | File | Status | Test File |
116
+ |---|---|---|---|
117
+ | SQLi | src/routes/users.js:34 | ✅ Fixed | __tests__/security/sqli-users.test.js |
118
+ | IDOR | src/controllers/docs.js:87 | ✅ Fixed | __tests__/security/idor-docs.test.js |
119
+ | XSS | src/api/comments.js:52 | ✅ Fixed | __tests__/security/xss-comments.test.js |
120
+ ```
@@ -1,12 +1,210 @@
1
1
  # TDD Remediation: The Patch (Green Phase)
2
2
 
3
- Once the failing test is committed to the codebase, it is time to write the remediation code.
3
+ Once the failing exploit test is committed, write the minimum code required to make it pass. Do not over-engineer a targeted fix is safer than a rewrite.
4
4
 
5
5
  ## Action
6
- Apply the AI-generated security patch to the relevant routes, database configurations, sanitization utilities, or controllers.
6
+ Apply a security patch to the relevant routes, middleware, database layer, or sanitization utilities. Run the test suite. The exploit test from Phase 1 (Red) must now pass.
7
7
 
8
8
  ## Protocol
9
- Run the test suite again. The exploit test from **Phase 1 (Red)** must now be blocked gracefully resulting in a passing test suite.
9
+ 1. Identify the **root cause** not just the symptom. A 500 error is not a security fix.
10
+ 2. Apply the narrowest patch that closes the vulnerability.
11
+ 3. Run the full test suite. The exploit test must pass AND all pre-existing tests must remain green.
12
+ 4. If the test still fails, your patch is incomplete — do not move on.
10
13
 
11
14
  ## Goal
12
- Prove definitively that the specific vulnerability is patched without relying on manual clicking, guessing, or superficial UI changes. If the test still fails, your security fix is incomplete.
15
+ Prove definitively that the specific vulnerability is closed without relying on manual testing, guessing, or superficial UI changes.
16
+
17
+ ---
18
+
19
+ ## Vulnerability-Specific Patch Strategies
20
+
21
+ ### IDOR (Insecure Direct Object Reference) / Tenant Isolation
22
+
23
+ **Root cause:** Resource lookups that use a user-supplied ID without verifying ownership.
24
+
25
+ **Fix:** Scope every database query to the authenticated user's ID or tenant ID. Never trust the client.
26
+
27
+ ```javascript
28
+ // BEFORE (vulnerable)
29
+ const record = await db.records.findById(req.params.id);
30
+
31
+ // AFTER (patched)
32
+ const record = await db.records.findOne({
33
+ id: req.params.id,
34
+ userId: req.user.id, // enforce ownership at query level
35
+ });
36
+ if (!record) return res.status(403).json({ error: 'Forbidden' });
37
+ ```
38
+
39
+ ```python
40
+ # BEFORE (vulnerable)
41
+ record = db.query(Record).filter(Record.id == record_id).first()
42
+
43
+ # AFTER (patched)
44
+ record = db.query(Record).filter(
45
+ Record.id == record_id,
46
+ Record.user_id == current_user.id # enforce ownership
47
+ ).first()
48
+ if not record:
49
+ raise HTTPException(status_code=403, detail="Forbidden")
50
+ ```
51
+
52
+ **Libraries:** Built-in ORM scoping; no extra library needed.
53
+
54
+ ---
55
+
56
+ ### XSS (Cross-Site Scripting)
57
+
58
+ **Root cause:** User input is reflected into HTML, JS, or DOM without encoding or sanitization.
59
+
60
+ **Fix options (choose the appropriate layer):**
61
+ - **Storage:** Sanitize on write using a safe library.
62
+ - **Rendering:** Escape on output; never use `innerHTML` with user data.
63
+ - **API responses:** Set `Content-Type: application/json` strictly; never reflect raw input.
64
+
65
+ ```javascript
66
+ // BEFORE (vulnerable — Express)
67
+ res.send(`<p>Hello ${req.query.name}</p>`);
68
+
69
+ // AFTER — Option A: escape on output
70
+ const escapeHtml = require('escape-html');
71
+ res.send(`<p>Hello ${escapeHtml(req.query.name)}</p>`);
72
+
73
+ // AFTER — Option B: sanitize rich HTML (for WYSIWYG content)
74
+ const DOMPurify = require('isomorphic-dompurify');
75
+ const clean = DOMPurify.sanitize(req.body.content, { ALLOWED_TAGS: ['b', 'i', 'em'] });
76
+ res.json({ content: clean });
77
+ ```
78
+
79
+ ```python
80
+ # BEFORE (vulnerable — Flask/Jinja2 with autoescape disabled)
81
+ return render_template_string(f"<p>{user_input}</p>")
82
+
83
+ # AFTER — Jinja2 autoescape handles it; force it on
84
+ from markupsafe import escape
85
+ return f"<p>{escape(user_input)}</p>"
86
+
87
+ # For sanitizing rich HTML
88
+ import bleach
89
+ clean = bleach.clean(user_input, tags=['b', 'i', 'em'], strip=True)
90
+ ```
91
+
92
+ **Libraries:** `escape-html`, `isomorphic-dompurify` (Node); `markupsafe`, `bleach` (Python).
93
+
94
+ ---
95
+
96
+ ### SQL Injection
97
+
98
+ **Root cause:** User input is concatenated directly into a SQL query string.
99
+
100
+ **Fix:** Use parameterized queries or ORM methods exclusively. Never build SQL strings from user input.
101
+
102
+ ```javascript
103
+ // BEFORE (vulnerable)
104
+ const result = await db.query(`SELECT * FROM users WHERE email = '${email}'`);
105
+
106
+ // AFTER — parameterized (node-postgres / pg)
107
+ const result = await db.query('SELECT * FROM users WHERE email = $1', [email]);
108
+
109
+ // AFTER — ORM (Sequelize / Prisma)
110
+ const user = await User.findOne({ where: { email } }); // safe by default
111
+ ```
112
+
113
+ ```python
114
+ # BEFORE (vulnerable)
115
+ cursor.execute(f"SELECT * FROM users WHERE email = '{email}'")
116
+
117
+ # AFTER — parameterized
118
+ cursor.execute("SELECT * FROM users WHERE email = %s", (email,))
119
+
120
+ # AFTER — ORM (SQLAlchemy)
121
+ user = db.query(User).filter(User.email == email).first()
122
+ ```
123
+
124
+ **Libraries:** Use your existing ORM. Never use raw string interpolation for queries.
125
+
126
+ ---
127
+
128
+ ### Command Injection
129
+
130
+ **Root cause:** User input is passed to `exec`, `spawn`, `subprocess.run(shell=True)`, or similar without validation.
131
+
132
+ **Fix:** Use argument arrays (never shell strings), allowlists, or eliminate the shell call entirely.
133
+
134
+ ```javascript
135
+ // BEFORE (vulnerable)
136
+ const { exec } = require('child_process');
137
+ exec(`convert ${req.body.filename} output.png`); // shell injection possible
138
+
139
+ // AFTER — use execFile/spawn with argument array (no shell)
140
+ const { execFile } = require('child_process');
141
+ const safeName = path.basename(req.body.filename); // strip path traversal too
142
+ execFile('convert', [safeName, 'output.png']); // no shell expansion
143
+ ```
144
+
145
+ ```python
146
+ # BEFORE (vulnerable)
147
+ subprocess.run(f"ffmpeg -i {filename} output.mp4", shell=True)
148
+
149
+ # AFTER — argument list, no shell
150
+ import subprocess, os
151
+ safe_name = os.path.basename(filename)
152
+ subprocess.run(["ffmpeg", "-i", safe_name, "output.mp4"]) # shell=False by default
153
+ ```
154
+
155
+ ---
156
+
157
+ ### Path Traversal
158
+
159
+ **Root cause:** User-supplied file paths are used to read/write files without normalization or bounds checking.
160
+
161
+ **Fix:** Normalize the path and assert it stays within the allowed directory.
162
+
163
+ ```javascript
164
+ // BEFORE (vulnerable)
165
+ const filePath = path.join(__dirname, 'uploads', req.params.filename);
166
+ res.sendFile(filePath); // '../../../etc/passwd' bypass possible
167
+
168
+ // AFTER
169
+ const UPLOADS_DIR = path.resolve(__dirname, 'uploads');
170
+ const requested = path.resolve(UPLOADS_DIR, req.params.filename);
171
+ if (!requested.startsWith(UPLOADS_DIR + path.sep)) {
172
+ return res.status(400).json({ error: 'Invalid path' });
173
+ }
174
+ res.sendFile(requested);
175
+ ```
176
+
177
+ ```python
178
+ # AFTER (Python)
179
+ import os
180
+ UPLOADS_DIR = os.path.realpath("uploads")
181
+ requested = os.path.realpath(os.path.join(UPLOADS_DIR, filename))
182
+ if not requested.startswith(UPLOADS_DIR + os.sep):
183
+ raise HTTPException(status_code=400, detail="Invalid path")
184
+ ```
185
+
186
+ ---
187
+
188
+ ### Broken Authentication / Missing Authorization Middleware
189
+
190
+ **Root cause:** Routes lack authentication checks, or JWTs/sessions are not validated on sensitive endpoints.
191
+
192
+ **Fix:** Apply authentication middleware globally and opt routes out explicitly, rather than opting in per route.
193
+
194
+ ```javascript
195
+ // AFTER — Express: apply auth globally, then define public routes above it
196
+ app.get('/health', (req, res) => res.send('ok')); // public
197
+ app.use(requireAuth); // all routes below are protected
198
+
199
+ // Middleware
200
+ function requireAuth(req, res, next) {
201
+ const token = req.headers.authorization?.split(' ')[1];
202
+ if (!token) return res.status(401).json({ error: 'Unauthorized' });
203
+ try {
204
+ req.user = jwt.verify(token, process.env.JWT_SECRET);
205
+ next();
206
+ } catch {
207
+ return res.status(401).json({ error: 'Invalid token' });
208
+ }
209
+ }
210
+ ```
@@ -17,32 +17,106 @@ Establish a measurable baseline. You now have a weaponized test case.
17
17
  ## Vulnerability-Specific Strategies
18
18
 
19
19
  ### IDOR (Insecure Direct Object Reference) / Tenant Isolation
20
- Assert that User A receives a 403 Forbidden or 404 Not Found when trying to manipulate User B's resources.
21
- * **Jest/Supertest:** `expect(response.status).toBe(403);`
22
- * **Playwright:** Verify the UI displays an unauthorized banner instead of loading the other user's dashboard.
20
+ Authenticate as User B and request a resource that belongs to User A using its ID directly.
21
+ Assert a 403 Forbidden or 404 Not Found — not a 200 returning someone else's data.
22
+ ```javascript
23
+ // Jest/Supertest
24
+ const res = await request(app)
25
+ .get(`/api/documents/${userA_doc_id}`)
26
+ .set('Authorization', `Bearer ${userB_token}`);
27
+ expect(res.status).toBe(403); // currently returns 200 with userA's data — RED
28
+ ```
29
+ ```python
30
+ # PyTest
31
+ def test_idor_exploit(client, user_b_token, user_a_resource_id):
32
+ res = client.get(f'/api/documents/{user_a_resource_id}',
33
+ headers={'Authorization': f'Bearer {user_b_token}'})
34
+ assert res.status_code == 403 # currently 200 — RED
35
+ ```
23
36
 
24
37
  ### XSS (Cross-Site Scripting)
25
- Submit an aggressive payload like `<script>alert(1)</script>` or `<img src=x onerror=alert(1)>`.
26
- * **Jest/Supertest:** Assert that the raw response body either HTML-escapes the payload (`&lt;script&gt;`) or rejects the input entirely.
27
- * **Playwright:** Attempt to inject the payload in a form field and verify that the script is not evaluated in the DOM.
38
+ Submit `<script>alert(1)</script>` or `<img src=x onerror=alert(1)>` as user input.
39
+ Assert the raw response body either HTML-escapes the payload or rejects the input entirely.
40
+ ```javascript
41
+ const payload = '<script>alert(1)</script>';
42
+ const res = await request(app).post('/api/comments').send({ body: payload });
43
+ // Should be escaped in the response — currently reflected raw — RED
44
+ expect(res.body.comment.body).not.toContain('<script>');
45
+ expect(res.body.comment.body).toContain('&lt;script&gt;');
46
+ ```
28
47
 
29
48
  ### SQL Injection
30
- Submit payloads attempting tautologies (e.g., `' OR 1=1 --`) or union-based extraction.
31
- * **Assertion:** Expect a 400 Bad Request or parameter rejection, and verify that the database did not actually execute the malformed query or return all records.
49
+ Submit tautology payloads (`' OR '1'='1`) or union-based extraction attempts.
50
+ Assert a 400 Bad Request or that the response does not return all records.
51
+ ```javascript
52
+ const res = await request(app)
53
+ .get('/api/users')
54
+ .query({ email: "' OR '1'='1" });
55
+ expect(res.status).toBe(400); // currently 200 with all user records — RED
56
+ expect(res.body.users).toBeUndefined();
57
+ ```
58
+ ```python
59
+ def test_sql_injection(client):
60
+ res = client.get('/api/users', params={'email': "' OR '1'='1"})
61
+ assert res.status_code == 400 # currently 200 returning all users — RED
62
+ ```
63
+
64
+ ### Command Injection
65
+ Submit shell metacharacters in input that gets passed to a shell command.
66
+ Assert the dangerous characters are rejected (400) — not executed.
67
+ ```javascript
68
+ const res = await request(app)
69
+ .post('/api/export')
70
+ .send({ filename: 'report.pdf; rm -rf /tmp/test' });
71
+ expect(res.status).toBe(400); // currently executes the command — RED
72
+ ```
73
+
74
+ ### Path Traversal
75
+ Submit a `../` sequence in a file path parameter.
76
+ Assert a 400 Bad Request or that the server does not serve files outside the uploads directory.
77
+ ```javascript
78
+ const res = await request(app)
79
+ .get('/api/files/download')
80
+ .query({ name: '../../../etc/passwd' });
81
+ expect(res.status).toBe(400); // currently returns file contents — RED
82
+ ```
83
+
84
+ ### Broken Authentication (Unprotected Route)
85
+ Call a protected endpoint with no Authorization header.
86
+ Assert a 401 Unauthorized — not a 200 with data.
87
+ ```javascript
88
+ const res = await request(app).get('/api/admin/users'); // no auth header
89
+ expect(res.status).toBe(401); // currently returns 200 — RED
90
+ ```
32
91
 
33
92
  ---
34
93
 
35
- ## Framework Templates to Provide
94
+ ## Framework Templates
36
95
 
37
96
  ### Jest / Supertest (Node.js)
38
97
  ```javascript
39
- const response = await request(app).post('/api/endpoint').send({ exploit: true });
40
- expect(response.status).toBe(403); // Fails because it currently returns 200
98
+ const request = require('supertest');
99
+ const app = require('../../app');
100
+
101
+ describe('[VulnType] - Red Phase', () => {
102
+ it('SHOULD block [exploit description]', async () => {
103
+ const res = await request(app)
104
+ .post('/api/vulnerable-endpoint')
105
+ .send({ input: '<exploit payload>' });
106
+
107
+ expect(res.status).toBe(403); // currently 200 — this test MUST fail (Red)
108
+ expect(res.body.data).not.toContain('<exploit payload>');
109
+ });
110
+ });
41
111
  ```
42
112
 
43
- ### PyTest (Python)
113
+ ### PyTest (Python / FastAPI / Flask)
44
114
  ```python
45
- def test_idor_exploit(client, user_b_token):
46
- response = client.get('/api/user_a_resource/', headers={'Authorization': f'Bearer {user_b_token}'})
47
- assert response.status_code == 403 # Fails because it currently returns 200
115
+ def test_vuln_type_exploit(client, attacker_token):
116
+ response = client.post(
117
+ '/api/vulnerable-endpoint',
118
+ json={'input': '<exploit payload>'},
119
+ headers={'Authorization': f'Bearer {attacker_token}'}
120
+ )
121
+ assert response.status_code == 403 # currently 200 — RED
48
122
  ```
@@ -1,14 +1,47 @@
1
1
  # TDD Remediation: Regression & Refactor (Refactor Phase)
2
2
 
3
- Security fixes can sometimes be heavy-handed and break core functionality. Now that the perimeter is secure, we must ensure the application still functions.
3
+ Security fixes can be heavy-handed and break legitimate functionality. The perimeter is now secure confirm nothing else broke, then clean up.
4
4
 
5
5
  ## Action
6
- Run standard functional tests alongside the new security tests.
6
+ Run the **full** test suite: security tests + all pre-existing functional/integration tests.
7
7
 
8
8
  ## Protocol
9
- 1. Clean up the code and remove redundancies.
10
- 2. Ensure the intended business logic remains completely intact.
11
- 3. If a functional test breaks, **revert the patch** and prompt the AI to try a different security approach. Security that breaks functionality is not a successful patch.
9
+
10
+ ### Step 1: Verify the Green baseline
11
+ ```bash
12
+ npm test # or pytest, go test ./..., etc.
13
+ ```
14
+ All tests must be green. If any pre-existing functional test now fails, **stop and revert the security patch.** A security fix that breaks functionality is a failed fix — return to Phase 2 with a narrower approach.
15
+
16
+ ### Step 2: Check for regressions by category
17
+ Go through this checklist before closing the vulnerability:
18
+
19
+ - [ ] **Happy-path flows still work** — legitimate users can still access their own resources
20
+ - [ ] **Error messages are safe** — no stack traces, internal paths, or sensitive data leaked in error responses
21
+ - [ ] **Auth bypass not introduced** — the fix doesn't create a new unprotected code path
22
+ - [ ] **Performance acceptable** — the patch doesn't add unbounded DB queries or blocking I/O
23
+ - [ ] **No secrets in code** — patch doesn't hardcode keys, tokens, or credentials
24
+
25
+ ### Step 3: Clean the patch
26
+ - Remove any debugging `console.log` or `print` statements added during patching
27
+ - Extract reusable security logic into middleware or utility functions if it appears in more than one place
28
+ - Add a brief comment only if the security rationale is non-obvious (e.g., `// Scope query to owner to prevent IDOR`)
29
+
30
+ ### Step 4: Lock it in
31
+ - Ensure the exploit test in `__tests__/security/` has a clear, descriptive name
32
+ - Confirm the test file will be picked up by your CI security test job
33
+ - If applicable, add the CVE reference or ticket ID as a comment in the test
12
34
 
13
35
  ## Goal
14
- Maintain the speed and functionality of the rapid prototype while successfully hardening the perimeter. The ultimate goal is a fully passing test suite (security tests + functional tests).
36
+ A fully passing test suite (security tests + functional tests) with clean, reviewable code. The vulnerability is provably closed and provably non-regressive.
37
+
38
+ ---
39
+
40
+ ## When to revert and retry
41
+
42
+ Revert the patch (git checkout -- <file>) and return to Phase 2 if:
43
+ - A functional test fails after applying the security fix
44
+ - The fix introduces a new 401/403 for a legitimate user flow
45
+ - Performance degrades measurably under load (e.g., O(n) queries replacing O(1))
46
+
47
+ When you retry, describe the constraint to the AI: *"The previous fix broke X — find a narrower approach that still closes the vulnerability."*
@@ -0,0 +1,50 @@
1
+ // TDD Remediation: Red Phase Sample Test (Go)
2
+ //
3
+ // Replace the boilerplate below with the specific exploit you are trying to verify.
4
+ // This test MUST fail initially (Red Phase). Once you apply the security fix,
5
+ // this test MUST pass (Green Phase).
6
+ //
7
+ // Place this file in: security/exploit_test.go (or __tests__/security/)
8
+ // Run with: go test ./security/... -v
9
+
10
+ package security_test
11
+
12
+ import (
13
+ "net/http"
14
+ "net/http/httptest"
15
+ "strings"
16
+ "testing"
17
+
18
+ // Update with your module path:
19
+ // "github.com/your-org/your-app/server"
20
+ )
21
+
22
+ func TestShouldNotAllowExploitationOfVulnerability(t *testing.T) {
23
+ // 1. Arrange: set up your router/handler
24
+ // router := server.NewRouter()
25
+ // server := httptest.NewServer(router)
26
+ // defer server.Close()
27
+
28
+ // 2. Act: send the exploit payload
29
+ exploitPayload := `{"input": "exploit payload here"}`
30
+ req, err := http.NewRequest(
31
+ http.MethodPost,
32
+ "/api/vulnerable-endpoint",
33
+ strings.NewReader(exploitPayload),
34
+ )
35
+ if err != nil {
36
+ t.Fatal(err)
37
+ }
38
+ req.Header.Set("Content-Type", "application/json")
39
+ req.Header.Set("Authorization", "Bearer attacker-token-here")
40
+
41
+ rr := httptest.NewRecorder()
42
+ // router.ServeHTTP(rr, req)
43
+
44
+ // 3. Assert: the system MUST block the exploit (currently returns 200 — RED)
45
+ if rr.Code != http.StatusForbidden {
46
+ t.Errorf("expected 403 Forbidden, got %d — vulnerability not blocked (Red Phase)", rr.Code)
47
+ }
48
+
49
+ t.Skip("Replace this boilerplate with your specific exploit test, then remove this Skip")
50
+ }
@@ -0,0 +1,68 @@
1
+ """
2
+ TDD Remediation: Red Phase Sample Test (PyTest)
3
+
4
+ Replace the boilerplate below with the specific exploit you are trying to verify.
5
+ This test MUST fail initially (Red Phase). Once you apply the security fix,
6
+ this test MUST pass (Green Phase).
7
+
8
+ Usage with FastAPI:
9
+ from fastapi.testclient import TestClient
10
+ from app.main import app
11
+ client = TestClient(app)
12
+
13
+ Usage with Flask:
14
+ from app import create_app
15
+ client = create_app().test_client()
16
+ """
17
+
18
+ import pytest
19
+
20
+
21
+ # Update this fixture to match your app setup
22
+ @pytest.fixture
23
+ def client():
24
+ # FastAPI example:
25
+ # from fastapi.testclient import TestClient
26
+ # from app.main import app
27
+ # return TestClient(app)
28
+
29
+ # Flask example:
30
+ # from app import create_app
31
+ # app = create_app({"TESTING": True})
32
+ # return app.test_client()
33
+
34
+ raise NotImplementedError("Configure the client fixture for your framework")
35
+
36
+
37
+ @pytest.fixture
38
+ def attacker_token():
39
+ """Return a valid auth token for a different user (the attacker)."""
40
+ # Return a JWT or session token for user B when testing IDOR against user A
41
+ return "attacker-token-here"
42
+
43
+
44
+ class TestSecurityRedPhase:
45
+
46
+ def test_should_not_allow_exploitation_of_vulnerability(self, client, attacker_token):
47
+ """
48
+ SHOULD NOT allow unauthorized exploitation of [VULNERABILITY].
49
+ This test MUST FAIL before the patch is applied.
50
+ """
51
+ # 1. Arrange the exploit payload
52
+ exploit_payload = {
53
+ # "input": "' OR '1'='1", # SQL injection example
54
+ # "name": "<script>alert(1)</script>", # XSS example
55
+ }
56
+
57
+ # 2. Act: Execute the exploit against the system
58
+ response = client.post(
59
+ "/api/vulnerable-endpoint",
60
+ json=exploit_payload,
61
+ headers={"Authorization": f"Bearer {attacker_token}"},
62
+ )
63
+
64
+ # 3. Assert: The system MUST block the exploit gracefully (403, 400, or sanitized response)
65
+ assert response.status_code == 403 # currently returns 200 — RED
66
+
67
+ # For XSS or SQLi, ensure the payload is not reflected:
68
+ # assert exploit_payload["input"] not in response.text
@@ -0,0 +1,35 @@
1
+ /**
2
+ * TDD Remediation: Red Phase Sample Test (Vitest)
3
+ *
4
+ * Replace the boilerplate below with the specific exploit you are trying to verify.
5
+ * This test MUST fail initially (Red Phase). Once you apply the security fix,
6
+ * this test MUST pass (Green Phase).
7
+ */
8
+
9
+ import { describe, it, expect } from 'vitest';
10
+ import supertest from 'supertest';
11
+ import app from '../../app'; // update with the path to your app
12
+
13
+ const request = supertest(app);
14
+
15
+ describe('Security Vulnerability Remediation - Red Phase', () => {
16
+
17
+ it('SHOULD NOT allow unauthorized exploitation of [VULNERABILITY]', async () => {
18
+ // 1. Arrange the exploit payload
19
+ const exploitPayload = {
20
+ // e.g. input: "1; DROP TABLE users"
21
+ };
22
+
23
+ // 2. Act: Execute the exploit against the system
24
+ const response = await request
25
+ .post('/api/vulnerable-endpoint')
26
+ .send(exploitPayload);
27
+
28
+ // 3. Assert: The system MUST block the exploit gracefully (e.g. 403, 400, sanitization)
29
+ expect(response.status).toBe(403);
30
+
31
+ // For XSS or SQLi, ensure the response body does NOT reflect the payload
32
+ // expect(response.body.data).not.toContain(exploitPayload.input);
33
+ });
34
+
35
+ });
@@ -0,0 +1,22 @@
1
+ name: Security Tests
2
+
3
+ on:
4
+ push:
5
+ branches: [main, master]
6
+ pull_request:
7
+ branches: [main, master]
8
+
9
+ jobs:
10
+ security-tests:
11
+ name: Exploit Test Suite
12
+ runs-on: ubuntu-latest
13
+
14
+ steps:
15
+ - uses: actions/checkout@v4
16
+
17
+ - uses: actions/setup-go@v5
18
+ with:
19
+ go-version: '1.22'
20
+
21
+ - name: Run security exploit tests
22
+ run: go test ./security/... -v
@@ -0,0 +1,26 @@
1
+ name: Security Tests
2
+
3
+ on:
4
+ push:
5
+ branches: [main, master]
6
+ pull_request:
7
+ branches: [main, master]
8
+
9
+ jobs:
10
+ security-tests:
11
+ name: Exploit Test Suite
12
+ runs-on: ubuntu-latest
13
+
14
+ steps:
15
+ - uses: actions/checkout@v4
16
+
17
+ - uses: actions/setup-node@v4
18
+ with:
19
+ node-version: '20'
20
+ cache: 'npm'
21
+
22
+ - name: Install dependencies
23
+ run: npm ci
24
+
25
+ - name: Run security exploit tests
26
+ run: npm run test:security
@@ -0,0 +1,25 @@
1
+ name: Security Tests
2
+
3
+ on:
4
+ push:
5
+ branches: [main, master]
6
+ pull_request:
7
+ branches: [main, master]
8
+
9
+ jobs:
10
+ security-tests:
11
+ name: Exploit Test Suite
12
+ runs-on: ubuntu-latest
13
+
14
+ steps:
15
+ - uses: actions/checkout@v4
16
+
17
+ - uses: actions/setup-python@v5
18
+ with:
19
+ python-version: '3.12'
20
+
21
+ - name: Install dependencies
22
+ run: pip install -r requirements.txt
23
+
24
+ - name: Run security exploit tests
25
+ run: pytest tests/security/ -v
@@ -1,6 +1,16 @@
1
1
  ---
2
2
  description: Run the complete TDD Remediation Autonomous Audit
3
3
  ---
4
- Please use the TDD Remediation Protocol Auto-Audit skill (`.agents/skills/tdd-remediation/SKILL.md`) to secure this repository.
4
+ Please use the TDD Remediation Protocol Auto-Audit skill (located in the `skills/tdd-remediation` folder) to secure this repository.
5
5
 
6
- Begin by exploring the structure to find any vulnerabilities or anti-patterns in the codebase. Then, for every issue you find, show me the list of vulnerabilities, and rigorously apply the Red-Green-Refactor loop to write the exploit tests, patch the flaws, and ensure no regressions occurred.
6
+ Follow the full Auto-Audit protocol from `auto-audit.md`:
7
+
8
+ 1. **Explore** the codebase using Glob, Grep, and Read. Focus on controllers, routes, middleware, and database layers. Search for the vulnerability patterns defined in Phase 0 of the auto-audit prompt.
9
+ 2. **Present** a structured Audit Report, grouped by severity (CRITICAL / HIGH / MEDIUM / LOW), and wait for my confirmation before making any changes.
10
+ 3. **Remediate** each confirmed vulnerability one at a time, top-down by severity, applying the full Red-Green-Refactor loop:
11
+ - Write the exploit test (Red — must fail)
12
+ - Apply the patch (Green — test must pass)
13
+ - Run the full suite (Refactor — no regressions)
14
+ 4. **Report** a final Remediation Summary table when all issues are addressed.
15
+
16
+ Do not skip steps. Do not advance to the next vulnerability until the current one is fully proven closed by a passing test.