@lhi/tdd-audit 1.0.0 → 1.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +77 -16
- package/SKILL.md +5 -5
- package/index.js +244 -32
- package/package.json +31 -4
- package/prompts/auto-audit.md +116 -15
- package/prompts/green-phase.md +202 -4
- package/prompts/red-phase.md +89 -15
- package/prompts/refactor-phase.md +39 -6
- package/templates/sample.exploit.test.go +50 -0
- package/templates/sample.exploit.test.pytest.py +68 -0
- package/templates/sample.exploit.test.vitest.js +35 -0
- package/templates/workflows/security-tests.go.yml +22 -0
- package/templates/workflows/security-tests.node.yml +26 -0
- package/templates/workflows/security-tests.python.yml +25 -0
- package/workflows/tdd-audit.md +12 -2
package/README.md
CHANGED
|
@@ -1,48 +1,109 @@
|
|
|
1
1
|
# @lhi/tdd-audit
|
|
2
2
|
|
|
3
|
-
Anti-Gravity Skill for TDD Remediation.
|
|
3
|
+
Anti-Gravity Skill for TDD Remediation. Patches security vulnerabilities by applying a Test-Driven Remediation (Red-Green-Refactor) protocol — you prove the hole exists, apply the fix, and prove it's closed.
|
|
4
|
+
|
|
5
|
+
## What happens on install
|
|
6
|
+
|
|
7
|
+
Running the installer does five things immediately:
|
|
8
|
+
|
|
9
|
+
1. **Scans your codebase** for common vulnerability patterns (SQL injection, IDOR, XSS, command injection, path traversal, broken auth) and prints findings to stdout
|
|
10
|
+
2. **Scaffolds `__tests__/security/`** with a framework-matched boilerplate exploit test
|
|
11
|
+
3. **Adds `test:security`** to your `package.json` scripts (Node.js projects)
|
|
12
|
+
4. **Creates `.github/workflows/security-tests.yml`** so the CI gate exists from day one
|
|
13
|
+
5. **Installs the `/tdd-audit` workflow shortcode** for your agent
|
|
4
14
|
|
|
5
15
|
## Installation
|
|
6
16
|
|
|
7
|
-
|
|
17
|
+
Install globally so the skill is available across all your projects:
|
|
8
18
|
|
|
9
19
|
```bash
|
|
10
20
|
npx @lhi/tdd-audit
|
|
11
21
|
```
|
|
12
22
|
|
|
13
|
-
Or run
|
|
23
|
+
Or clone and run directly:
|
|
14
24
|
|
|
15
25
|
```bash
|
|
16
26
|
node index.js
|
|
17
27
|
```
|
|
18
28
|
|
|
19
|
-
###
|
|
29
|
+
### Flags
|
|
20
30
|
|
|
21
|
-
|
|
31
|
+
| Flag | Description |
|
|
32
|
+
|---|---|
|
|
33
|
+
| `--local` | Install skill files to the current project directory instead of `~` |
|
|
34
|
+
| `--claude` | Use `.claude/` instead of `.agents/` as the skill directory |
|
|
35
|
+
| `--with-hooks` | Install a pre-commit hook that blocks commits if security tests fail |
|
|
36
|
+
| `--skip-scan` | Skip the automatic vulnerability scan on install |
|
|
22
37
|
|
|
38
|
+
**Install to a Claude Code project with pre-commit protection:**
|
|
23
39
|
```bash
|
|
24
|
-
npx @lhi/tdd-audit --local
|
|
25
|
-
# or
|
|
26
|
-
node index.js --local
|
|
40
|
+
npx @lhi/tdd-audit --local --claude --with-hooks
|
|
27
41
|
```
|
|
28
42
|
|
|
29
|
-
|
|
43
|
+
### Framework Detection
|
|
30
44
|
|
|
31
|
-
|
|
45
|
+
The installer automatically detects your project's test framework and scaffolds the right boilerplate:
|
|
46
|
+
|
|
47
|
+
| Detected | Boilerplate | `test:security` command |
|
|
48
|
+
|---|---|---|
|
|
49
|
+
| `jest` / `supertest` | `sample.exploit.test.js` | `jest --testPathPattern=__tests__/security` |
|
|
50
|
+
| `vitest` | `sample.exploit.test.vitest.js` | `vitest run __tests__/security` |
|
|
51
|
+
| `mocha` | `sample.exploit.test.js` | `mocha '__tests__/security/**/*.spec.js'` |
|
|
52
|
+
| `pytest.ini` / `pyproject.toml` | `sample.exploit.test.pytest.py` | `pytest tests/security/ -v` |
|
|
53
|
+
| `go.mod` | `sample.exploit.test.go` | `go test ./security/... -v` |
|
|
32
54
|
|
|
33
55
|
## Usage
|
|
34
56
|
|
|
35
|
-
Once installed,
|
|
57
|
+
Once installed, trigger the autonomous audit in your agent:
|
|
36
58
|
|
|
37
59
|
```text
|
|
38
60
|
/tdd-audit
|
|
39
61
|
```
|
|
40
62
|
|
|
41
|
-
|
|
42
|
-
1.
|
|
43
|
-
2.
|
|
44
|
-
3.
|
|
45
|
-
|
|
63
|
+
The agent will:
|
|
64
|
+
1. Scan the codebase and present a severity-ranked findings report (CRITICAL / HIGH / MEDIUM / LOW)
|
|
65
|
+
2. Wait for your confirmation before making any changes
|
|
66
|
+
3. For each confirmed vulnerability, apply the full Red-Green-Refactor loop:
|
|
67
|
+
- **Red** — write an exploit test that fails, proving the vulnerability exists
|
|
68
|
+
- **Green** — apply the targeted patch, making the test pass
|
|
69
|
+
- **Refactor** — run the full suite to confirm no regressions
|
|
70
|
+
4. Deliver a final Remediation Summary table
|
|
71
|
+
|
|
72
|
+
The agent works one vulnerability at a time and does not advance until the current one is fully proven closed.
|
|
73
|
+
|
|
74
|
+
## Running security tests manually
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
# Node.js
|
|
78
|
+
npm run test:security
|
|
79
|
+
|
|
80
|
+
# Python
|
|
81
|
+
pytest tests/security/ -v
|
|
82
|
+
|
|
83
|
+
# Go
|
|
84
|
+
go test ./security/... -v
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
## CI/CD
|
|
88
|
+
|
|
89
|
+
The installer creates `.github/workflows/security-tests.yml` for your stack. It runs on every pull request targeting `main` — any exploit test that regresses will block the merge.
|
|
90
|
+
|
|
91
|
+
To add this gate to an existing CI pipeline manually:
|
|
92
|
+
|
|
93
|
+
```yaml
|
|
94
|
+
- name: Run security exploit tests
|
|
95
|
+
run: npm run test:security # or pytest tests/security/, or go test ./security/...
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
## Pre-commit Hook
|
|
99
|
+
|
|
100
|
+
The `--with-hooks` flag appends a security gate to `.git/hooks/pre-commit`. Commits are blocked if any exploit test fails:
|
|
101
|
+
|
|
102
|
+
```
|
|
103
|
+
❌ Security tests failed. Commit blocked.
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
The hook is non-destructive — it appends to any existing hook content rather than overwriting it.
|
|
46
107
|
|
|
47
108
|
## License
|
|
48
109
|
|
package/SKILL.md
CHANGED
|
@@ -9,13 +9,13 @@ Applying Test-Driven Development (TDD) to code that has already been generated r
|
|
|
9
9
|
|
|
10
10
|
## Autonomous Audit Mode
|
|
11
11
|
If the user asks you to "Run the TDD Remediation Auto-Audit" or asks you to implement this on your own:
|
|
12
|
-
1. **Explore**: Proactively use
|
|
13
|
-
2. **Plan**:
|
|
14
|
-
3. **Self-Implement**: For *each* vulnerability
|
|
12
|
+
1. **Explore**: Proactively use `Glob`, `Grep`, and `Read` to scan the repository. Focus on `controllers/`, `routes/`, `api/`, `middleware/`, and database files. Search for anti-patterns: unparameterized SQL queries, missing ownership checks, unsafe HTML rendering, and command injection sinks. Full search patterns are in [auto-audit.md](./prompts/auto-audit.md).
|
|
13
|
+
2. **Plan**: Present a structured list of vulnerabilities (grouped by severity: CRITICAL / HIGH / MEDIUM / LOW) and get confirmation before making any changes.
|
|
14
|
+
3. **Self-Implement**: For *each* confirmed vulnerability, autonomously execute the complete 3-phase protocol:
|
|
15
15
|
- **[Phase 1 (Red)](./prompts/red-phase.md)**: Write the exploit test ensuring it fails.
|
|
16
16
|
- **[Phase 2 (Green)](./prompts/green-phase.md)**: Write the security patch ensuring the test passes.
|
|
17
|
-
- **[Phase 3 (Refactor)](./prompts/refactor-phase.md)**:
|
|
18
|
-
Move methodically through
|
|
17
|
+
- **[Phase 3 (Refactor)](./prompts/refactor-phase.md)**: Run the full test suite and ensure no business logic broke.
|
|
18
|
+
Move methodically through vulnerabilities one by one, CRITICAL-first. Do not advance until the current vulnerability is fully remediated.
|
|
19
19
|
|
|
20
20
|
---
|
|
21
21
|
|
package/index.js
CHANGED
|
@@ -4,56 +4,268 @@ const fs = require('fs');
|
|
|
4
4
|
const path = require('path');
|
|
5
5
|
const os = require('os');
|
|
6
6
|
|
|
7
|
-
const
|
|
7
|
+
const args = process.argv.slice(2);
|
|
8
|
+
const isLocal = args.includes('--local');
|
|
9
|
+
const isClaude = args.includes('--claude');
|
|
10
|
+
const withHooks = args.includes('--with-hooks');
|
|
11
|
+
const skipScan = args.includes('--skip-scan');
|
|
12
|
+
|
|
8
13
|
const agentBaseDir = isLocal ? process.cwd() : os.homedir();
|
|
14
|
+
const agentDirName = isClaude ? '.claude' : '.agents';
|
|
15
|
+
const projectDir = process.cwd();
|
|
16
|
+
|
|
17
|
+
const targetSkillDir = path.join(agentBaseDir, agentDirName, 'skills', 'tdd-remediation');
|
|
18
|
+
const targetWorkflowDir = path.join(agentBaseDir, agentDirName, 'workflows');
|
|
19
|
+
|
|
20
|
+
// ─── 1. Framework Detection ──────────────────────────────────────────────────
|
|
21
|
+
|
|
22
|
+
function detectFramework() {
|
|
23
|
+
const pkgPath = path.join(projectDir, 'package.json');
|
|
24
|
+
if (fs.existsSync(pkgPath)) {
|
|
25
|
+
try {
|
|
26
|
+
const pkg = JSON.parse(fs.readFileSync(pkgPath, 'utf8'));
|
|
27
|
+
const deps = { ...(pkg.dependencies || {}), ...(pkg.devDependencies || {}) };
|
|
28
|
+
if (deps.vitest) return 'vitest';
|
|
29
|
+
if (deps.jest || deps.supertest) return 'jest';
|
|
30
|
+
if (deps.mocha) return 'mocha';
|
|
31
|
+
} catch {}
|
|
32
|
+
}
|
|
33
|
+
if (
|
|
34
|
+
fs.existsSync(path.join(projectDir, 'pytest.ini')) ||
|
|
35
|
+
fs.existsSync(path.join(projectDir, 'pyproject.toml')) ||
|
|
36
|
+
fs.existsSync(path.join(projectDir, 'setup.py')) ||
|
|
37
|
+
fs.existsSync(path.join(projectDir, 'requirements.txt'))
|
|
38
|
+
) return 'pytest';
|
|
39
|
+
if (fs.existsSync(path.join(projectDir, 'go.mod'))) return 'go';
|
|
40
|
+
return 'jest';
|
|
41
|
+
}
|
|
42
|
+
|
|
43
|
+
const framework = detectFramework();
|
|
44
|
+
|
|
45
|
+
// ─── 2. Test Directory Detection ─────────────────────────────────────────────
|
|
46
|
+
|
|
47
|
+
function detectTestBaseDir() {
|
|
48
|
+
// Respect an existing convention before inventing one
|
|
49
|
+
const candidates = ['__tests__', 'tests', 'test', 'spec'];
|
|
50
|
+
for (const dir of candidates) {
|
|
51
|
+
if (fs.existsSync(path.join(projectDir, dir))) return dir;
|
|
52
|
+
}
|
|
53
|
+
// Framework-informed defaults when no directory exists yet
|
|
54
|
+
if (framework === 'pytest') return 'tests';
|
|
55
|
+
if (framework === 'go') return 'test';
|
|
56
|
+
return '__tests__';
|
|
57
|
+
}
|
|
9
58
|
|
|
10
|
-
const
|
|
11
|
-
const
|
|
12
|
-
const targetTestDir = path.join(process.cwd(), '__tests__', 'security');
|
|
59
|
+
const testBaseDir = detectTestBaseDir();
|
|
60
|
+
const targetTestDir = path.join(projectDir, testBaseDir, 'security');
|
|
13
61
|
|
|
14
|
-
|
|
62
|
+
// ─── 3. Quick Scan ───────────────────────────────────────────────────────────
|
|
15
63
|
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
64
|
+
const VULN_PATTERNS = [
|
|
65
|
+
{ name: 'SQL Injection', severity: 'CRITICAL', pattern: /(`SELECT[^`]*\$\{|"SELECT[^"]*"\s*\+|execute\(f"|cursor\.execute\(.*%s|\.query\(`[^`]*\$\{)/i },
|
|
66
|
+
{ name: 'Command Injection', severity: 'CRITICAL', pattern: /\bexec(Sync)?\s*\(.*req\.(params|body|query)|subprocess\.(run|Popen|call)\([^)]*shell\s*=\s*True/i },
|
|
67
|
+
{ name: 'IDOR', severity: 'HIGH', pattern: /findById\s*\(\s*req\.(params|body|query)\.|findOne\s*\(\s*\{[^}]*id\s*:\s*req\.(params|body|query)/i },
|
|
68
|
+
{ name: 'XSS', severity: 'HIGH', pattern: /[^/]innerHTML\s*=(?!=)|dangerouslySetInnerHTML\s*=\s*\{\{|document\.write\s*\(|res\.send\s*\(`[^`]*\$\{req\./i },
|
|
69
|
+
{ name: 'Path Traversal', severity: 'HIGH', pattern: /(readFile|sendFile|createReadStream|open)\s*\(.*req\.(params|body|query)|path\.join\s*\([^)]*req\.(params|body|query)/i },
|
|
70
|
+
{ name: 'Broken Auth', severity: 'HIGH', pattern: /jwt\.decode\s*\((?![^;]*\.verify)|verify\s*:\s*false|secret\s*=\s*['"][a-z0-9]{1,20}['"]/i },
|
|
71
|
+
];
|
|
72
|
+
|
|
73
|
+
const SCAN_EXTENSIONS = new Set(['.js', '.ts', '.jsx', '.tsx', '.mjs', '.py', '.go']);
|
|
74
|
+
const SKIP_DIRS = new Set(['node_modules', '.git', 'dist', 'build', '.next', 'out', '__pycache__', 'venv', '.venv', 'vendor']);
|
|
75
|
+
|
|
76
|
+
function* walkFiles(dir) {
|
|
77
|
+
let entries;
|
|
78
|
+
try { entries = fs.readdirSync(dir, { withFileTypes: true }); } catch { return; }
|
|
79
|
+
for (const entry of entries) {
|
|
80
|
+
if (SKIP_DIRS.has(entry.name)) continue;
|
|
81
|
+
const fullPath = path.join(dir, entry.name);
|
|
82
|
+
if (entry.isDirectory()) yield* walkFiles(fullPath);
|
|
83
|
+
else if (SCAN_EXTENSIONS.has(path.extname(entry.name))) yield fullPath;
|
|
84
|
+
}
|
|
19
85
|
}
|
|
20
86
|
|
|
21
|
-
|
|
22
|
-
const
|
|
23
|
-
for (const
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
87
|
+
function quickScan() {
|
|
88
|
+
const findings = [];
|
|
89
|
+
for (const filePath of walkFiles(projectDir)) {
|
|
90
|
+
let lines;
|
|
91
|
+
try { lines = fs.readFileSync(filePath, 'utf8').split('\n'); } catch { continue; }
|
|
92
|
+
for (let i = 0; i < lines.length; i++) {
|
|
93
|
+
for (const vuln of VULN_PATTERNS) {
|
|
94
|
+
if (vuln.pattern.test(lines[i])) {
|
|
95
|
+
findings.push({
|
|
96
|
+
severity: vuln.severity,
|
|
97
|
+
name: vuln.name,
|
|
98
|
+
file: path.relative(projectDir, filePath),
|
|
99
|
+
line: i + 1,
|
|
100
|
+
snippet: lines[i].trim().slice(0, 80),
|
|
101
|
+
});
|
|
102
|
+
break; // one finding per line
|
|
103
|
+
}
|
|
104
|
+
}
|
|
105
|
+
}
|
|
28
106
|
}
|
|
107
|
+
return findings;
|
|
29
108
|
}
|
|
30
109
|
|
|
31
|
-
|
|
110
|
+
function printFindings(findings) {
|
|
111
|
+
if (findings.length === 0) {
|
|
112
|
+
console.log(' ✅ No obvious vulnerability patterns detected.\n');
|
|
113
|
+
return;
|
|
114
|
+
}
|
|
115
|
+
const bySeverity = { CRITICAL: [], HIGH: [], MEDIUM: [], LOW: [] };
|
|
116
|
+
for (const f of findings) (bySeverity[f.severity] || bySeverity.LOW).push(f);
|
|
117
|
+
const icons = { CRITICAL: '🔴', HIGH: '🟠', MEDIUM: '🟡', LOW: '🔵' };
|
|
118
|
+
|
|
119
|
+
console.log(`\n Found ${findings.length} potential issue(s):\n`);
|
|
120
|
+
for (const [sev, list] of Object.entries(bySeverity)) {
|
|
121
|
+
if (!list.length) continue;
|
|
122
|
+
for (const f of list) {
|
|
123
|
+
console.log(` ${icons[sev]} [${sev}] ${f.name} — ${f.file}:${f.line}`);
|
|
124
|
+
console.log(` ${f.snippet}`);
|
|
125
|
+
}
|
|
126
|
+
}
|
|
127
|
+
console.log('\n Run /tdd-audit in your agent to remediate.\n');
|
|
128
|
+
}
|
|
129
|
+
|
|
130
|
+
// ─── 4. Install Skill Files ───────────────────────────────────────────────────
|
|
131
|
+
|
|
132
|
+
console.log(`\nInstalling TDD Remediation Skill (${isLocal ? 'local' : 'global'}, framework: ${framework}, test dir: ${testBaseDir}/)...\n`);
|
|
133
|
+
|
|
134
|
+
if (!fs.existsSync(targetSkillDir)) fs.mkdirSync(targetSkillDir, { recursive: true });
|
|
135
|
+
|
|
136
|
+
for (const item of ['SKILL.md', 'prompts', 'templates']) {
|
|
137
|
+
const src = path.join(__dirname, item);
|
|
138
|
+
const dest = path.join(targetSkillDir, item);
|
|
139
|
+
if (fs.existsSync(src)) fs.cpSync(src, dest, { recursive: true });
|
|
140
|
+
}
|
|
141
|
+
|
|
142
|
+
// ─── 5. Scaffold Security Test Boilerplate ────────────────────────────────────
|
|
143
|
+
|
|
32
144
|
if (!fs.existsSync(targetTestDir)) {
|
|
33
145
|
fs.mkdirSync(targetTestDir, { recursive: true });
|
|
34
|
-
console.log(
|
|
146
|
+
console.log(`✅ Created ${path.relative(projectDir, targetTestDir)}/`);
|
|
35
147
|
}
|
|
36
148
|
|
|
37
|
-
const
|
|
38
|
-
|
|
149
|
+
const testTemplateMap = {
|
|
150
|
+
jest: 'sample.exploit.test.js',
|
|
151
|
+
vitest: 'sample.exploit.test.vitest.js',
|
|
152
|
+
mocha: 'sample.exploit.test.js',
|
|
153
|
+
pytest: 'sample.exploit.test.pytest.py',
|
|
154
|
+
go: 'sample.exploit.test.go',
|
|
155
|
+
};
|
|
156
|
+
|
|
157
|
+
const testTemplateName = testTemplateMap[framework];
|
|
158
|
+
const srcTest = path.join(__dirname, 'templates', testTemplateName);
|
|
159
|
+
const destTest = path.join(targetTestDir, testTemplateName);
|
|
160
|
+
|
|
161
|
+
if (!fs.existsSync(destTest) && fs.existsSync(srcTest)) {
|
|
162
|
+
fs.copyFileSync(srcTest, destTest);
|
|
163
|
+
console.log(`✅ Scaffolded ${path.relative(projectDir, destTest)}`);
|
|
164
|
+
}
|
|
39
165
|
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
166
|
+
// ─── 6. Install Workflow Shortcode ────────────────────────────────────────────
|
|
167
|
+
|
|
168
|
+
if (!fs.existsSync(targetWorkflowDir)) fs.mkdirSync(targetWorkflowDir, { recursive: true });
|
|
169
|
+
const srcWorkflow = path.join(__dirname, 'workflows', 'tdd-audit.md');
|
|
170
|
+
const destWorkflow = path.join(targetWorkflowDir, 'tdd-audit.md');
|
|
171
|
+
if (fs.existsSync(srcWorkflow)) {
|
|
172
|
+
fs.copyFileSync(srcWorkflow, destWorkflow);
|
|
173
|
+
console.log(`✅ Installed /tdd-audit workflow shortcode`);
|
|
174
|
+
}
|
|
175
|
+
|
|
176
|
+
// ─── 7. Inject test:security into package.json ────────────────────────────────
|
|
177
|
+
|
|
178
|
+
const pkgPath = path.join(projectDir, 'package.json');
|
|
179
|
+
if (framework !== 'pytest' && framework !== 'go' && fs.existsSync(pkgPath)) {
|
|
180
|
+
try {
|
|
181
|
+
const pkg = JSON.parse(fs.readFileSync(pkgPath, 'utf8'));
|
|
182
|
+
if (!pkg.scripts?.['test:security']) {
|
|
183
|
+
pkg.scripts = pkg.scripts || {};
|
|
184
|
+
const secDir = `${testBaseDir}/security`;
|
|
185
|
+
pkg.scripts['test:security'] = {
|
|
186
|
+
jest: `jest --testPathPattern=${secDir} --forceExit`,
|
|
187
|
+
vitest: `vitest run ${secDir}`,
|
|
188
|
+
mocha: `mocha '${secDir}/**/*.spec.js'`,
|
|
189
|
+
}[framework] || `jest --testPathPattern=${secDir} --forceExit`;
|
|
190
|
+
fs.writeFileSync(pkgPath, JSON.stringify(pkg, null, 2) + '\n');
|
|
191
|
+
console.log(`✅ Added "test:security" script to package.json`);
|
|
192
|
+
} else {
|
|
193
|
+
console.log(` "test:security" already in package.json — skipped`);
|
|
194
|
+
}
|
|
195
|
+
} catch (e) {
|
|
196
|
+
console.warn(` ⚠️ Could not update package.json: ${e.message}`);
|
|
197
|
+
}
|
|
43
198
|
}
|
|
44
199
|
|
|
45
|
-
//
|
|
46
|
-
|
|
47
|
-
|
|
200
|
+
// ─── 8. Scaffold CI Workflow ─────────────────────────────────────────────────
|
|
201
|
+
|
|
202
|
+
const ciWorkflowDir = path.join(projectDir, '.github', 'workflows');
|
|
203
|
+
const ciWorkflowPath = path.join(ciWorkflowDir, 'security-tests.yml');
|
|
204
|
+
|
|
205
|
+
if (!fs.existsSync(ciWorkflowPath)) {
|
|
206
|
+
const ciTemplateMap = {
|
|
207
|
+
jest: 'security-tests.node.yml',
|
|
208
|
+
vitest: 'security-tests.node.yml',
|
|
209
|
+
mocha: 'security-tests.node.yml',
|
|
210
|
+
pytest: 'security-tests.python.yml',
|
|
211
|
+
go: 'security-tests.go.yml',
|
|
212
|
+
};
|
|
213
|
+
const ciTemplatePath = path.join(__dirname, 'templates', 'workflows', ciTemplateMap[framework]);
|
|
214
|
+
if (fs.existsSync(ciTemplatePath)) {
|
|
215
|
+
fs.mkdirSync(ciWorkflowDir, { recursive: true });
|
|
216
|
+
fs.copyFileSync(ciTemplatePath, ciWorkflowPath);
|
|
217
|
+
console.log(`✅ Scaffolded .github/workflows/security-tests.yml`);
|
|
218
|
+
}
|
|
219
|
+
} else {
|
|
220
|
+
console.log(` .github/workflows/security-tests.yml already exists — skipped`);
|
|
221
|
+
}
|
|
222
|
+
|
|
223
|
+
// ─── 9. Pre-commit Hook (opt-in) ─────────────────────────────────────────────
|
|
224
|
+
|
|
225
|
+
if (withHooks) {
|
|
226
|
+
const gitDir = path.join(projectDir, '.git');
|
|
227
|
+
if (fs.existsSync(gitDir)) {
|
|
228
|
+
const hooksDir = path.join(gitDir, 'hooks');
|
|
229
|
+
if (!fs.existsSync(hooksDir)) fs.mkdirSync(hooksDir);
|
|
230
|
+
const hookPath = path.join(hooksDir, 'pre-commit');
|
|
231
|
+
|
|
232
|
+
const testCmd = {
|
|
233
|
+
pytest: 'pytest tests/security/ -q',
|
|
234
|
+
go: 'go test ./security/... -v',
|
|
235
|
+
}[framework] || 'npm run test:security --silent';
|
|
236
|
+
|
|
237
|
+
const injection = [
|
|
238
|
+
'# tdd-remediation: security gate',
|
|
239
|
+
testCmd,
|
|
240
|
+
'if [ $? -ne 0 ]; then',
|
|
241
|
+
' printf "\\n\\033[0;31m❌ Security tests failed. Commit blocked.\\033[0m\\n"',
|
|
242
|
+
' exit 1',
|
|
243
|
+
'fi',
|
|
244
|
+
'',
|
|
245
|
+
].join('\n');
|
|
246
|
+
|
|
247
|
+
const existing = fs.existsSync(hookPath) ? fs.readFileSync(hookPath, 'utf8') : '#!/bin/sh\n';
|
|
248
|
+
if (existing.includes('tdd-remediation')) {
|
|
249
|
+
console.log(` Pre-commit hook already has security gate — skipped`);
|
|
250
|
+
} else {
|
|
251
|
+
const newContent = existing.trimEnd() + '\n\n' + injection;
|
|
252
|
+
fs.writeFileSync(hookPath, newContent);
|
|
253
|
+
fs.chmodSync(hookPath, '755');
|
|
254
|
+
console.log(`✅ Installed pre-commit hook (.git/hooks/pre-commit)`);
|
|
255
|
+
}
|
|
256
|
+
} else {
|
|
257
|
+
console.warn(` ⚠️ No .git directory found — skipping pre-commit hook`);
|
|
258
|
+
}
|
|
48
259
|
}
|
|
49
260
|
|
|
50
|
-
|
|
51
|
-
const targetWorkflowFile = path.join(targetWorkflowDir, 'tdd-audit.md');
|
|
261
|
+
// ─── 10. Quick Scan ──────────────────────────────────────────────────────────
|
|
52
262
|
|
|
53
|
-
if (
|
|
54
|
-
|
|
55
|
-
|
|
263
|
+
if (!skipScan) {
|
|
264
|
+
process.stdout.write('\n🔍 Scanning for vulnerability patterns...');
|
|
265
|
+
const findings = quickScan();
|
|
266
|
+
process.stdout.write('\n');
|
|
267
|
+
printFindings(findings);
|
|
56
268
|
}
|
|
57
269
|
|
|
58
|
-
console.log(
|
|
59
|
-
console.log('
|
|
270
|
+
console.log(`\nSkill installed to ${path.relative(os.homedir(), targetSkillDir)}`);
|
|
271
|
+
console.log('Run /tdd-audit in your agent to begin remediation.\n');
|
package/package.json
CHANGED
|
@@ -1,13 +1,40 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@lhi/tdd-audit",
|
|
3
|
-
"version": "1.
|
|
4
|
-
"description": "Anti-Gravity Skill for TDD Remediation",
|
|
3
|
+
"version": "1.1.1",
|
|
4
|
+
"description": "Anti-Gravity Skill for TDD Remediation. Patches security vulnerabilities using a Red-Green-Refactor protocol with automated exploit tests.",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"bin": {
|
|
7
|
-
"tdd-audit": "
|
|
7
|
+
"tdd-audit": "index.js"
|
|
8
8
|
},
|
|
9
|
+
"files": [
|
|
10
|
+
"index.js",
|
|
11
|
+
"SKILL.md",
|
|
12
|
+
"prompts/",
|
|
13
|
+
"templates/",
|
|
14
|
+
"workflows/",
|
|
15
|
+
"README.md",
|
|
16
|
+
"LICENSE"
|
|
17
|
+
],
|
|
9
18
|
"scripts": {
|
|
10
|
-
"test": "
|
|
19
|
+
"test": "node index.js --local --skip-scan && echo 'Smoke test passed'",
|
|
20
|
+
"test:security": "jest --testPathPattern=__tests__/security --forceExit"
|
|
21
|
+
},
|
|
22
|
+
"keywords": [
|
|
23
|
+
"security",
|
|
24
|
+
"tdd",
|
|
25
|
+
"test-driven-development",
|
|
26
|
+
"vulnerability",
|
|
27
|
+
"remediation",
|
|
28
|
+
"exploit",
|
|
29
|
+
"red-green-refactor",
|
|
30
|
+
"owasp",
|
|
31
|
+
"audit",
|
|
32
|
+
"claude",
|
|
33
|
+
"ai-agent",
|
|
34
|
+
"skill"
|
|
35
|
+
],
|
|
36
|
+
"engines": {
|
|
37
|
+
"node": ">=16.7.0"
|
|
11
38
|
},
|
|
12
39
|
"author": "Kyra Lee",
|
|
13
40
|
"license": "MIT"
|
package/prompts/auto-audit.md
CHANGED
|
@@ -1,19 +1,120 @@
|
|
|
1
1
|
# TDD Remediation: Auto-Audit Mode
|
|
2
2
|
|
|
3
|
-
When invoked in Auto-Audit mode,
|
|
3
|
+
When invoked in Auto-Audit mode, proactively secure the user's entire repository without waiting for explicit files to be provided.
|
|
4
4
|
|
|
5
5
|
## Phase 0: Discovery
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
6
|
+
|
|
7
|
+
### 0a. Explore the Architecture
|
|
8
|
+
Use `Glob` and `Read` to understand the project structure. Focus on:
|
|
9
|
+
- `controllers/`, `routes/`, `api/`, `handlers/` — request entry points
|
|
10
|
+
- `services/`, `models/`, `db/`, `repositories/` — data access
|
|
11
|
+
- `middleware/`, `utils/`, `helpers/`, `lib/` — shared utilities
|
|
12
|
+
- Config files: `*.env`, `config.js`, `settings.py` — secrets and security settings
|
|
13
|
+
|
|
14
|
+
### 0b. Search for Anti-Patterns
|
|
15
|
+
Use `Grep` with the following patterns to surface candidates. Read the matched files to confirm before reporting.
|
|
16
|
+
|
|
17
|
+
**SQL Injection**
|
|
18
|
+
```
|
|
19
|
+
`SELECT.*\$\{ # template literal SQL (JS/TS)
|
|
20
|
+
"SELECT.*" \+ # string concatenation SQL (Java/Python/JS)
|
|
21
|
+
execute\(f" # f-string SQL (Python)
|
|
22
|
+
cursor\.execute\(.*% # %-formatted SQL (Python)
|
|
23
|
+
raw\( # Django raw() queries
|
|
24
|
+
\.query\(` # tagged template DB calls
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
**IDOR / Missing Ownership Checks**
|
|
28
|
+
```
|
|
29
|
+
findById\(req\. # lookup directly from request params without user scope
|
|
30
|
+
params\.id # request param used in a DB lookup
|
|
31
|
+
req\.body\.userId # trusting client-supplied user ID
|
|
32
|
+
findOne\(\{.*id:.*req # DB findOne keyed only to request param
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
**XSS / Unsafe Rendering**
|
|
36
|
+
```
|
|
37
|
+
innerHTML\s*= # direct DOM write
|
|
38
|
+
dangerouslySetInnerHTML # React unsafe HTML
|
|
39
|
+
\.write\( # document.write
|
|
40
|
+
res\.send\(.*req\. # reflecting request data directly into response
|
|
41
|
+
render_template_string # Flask dynamic template with user input
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
**Command Injection**
|
|
45
|
+
```
|
|
46
|
+
exec\(.*req\. # exec with request data
|
|
47
|
+
execSync\(.*req\. # execSync with request data
|
|
48
|
+
shell=True # Python subprocess with shell=True
|
|
49
|
+
child_process # review all child_process usages
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
**Path Traversal**
|
|
53
|
+
```
|
|
54
|
+
readFile.*req\. # file read from request param
|
|
55
|
+
sendFile.*req\. # file send from request param
|
|
56
|
+
join.*req\.params # path.join with user input
|
|
57
|
+
open\(.*request\. # Python file open with request data
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
**Broken Authentication**
|
|
61
|
+
```
|
|
62
|
+
jwt\.decode\( # JWT decoded but not verified
|
|
63
|
+
verify.*false # verification disabled
|
|
64
|
+
secret.*=.*['"] # hardcoded secrets
|
|
65
|
+
Bearer.*hardcoded # hardcoded tokens
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
**Missing Rate Limiting**
|
|
69
|
+
```
|
|
70
|
+
router\.(post|put|delete) # mutation routes (check for rate-limit middleware)
|
|
71
|
+
app\.post\( # POST handlers (check for rate-limit middleware)
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
### 0c. Present Findings
|
|
75
|
+
Before touching any code, output a structured **Audit Report** with this format:
|
|
76
|
+
|
|
77
|
+
```
|
|
78
|
+
## Audit Findings
|
|
79
|
+
|
|
80
|
+
### CRITICAL
|
|
81
|
+
- [ ] [SQLi] `src/routes/users.js:34` — raw template literal in SELECT query
|
|
82
|
+
- [ ] [IDOR] `src/controllers/docs.js:87` — findById(req.params.id) with no ownership check
|
|
83
|
+
|
|
84
|
+
### HIGH
|
|
85
|
+
- [ ] [XSS] `src/api/comments.js:52` — req.body.content reflected via res.send()
|
|
86
|
+
- [ ] [CmdInj] `src/utils/export.js:19` — exec() called with req.body.filename
|
|
87
|
+
|
|
88
|
+
### MEDIUM
|
|
89
|
+
- [ ] [PathTraversal] `src/routes/files.js:41` — path.join with req.params.name, no bounds check
|
|
90
|
+
- [ ] [BrokenAuth] `src/middleware/auth.js:12` — JWT decoded without signature verification
|
|
91
|
+
|
|
92
|
+
### LOW / INFORMATIONAL
|
|
93
|
+
- [ ] [RateLimit] `src/routes/auth.js` — /login endpoint has no rate limiting
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
Ask the user to confirm the list before beginning remediation. If they say "fix all" or "proceed", work through them top-down (CRITICAL first).
|
|
97
|
+
|
|
98
|
+
---
|
|
99
|
+
|
|
100
|
+
## Phase 1–3: Remediation Engine
|
|
101
|
+
|
|
102
|
+
For **each** confirmed vulnerability, rigorously apply the RED-GREEN-REFACTOR protocol in order:
|
|
103
|
+
|
|
104
|
+
1. **[RED](./red-phase.md)**: Write the exploit test in the project's security test directory (e.g., `tests/security/`, `__tests__/security/`, `test/security/` — wherever the installer scaffolded the boilerplate) and run it to prove the vulnerability exists (test must fail).
|
|
105
|
+
2. **[GREEN](./green-phase.md)**: Apply the targeted patch. Run the exploit test — it must now pass.
|
|
106
|
+
3. **[REFACTOR](./refactor-phase.md)**: Run the full test suite. All tests must be green before moving on.
|
|
107
|
+
|
|
108
|
+
**Do not move to the next vulnerability until the current one is fully remediated and all tests pass.**
|
|
109
|
+
|
|
110
|
+
After all vulnerabilities are addressed, output a final **Remediation Summary**:
|
|
111
|
+
|
|
112
|
+
```
|
|
113
|
+
## Remediation Summary
|
|
114
|
+
|
|
115
|
+
| Vulnerability | File | Status | Test File |
|
|
116
|
+
|---|---|---|---|
|
|
117
|
+
| SQLi | src/routes/users.js:34 | ✅ Fixed | __tests__/security/sqli-users.test.js |
|
|
118
|
+
| IDOR | src/controllers/docs.js:87 | ✅ Fixed | __tests__/security/idor-docs.test.js |
|
|
119
|
+
| XSS | src/api/comments.js:52 | ✅ Fixed | __tests__/security/xss-comments.test.js |
|
|
120
|
+
```
|
package/prompts/green-phase.md
CHANGED
|
@@ -1,12 +1,210 @@
|
|
|
1
1
|
# TDD Remediation: The Patch (Green Phase)
|
|
2
2
|
|
|
3
|
-
Once the failing test is committed
|
|
3
|
+
Once the failing exploit test is committed, write the minimum code required to make it pass. Do not over-engineer — a targeted fix is safer than a rewrite.
|
|
4
4
|
|
|
5
5
|
## Action
|
|
6
|
-
Apply
|
|
6
|
+
Apply a security patch to the relevant routes, middleware, database layer, or sanitization utilities. Run the test suite. The exploit test from Phase 1 (Red) must now pass.
|
|
7
7
|
|
|
8
8
|
## Protocol
|
|
9
|
-
|
|
9
|
+
1. Identify the **root cause** — not just the symptom. A 500 error is not a security fix.
|
|
10
|
+
2. Apply the narrowest patch that closes the vulnerability.
|
|
11
|
+
3. Run the full test suite. The exploit test must pass AND all pre-existing tests must remain green.
|
|
12
|
+
4. If the test still fails, your patch is incomplete — do not move on.
|
|
10
13
|
|
|
11
14
|
## Goal
|
|
12
|
-
Prove definitively that the specific vulnerability is
|
|
15
|
+
Prove definitively that the specific vulnerability is closed without relying on manual testing, guessing, or superficial UI changes.
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## Vulnerability-Specific Patch Strategies
|
|
20
|
+
|
|
21
|
+
### IDOR (Insecure Direct Object Reference) / Tenant Isolation
|
|
22
|
+
|
|
23
|
+
**Root cause:** Resource lookups that use a user-supplied ID without verifying ownership.
|
|
24
|
+
|
|
25
|
+
**Fix:** Scope every database query to the authenticated user's ID or tenant ID. Never trust the client.
|
|
26
|
+
|
|
27
|
+
```javascript
|
|
28
|
+
// BEFORE (vulnerable)
|
|
29
|
+
const record = await db.records.findById(req.params.id);
|
|
30
|
+
|
|
31
|
+
// AFTER (patched)
|
|
32
|
+
const record = await db.records.findOne({
|
|
33
|
+
id: req.params.id,
|
|
34
|
+
userId: req.user.id, // enforce ownership at query level
|
|
35
|
+
});
|
|
36
|
+
if (!record) return res.status(403).json({ error: 'Forbidden' });
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
```python
|
|
40
|
+
# BEFORE (vulnerable)
|
|
41
|
+
record = db.query(Record).filter(Record.id == record_id).first()
|
|
42
|
+
|
|
43
|
+
# AFTER (patched)
|
|
44
|
+
record = db.query(Record).filter(
|
|
45
|
+
Record.id == record_id,
|
|
46
|
+
Record.user_id == current_user.id # enforce ownership
|
|
47
|
+
).first()
|
|
48
|
+
if not record:
|
|
49
|
+
raise HTTPException(status_code=403, detail="Forbidden")
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
**Libraries:** Built-in ORM scoping; no extra library needed.
|
|
53
|
+
|
|
54
|
+
---
|
|
55
|
+
|
|
56
|
+
### XSS (Cross-Site Scripting)
|
|
57
|
+
|
|
58
|
+
**Root cause:** User input is reflected into HTML, JS, or DOM without encoding or sanitization.
|
|
59
|
+
|
|
60
|
+
**Fix options (choose the appropriate layer):**
|
|
61
|
+
- **Storage:** Sanitize on write using a safe library.
|
|
62
|
+
- **Rendering:** Escape on output; never use `innerHTML` with user data.
|
|
63
|
+
- **API responses:** Set `Content-Type: application/json` strictly; never reflect raw input.
|
|
64
|
+
|
|
65
|
+
```javascript
|
|
66
|
+
// BEFORE (vulnerable — Express)
|
|
67
|
+
res.send(`<p>Hello ${req.query.name}</p>`);
|
|
68
|
+
|
|
69
|
+
// AFTER — Option A: escape on output
|
|
70
|
+
const escapeHtml = require('escape-html');
|
|
71
|
+
res.send(`<p>Hello ${escapeHtml(req.query.name)}</p>`);
|
|
72
|
+
|
|
73
|
+
// AFTER — Option B: sanitize rich HTML (for WYSIWYG content)
|
|
74
|
+
const DOMPurify = require('isomorphic-dompurify');
|
|
75
|
+
const clean = DOMPurify.sanitize(req.body.content, { ALLOWED_TAGS: ['b', 'i', 'em'] });
|
|
76
|
+
res.json({ content: clean });
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
```python
|
|
80
|
+
# BEFORE (vulnerable — Flask/Jinja2 with autoescape disabled)
|
|
81
|
+
return render_template_string(f"<p>{user_input}</p>")
|
|
82
|
+
|
|
83
|
+
# AFTER — Jinja2 autoescape handles it; force it on
|
|
84
|
+
from markupsafe import escape
|
|
85
|
+
return f"<p>{escape(user_input)}</p>"
|
|
86
|
+
|
|
87
|
+
# For sanitizing rich HTML
|
|
88
|
+
import bleach
|
|
89
|
+
clean = bleach.clean(user_input, tags=['b', 'i', 'em'], strip=True)
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
**Libraries:** `escape-html`, `isomorphic-dompurify` (Node); `markupsafe`, `bleach` (Python).
|
|
93
|
+
|
|
94
|
+
---
|
|
95
|
+
|
|
96
|
+
### SQL Injection
|
|
97
|
+
|
|
98
|
+
**Root cause:** User input is concatenated directly into a SQL query string.
|
|
99
|
+
|
|
100
|
+
**Fix:** Use parameterized queries or ORM methods exclusively. Never build SQL strings from user input.
|
|
101
|
+
|
|
102
|
+
```javascript
|
|
103
|
+
// BEFORE (vulnerable)
|
|
104
|
+
const result = await db.query(`SELECT * FROM users WHERE email = '${email}'`);
|
|
105
|
+
|
|
106
|
+
// AFTER — parameterized (node-postgres / pg)
|
|
107
|
+
const result = await db.query('SELECT * FROM users WHERE email = $1', [email]);
|
|
108
|
+
|
|
109
|
+
// AFTER — ORM (Sequelize / Prisma)
|
|
110
|
+
const user = await User.findOne({ where: { email } }); // safe by default
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
```python
|
|
114
|
+
# BEFORE (vulnerable)
|
|
115
|
+
cursor.execute(f"SELECT * FROM users WHERE email = '{email}'")
|
|
116
|
+
|
|
117
|
+
# AFTER — parameterized
|
|
118
|
+
cursor.execute("SELECT * FROM users WHERE email = %s", (email,))
|
|
119
|
+
|
|
120
|
+
# AFTER — ORM (SQLAlchemy)
|
|
121
|
+
user = db.query(User).filter(User.email == email).first()
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
**Libraries:** Use your existing ORM. Never use raw string interpolation for queries.
|
|
125
|
+
|
|
126
|
+
---
|
|
127
|
+
|
|
128
|
+
### Command Injection
|
|
129
|
+
|
|
130
|
+
**Root cause:** User input is passed to `exec`, `spawn`, `subprocess.run(shell=True)`, or similar without validation.
|
|
131
|
+
|
|
132
|
+
**Fix:** Use argument arrays (never shell strings), allowlists, or eliminate the shell call entirely.
|
|
133
|
+
|
|
134
|
+
```javascript
|
|
135
|
+
// BEFORE (vulnerable)
|
|
136
|
+
const { exec } = require('child_process');
|
|
137
|
+
exec(`convert ${req.body.filename} output.png`); // shell injection possible
|
|
138
|
+
|
|
139
|
+
// AFTER — use execFile/spawn with argument array (no shell)
|
|
140
|
+
const { execFile } = require('child_process');
|
|
141
|
+
const safeName = path.basename(req.body.filename); // strip path traversal too
|
|
142
|
+
execFile('convert', [safeName, 'output.png']); // no shell expansion
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
```python
|
|
146
|
+
# BEFORE (vulnerable)
|
|
147
|
+
subprocess.run(f"ffmpeg -i {filename} output.mp4", shell=True)
|
|
148
|
+
|
|
149
|
+
# AFTER — argument list, no shell
|
|
150
|
+
import subprocess, os
|
|
151
|
+
safe_name = os.path.basename(filename)
|
|
152
|
+
subprocess.run(["ffmpeg", "-i", safe_name, "output.mp4"]) # shell=False by default
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
---
|
|
156
|
+
|
|
157
|
+
### Path Traversal
|
|
158
|
+
|
|
159
|
+
**Root cause:** User-supplied file paths are used to read/write files without normalization or bounds checking.
|
|
160
|
+
|
|
161
|
+
**Fix:** Normalize the path and assert it stays within the allowed directory.
|
|
162
|
+
|
|
163
|
+
```javascript
|
|
164
|
+
// BEFORE (vulnerable)
|
|
165
|
+
const filePath = path.join(__dirname, 'uploads', req.params.filename);
|
|
166
|
+
res.sendFile(filePath); // '../../../etc/passwd' bypass possible
|
|
167
|
+
|
|
168
|
+
// AFTER
|
|
169
|
+
const UPLOADS_DIR = path.resolve(__dirname, 'uploads');
|
|
170
|
+
const requested = path.resolve(UPLOADS_DIR, req.params.filename);
|
|
171
|
+
if (!requested.startsWith(UPLOADS_DIR + path.sep)) {
|
|
172
|
+
return res.status(400).json({ error: 'Invalid path' });
|
|
173
|
+
}
|
|
174
|
+
res.sendFile(requested);
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
```python
|
|
178
|
+
# AFTER (Python)
|
|
179
|
+
import os
|
|
180
|
+
UPLOADS_DIR = os.path.realpath("uploads")
|
|
181
|
+
requested = os.path.realpath(os.path.join(UPLOADS_DIR, filename))
|
|
182
|
+
if not requested.startswith(UPLOADS_DIR + os.sep):
|
|
183
|
+
raise HTTPException(status_code=400, detail="Invalid path")
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
---
|
|
187
|
+
|
|
188
|
+
### Broken Authentication / Missing Authorization Middleware
|
|
189
|
+
|
|
190
|
+
**Root cause:** Routes lack authentication checks, or JWTs/sessions are not validated on sensitive endpoints.
|
|
191
|
+
|
|
192
|
+
**Fix:** Apply authentication middleware globally and opt routes out explicitly, rather than opting in per route.
|
|
193
|
+
|
|
194
|
+
```javascript
|
|
195
|
+
// AFTER — Express: apply auth globally, then define public routes above it
|
|
196
|
+
app.get('/health', (req, res) => res.send('ok')); // public
|
|
197
|
+
app.use(requireAuth); // all routes below are protected
|
|
198
|
+
|
|
199
|
+
// Middleware
|
|
200
|
+
function requireAuth(req, res, next) {
|
|
201
|
+
const token = req.headers.authorization?.split(' ')[1];
|
|
202
|
+
if (!token) return res.status(401).json({ error: 'Unauthorized' });
|
|
203
|
+
try {
|
|
204
|
+
req.user = jwt.verify(token, process.env.JWT_SECRET);
|
|
205
|
+
next();
|
|
206
|
+
} catch {
|
|
207
|
+
return res.status(401).json({ error: 'Invalid token' });
|
|
208
|
+
}
|
|
209
|
+
}
|
|
210
|
+
```
|
package/prompts/red-phase.md
CHANGED
|
@@ -17,32 +17,106 @@ Establish a measurable baseline. You now have a weaponized test case.
|
|
|
17
17
|
## Vulnerability-Specific Strategies
|
|
18
18
|
|
|
19
19
|
### IDOR (Insecure Direct Object Reference) / Tenant Isolation
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
20
|
+
Authenticate as User B and request a resource that belongs to User A using its ID directly.
|
|
21
|
+
Assert a 403 Forbidden or 404 Not Found — not a 200 returning someone else's data.
|
|
22
|
+
```javascript
|
|
23
|
+
// Jest/Supertest
|
|
24
|
+
const res = await request(app)
|
|
25
|
+
.get(`/api/documents/${userA_doc_id}`)
|
|
26
|
+
.set('Authorization', `Bearer ${userB_token}`);
|
|
27
|
+
expect(res.status).toBe(403); // currently returns 200 with userA's data — RED
|
|
28
|
+
```
|
|
29
|
+
```python
|
|
30
|
+
# PyTest
|
|
31
|
+
def test_idor_exploit(client, user_b_token, user_a_resource_id):
|
|
32
|
+
res = client.get(f'/api/documents/{user_a_resource_id}',
|
|
33
|
+
headers={'Authorization': f'Bearer {user_b_token}'})
|
|
34
|
+
assert res.status_code == 403 # currently 200 — RED
|
|
35
|
+
```
|
|
23
36
|
|
|
24
37
|
### XSS (Cross-Site Scripting)
|
|
25
|
-
Submit
|
|
26
|
-
|
|
27
|
-
|
|
38
|
+
Submit `<script>alert(1)</script>` or `<img src=x onerror=alert(1)>` as user input.
|
|
39
|
+
Assert the raw response body either HTML-escapes the payload or rejects the input entirely.
|
|
40
|
+
```javascript
|
|
41
|
+
const payload = '<script>alert(1)</script>';
|
|
42
|
+
const res = await request(app).post('/api/comments').send({ body: payload });
|
|
43
|
+
// Should be escaped in the response — currently reflected raw — RED
|
|
44
|
+
expect(res.body.comment.body).not.toContain('<script>');
|
|
45
|
+
expect(res.body.comment.body).toContain('<script>');
|
|
46
|
+
```
|
|
28
47
|
|
|
29
48
|
### SQL Injection
|
|
30
|
-
Submit payloads
|
|
31
|
-
|
|
49
|
+
Submit tautology payloads (`' OR '1'='1`) or union-based extraction attempts.
|
|
50
|
+
Assert a 400 Bad Request or that the response does not return all records.
|
|
51
|
+
```javascript
|
|
52
|
+
const res = await request(app)
|
|
53
|
+
.get('/api/users')
|
|
54
|
+
.query({ email: "' OR '1'='1" });
|
|
55
|
+
expect(res.status).toBe(400); // currently 200 with all user records — RED
|
|
56
|
+
expect(res.body.users).toBeUndefined();
|
|
57
|
+
```
|
|
58
|
+
```python
|
|
59
|
+
def test_sql_injection(client):
|
|
60
|
+
res = client.get('/api/users', params={'email': "' OR '1'='1"})
|
|
61
|
+
assert res.status_code == 400 # currently 200 returning all users — RED
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
### Command Injection
|
|
65
|
+
Submit shell metacharacters in input that gets passed to a shell command.
|
|
66
|
+
Assert the dangerous characters are rejected (400) — not executed.
|
|
67
|
+
```javascript
|
|
68
|
+
const res = await request(app)
|
|
69
|
+
.post('/api/export')
|
|
70
|
+
.send({ filename: 'report.pdf; rm -rf /tmp/test' });
|
|
71
|
+
expect(res.status).toBe(400); // currently executes the command — RED
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
### Path Traversal
|
|
75
|
+
Submit a `../` sequence in a file path parameter.
|
|
76
|
+
Assert a 400 Bad Request or that the server does not serve files outside the uploads directory.
|
|
77
|
+
```javascript
|
|
78
|
+
const res = await request(app)
|
|
79
|
+
.get('/api/files/download')
|
|
80
|
+
.query({ name: '../../../etc/passwd' });
|
|
81
|
+
expect(res.status).toBe(400); // currently returns file contents — RED
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### Broken Authentication (Unprotected Route)
|
|
85
|
+
Call a protected endpoint with no Authorization header.
|
|
86
|
+
Assert a 401 Unauthorized — not a 200 with data.
|
|
87
|
+
```javascript
|
|
88
|
+
const res = await request(app).get('/api/admin/users'); // no auth header
|
|
89
|
+
expect(res.status).toBe(401); // currently returns 200 — RED
|
|
90
|
+
```
|
|
32
91
|
|
|
33
92
|
---
|
|
34
93
|
|
|
35
|
-
## Framework Templates
|
|
94
|
+
## Framework Templates
|
|
36
95
|
|
|
37
96
|
### Jest / Supertest (Node.js)
|
|
38
97
|
```javascript
|
|
39
|
-
const
|
|
40
|
-
|
|
98
|
+
const request = require('supertest');
|
|
99
|
+
const app = require('../../app');
|
|
100
|
+
|
|
101
|
+
describe('[VulnType] - Red Phase', () => {
|
|
102
|
+
it('SHOULD block [exploit description]', async () => {
|
|
103
|
+
const res = await request(app)
|
|
104
|
+
.post('/api/vulnerable-endpoint')
|
|
105
|
+
.send({ input: '<exploit payload>' });
|
|
106
|
+
|
|
107
|
+
expect(res.status).toBe(403); // currently 200 — this test MUST fail (Red)
|
|
108
|
+
expect(res.body.data).not.toContain('<exploit payload>');
|
|
109
|
+
});
|
|
110
|
+
});
|
|
41
111
|
```
|
|
42
112
|
|
|
43
|
-
### PyTest (Python)
|
|
113
|
+
### PyTest (Python / FastAPI / Flask)
|
|
44
114
|
```python
|
|
45
|
-
def
|
|
46
|
-
response = client.
|
|
47
|
-
|
|
115
|
+
def test_vuln_type_exploit(client, attacker_token):
|
|
116
|
+
response = client.post(
|
|
117
|
+
'/api/vulnerable-endpoint',
|
|
118
|
+
json={'input': '<exploit payload>'},
|
|
119
|
+
headers={'Authorization': f'Bearer {attacker_token}'}
|
|
120
|
+
)
|
|
121
|
+
assert response.status_code == 403 # currently 200 — RED
|
|
48
122
|
```
|
|
@@ -1,14 +1,47 @@
|
|
|
1
1
|
# TDD Remediation: Regression & Refactor (Refactor Phase)
|
|
2
2
|
|
|
3
|
-
Security fixes can
|
|
3
|
+
Security fixes can be heavy-handed and break legitimate functionality. The perimeter is now secure — confirm nothing else broke, then clean up.
|
|
4
4
|
|
|
5
5
|
## Action
|
|
6
|
-
Run
|
|
6
|
+
Run the **full** test suite: security tests + all pre-existing functional/integration tests.
|
|
7
7
|
|
|
8
8
|
## Protocol
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
9
|
+
|
|
10
|
+
### Step 1: Verify the Green baseline
|
|
11
|
+
```bash
|
|
12
|
+
npm test # or pytest, go test ./..., etc.
|
|
13
|
+
```
|
|
14
|
+
All tests must be green. If any pre-existing functional test now fails, **stop and revert the security patch.** A security fix that breaks functionality is a failed fix — return to Phase 2 with a narrower approach.
|
|
15
|
+
|
|
16
|
+
### Step 2: Check for regressions by category
|
|
17
|
+
Go through this checklist before closing the vulnerability:
|
|
18
|
+
|
|
19
|
+
- [ ] **Happy-path flows still work** — legitimate users can still access their own resources
|
|
20
|
+
- [ ] **Error messages are safe** — no stack traces, internal paths, or sensitive data leaked in error responses
|
|
21
|
+
- [ ] **Auth bypass not introduced** — the fix doesn't create a new unprotected code path
|
|
22
|
+
- [ ] **Performance acceptable** — the patch doesn't add unbounded DB queries or blocking I/O
|
|
23
|
+
- [ ] **No secrets in code** — patch doesn't hardcode keys, tokens, or credentials
|
|
24
|
+
|
|
25
|
+
### Step 3: Clean the patch
|
|
26
|
+
- Remove any debugging `console.log` or `print` statements added during patching
|
|
27
|
+
- Extract reusable security logic into middleware or utility functions if it appears in more than one place
|
|
28
|
+
- Add a brief comment only if the security rationale is non-obvious (e.g., `// Scope query to owner to prevent IDOR`)
|
|
29
|
+
|
|
30
|
+
### Step 4: Lock it in
|
|
31
|
+
- Ensure the exploit test in `__tests__/security/` has a clear, descriptive name
|
|
32
|
+
- Confirm the test file will be picked up by your CI security test job
|
|
33
|
+
- If applicable, add the CVE reference or ticket ID as a comment in the test
|
|
12
34
|
|
|
13
35
|
## Goal
|
|
14
|
-
|
|
36
|
+
A fully passing test suite (security tests + functional tests) with clean, reviewable code. The vulnerability is provably closed and provably non-regressive.
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## When to revert and retry
|
|
41
|
+
|
|
42
|
+
Revert the patch (git checkout -- <file>) and return to Phase 2 if:
|
|
43
|
+
- A functional test fails after applying the security fix
|
|
44
|
+
- The fix introduces a new 401/403 for a legitimate user flow
|
|
45
|
+
- Performance degrades measurably under load (e.g., O(n) queries replacing O(1))
|
|
46
|
+
|
|
47
|
+
When you retry, describe the constraint to the AI: *"The previous fix broke X — find a narrower approach that still closes the vulnerability."*
|
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
// TDD Remediation: Red Phase Sample Test (Go)
|
|
2
|
+
//
|
|
3
|
+
// Replace the boilerplate below with the specific exploit you are trying to verify.
|
|
4
|
+
// This test MUST fail initially (Red Phase). Once you apply the security fix,
|
|
5
|
+
// this test MUST pass (Green Phase).
|
|
6
|
+
//
|
|
7
|
+
// Place this file in: security/exploit_test.go (or __tests__/security/)
|
|
8
|
+
// Run with: go test ./security/... -v
|
|
9
|
+
|
|
10
|
+
package security_test
|
|
11
|
+
|
|
12
|
+
import (
|
|
13
|
+
"net/http"
|
|
14
|
+
"net/http/httptest"
|
|
15
|
+
"strings"
|
|
16
|
+
"testing"
|
|
17
|
+
|
|
18
|
+
// Update with your module path:
|
|
19
|
+
// "github.com/your-org/your-app/server"
|
|
20
|
+
)
|
|
21
|
+
|
|
22
|
+
func TestShouldNotAllowExploitationOfVulnerability(t *testing.T) {
|
|
23
|
+
// 1. Arrange: set up your router/handler
|
|
24
|
+
// router := server.NewRouter()
|
|
25
|
+
// server := httptest.NewServer(router)
|
|
26
|
+
// defer server.Close()
|
|
27
|
+
|
|
28
|
+
// 2. Act: send the exploit payload
|
|
29
|
+
exploitPayload := `{"input": "exploit payload here"}`
|
|
30
|
+
req, err := http.NewRequest(
|
|
31
|
+
http.MethodPost,
|
|
32
|
+
"/api/vulnerable-endpoint",
|
|
33
|
+
strings.NewReader(exploitPayload),
|
|
34
|
+
)
|
|
35
|
+
if err != nil {
|
|
36
|
+
t.Fatal(err)
|
|
37
|
+
}
|
|
38
|
+
req.Header.Set("Content-Type", "application/json")
|
|
39
|
+
req.Header.Set("Authorization", "Bearer attacker-token-here")
|
|
40
|
+
|
|
41
|
+
rr := httptest.NewRecorder()
|
|
42
|
+
// router.ServeHTTP(rr, req)
|
|
43
|
+
|
|
44
|
+
// 3. Assert: the system MUST block the exploit (currently returns 200 — RED)
|
|
45
|
+
if rr.Code != http.StatusForbidden {
|
|
46
|
+
t.Errorf("expected 403 Forbidden, got %d — vulnerability not blocked (Red Phase)", rr.Code)
|
|
47
|
+
}
|
|
48
|
+
|
|
49
|
+
t.Skip("Replace this boilerplate with your specific exploit test, then remove this Skip")
|
|
50
|
+
}
|
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
"""
|
|
2
|
+
TDD Remediation: Red Phase Sample Test (PyTest)
|
|
3
|
+
|
|
4
|
+
Replace the boilerplate below with the specific exploit you are trying to verify.
|
|
5
|
+
This test MUST fail initially (Red Phase). Once you apply the security fix,
|
|
6
|
+
this test MUST pass (Green Phase).
|
|
7
|
+
|
|
8
|
+
Usage with FastAPI:
|
|
9
|
+
from fastapi.testclient import TestClient
|
|
10
|
+
from app.main import app
|
|
11
|
+
client = TestClient(app)
|
|
12
|
+
|
|
13
|
+
Usage with Flask:
|
|
14
|
+
from app import create_app
|
|
15
|
+
client = create_app().test_client()
|
|
16
|
+
"""
|
|
17
|
+
|
|
18
|
+
import pytest
|
|
19
|
+
|
|
20
|
+
|
|
21
|
+
# Update this fixture to match your app setup
|
|
22
|
+
@pytest.fixture
|
|
23
|
+
def client():
|
|
24
|
+
# FastAPI example:
|
|
25
|
+
# from fastapi.testclient import TestClient
|
|
26
|
+
# from app.main import app
|
|
27
|
+
# return TestClient(app)
|
|
28
|
+
|
|
29
|
+
# Flask example:
|
|
30
|
+
# from app import create_app
|
|
31
|
+
# app = create_app({"TESTING": True})
|
|
32
|
+
# return app.test_client()
|
|
33
|
+
|
|
34
|
+
raise NotImplementedError("Configure the client fixture for your framework")
|
|
35
|
+
|
|
36
|
+
|
|
37
|
+
@pytest.fixture
|
|
38
|
+
def attacker_token():
|
|
39
|
+
"""Return a valid auth token for a different user (the attacker)."""
|
|
40
|
+
# Return a JWT or session token for user B when testing IDOR against user A
|
|
41
|
+
return "attacker-token-here"
|
|
42
|
+
|
|
43
|
+
|
|
44
|
+
class TestSecurityRedPhase:
|
|
45
|
+
|
|
46
|
+
def test_should_not_allow_exploitation_of_vulnerability(self, client, attacker_token):
|
|
47
|
+
"""
|
|
48
|
+
SHOULD NOT allow unauthorized exploitation of [VULNERABILITY].
|
|
49
|
+
This test MUST FAIL before the patch is applied.
|
|
50
|
+
"""
|
|
51
|
+
# 1. Arrange the exploit payload
|
|
52
|
+
exploit_payload = {
|
|
53
|
+
# "input": "' OR '1'='1", # SQL injection example
|
|
54
|
+
# "name": "<script>alert(1)</script>", # XSS example
|
|
55
|
+
}
|
|
56
|
+
|
|
57
|
+
# 2. Act: Execute the exploit against the system
|
|
58
|
+
response = client.post(
|
|
59
|
+
"/api/vulnerable-endpoint",
|
|
60
|
+
json=exploit_payload,
|
|
61
|
+
headers={"Authorization": f"Bearer {attacker_token}"},
|
|
62
|
+
)
|
|
63
|
+
|
|
64
|
+
# 3. Assert: The system MUST block the exploit gracefully (403, 400, or sanitized response)
|
|
65
|
+
assert response.status_code == 403 # currently returns 200 — RED
|
|
66
|
+
|
|
67
|
+
# For XSS or SQLi, ensure the payload is not reflected:
|
|
68
|
+
# assert exploit_payload["input"] not in response.text
|
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* TDD Remediation: Red Phase Sample Test (Vitest)
|
|
3
|
+
*
|
|
4
|
+
* Replace the boilerplate below with the specific exploit you are trying to verify.
|
|
5
|
+
* This test MUST fail initially (Red Phase). Once you apply the security fix,
|
|
6
|
+
* this test MUST pass (Green Phase).
|
|
7
|
+
*/
|
|
8
|
+
|
|
9
|
+
import { describe, it, expect } from 'vitest';
|
|
10
|
+
import supertest from 'supertest';
|
|
11
|
+
import app from '../../app'; // update with the path to your app
|
|
12
|
+
|
|
13
|
+
const request = supertest(app);
|
|
14
|
+
|
|
15
|
+
describe('Security Vulnerability Remediation - Red Phase', () => {
|
|
16
|
+
|
|
17
|
+
it('SHOULD NOT allow unauthorized exploitation of [VULNERABILITY]', async () => {
|
|
18
|
+
// 1. Arrange the exploit payload
|
|
19
|
+
const exploitPayload = {
|
|
20
|
+
// e.g. input: "1; DROP TABLE users"
|
|
21
|
+
};
|
|
22
|
+
|
|
23
|
+
// 2. Act: Execute the exploit against the system
|
|
24
|
+
const response = await request
|
|
25
|
+
.post('/api/vulnerable-endpoint')
|
|
26
|
+
.send(exploitPayload);
|
|
27
|
+
|
|
28
|
+
// 3. Assert: The system MUST block the exploit gracefully (e.g. 403, 400, sanitization)
|
|
29
|
+
expect(response.status).toBe(403);
|
|
30
|
+
|
|
31
|
+
// For XSS or SQLi, ensure the response body does NOT reflect the payload
|
|
32
|
+
// expect(response.body.data).not.toContain(exploitPayload.input);
|
|
33
|
+
});
|
|
34
|
+
|
|
35
|
+
});
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
name: Security Tests
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches: [main, master]
|
|
6
|
+
pull_request:
|
|
7
|
+
branches: [main, master]
|
|
8
|
+
|
|
9
|
+
jobs:
|
|
10
|
+
security-tests:
|
|
11
|
+
name: Exploit Test Suite
|
|
12
|
+
runs-on: ubuntu-latest
|
|
13
|
+
|
|
14
|
+
steps:
|
|
15
|
+
- uses: actions/checkout@v4
|
|
16
|
+
|
|
17
|
+
- uses: actions/setup-go@v5
|
|
18
|
+
with:
|
|
19
|
+
go-version: '1.22'
|
|
20
|
+
|
|
21
|
+
- name: Run security exploit tests
|
|
22
|
+
run: go test ./security/... -v
|
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
name: Security Tests
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches: [main, master]
|
|
6
|
+
pull_request:
|
|
7
|
+
branches: [main, master]
|
|
8
|
+
|
|
9
|
+
jobs:
|
|
10
|
+
security-tests:
|
|
11
|
+
name: Exploit Test Suite
|
|
12
|
+
runs-on: ubuntu-latest
|
|
13
|
+
|
|
14
|
+
steps:
|
|
15
|
+
- uses: actions/checkout@v4
|
|
16
|
+
|
|
17
|
+
- uses: actions/setup-node@v4
|
|
18
|
+
with:
|
|
19
|
+
node-version: '20'
|
|
20
|
+
cache: 'npm'
|
|
21
|
+
|
|
22
|
+
- name: Install dependencies
|
|
23
|
+
run: npm ci
|
|
24
|
+
|
|
25
|
+
- name: Run security exploit tests
|
|
26
|
+
run: npm run test:security
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
name: Security Tests
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches: [main, master]
|
|
6
|
+
pull_request:
|
|
7
|
+
branches: [main, master]
|
|
8
|
+
|
|
9
|
+
jobs:
|
|
10
|
+
security-tests:
|
|
11
|
+
name: Exploit Test Suite
|
|
12
|
+
runs-on: ubuntu-latest
|
|
13
|
+
|
|
14
|
+
steps:
|
|
15
|
+
- uses: actions/checkout@v4
|
|
16
|
+
|
|
17
|
+
- uses: actions/setup-python@v5
|
|
18
|
+
with:
|
|
19
|
+
python-version: '3.12'
|
|
20
|
+
|
|
21
|
+
- name: Install dependencies
|
|
22
|
+
run: pip install -r requirements.txt
|
|
23
|
+
|
|
24
|
+
- name: Run security exploit tests
|
|
25
|
+
run: pytest tests/security/ -v
|
package/workflows/tdd-audit.md
CHANGED
|
@@ -1,6 +1,16 @@
|
|
|
1
1
|
---
|
|
2
2
|
description: Run the complete TDD Remediation Autonomous Audit
|
|
3
3
|
---
|
|
4
|
-
Please use the TDD Remediation Protocol Auto-Audit skill (
|
|
4
|
+
Please use the TDD Remediation Protocol Auto-Audit skill (located in the `skills/tdd-remediation` folder) to secure this repository.
|
|
5
5
|
|
|
6
|
-
|
|
6
|
+
Follow the full Auto-Audit protocol from `auto-audit.md`:
|
|
7
|
+
|
|
8
|
+
1. **Explore** the codebase using Glob, Grep, and Read. Focus on controllers, routes, middleware, and database layers. Search for the vulnerability patterns defined in Phase 0 of the auto-audit prompt.
|
|
9
|
+
2. **Present** a structured Audit Report, grouped by severity (CRITICAL / HIGH / MEDIUM / LOW), and wait for my confirmation before making any changes.
|
|
10
|
+
3. **Remediate** each confirmed vulnerability one at a time, top-down by severity, applying the full Red-Green-Refactor loop:
|
|
11
|
+
- Write the exploit test (Red — must fail)
|
|
12
|
+
- Apply the patch (Green — test must pass)
|
|
13
|
+
- Run the full suite (Refactor — no regressions)
|
|
14
|
+
4. **Report** a final Remediation Summary table when all issues are addressed.
|
|
15
|
+
|
|
16
|
+
Do not skip steps. Do not advance to the next vulnerability until the current one is fully proven closed by a passing test.
|