@codexstar/bug-hunter 3.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +151 -0
- package/LICENSE +21 -0
- package/README.md +665 -0
- package/SKILL.md +624 -0
- package/bin/bug-hunter +222 -0
- package/evals/evals.json +362 -0
- package/modes/_dispatch.md +121 -0
- package/modes/extended.md +94 -0
- package/modes/fix-loop.md +115 -0
- package/modes/fix-pipeline.md +384 -0
- package/modes/large-codebase.md +212 -0
- package/modes/local-sequential.md +143 -0
- package/modes/loop.md +125 -0
- package/modes/parallel.md +113 -0
- package/modes/scaled.md +76 -0
- package/modes/single-file.md +38 -0
- package/modes/small.md +86 -0
- package/package.json +56 -0
- package/prompts/doc-lookup.md +44 -0
- package/prompts/examples/hunter-examples.md +131 -0
- package/prompts/examples/skeptic-examples.md +87 -0
- package/prompts/fixer.md +103 -0
- package/prompts/hunter.md +146 -0
- package/prompts/recon.md +159 -0
- package/prompts/referee.md +122 -0
- package/prompts/skeptic.md +143 -0
- package/prompts/threat-model.md +122 -0
- package/scripts/bug-hunter-state.cjs +537 -0
- package/scripts/code-index.cjs +541 -0
- package/scripts/context7-api.cjs +133 -0
- package/scripts/delta-mode.cjs +219 -0
- package/scripts/dep-scan.cjs +343 -0
- package/scripts/doc-lookup.cjs +316 -0
- package/scripts/fix-lock.cjs +167 -0
- package/scripts/init-test-fixture.sh +19 -0
- package/scripts/payload-guard.cjs +197 -0
- package/scripts/run-bug-hunter.cjs +892 -0
- package/scripts/tests/bug-hunter-state.test.cjs +87 -0
- package/scripts/tests/code-index.test.cjs +57 -0
- package/scripts/tests/delta-mode.test.cjs +47 -0
- package/scripts/tests/fix-lock.test.cjs +36 -0
- package/scripts/tests/fixtures/flaky-worker.cjs +63 -0
- package/scripts/tests/fixtures/low-confidence-worker.cjs +73 -0
- package/scripts/tests/fixtures/success-worker.cjs +42 -0
- package/scripts/tests/payload-guard.test.cjs +41 -0
- package/scripts/tests/run-bug-hunter.test.cjs +403 -0
- package/scripts/tests/test-utils.cjs +59 -0
- package/scripts/tests/worktree-harvest.test.cjs +297 -0
- package/scripts/triage.cjs +528 -0
- package/scripts/worktree-harvest.cjs +516 -0
- package/templates/subagent-wrapper.md +109 -0
|
@@ -0,0 +1,143 @@
|
|
|
1
|
+
# Local-Sequential Mode (no subagents — default fallback)
|
|
2
|
+
|
|
3
|
+
Run all pipeline phases in the main agent's own context window. This is the **most common execution mode** — most agent runtimes will land here because subagent dispatch requires specific tooling that isn't always available.
|
|
4
|
+
|
|
5
|
+
This is NOT a degraded mode. The skill is designed to work fully here.
|
|
6
|
+
|
|
7
|
+
## How It Works
|
|
8
|
+
|
|
9
|
+
You (the orchestrating agent) play each role yourself, sequentially. Between phases you **write outputs to files** so later phases can reference them without holding everything in working memory.
|
|
10
|
+
|
|
11
|
+
All state files go in `.bug-hunter/` relative to the working directory.
|
|
12
|
+
|
|
13
|
+
## Phase A: Recon (map the codebase)
|
|
14
|
+
|
|
15
|
+
**Check for triage data first.** The orchestrator runs `triage.cjs` in Step 1 and writes `.bug-hunter/triage.json`. If this file exists, triage has already:
|
|
16
|
+
- Discovered and classified all source files (domains + riskMap)
|
|
17
|
+
- Computed FILE_BUDGET from actual file sizes
|
|
18
|
+
- Built a priority-ordered scanOrder for Hunters
|
|
19
|
+
- Determined the execution strategy
|
|
20
|
+
|
|
21
|
+
1. Read `SKILL_DIR/prompts/recon.md` with the Read tool — do NOT act from memory.
|
|
22
|
+
|
|
23
|
+
2. **If `.bug-hunter/triage.json` exists:**
|
|
24
|
+
- Read it. Use `triage.riskMap` as the initial risk map (skip file discovery + classification).
|
|
25
|
+
- Use `triage.fileBudget` as FILE_BUDGET (skip computation).
|
|
26
|
+
- Use `triage.scanOrder` as the file order for Phase B.
|
|
27
|
+
- Recon's remaining job: read 3-5 key files from CRITICAL domains to identify **tech stack** (framework, auth mechanism, database, key dependencies) and **trust boundary patterns** (how routes are defined, how auth middleware is applied, etc.).
|
|
28
|
+
- If git is available, check recently changed files with `git log`.
|
|
29
|
+
- Write your Recon output to `.bug-hunter/recon.md` — include the tech stack, patterns, and the triage-provided risk map.
|
|
30
|
+
|
|
31
|
+
3. **If `.bug-hunter/triage.json` does NOT exist** (fallback — Recon called directly):
|
|
32
|
+
- Execute the full Recon instructions: discover files, classify, compute FILE_BUDGET.
|
|
33
|
+
- Use search tools (`rg`, `grep`, or manual Read) to find trust boundaries.
|
|
34
|
+
- Measure file sizes to compute FILE_BUDGET (default: 40 if measurement fails).
|
|
35
|
+
- Write complete output to `.bug-hunter/recon.md`.
|
|
36
|
+
|
|
37
|
+
4. Parse your output: extract the risk map, FILE_BUDGET, tech stack, and scan order. You will use these in all subsequent phases.
|
|
38
|
+
|
|
39
|
+
**If Recon fails or you cannot complete it:** Skip Recon. Set FILE_BUDGET=40. Use triage's scanOrder if available, otherwise use a flat file list ordered by directory depth. Continue to Phase B.
|
|
40
|
+
|
|
41
|
+
## Phase B: Hunter (deep scan for bugs)
|
|
42
|
+
|
|
43
|
+
1. Read `SKILL_DIR/prompts/hunter.md` with the Read tool.
|
|
44
|
+
2. Read `SKILL_DIR/prompts/doc-lookup.md` with the Read tool.
|
|
45
|
+
3. **Switch mindset**: you are now a Bug Hunter. Your ONLY job is to find behavioral bugs.
|
|
46
|
+
4. Execute the Hunter instructions yourself:
|
|
47
|
+
- Read files in risk-map order: CRITICAL → HIGH → MEDIUM.
|
|
48
|
+
- For each file, use the Read tool. Do NOT rely on memory from earlier phases.
|
|
49
|
+
- Apply the mandatory security checklist sweep (Phase 3 in hunter.md) on every CRITICAL and HIGH file.
|
|
50
|
+
- Track which files you actually read — be honest about coverage.
|
|
51
|
+
- For each bug found, record it in the exact BUG-N format specified in hunter.md.
|
|
52
|
+
5. Write your complete findings to `.bug-hunter/findings.md`.
|
|
53
|
+
|
|
54
|
+
**Context management:** If you notice earlier files becoming hazy in your memory:
|
|
55
|
+
- STOP expanding to new files.
|
|
56
|
+
- Record your honest coverage in FILES SCANNED / FILES SKIPPED.
|
|
57
|
+
- Complete the current file thoroughly rather than skimming more files.
|
|
58
|
+
- The pipeline will handle partial coverage via gap-fill or `--loop` mode.
|
|
59
|
+
|
|
60
|
+
**Chunked execution (when files > FILE_BUDGET):**
|
|
61
|
+
|
|
62
|
+
If the Recon risk map contains more files than FILE_BUDGET, do NOT try to read them all in one pass. Instead:
|
|
63
|
+
|
|
64
|
+
1. Initialize state:
|
|
65
|
+
```bash
|
|
66
|
+
node "$SKILL_DIR/scripts/bug-hunter-state.cjs" init ".bug-hunter/state.json" "local-sequential" ".bug-hunter/source-files.json" 30
|
|
67
|
+
```
|
|
68
|
+
2. For each chunk:
|
|
69
|
+
a. Get next chunk:
|
|
70
|
+
```bash
|
|
71
|
+
node "$SKILL_DIR/scripts/bug-hunter-state.cjs" next-chunk ".bug-hunter/state.json"
|
|
72
|
+
```
|
|
73
|
+
b. Mark in-progress:
|
|
74
|
+
```bash
|
|
75
|
+
node "$SKILL_DIR/scripts/bug-hunter-state.cjs" mark-chunk ".bug-hunter/state.json" "<chunk-id>" in_progress
|
|
76
|
+
```
|
|
77
|
+
c. Run the Hunter scan on this chunk's files only.
|
|
78
|
+
d. Write chunk findings to `.bug-hunter/chunk-<id>-findings.json`.
|
|
79
|
+
e. Record findings in state:
|
|
80
|
+
```bash
|
|
81
|
+
node "$SKILL_DIR/scripts/bug-hunter-state.cjs" record-findings ".bug-hunter/state.json" ".bug-hunter/chunk-<id>-findings.json" "local-sequential"
|
|
82
|
+
```
|
|
83
|
+
f. Mark done:
|
|
84
|
+
```bash
|
|
85
|
+
node "$SKILL_DIR/scripts/bug-hunter-state.cjs" mark-chunk ".bug-hunter/state.json" "<chunk-id>" done
|
|
86
|
+
```
|
|
87
|
+
3. After all chunks: merge findings from `.bug-hunter/state.json` into `.bug-hunter/findings.md`.
|
|
88
|
+
|
|
89
|
+
**Gap-fill:** After scanning, compare FILES SCANNED against the risk map. If any CRITICAL or HIGH files are in FILES SKIPPED, read them now and append any new findings. If you truly cannot read them (context exhaustion), leave them in FILES SKIPPED.
|
|
90
|
+
|
|
91
|
+
If TOTAL FINDINGS: 0, skip Phases C and D. Go directly to Step 7 (Final Report) in SKILL.md.
|
|
92
|
+
|
|
93
|
+
## Phase C: Skeptic (challenge your own findings)
|
|
94
|
+
|
|
95
|
+
1. Read `SKILL_DIR/prompts/skeptic.md` with the Read tool.
|
|
96
|
+
2. Read `SKILL_DIR/prompts/doc-lookup.md` with the Read tool.
|
|
97
|
+
3. **Switch mindset completely**: you are now the Skeptic. Your job is to DISPROVE false positives. Forget the pride of finding them — you want to kill weak claims.
|
|
98
|
+
4. Read `.bug-hunter/findings.md` to get the findings list.
|
|
99
|
+
5. For EACH finding:
|
|
100
|
+
- Re-read the actual code at the reported file and line with the Read tool. This is MANDATORY — do not evaluate from memory.
|
|
101
|
+
- Read all cross-referenced files.
|
|
102
|
+
- Mentally trace the runtime trigger: does the code actually behave the way the Hunter claimed?
|
|
103
|
+
- Check framework/middleware protections the Hunter may have missed.
|
|
104
|
+
- Apply the risk calculation: `EV = (confidence% × points) - ((100 - confidence%) × 2 × points)`. Only DISPROVE when EV is positive (confidence > 67%).
|
|
105
|
+
- For Critical bugs: need >67% confidence AND all cross-references read.
|
|
106
|
+
6. Write your complete Skeptic output to `.bug-hunter/skeptic.md` in the format from skeptic.md.
|
|
107
|
+
|
|
108
|
+
**Important:** When switching from Hunter to Skeptic, genuinely try to disprove your own findings. The point of this phase is adversarial review. If you cannot genuinely argue against a finding, ACCEPT it and move on — do not waste time rubber-stamping.
|
|
109
|
+
|
|
110
|
+
## Phase D: Referee (final verdicts)
|
|
111
|
+
|
|
112
|
+
1. Read `SKILL_DIR/prompts/referee.md` with the Read tool.
|
|
113
|
+
2. **Switch mindset**: you are the impartial Referee. You trust neither the Hunter nor the Skeptic.
|
|
114
|
+
3. Read both `.bug-hunter/findings.md` and `.bug-hunter/skeptic.md`.
|
|
115
|
+
4. For each finding:
|
|
116
|
+
- **Tier 1 (all Critical + top 15 by severity):** Re-read the actual code yourself a THIRD time using the Read tool. Construct the runtime trigger independently. Make your own judgment.
|
|
117
|
+
- **Tier 2 (remaining):** Evaluate evidence quality. Whose code quotes are more specific? Whose runtime trigger is more concrete?
|
|
118
|
+
5. Make final REAL BUG / NOT A BUG verdicts with severity calibration.
|
|
119
|
+
6. Write the final Referee report to `.bug-hunter/referee.md`.
|
|
120
|
+
7. Proceed to Step 7 (Final Report) in SKILL.md.
|
|
121
|
+
|
|
122
|
+
## State Files Summary
|
|
123
|
+
|
|
124
|
+
After a complete local-sequential run, these files should exist:
|
|
125
|
+
|
|
126
|
+
| File | Phase | Content |
|
|
127
|
+
|------|-------|---------|
|
|
128
|
+
| `.bug-hunter/recon.md` | A | Risk map, file metrics, tech stack |
|
|
129
|
+
| `.bug-hunter/findings.md` | B | All Hunter findings in BUG-N format |
|
|
130
|
+
| `.bug-hunter/skeptic.md` | C | Skeptic challenges and decisions |
|
|
131
|
+
| `.bug-hunter/referee.md` | D | Final verdicts and confirmed bugs |
|
|
132
|
+
| `.bug-hunter/state.json` | B (chunked) | Chunk progress, findings ledger |
|
|
133
|
+
| `.bug-hunter/source-files.json` | A | Source file list (for state init) |
|
|
134
|
+
|
|
135
|
+
## Coverage Enforcement
|
|
136
|
+
|
|
137
|
+
After Phase D, check coverage:
|
|
138
|
+
|
|
139
|
+
- If all CRITICAL and HIGH files were scanned: proceed to Final Report.
|
|
140
|
+
- If any CRITICAL/HIGH files were skipped:
|
|
141
|
+
- If `--loop` mode: the ralph-loop will iterate and cover them next.
|
|
142
|
+
- If not `--loop`: include a coverage WARNING in the Final Report and recommend `--loop`.
|
|
143
|
+
- Do NOT claim "full coverage" or "audit complete" unless every CRITICAL and HIGH file was actually read with the Read tool and has status DONE.
|
package/modes/loop.md
ADDED
|
@@ -0,0 +1,125 @@
|
|
|
1
|
+
# Ralph-Loop Mode (`--loop`)
|
|
2
|
+
|
|
3
|
+
When `--loop` is present, the bug-hunter wraps itself in a ralph-loop that keeps iterating until the audit achieves full coverage. This is for thorough, autonomous audits where you want every file examined.
|
|
4
|
+
|
|
5
|
+
## CRITICAL: Starting the ralph-loop
|
|
6
|
+
|
|
7
|
+
**You MUST call the `ralph_start` tool to begin the loop.** Without this call, the loop will not iterate.
|
|
8
|
+
|
|
9
|
+
When `LOOP_MODE=true` is set (from `--loop` flag), before running the first pipeline iteration:
|
|
10
|
+
|
|
11
|
+
1. Build the task content from the TODO.md template below.
|
|
12
|
+
2. Call the `ralph_start` tool:
|
|
13
|
+
|
|
14
|
+
```
|
|
15
|
+
ralph_start({
|
|
16
|
+
name: "bug-hunter-audit",
|
|
17
|
+
taskContent: <the TODO.md content below>,
|
|
18
|
+
maxIterations: 10
|
|
19
|
+
})
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
3. The ralph-loop system will then drive iteration. Each iteration:
|
|
23
|
+
- You receive the task prompt with the current checklist state.
|
|
24
|
+
- You execute one iteration of the bug-hunt pipeline (steps below).
|
|
25
|
+
- You update `.bug-hunter/coverage.md` with results.
|
|
26
|
+
- If ALL CRITICAL/HIGH files are DONE → output `<promise>COMPLETE</promise>` to end the loop.
|
|
27
|
+
- Otherwise → call `ralph_done` to proceed to the next iteration.
|
|
28
|
+
|
|
29
|
+
**Do NOT manually loop or re-invoke yourself.** The ralph-loop system handles iteration automatically after you call `ralph_start`.
|
|
30
|
+
|
|
31
|
+
## How it works
|
|
32
|
+
|
|
33
|
+
1. **First iteration**: Run the normal pipeline (Recon → Hunters → Skeptics → Referee). At the end, write a coverage report to `.bug-hunter/coverage.md` using the machine-parseable format below.
|
|
34
|
+
|
|
35
|
+
2. **Coverage check**: After each iteration, evaluate:
|
|
36
|
+
- If ALL CRITICAL and HIGH files show status DONE → output `<promise>COMPLETE</promise>` → loop ends
|
|
37
|
+
- If any CRITICAL/HIGH files are SKIPPED or PARTIAL → call `ralph_done` → loop continues
|
|
38
|
+
- If only MEDIUM files remain uncovered → output `<promise>COMPLETE</promise>` (MEDIUM gaps are acceptable)
|
|
39
|
+
|
|
40
|
+
3. **Subsequent iterations**: Each new iteration reads `.bug-hunter/coverage.md` to see what's already been done, then runs the pipeline ONLY on uncovered files. New findings are appended to the cumulative bug list.
|
|
41
|
+
|
|
42
|
+
## Coverage file format (machine-parseable)
|
|
43
|
+
|
|
44
|
+
**`.bug-hunter/coverage.md`:**
|
|
45
|
+
```markdown
|
|
46
|
+
# Bug Hunt Coverage
|
|
47
|
+
SCHEMA_VERSION: 2
|
|
48
|
+
|
|
49
|
+
## Meta
|
|
50
|
+
ITERATION: [N]
|
|
51
|
+
STATUS: [IN_PROGRESS | COMPLETE]
|
|
52
|
+
TOTAL_BUGS_FOUND: [N]
|
|
53
|
+
TIMESTAMP: [ISO 8601]
|
|
54
|
+
CHECKSUM: [line_count of Files section]|[line_count of Bugs section]
|
|
55
|
+
|
|
56
|
+
## Files
|
|
57
|
+
<!-- One line per file. Format: TIER|PATH|STATUS|ITERATION_SCANNED|BUGS_FOUND -->
|
|
58
|
+
<!-- STATUS: DONE | PARTIAL | SKIPPED -->
|
|
59
|
+
<!-- BUGS_FOUND: comma-separated BUG-IDs, or NONE -->
|
|
60
|
+
CRITICAL|src/auth/login.ts|DONE|1|BUG-3,BUG-7
|
|
61
|
+
CRITICAL|src/auth/middleware.ts|DONE|1|NONE
|
|
62
|
+
HIGH|src/api/users.ts|DONE|1|BUG-12
|
|
63
|
+
HIGH|src/api/payments.ts|SKIPPED|0|
|
|
64
|
+
MEDIUM|src/utils/format.ts|SKIPPED|0|
|
|
65
|
+
TEST|src/auth/login.test.ts|CONTEXT|1|
|
|
66
|
+
|
|
67
|
+
## Bugs
|
|
68
|
+
<!-- One line per confirmed bug. Format: BUG-ID|SEVERITY|FILE|LINES|ONE_LINE_DESCRIPTION -->
|
|
69
|
+
BUG-3|Critical|src/auth/login.ts|45-52|JWT token not validated before use
|
|
70
|
+
BUG-7|Medium|src/auth/login.ts|89|Password comparison uses timing-unsafe equality
|
|
71
|
+
BUG-12|Low|src/api/users.ts|120-125|Missing null check on optional profile field
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
## TODO.md task content for ralph_start
|
|
75
|
+
|
|
76
|
+
Use this as the `taskContent` parameter when calling `ralph_start`:
|
|
77
|
+
|
|
78
|
+
**For `--loop` (scan only):**
|
|
79
|
+
```markdown
|
|
80
|
+
# Bug Hunt Audit
|
|
81
|
+
|
|
82
|
+
## Coverage Tasks
|
|
83
|
+
- [ ] All CRITICAL files scanned
|
|
84
|
+
- [ ] All HIGH files scanned
|
|
85
|
+
- [ ] Findings verified through Skeptic+Referee pipeline
|
|
86
|
+
|
|
87
|
+
## Completion
|
|
88
|
+
- [ ] ALL_TASKS_COMPLETE
|
|
89
|
+
|
|
90
|
+
## Instructions
|
|
91
|
+
1. Read .bug-hunter/coverage.md for previous iteration state
|
|
92
|
+
2. Parse the Files table — collect all lines where STATUS is not DONE and TIER is CRITICAL or HIGH
|
|
93
|
+
3. Run bug-hunter pipeline on those files only
|
|
94
|
+
4. Update coverage file: change STATUS to DONE, add BUG-IDs
|
|
95
|
+
5. Output <promise>COMPLETE</promise> when all CRITICAL/HIGH files are DONE
|
|
96
|
+
6. Otherwise call ralph_done to continue to the next iteration
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
## Coverage file validation
|
|
100
|
+
|
|
101
|
+
At the start of each iteration, validate the coverage file:
|
|
102
|
+
1. Check `SCHEMA_VERSION: 2` exists on line 2 — if missing, this is a v1 file; migrate by adding the header
|
|
103
|
+
2. Parse the CHECKSUM field: `[file_lines]|[bug_lines]` — count actual lines in Files and Bugs sections
|
|
104
|
+
3. If counts don't match the checksum, the file may be corrupted. Warn: "Coverage file checksum mismatch (expected X|Y, got A|B). Re-scanning affected files." Then set any files with mismatched data to STATUS=PARTIAL for re-scan.
|
|
105
|
+
4. If the file fails to parse entirely (malformed lines, missing sections), rename it to `.bug-hunter/coverage.md.bak` and start fresh. Warn user.
|
|
106
|
+
|
|
107
|
+
Update the CHECKSUM every time you write to the coverage file.
|
|
108
|
+
|
|
109
|
+
## Iteration behavior
|
|
110
|
+
|
|
111
|
+
Each iteration after the first:
|
|
112
|
+
1. Read `.bug-hunter/coverage.md` — parse the Files table
|
|
113
|
+
2. Collect all lines where STATUS != DONE and TIER is CRITICAL or HIGH
|
|
114
|
+
3. If none remain → output `<promise>COMPLETE</promise>` (this ends the ralph-loop)
|
|
115
|
+
4. Otherwise, run the pipeline on remaining files only (use small/parallel mode based on count)
|
|
116
|
+
5. Update the coverage file: set STATUS to DONE for scanned files, append new bugs to the Bugs section
|
|
117
|
+
6. Increment ITERATION counter
|
|
118
|
+
7. Call `ralph_done` to proceed to the next iteration
|
|
119
|
+
|
|
120
|
+
## Safety
|
|
121
|
+
|
|
122
|
+
- Max 10 iterations by default (set via `ralph_start({ maxIterations: 10 })`)
|
|
123
|
+
- Each iteration only scans NEW files — no re-scanning already-DONE files
|
|
124
|
+
- User can stop anytime with ESC or `/ralph-stop`
|
|
125
|
+
- All state is in `.bug-hunter/coverage.md` — fully resumable, machine-parseable
|
|
@@ -0,0 +1,113 @@
|
|
|
1
|
+
# Parallel Mode (11–FILE_BUDGET files) — sequential-first hybrid
|
|
2
|
+
|
|
3
|
+
This mode handles medium-sized scan targets. The deep Hunter scans all files sequentially.
|
|
4
|
+
An optional read-only dual-lens **scout pass** can run in parallel to provide hints.
|
|
5
|
+
All phases are dispatched using the `AGENT_BACKEND` selected during SKILL preflight.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Triage Integration
|
|
10
|
+
|
|
11
|
+
Before any phase, check for `.bug-hunter/triage.json` (written by Step 1). If present:
|
|
12
|
+
- Use `triage.riskMap` as the risk map — skip Recon's file classification.
|
|
13
|
+
- Use `triage.scanOrder` as the Hunter's file order.
|
|
14
|
+
- Use `triage.fileBudget` as FILE_BUDGET.
|
|
15
|
+
- Recon becomes an enrichment pass: identify tech stack and trust boundary patterns only.
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## Step 4: Run Recon
|
|
20
|
+
|
|
21
|
+
Dispatch Recon using the standard dispatch pattern (see `_dispatch.md`, role=`recon`).
|
|
22
|
+
|
|
23
|
+
**If triage data exists**, tell Recon to use the triage risk map and only identify tech stack + patterns. Pass the triage JSON path as phase-specific context.
|
|
24
|
+
|
|
25
|
+
**If no triage data**, Recon does full file discovery and classification.
|
|
26
|
+
|
|
27
|
+
After Recon completes, read `.bug-hunter/recon.md` to extract the risk map, tech stack, and FILE_BUDGET.
|
|
28
|
+
|
|
29
|
+
Report architecture summary to user.
|
|
30
|
+
|
|
31
|
+
---
|
|
32
|
+
|
|
33
|
+
## Step 5: Optional read-only dual-lens scout pass (safe parallel)
|
|
34
|
+
|
|
35
|
+
**This step is optional** — skip it if the codebase is straightforward or if `AGENT_BACKEND = "local-sequential"`.
|
|
36
|
+
|
|
37
|
+
Launch two scout Hunters in parallel on CRITICAL+HIGH files only:
|
|
38
|
+
|
|
39
|
+
1. Generate payloads:
|
|
40
|
+
```
|
|
41
|
+
node "$SKILL_DIR/scripts/payload-guard.cjs" generate triage-hunter ".bug-hunter/payloads/scout-hunter-a.json"
|
|
42
|
+
node "$SKILL_DIR/scripts/payload-guard.cjs" generate triage-hunter ".bug-hunter/payloads/scout-hunter-b.json"
|
|
43
|
+
```
|
|
44
|
+
2. Fill payloads: Scout-A = security lens, Scout-B = logic lens. Both scan the same CRITICAL+HIGH files.
|
|
45
|
+
3. Validate both payloads.
|
|
46
|
+
4. Dispatch in parallel:
|
|
47
|
+
```
|
|
48
|
+
subagent({ tasks: [
|
|
49
|
+
{ agent: "scout-hunter-security", task: "<security scout template>", output: ".bug-hunter/scout-a.md" },
|
|
50
|
+
{ agent: "scout-hunter-logic", task: "<logic scout template>", output: ".bug-hunter/scout-b.md" }
|
|
51
|
+
]})
|
|
52
|
+
```
|
|
53
|
+
5. Wait for both. Merge scout shortlists into hints for the deep Hunter.
|
|
54
|
+
|
|
55
|
+
**Scout pass rules:**
|
|
56
|
+
- Scouts are READ-ONLY — they never modify files or state.
|
|
57
|
+
- If either scout dispatch fails, disable scout pass and continue to Step 5-deep without hints.
|
|
58
|
+
- Scout findings alone are NOT the final result — they only inform the deep scan.
|
|
59
|
+
|
|
60
|
+
---
|
|
61
|
+
|
|
62
|
+
## Step 5-deep: Run Deep Hunter
|
|
63
|
+
|
|
64
|
+
Dispatch Hunter using the standard dispatch pattern (see `_dispatch.md`, role=`hunter`).
|
|
65
|
+
|
|
66
|
+
Pass to the Hunter:
|
|
67
|
+
- File list in risk-map order. If triage exists, use `triage.scanOrder`.
|
|
68
|
+
- Risk map from Recon (or triage).
|
|
69
|
+
- Tech stack from Recon.
|
|
70
|
+
- If scout hints exist (from Step 5), use them to prioritize certain code sections, but scan all files regardless.
|
|
71
|
+
- `doc-lookup.md` contents as phase-specific context.
|
|
72
|
+
|
|
73
|
+
After completion, read `.bug-hunter/findings.md`.
|
|
74
|
+
|
|
75
|
+
**Merge scout + deep findings:** If scout pass ran, compare scout findings with deep Hunter findings. Promote any scout-only findings (bugs the deep Hunter missed) into the findings list for Skeptic review.
|
|
76
|
+
|
|
77
|
+
If TOTAL FINDINGS: 0, skip Skeptic and Referee. Go to Step 7 (Final Report) in SKILL.md.
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## Step 5-verify: Gap-fill check
|
|
82
|
+
|
|
83
|
+
Same as small mode: compare FILES SCANNED vs risk map, re-scan any missed CRITICAL/HIGH files.
|
|
84
|
+
|
|
85
|
+
---
|
|
86
|
+
|
|
87
|
+
## Step 6: Run Skeptic
|
|
88
|
+
|
|
89
|
+
Dispatch Skeptic using the standard dispatch pattern (see `_dispatch.md`, role=`skeptic`).
|
|
90
|
+
|
|
91
|
+
For parallel mode, you may split into two Skeptics by directory if findings span multiple services:
|
|
92
|
+
- Skeptic-A: bugs in service/directory A
|
|
93
|
+
- Skeptic-B: bugs in service/directory B
|
|
94
|
+
|
|
95
|
+
Pass to each Skeptic only the bugs in their assigned scope. After completion, merge results.
|
|
96
|
+
|
|
97
|
+
If only one service/directory: use a single Skeptic.
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## Step 7: Run Referee
|
|
102
|
+
|
|
103
|
+
Dispatch Referee using the standard dispatch pattern (see `_dispatch.md`, role=`referee`).
|
|
104
|
+
|
|
105
|
+
Pass the merged Hunter findings + Skeptic challenges.
|
|
106
|
+
|
|
107
|
+
After completion, read `.bug-hunter/referee.md`.
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
## After Step 7
|
|
112
|
+
|
|
113
|
+
Proceed to **Step 7** (Final Report) in SKILL.md.
|
package/modes/scaled.md
ADDED
|
@@ -0,0 +1,76 @@
|
|
|
1
|
+
# Scaled Mode (FILE_BUDGET×2+1 to FILE_BUDGET×3 files) — state-driven sequential
|
|
2
|
+
|
|
3
|
+
This mode handles large targets requiring 3+ chunks with full resume state.
|
|
4
|
+
All phases are dispatched using the `AGENT_BACKEND` selected during SKILL preflight.
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Triage Integration
|
|
9
|
+
|
|
10
|
+
Before any phase, check for `.bug-hunter/triage.json` (written by Step 1). If present:
|
|
11
|
+
- Use `triage.riskMap` as the risk map — skip Recon's file classification.
|
|
12
|
+
- Use `triage.scanOrder` as the chunk-building source (files already priority-ordered).
|
|
13
|
+
- Use `triage.fileBudget` as FILE_BUDGET and chunk size cap.
|
|
14
|
+
- Use `triage.domains` for service-aware partitioning if available.
|
|
15
|
+
- Recon becomes an enrichment pass: identify tech stack and trust boundary patterns only.
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## Step 4: Run Recon
|
|
20
|
+
|
|
21
|
+
Dispatch Recon using the standard dispatch pattern (see `_dispatch.md`, role=`recon`).
|
|
22
|
+
|
|
23
|
+
Same as Extended mode: Recon enriches triage data with tech stack and patterns. If no triage, Recon does full discovery.
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## Step 5: Run Chunked Hunters with Resume State
|
|
28
|
+
|
|
29
|
+
### 5a. Build chunks and initialize state
|
|
30
|
+
|
|
31
|
+
Same as Extended mode. Partition from `triage.scanOrder` or risk map. Initialize state:
|
|
32
|
+
```bash
|
|
33
|
+
node "$SKILL_DIR/scripts/bug-hunter-state.cjs" init ".bug-hunter/state.json" "scaled" ".bug-hunter/source-files.json" 30
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
### 5b. Execute chunks with hash-based skip filtering
|
|
37
|
+
|
|
38
|
+
Before each chunk, apply skip filtering to avoid re-scanning files already processed (handles resume after interruption):
|
|
39
|
+
```bash
|
|
40
|
+
node "$SKILL_DIR/scripts/bug-hunter-state.cjs" hash-filter ".bug-hunter/state.json" ".bug-hunter/chunk-<id>-files.json"
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
For each chunk: dispatch Hunter, record findings, mark done — same pattern as Extended mode.
|
|
44
|
+
|
|
45
|
+
### 5c. Cross-chunk consistency
|
|
46
|
+
|
|
47
|
+
After all chunks complete:
|
|
48
|
+
1. Merge findings from state into `.bug-hunter/findings.md`.
|
|
49
|
+
2. Run consistency check: look for duplicate BUG-IDs across chunks and conflicting claims on the same file/line.
|
|
50
|
+
3. Resolve conflicts: keep the finding with the stronger evidence.
|
|
51
|
+
|
|
52
|
+
If TOTAL FINDINGS: 0, skip Skeptic and Referee. Go to Step 7 (Final Report) in SKILL.md.
|
|
53
|
+
|
|
54
|
+
---
|
|
55
|
+
|
|
56
|
+
## Step 6: Run Skeptic(s)
|
|
57
|
+
|
|
58
|
+
Dispatch 1-2 Skeptics by directory using the standard dispatch pattern (see `_dispatch.md`, role=`skeptic`).
|
|
59
|
+
|
|
60
|
+
Split bugs by directory/service for focused scope. Merge results.
|
|
61
|
+
|
|
62
|
+
---
|
|
63
|
+
|
|
64
|
+
## Step 7: Run Referee
|
|
65
|
+
|
|
66
|
+
Dispatch Referee using the standard dispatch pattern (see `_dispatch.md`, role=`referee`).
|
|
67
|
+
|
|
68
|
+
Pass merged Hunter findings + Skeptic challenges.
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
72
|
+
## After Step 7
|
|
73
|
+
|
|
74
|
+
Proceed to **Step 7** (Final Report) in SKILL.md.
|
|
75
|
+
|
|
76
|
+
If `--loop` was specified and coverage is incomplete, the ralph-loop will iterate to cover remaining files.
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
# Single-File Mode (1 file)
|
|
2
|
+
|
|
3
|
+
All phases are dispatched using `AGENT_BACKEND` selected during SKILL preflight.
|
|
4
|
+
Recon is skipped — a single file doesn't need codebase mapping.
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Step 4: Run Hunter
|
|
9
|
+
|
|
10
|
+
Dispatch Hunter using the standard dispatch pattern (see `_dispatch.md`, role=`hunter`).
|
|
11
|
+
|
|
12
|
+
Pass the single file path as the file list. No risk map needed — the file is implicitly CRITICAL.
|
|
13
|
+
|
|
14
|
+
For `local-sequential`: read the prompt file and scan the single file yourself.
|
|
15
|
+
|
|
16
|
+
After completion, read `.bug-hunter/findings.md`.
|
|
17
|
+
|
|
18
|
+
If TOTAL FINDINGS: 0, go to Step 7 (Final Report) in SKILL.md.
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## Step 5: Run Skeptic
|
|
23
|
+
|
|
24
|
+
Dispatch Skeptic using the standard dispatch pattern (see `_dispatch.md`, role=`skeptic`).
|
|
25
|
+
|
|
26
|
+
Inject the Hunter's findings.
|
|
27
|
+
|
|
28
|
+
After completion, read `.bug-hunter/skeptic.md`.
|
|
29
|
+
|
|
30
|
+
---
|
|
31
|
+
|
|
32
|
+
## Step 6: Run Referee
|
|
33
|
+
|
|
34
|
+
Dispatch Referee using the standard dispatch pattern (see `_dispatch.md`, role=`referee`).
|
|
35
|
+
|
|
36
|
+
Inject Hunter + Skeptic reports.
|
|
37
|
+
|
|
38
|
+
After completion, read `.bug-hunter/referee.md`. Go to Step 7 (Final Report) in SKILL.md.
|
package/modes/small.md
ADDED
|
@@ -0,0 +1,86 @@
|
|
|
1
|
+
# Small Mode (2–10 files)
|
|
2
|
+
|
|
3
|
+
This mode handles small scan targets where all files fit in a single pass.
|
|
4
|
+
All phases are dispatched using the `AGENT_BACKEND` selected during SKILL preflight.
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Triage Integration
|
|
9
|
+
|
|
10
|
+
Before any phase, check for `.bug-hunter/triage.json` (written by Step 1). If present:
|
|
11
|
+
- Use `triage.riskMap` as the risk map — skip Recon's file classification.
|
|
12
|
+
- Use `triage.scanOrder` as the Hunter's file order.
|
|
13
|
+
- Recon becomes an enrichment pass: identify tech stack and trust boundary patterns only.
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## Step 4: Run Recon
|
|
18
|
+
|
|
19
|
+
Dispatch Recon using the standard dispatch pattern (see `_dispatch.md`, role=`recon`).
|
|
20
|
+
|
|
21
|
+
**If triage data exists**, tell Recon to use the triage risk map and only identify tech stack + patterns. Pass the triage JSON path as phase-specific context.
|
|
22
|
+
|
|
23
|
+
**If no triage data**, Recon does full file discovery and classification.
|
|
24
|
+
|
|
25
|
+
After Recon completes, read `.bug-hunter/recon.md` to extract the risk map and tech stack.
|
|
26
|
+
|
|
27
|
+
Report architecture summary to user.
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
## Step 5: Run Hunter
|
|
32
|
+
|
|
33
|
+
Dispatch Hunter using the standard dispatch pattern (see `_dispatch.md`, role=`hunter`).
|
|
34
|
+
|
|
35
|
+
Pass to the Hunter:
|
|
36
|
+
- File list in risk-map order (CRITICAL → HIGH → MEDIUM). If triage exists, use `triage.scanOrder`.
|
|
37
|
+
- Risk map from Recon (or triage).
|
|
38
|
+
- Tech stack from Recon.
|
|
39
|
+
- `doc-lookup.md` contents as phase-specific context.
|
|
40
|
+
|
|
41
|
+
After completion, read `.bug-hunter/findings.md`.
|
|
42
|
+
|
|
43
|
+
If TOTAL FINDINGS: 0, skip Skeptic and Referee. Go to Step 7 (Final Report) in SKILL.md.
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
|
|
47
|
+
## Step 5-verify: Gap-fill check
|
|
48
|
+
|
|
49
|
+
Compare the Hunter's FILES SCANNED list against the risk map.
|
|
50
|
+
|
|
51
|
+
If any CRITICAL or HIGH files appear in FILES SKIPPED:
|
|
52
|
+
|
|
53
|
+
**local-sequential:** Read the missed files yourself now and scan them for bugs. Append new findings to `.bug-hunter/findings.md`.
|
|
54
|
+
|
|
55
|
+
**subagent/teams:** Launch a second Hunter on ONLY the missed files using the standard dispatch pattern. Merge gap findings into `.bug-hunter/findings.md`.
|
|
56
|
+
|
|
57
|
+
---
|
|
58
|
+
|
|
59
|
+
## Step 6: Run Skeptic
|
|
60
|
+
|
|
61
|
+
Dispatch Skeptic using the standard dispatch pattern (see `_dispatch.md`, role=`skeptic`).
|
|
62
|
+
|
|
63
|
+
Pass to the Skeptic:
|
|
64
|
+
- Hunter findings from `.bug-hunter/findings.md` (compact format: bugId, severity, file, lines, claim, evidence, runtimeTrigger).
|
|
65
|
+
- Tech stack from Recon.
|
|
66
|
+
- `doc-lookup.md` contents as phase-specific context.
|
|
67
|
+
|
|
68
|
+
After completion, read `.bug-hunter/skeptic.md`.
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
72
|
+
## Step 7: Run Referee
|
|
73
|
+
|
|
74
|
+
Dispatch Referee using the standard dispatch pattern (see `_dispatch.md`, role=`referee`).
|
|
75
|
+
|
|
76
|
+
Pass to the Referee:
|
|
77
|
+
- Hunter findings from `.bug-hunter/findings.md`.
|
|
78
|
+
- Skeptic challenges from `.bug-hunter/skeptic.md`.
|
|
79
|
+
|
|
80
|
+
After completion, read `.bug-hunter/referee.md`.
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
## After Step 7
|
|
85
|
+
|
|
86
|
+
Proceed to **Step 7** (Final Report) in SKILL.md. The Referee output in `.bug-hunter/referee.md` provides the confirmed bugs table, dismissed findings, and coverage stats needed for the final report.
|
package/package.json
ADDED
|
@@ -0,0 +1,56 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "@codexstar/bug-hunter",
|
|
3
|
+
"version": "3.0.0",
|
|
4
|
+
"description": "Adversarial AI bug hunter — multi-agent pipeline finds security vulnerabilities, logic errors, and runtime bugs, then fixes them autonomously. Works with Claude Code, Cursor, Codex CLI, Copilot, Kiro, and more.",
|
|
5
|
+
"license": "MIT",
|
|
6
|
+
"type": "commonjs",
|
|
7
|
+
"bin": {
|
|
8
|
+
"bug-hunter": "bin/bug-hunter"
|
|
9
|
+
},
|
|
10
|
+
"engines": {
|
|
11
|
+
"node": ">=18.0.0"
|
|
12
|
+
},
|
|
13
|
+
"keywords": [
|
|
14
|
+
"ai",
|
|
15
|
+
"agent",
|
|
16
|
+
"skill",
|
|
17
|
+
"bug-hunter",
|
|
18
|
+
"security",
|
|
19
|
+
"vulnerability",
|
|
20
|
+
"code-review",
|
|
21
|
+
"auto-fix",
|
|
22
|
+
"claude-code",
|
|
23
|
+
"cursor",
|
|
24
|
+
"codex",
|
|
25
|
+
"copilot",
|
|
26
|
+
"kiro",
|
|
27
|
+
"multi-agent",
|
|
28
|
+
"adversarial",
|
|
29
|
+
"static-analysis"
|
|
30
|
+
],
|
|
31
|
+
"files": [
|
|
32
|
+
"bin/",
|
|
33
|
+
"scripts/",
|
|
34
|
+
"prompts/",
|
|
35
|
+
"templates/",
|
|
36
|
+
"modes/",
|
|
37
|
+
"evals/",
|
|
38
|
+
"SKILL.md",
|
|
39
|
+
"README.md",
|
|
40
|
+
"CHANGELOG.md",
|
|
41
|
+
"LICENSE"
|
|
42
|
+
],
|
|
43
|
+
"repository": {
|
|
44
|
+
"type": "git",
|
|
45
|
+
"url": "git+https://github.com/codexstar69/bug-hunter.git"
|
|
46
|
+
},
|
|
47
|
+
"bugs": {
|
|
48
|
+
"url": "https://github.com/codexstar69/bug-hunter/issues"
|
|
49
|
+
},
|
|
50
|
+
"homepage": "https://github.com/codexstar69/bug-hunter#readme",
|
|
51
|
+
"scripts": {
|
|
52
|
+
"test": "node --test scripts/tests/*.test.cjs",
|
|
53
|
+
"doctor": "node bin/bug-hunter doctor",
|
|
54
|
+
"postinstall": "node -e \"console.log('\\n Run: bug-hunter install to set up the skill\\n Run: bug-hunter doctor to check your environment\\n')\""
|
|
55
|
+
}
|
|
56
|
+
}
|