@codexstar/bug-hunter 3.0.0 → 3.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (77) hide show
  1. package/CHANGELOG.md +149 -83
  2. package/README.md +150 -15
  3. package/SKILL.md +94 -27
  4. package/agents/openai.yaml +4 -0
  5. package/bin/bug-hunter +9 -3
  6. package/docs/images/2026-03-12-fix-plan-rollout.png +0 -0
  7. package/docs/images/2026-03-12-hero-bug-hunter-overview.png +0 -0
  8. package/docs/images/2026-03-12-machine-readable-artifacts.png +0 -0
  9. package/docs/images/2026-03-12-pr-review-flow.png +0 -0
  10. package/docs/images/2026-03-12-security-pack.png +0 -0
  11. package/docs/images/adversarial-debate.png +0 -0
  12. package/docs/images/doc-verify-fix-plan.png +0 -0
  13. package/docs/images/hero.png +0 -0
  14. package/docs/images/pipeline-overview.png +0 -0
  15. package/docs/images/security-finding-card.png +0 -0
  16. package/docs/plans/2026-03-11-structured-output-migration-plan.md +288 -0
  17. package/docs/plans/2026-03-12-audit-bug-fixes-surgical-plan.md +193 -0
  18. package/docs/plans/2026-03-12-enterprise-security-pack-e2e-plan.md +59 -0
  19. package/docs/plans/2026-03-12-local-security-skills-integration-plan.md +39 -0
  20. package/docs/plans/2026-03-12-pr-review-strategic-fix-flow.md +78 -0
  21. package/evals/evals.json +366 -102
  22. package/modes/extended.md +2 -2
  23. package/modes/fix-loop.md +30 -30
  24. package/modes/fix-pipeline.md +32 -6
  25. package/modes/large-codebase.md +14 -15
  26. package/modes/local-sequential.md +44 -20
  27. package/modes/loop.md +56 -56
  28. package/modes/parallel.md +3 -3
  29. package/modes/scaled.md +2 -2
  30. package/modes/single-file.md +3 -3
  31. package/modes/small.md +11 -11
  32. package/package.json +10 -1
  33. package/prompts/fixer.md +37 -23
  34. package/prompts/hunter.md +39 -20
  35. package/prompts/referee.md +34 -20
  36. package/prompts/skeptic.md +25 -22
  37. package/schemas/coverage.schema.json +67 -0
  38. package/schemas/examples/findings.invalid.json +13 -0
  39. package/schemas/examples/findings.valid.json +17 -0
  40. package/schemas/findings.schema.json +76 -0
  41. package/schemas/fix-plan.schema.json +94 -0
  42. package/schemas/fix-report.schema.json +105 -0
  43. package/schemas/fix-strategy.schema.json +99 -0
  44. package/schemas/recon.schema.json +31 -0
  45. package/schemas/referee.schema.json +46 -0
  46. package/schemas/shared.schema.json +51 -0
  47. package/schemas/skeptic.schema.json +21 -0
  48. package/scripts/bug-hunter-state.cjs +35 -12
  49. package/scripts/code-index.cjs +11 -4
  50. package/scripts/fix-lock.cjs +95 -25
  51. package/scripts/payload-guard.cjs +24 -10
  52. package/scripts/pr-scope.cjs +181 -0
  53. package/scripts/render-report.cjs +346 -0
  54. package/scripts/run-bug-hunter.cjs +667 -32
  55. package/scripts/schema-runtime.cjs +273 -0
  56. package/scripts/schema-validate.cjs +40 -0
  57. package/scripts/tests/bug-hunter-state.test.cjs +68 -3
  58. package/scripts/tests/code-index.test.cjs +15 -0
  59. package/scripts/tests/fix-lock.test.cjs +60 -2
  60. package/scripts/tests/fixtures/flaky-worker.cjs +6 -1
  61. package/scripts/tests/fixtures/low-confidence-worker.cjs +8 -2
  62. package/scripts/tests/fixtures/success-worker.cjs +6 -1
  63. package/scripts/tests/payload-guard.test.cjs +154 -2
  64. package/scripts/tests/pr-scope.test.cjs +212 -0
  65. package/scripts/tests/render-report.test.cjs +180 -0
  66. package/scripts/tests/run-bug-hunter.test.cjs +686 -2
  67. package/scripts/tests/security-skills-integration.test.cjs +29 -0
  68. package/scripts/tests/skills-packaging.test.cjs +30 -0
  69. package/scripts/tests/worktree-harvest.test.cjs +66 -0
  70. package/scripts/worktree-harvest.cjs +62 -9
  71. package/skills/README.md +19 -0
  72. package/skills/commit-security-scan/SKILL.md +63 -0
  73. package/skills/security-review/SKILL.md +57 -0
  74. package/skills/threat-model-generation/SKILL.md +47 -0
  75. package/skills/vulnerability-validation/SKILL.md +59 -0
  76. package/templates/subagent-wrapper.md +12 -3
  77. package/modes/_dispatch.md +0 -121
@@ -1,121 +0,0 @@
1
- # Shared Dispatch Patterns
2
-
3
- This file defines how to dispatch each pipeline role (Recon, Hunter, Skeptic, Referee, Fixer) using any `AGENT_BACKEND`. Mode files reference this instead of duplicating dispatch boilerplate.
4
-
5
- ---
6
-
7
- ## Dispatch by Backend
8
-
9
- ### local-sequential
10
-
11
- You execute the role yourself:
12
-
13
- 1. Read the prompt file: `read({ path: "$SKILL_DIR/prompts/<role>.md" })`
14
- 2. If the role needs doc-lookup: also read `$SKILL_DIR/prompts/doc-lookup.md`
15
- 3. **Switch mindset** to the role (important for Skeptic/Referee — genuinely adversarial)
16
- 4. Execute the role's instructions using the Read tool to examine source files
17
- 5. Write output to the role's output file (see Output Files table below)
18
-
19
- ### subagent
20
-
21
- 1. Read the prompt file: `read({ path: "$SKILL_DIR/prompts/<role>.md" })`
22
- 2. Read the wrapper template: `read({ path: "$SKILL_DIR/templates/subagent-wrapper.md" })`
23
- 3. Generate payload:
24
- ```bash
25
- node "$SKILL_DIR/scripts/payload-guard.cjs" generate <role> ".bug-hunter/payloads/<role>-<context>.json"
26
- ```
27
- 4. Edit the payload JSON — fill in `skillDir`, `targetFiles`, and role-specific fields
28
- 5. Validate:
29
- ```bash
30
- node "$SKILL_DIR/scripts/payload-guard.cjs" validate <role> ".bug-hunter/payloads/<role>-<context>.json"
31
- ```
32
- 6. Fill the subagent-wrapper template variables:
33
- - `{ROLE_NAME}` = role name (see table below)
34
- - `{ROLE_DESCRIPTION}` = role description (see table below)
35
- - `{PROMPT_CONTENT}` = full contents of the prompt .md file
36
- - `{TARGET_DESCRIPTION}` = what is being scanned
37
- - `{SKILL_DIR}` = absolute path to skill directory
38
- - `{FILE_LIST}` = files in scan order (CRITICAL first)
39
- - `{RISK_MAP}` = risk classification from triage or Recon
40
- - `{TECH_STACK}` = framework, auth, DB from Recon
41
- - `{PHASE_SPECIFIC_CONTEXT}` = role-specific context (see below)
42
- - `{OUTPUT_FILE_PATH}` = output file path
43
- 7. Dispatch:
44
- ```
45
- subagent({ agent: "<role>-agent", task: "<filled template>", output: "<output-path>" })
46
- ```
47
- 8. Read the output file after completion
48
-
49
- ### teams
50
-
51
- Same as subagent, but dispatch with:
52
- ```
53
- teams({ tasks: [{ text: "<filled template>" }], maxTeammates: 1 })
54
- ```
55
-
56
- ### interactive_shell
57
-
58
- ```
59
- interactive_shell({ command: 'pi "<filled task prompt>"', mode: "dispatch" })
60
- ```
61
-
62
- ---
63
-
64
- ## Role Reference
65
-
66
- | Role | Prompt File | Role Description | Output File | Phase-Specific Context |
67
- |------|-------------|-----------------|-------------|----------------------|
68
- | `recon` | `prompts/recon.md` | Reconnaissance agent — map the codebase and classify files by risk | `.bug-hunter/recon.md` | Triage JSON path (if exists) |
69
- | `hunter` | `prompts/hunter.md` | Bug Hunter — find behavioral bugs in source code | `.bug-hunter/findings.md` | `doc-lookup.md` + risk map + tech stack |
70
- | `skeptic` | `prompts/skeptic.md` | Skeptic — adversarial review to disprove false positives | `.bug-hunter/skeptic.md` | Hunter findings (compact: bugId, severity, file, lines, claim, evidence, runtimeTrigger) + `doc-lookup.md` |
71
- | `referee` | `prompts/referee.md` | Referee — impartial final judge of all findings | `.bug-hunter/referee.md` | Hunter findings + Skeptic challenges |
72
- | `fixer` | `prompts/fixer.md` | Surgical code fixer — implement minimal fixes for confirmed bugs | `.bug-hunter/fix-report.md` | Confirmed bugs from Referee + tech stack + `doc-lookup.md` |
73
-
74
- ---
75
-
76
- ## Fixer Dispatch: Worktree Isolation (subagent/teams only)
77
-
78
- When `WORKTREE_MODE=true`, the Fixer runs in a managed git worktree for isolation. The orchestrator handles the full lifecycle — the Fixer just edits and commits.
79
-
80
- **Key differences from other role dispatches:**
81
-
82
- 1. The worktree is created by the orchestrator via `worktree-harvest.cjs prepare` BEFORE dispatch.
83
- 2. The Fixer's working directory is set to the worktree's absolute path, not the project root.
84
- 3. The Fixer MUST `git add` + `git commit` each fix (uncommitted work = `FIX_FAILED`).
85
- 4. The orchestrator harvests commits via `worktree-harvest.cjs harvest` AFTER dispatch.
86
- 5. The orchestrator cleans up via `worktree-harvest.cjs cleanup` AFTER harvest.
87
-
88
- **CRITICAL — do NOT use `isolation: "worktree"` on the Agent tool:**
89
- The Agent tool's built-in worktree isolation creates an ephemeral branch and auto-cleans on exit, which loses Fixer commits. We manage worktrees ourselves so the Fixer commits land directly on the fix branch.
90
-
91
- **Fixer-specific template variables for `{PHASE_SPECIFIC_CONTEXT}`:**
92
- - `WORKTREE_DIR: <absolute path to worktree>`
93
- - `FIX_BRANCH: <branch name>`
94
- - `COMMIT_FORMAT: fix(bug-hunter): BUG-N — [description]`
95
- - Worktree isolation rules (see `{WORKTREE_RULES}` in subagent-wrapper.md)
96
-
97
- **Lifecycle diagram:**
98
- ```
99
- Orchestrator Fixer (in worktree)
100
- | |
101
- |-- prepare (worktree-harvest.cjs) -->|
102
- | |-- read code
103
- | |-- edit files
104
- | |-- git add + commit per bug
105
- | |-- report done
106
- |<-- harvest (worktree-harvest.cjs) --|
107
- |-- cleanup (worktree-harvest.cjs) |
108
- |-- verify on fix branch |
109
- ```
110
-
111
- ---
112
-
113
- ## Context Pruning Rules
114
-
115
- When passing data between phases, include only what the receiving role needs:
116
-
117
- **To Skeptic:** For each bug: BUG-ID, severity, file, lines, claim, evidence, runtimeTrigger, cross-references. Omit: Hunter's internal reasoning, scan coverage stats, FILES SCANNED/SKIPPED metadata.
118
-
119
- **To Referee:** Full Hunter findings + full Skeptic challenges. The Referee needs both sides to judge.
120
-
121
- **To Fixer:** For each confirmed bug: BUG-ID, severity, file, line range, description, suggested fix direction, tech stack context. Omit: Skeptic challenges, Referee reasoning.