pan-wizard 2.9.1 → 3.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (75) hide show
  1. package/README.md +31 -9
  2. package/agents/pan-conductor.md +189 -0
  3. package/agents/pan-counterfactual.md +112 -0
  4. package/agents/pan-debugger.md +15 -1
  5. package/agents/pan-distiller.md +82 -0
  6. package/agents/pan-document_code.md +21 -0
  7. package/agents/pan-executor.md +16 -0
  8. package/agents/pan-hardener.md +113 -0
  9. package/agents/pan-integration-checker.md +2 -0
  10. package/agents/pan-knowledge.md +81 -0
  11. package/agents/pan-meta-reviewer.md +91 -0
  12. package/agents/pan-optimizer.md +242 -0
  13. package/agents/pan-plan-checker.md +2 -0
  14. package/agents/pan-previewer.md +98 -0
  15. package/agents/pan-project-researcher.md +4 -4
  16. package/agents/pan-reviewer.md +2 -0
  17. package/agents/pan-verifier.md +2 -0
  18. package/bin/install-lib.cjs +197 -0
  19. package/bin/install.js +2048 -1959
  20. package/commands/pan/cost.md +132 -0
  21. package/commands/pan/exec-phase.md +15 -0
  22. package/commands/pan/focus-auto.md +168 -3
  23. package/commands/pan/focus-exec.md +21 -1
  24. package/commands/pan/focus-scan.md +6 -0
  25. package/commands/pan/git.md +223 -0
  26. package/commands/pan/knowledge.md +129 -0
  27. package/commands/pan/learn.md +61 -0
  28. package/commands/pan/map-codebase.md +15 -0
  29. package/commands/pan/mcp-bridge.md +145 -0
  30. package/commands/pan/milestone-done.md +9 -0
  31. package/commands/pan/optimize.md +86 -0
  32. package/commands/pan/plan-phase.md +11 -0
  33. package/commands/pan/preview.md +114 -0
  34. package/commands/pan/profile.md +37 -0
  35. package/commands/pan/review-deep.md +128 -0
  36. package/commands/pan/verify-phase.md +11 -0
  37. package/commands/pan/what-if.md +146 -0
  38. package/hooks/dist/pan-cost-logger.js +102 -0
  39. package/hooks/dist/pan-statusline.js +154 -108
  40. package/hooks/dist/pan-trace-logger.js +197 -0
  41. package/package.json +1 -1
  42. package/pan-wizard-core/bin/lib/bridge.cjs +269 -0
  43. package/pan-wizard-core/bin/lib/bus.cjs +251 -0
  44. package/pan-wizard-core/bin/lib/codebase.cjs +118 -0
  45. package/pan-wizard-core/bin/lib/commands.cjs +1 -0
  46. package/pan-wizard-core/bin/lib/constants.cjs +44 -1
  47. package/pan-wizard-core/bin/lib/context-budget.cjs +27 -0
  48. package/pan-wizard-core/bin/lib/core.cjs +91 -6
  49. package/pan-wizard-core/bin/lib/cost.cjs +359 -0
  50. package/pan-wizard-core/bin/lib/distill.cjs +510 -0
  51. package/pan-wizard-core/bin/lib/focus.cjs +108 -3
  52. package/pan-wizard-core/bin/lib/git.cjs +407 -0
  53. package/pan-wizard-core/bin/lib/init.cjs +5 -5
  54. package/pan-wizard-core/bin/lib/knowledge.cjs +331 -0
  55. package/pan-wizard-core/bin/lib/memory.cjs +252 -0
  56. package/pan-wizard-core/bin/lib/optimize.cjs +653 -0
  57. package/pan-wizard-core/bin/lib/phase.cjs +40 -13
  58. package/pan-wizard-core/bin/lib/preview.cjs +480 -0
  59. package/pan-wizard-core/bin/lib/review-deep.cjs +280 -0
  60. package/pan-wizard-core/bin/lib/roadmap.cjs +4 -4
  61. package/pan-wizard-core/bin/lib/state.cjs +2 -2
  62. package/pan-wizard-core/bin/lib/verify.cjs +34 -1
  63. package/pan-wizard-core/bin/lib/whatif.cjs +289 -0
  64. package/pan-wizard-core/bin/pan-tools.cjs +317 -4
  65. package/pan-wizard-core/templates/playbook.md +53 -0
  66. package/pan-wizard-core/templates/preview-report.md +93 -0
  67. package/pan-wizard-core/templates/roadmap.md +24 -24
  68. package/pan-wizard-core/templates/state.md +12 -9
  69. package/pan-wizard-core/workflows/exec-phase.md +97 -0
  70. package/pan-wizard-core/workflows/learn.md +91 -0
  71. package/pan-wizard-core/workflows/optimize.md +139 -0
  72. package/pan-wizard-core/workflows/plan-phase.md +28 -1
  73. package/pan-wizard-core/workflows/quick.md +7 -0
  74. package/pan-wizard-core/workflows/verify-phase.md +16 -0
  75. package/scripts/build-hooks.js +3 -1
@@ -0,0 +1,132 @@
1
+ ---
2
+ name: pan:cost
3
+ group: Observability
4
+ description: Show token usage and estimated cost across PAN commands and agents
5
+ argument-hint: "[report|append|clear] [--format json|table|chart] [--since YYYY-MM-DD] [--until YYYY-MM-DD]"
6
+ allowed-tools:
7
+ - Read
8
+ - Bash
9
+ ---
10
+
11
+ <objective>
12
+ Report token usage and estimated cost across all PAN invocations in this project.
13
+
14
+ Reads `.planning/metrics/tokens.jsonl` — an append-only log where each line is one call (agent or command) with token counts and model. Cost is computed from a built-in rate table (overridable via `.planning/config.json` → `cost.rates`).
15
+
16
+ Default output is JSON for piping. Use `--format table` for human-readable tables or `--format chart` for an ASCII bar chart of daily spend.
17
+ </objective>
18
+
19
+ <execution_context>
20
+ @~/.claude/pan-wizard-core/bin/lib/cost.cjs
21
+ </execution_context>
22
+
23
+ <subcommands>
24
+
25
+ ### `report` (default)
26
+
27
+ Aggregate all records into totals + breakdowns by agent, command, tier, and day.
28
+
29
+ ```
30
+ pan-tools cost report [--format json|table|chart] [--since YYYY-MM-DD] [--until YYYY-MM-DD]
31
+ ```
32
+
33
+ **Flags:**
34
+ - `--format` — `json` (default, for tools) | `table` (aligned text columns) | `chart` (per-day ASCII bars).
35
+ - `--since` — ISO date lower bound (inclusive). Records without `ts` always pass.
36
+ - `--until` — ISO date upper bound (inclusive).
37
+
38
+ **JSON output shape:**
39
+ ```json
40
+ {
41
+ "totals": {
42
+ "calls": 42,
43
+ "input_tokens": 123456,
44
+ "output_tokens": 4567,
45
+ "cache_read_tokens": 50000,
46
+ "cache_write_tokens": 5000,
47
+ "cost_usd": 2.1234,
48
+ "cost_unknown": 0
49
+ },
50
+ "cache_hit_rate_pct": 40.5,
51
+ "by_agent": { "pan-planner": { "calls": 8, "input": 50000, ... } },
52
+ "by_command": { ... },
53
+ "by_tier": { ... },
54
+ "by_day": { "2026-04-18": { ... } },
55
+ "window": { "since": null, "until": null }
56
+ }
57
+ ```
58
+
59
+ ### `append`
60
+
61
+ Append a single cost record. Normally called by instrumented agent spawns; users rarely invoke directly.
62
+
63
+ ```
64
+ pan-tools cost append \
65
+ [--agent <name>] [--command <name>] [--model <id>] [--tier reasoning|mid|fast] \
66
+ [--input-tokens N] [--output-tokens N] \
67
+ [--cache-read-tokens N] [--cache-write-tokens N] \
68
+ [--phase <num>] [--session <id>]
69
+ ```
70
+
71
+ Missing fields are stored as `null` / `0`. Cost is auto-computed when `model` or `tier` resolves to a known rate.
72
+
73
+ ### `clear`
74
+
75
+ Delete the cost log. Useful at the start of a billing cycle.
76
+
77
+ ```
78
+ pan-tools cost clear
79
+ ```
80
+
81
+ </subcommands>
82
+
83
+ <rate_table>
84
+ Default rates (USD per million tokens) as of 2026-04. Override per-model in `.planning/config.json`:
85
+
86
+ ```json
87
+ {
88
+ "cost": {
89
+ "rates": {
90
+ "claude-opus-4-7": { "input": 15.0, "output": 75.0, "cache_read": 1.5, "cache_write": 18.75 },
91
+ "my-custom-model": { "input": 1.0, "output": 2.0, "cache_read": 0.1, "cache_write": 1.25 }
92
+ }
93
+ }
94
+ }
95
+ ```
96
+
97
+ When a record has neither a known model nor a known tier, its cost is `null` and it counts toward `totals.cost_unknown`.
98
+ </rate_table>
99
+
100
+ <workflow>
101
+
102
+ **Daily check:** run `/pan:cost --format chart` at the end of a working day to see the spend shape.
103
+
104
+ **Before shipping:** run `/pan:cost --since 2026-04-01 --format table` to get a total for the billing period.
105
+
106
+ **After an expensive run:** check `by_agent` and `by_command` to see which stage drove the spend.
107
+
108
+ **To reconcile with provider bill:** providers report total tokens; PAN's log is append-only and in ISO-8601, so `--since / --until` should match the provider's billing window.
109
+
110
+ </workflow>
111
+
112
+ <instrumentation_note>
113
+
114
+ Token records are written by any caller that knows its usage — typically the host runtime or a wrapper. PAN ships the log format + aggregator (this command); the capture hook itself is opt-in (Wave 5 of Spec B v2). Until then, records can be appended manually via `pan-tools cost append` or by external scripts reading the provider API.
115
+
116
+ If `.planning/metrics/tokens.jsonl` is empty, `/pan:cost` returns zero totals — the feature is inert, not broken.
117
+
118
+ </instrumentation_note>
119
+
120
+ <runtime_compatibility>
121
+
122
+ | Runtime | Support |
123
+ |---------|---------|
124
+ | Claude Code | Full — data format + aggregation + all output formats |
125
+ | OpenCode | Full aggregator; token capture depends on OpenCode's own hooks |
126
+ | Gemini | Full aggregator; token capture depends on Gemini CLI instrumentation |
127
+ | Codex | Full aggregator; token capture via external script |
128
+ | Copilot CLI | Full aggregator; Copilot doesn't currently expose per-call usage |
129
+
130
+ The aggregator is runtime-agnostic. What varies across runtimes is how records *get into* `tokens.jsonl` in the first place.
131
+
132
+ </runtime_compatibility>
@@ -61,6 +61,8 @@ Phase: $ARGUMENTS
61
61
  - `--skip-tests` — Skip automatic test generation after execution completes.
62
62
  - `--skip-review` — Skip automatic code review after execution completes.
63
63
  - `--fast` — Skip both test generation and code review (implies `--skip-tests --skip-review`).
64
+ - `--deep-review` (v3.4+) — After the normal reviewer step, also run `/pan:review-deep <phase>` (security audit via pan-hardener + cross-check via pan-meta-reviewer). Produces `.planning/reviews/<N>/deep-review.md`. Recommended for phases touching auth, payment, PII, migrations, or public APIs. Costs roughly 3× a normal review.
65
+ - `--hierarchical` (v3.4+, Claude + Opus 4.7 only) — Spawn `pan-conductor` as a top-level orchestrator that decomposes the phase and spawns executor/reviewer/verifier sub-agents in sequence. Bounded by safety harness: max 2 nesting levels, 12 spawns per phase, budget ceiling, `.planning/orchestration/abort` kill-switch. On non-Claude runtimes or older models, this flag is a no-op with a warning and falls back to flat exec. Use only for large phases (≥4 autonomous plans) where wall-clock reduction justifies the ~20-30% orchestration tax.
64
66
 
65
67
  Context files are resolved inside the workflow via `pan-tools init execute-phase` and per-subagent `<files_to_read>` blocks.
66
68
  </context>
@@ -85,6 +87,19 @@ Each execution stage has a restricted set of appropriate actions. Using the wron
85
87
  - Wave commit: git operations only — all code changes must be done before committing
86
88
  </action_gating>
87
89
 
90
+ <cache_priming>
91
+ **Before Discovery, prime the prompt cache once per invocation.** All subagents spawned within the next 5 minutes will hit the cache instead of re-sending the full context.
92
+
93
+ Run once:
94
+ ```
95
+ pan-tools cache prime --summary
96
+ ```
97
+
98
+ This returns `{blocks: [{path, bytes, cache}], total_bytes, sha}` for the cacheable set (project.md, requirements.md, roadmap.md, state.md, standards.md). The `sha` is stable across identical inputs, so repeated calls within the phase hit cached reads.
99
+
100
+ When spawning subagents for wave execution, include the cacheable block paths in each agent's system-context so the host runtime (Claude Code with Opus 4.7) can mark them `cache_control: ephemeral`. On non-Claude runtimes or older models, this step is a no-op — nothing breaks, just no savings.
101
+ </cache_priming>
102
+
88
103
  <process>
89
104
  Execute the execute-phase workflow from @~/.claude/pan-wizard-core/workflows/exec-phase.md end-to-end.
90
105
  Preserve all workflow gates (wave execution, checkpoint handling, verification, state updates, routing).
@@ -58,8 +58,10 @@ Which category should this auto campaign focus on?
58
58
  5. **docs** — Stale documentation, missing command descriptions (P5-P6)
59
59
  6. **optimize** — Performance bottlenecks, redundant computation, robustness hardening (P1-P4)
60
60
  7. **prompts** — Execute micro-prompt documents sequentially, or generate them from specs (P0-P6)
61
+ 8. **security** — OWASP Top 10 violations, STRIDE threats, auth/injection/crypto hardening (P0-P2)
62
+ 9. **distill** — AI code-bloat: phantom try/catch, unused imports, repeated blocks, premature abstraction, god functions (P1-P5)
61
63
 
62
- Reply with a number (1-7) or category name.
64
+ Reply with a number (1-9) or category name.
63
65
  ```
64
66
 
65
67
  **After the user replies, map their response to a category name:**
@@ -70,6 +72,8 @@ Reply with a number (1-7) or category name.
70
72
  - "5" or "docs" → SELECTED_CATEGORY = docs
71
73
  - "6" or "optimize" → SELECTED_CATEGORY = optimize
72
74
  - "7" or "prompts" → SELECTED_CATEGORY = prompts
75
+ - "8" or "security" → SELECTED_CATEGORY = security
76
+ - "9" or "distill" → SELECTED_CATEGORY = distill
73
77
 
74
78
  Wait for the user's reply before proceeding. Do not guess or pick a default category.
75
79
 
@@ -85,11 +89,12 @@ Wait for the user's reply before proceeding. Do not guess or pick a default cate
85
89
  ```
86
90
  /pan:focus-auto [--category CAT] [--mode MODE] [--budget N] [--max-cycles N]
87
91
  [--total-budget N] [--continue] [--stop] [--status] [--dry-run]
92
+ [--deep-review]
88
93
  ```
89
94
 
90
95
  | Flag | Default | Description |
91
96
  |------|---------|-------------|
92
- | `--category` | null (all) | cleanup, tests, stability, features, docs, optimize, prompts |
97
+ | `--category` | null (all) | cleanup, tests, stability, features, docs, optimize, prompts, security, distill |
93
98
  | `--mode` | category-dependent | bugfix, balanced, features, full |
94
99
  | `--budget` | category-dependent | Points per cycle (5-100) |
95
100
  | `--max-cycles` | 10 | Maximum iterations (1-50) |
@@ -98,6 +103,7 @@ Wait for the user's reply before proceeding. Do not guess or pick a default cate
98
103
  | `--stop` | — | Gracefully stop active run |
99
104
  | `--status` | — | Show current campaign progress |
100
105
  | `--dry-run` | — | Show plan without executing |
106
+ | `--deep-review` | off | After every exec cycle, run inline OWASP security check on changed files. Verdict `block` or `review_required` stops the campaign (6th safety harness). Works with all categories. |
101
107
 
102
108
  ## Category Defaults
103
109
 
@@ -110,6 +116,7 @@ Wait for the user's reply before proceeding. Do not guess or pick a default cate
110
116
  | docs | P5-P6 | balanced | 30 |
111
117
  | optimize | P1-P4 | balanced | 50 |
112
118
  | prompts | P0-P6 | balanced | 100 |
119
+ | security | P0-P2 | bugfix | 40 |
113
120
 
114
121
  ## Pipeline
115
122
 
@@ -173,6 +180,11 @@ Perform a deep codebase scan to find actionable work items with evidence.
173
180
  - **features:** roadmap items not yet implemented, README promises without backing code
174
181
  - **docs:** stale documentation, missing command descriptions
175
182
  - **optimize:** N+1 operations (file I/O / network calls inside loops), redundant re-computation (`JSON.parse`/`stringify` of same data), synchronous blocking in async modules (`readFileSync`/`execSync` alongside async exports), algorithmic complexity (nested `.find()`/`.filter()` in loops creating O(n²)+), unnecessary allocations in hot paths (spread in loops, string concat vs `join()`), regex construction inside loops (should be hoisted), unbounded collection growth (`.push()` without size limits), swallowed errors (`catch {}` / `catch { /* */ }`), suboptimal data structures (array `.includes()` where Set is better), dead assignments, unguarded property access on nullable values (`.length`/`.split()`/`.match()[0]` without null check)
183
+ - **security:** Three-pass approach:
184
+ - **Pass 1 — Injection & crypto (inline grep):** Scan source files for `eval(`, `execSync`, `exec(`, string concatenation in SQL patterns (`` `SELECT...${`` / `"SELECT..."+`), `md5(`/`sha1(`/`createHash('md5'`/`createHash('sha1'`, hardcoded secrets (`password\s*=\s*['"]`, `api_key\s*=\s*['"]`, `secret\s*=\s*['"`), `Math.random()` used for security purposes.
185
+ - **Pass 2 — Auth & access control (inline grep):** Routes without auth middleware (look for `router.get/post/put/delete` without preceding `app.use(...auth...)`), `req.params.id` used directly without ownership check, `JSON.parse(` on `req.body` without schema validation, CORS `origin: '*'` or `Access-Control-Allow-Origin: *`, verbose errors that expose stack traces (`res.json({ stack:`).
186
+ - **Pass 3 — Semantic depth (Agent tool, optional):** For M/L items where grep found a suspicious pattern but fix guidance needs code-path tracing, use the Agent tool with Explore subagent to read the specific file and confirm exploitability before including in the batch.
187
+ - **Classification:** Map findings to priorities: OWASP critical/exploit-ready → P0, High/auth-bypass → P1, Medium/defense-in-depth → P2. Drop LOW/INFO — they don't meet the P0-P2 filter.
176
188
  - **prompts:** Two operational modes — detect which applies:
177
189
  - **Execute mode:** Find micro-prompt documents (`.md` files containing ordered prompt blocks, e.g., `## Prompt 1`, `## Prompt 2`, or numbered checklist items `- [ ] Prompt: ...`). Look in `.planning/`, project root, and `docs/` for files matching patterns: `*prompts*`, `*micro-prompt*`, `*prompt-plan*`, `*prompt-sequence*`. Each unchecked/incomplete prompt block is one work item.
178
190
  - **Generate mode:** Find specification documents (files matching `*spec*`, `*prd*`, `*requirements*`, `*feature*` in `.planning/`, `docs/specs/`, project root) that do NOT already have a corresponding micro-prompt document. Each spec needing decomposition is one work item.
@@ -271,6 +283,32 @@ A failed item never blocks subsequent items.
271
283
  5. Stage specific changed files (not `git add -A`) and commit with accurate message listing only verified items
272
284
  6. Count: `items_completed`, `items_failed`, `points_used`
273
285
 
286
+ **If `--deep-review` flag is active (run after commit, before recording cycle):**
287
+
288
+ Get changed files from this cycle's commit:
289
+ ```bash
290
+ CHANGED=$(git diff HEAD~1 --name-only 2>/dev/null | grep -E '\.(js|ts|jsx|tsx|py|go|rb|java|php)$')
291
+ ```
292
+
293
+ Run inline OWASP security check on changed files only:
294
+ - Grep each changed file for critical patterns:
295
+ - Injection: `eval(`, `execSync(`, SQL string concat (`` `SELECT...${`` ), `child_process.exec(`
296
+ - Crypto: `createHash('md5'`, `createHash('sha1'`, `Math.random()` near auth/token/secret context
297
+ - Auth bypass: routes with no auth guard added, `req.params` used as DB key without ownership check
298
+ - Secrets: `password\s*=\s*['"]`, `apiKey\s*=\s*['"]`, `token\s*=\s*['"]` assigned to a literal value
299
+ - Score findings by severity: critical (exploit-ready) → BLOCK; high (auth/injection surface) → WARN; medium/low → LOG
300
+
301
+ **Handle deep-review verdict:**
302
+
303
+ | Severity found | Verdict | Action |
304
+ |---------------|---------|--------|
305
+ | Critical pattern in changed file | `block` | STOP campaign — do NOT record cycle, revert last commit, present finding to user |
306
+ | High pattern in changed file | `review_required` | STOP campaign — record cycle as completed, flag finding, recommend manual review |
307
+ | Medium/low only | `ok_with_minor` | Continue — append findings to `.planning/focus/security-log-<date>.md` |
308
+ | No patterns | `ok` | Continue silently |
309
+
310
+ Write all non-ok findings to `.planning/focus/security-log-<date>.md` with file:line references.
311
+
274
312
  #### Step 2.4: Record Cycle
275
313
 
276
314
  Run: `pan-tools focus auto --update --items-completed N --items-failed N --points-used N --tests-before N --tests-after N --batch-file <path>`
@@ -282,6 +320,8 @@ Check the response for stop conditions:
282
320
  - `zero_completed`: No items completed in this cycle — go to Phase 3
283
321
  - `diminishing_returns`: Optimize only — cycle efficiency < 30% of previous cycle — go to Phase 3
284
322
  - `prompts_complete`: Prompts only — all prompts in document executed — go to Phase 3
323
+ - `security_complete`: Security only — scan found no HIGH/CRITICAL items remaining — go to Phase 3
324
+ - `deep_review_block`: `--deep-review` only — critical pattern detected in changed files — go to Phase 3 with warning
285
325
  - `null`: Continue to next cycle
286
326
 
287
327
  #### Step 2.5: Inter-Cycle Context Management
@@ -293,6 +333,24 @@ Between cycles, manage context to prevent quality degradation over long campaign
293
333
 
294
334
  Display one-line cycle summary: `Cycle N/M | X/Y pts | Z items done | Tests: A -> B`
295
335
 
336
+ #### Step 2.5a: Reflection Gate (Opus 4.7 thinking-capable models only)
337
+
338
+ Before committing to the next cycle, call the reflection helper:
339
+
340
+ ```
341
+ echo '{"run": <run-state>, "cycle": <just-completed-cycle>, "batch": <proposed-next-batch>, "tier": "reasoning"}' \
342
+ | pan-tools focus reflection
343
+ ```
344
+
345
+ The helper returns `{reflect: true, prompt: "..."}` when the current model tier supports extended thinking. If `reflect: true`, think through the prompt — which asks whether running another cycle is worthwhile given telemetry and remaining items — and respond with JSON: `{"continue": true|false, "rationale": "..."}`.
346
+
347
+ - If `continue: false`: stop the campaign and treat as a user-reason stop (preserve state, skip to Phase 3).
348
+ - If `continue: true`: proceed to the next cycle.
349
+
350
+ If the helper returns `reflect: false` (tier doesn't support thinking, or `reflection_enabled: false` in run state, or no next batch): skip this step silently and continue to the next cycle.
351
+
352
+ The reflection gate catches "zero progress" or "wrong category" drift earlier than the automatic stop rules.
353
+
296
354
  **Attention anchor — emit after every cycle summary:**
297
355
  ```
298
356
  Remaining: {cycles_left} cycles | {budget_remaining}/{total_budget} pts | Safety: {active_harness_warnings}
@@ -323,7 +381,7 @@ Then continue immediately to the next cycle (back to Step 2.1).
323
381
 
324
382
  3. Remove safety tag: `git tag -d focus-auto-baseline 2>/dev/null`
325
383
 
326
- ## 5-Layer Safety Harness
384
+ ## 6-Layer Safety Harness
327
385
 
328
386
  | Layer | Mechanism | Action |
329
387
  |-------|-----------|--------|
@@ -332,6 +390,7 @@ Then continue immediately to the next cycle (back to Step 2.1).
332
390
  | Iteration limit | `--max-cycles N` | Hard stop on loop count |
333
391
  | Regression circuit breaker | tests_after < tests_before | Immediate stop, status=stopped |
334
392
  | Zero-completed guard | 0 items done in a cycle | Stop — further cycles won't help |
393
+ | Security gate (`--deep-review`) | Critical/high OWASP pattern in changed files | Revert last commit (critical) or flag for manual review (high), stop campaign |
335
394
 
336
395
  ## 9 Behavioral Rules
337
396
 
@@ -430,6 +489,112 @@ When a specification document is found that doesn't have a matching micro-prompt
430
489
 
431
490
  **After generation:** The document is written and committed. The next cycle will detect it in execute mode and begin executing prompts sequentially.
432
491
 
492
+ ## Security Category — Execution Details
493
+
494
+ The security category scans for OWASP Top 10 (2025) violations and STRIDE threats, then fixes them cycle by cycle until the scan returns zero HIGH/CRITICAL findings.
495
+
496
+ ### Scan approach (Step 2.1)
497
+
498
+ Three passes per cycle:
499
+
500
+ **Pass 1 — Fast grep scan (always runs):**
501
+
502
+ | OWASP | Grep pattern | Priority |
503
+ |-------|-------------|---------|
504
+ | A03 Injection | `eval(`, `execSync(`, `` `SELECT.*\${ ``, `child_process.exec(` | P0 |
505
+ | A02 Crypto | `createHash\(['"]md5\|sha1`, `Math\.random\(\)` near auth/token | P0 |
506
+ | A01 Access | Route without auth middleware, IDOR (raw `req.params.id` to DB) | P1 |
507
+ | A05 Misconfig | `origin:\s*['"]?\*`, `Access-Control-Allow-Origin: \*`, stack in response | P1 |
508
+ | A07 Auth | No session expiry, credentials in URL params | P1 |
509
+ | A04 Design | Missing rate-limit on auth/payment endpoints | P2 |
510
+ | A09 Logging | Security events (`login`, `payment`, `admin`) with no log call nearby | P2 |
511
+
512
+ **Pass 2 — Structural check (always runs):**
513
+ - Read route files and check: does every mutating endpoint (POST/PUT/PATCH/DELETE) have auth middleware before the handler?
514
+ - Check for hardcoded secrets: grep for `['"][A-Za-z0-9_]{20,}['"]` assigned to variables named `key`/`token`/`secret`/`password`/`apiKey`
515
+ - Check for prototype pollution risk: `Object.assign(req.body)` or spread from untrusted input into a stored object
516
+
517
+ **Pass 3 — Semantic depth (Agent tool, for M/L items only):**
518
+ When a pattern match needs code-path confirmation, spawn an Explore subagent:
519
+ > "Read [file]. Confirm whether [line N] is reachable from an unauthenticated request path and whether the input is sanitized before use."
520
+
521
+ Use the confirmation to decide whether to include the item at P0/P1 or drop it as a false positive.
522
+
523
+ ### Item classification
524
+
525
+ | Hardener severity | Focus priority | Example |
526
+ |------------------|----------------|---------|
527
+ | Critical | P0 | `eval(req.body.code)` — direct RCE |
528
+ | High | P1 | Auth bypass on admin route |
529
+ | Medium | P2 | Rate-limiting absent on login |
530
+ | Low / Info | DROP | Missing security header on non-sensitive route |
531
+
532
+ ### Execution (Step 2.3)
533
+
534
+ Treat each security item as a STANDARD or FULL item regardless of effort estimate:
535
+
536
+ 1. **State threat:** "This is [OWASP category]. The exploit path is: [attacker does X → Y → data/system compromised]."
537
+ 2. **Read the file** — confirm the pattern is real, not a false positive
538
+ 3. **Implement the fix** — use established patterns (parameterized queries, allowlists, bcrypt, rate-limit middleware)
539
+ 4. **Write or update the test** — every security fix MUST have a test that proves the vulnerability is closed (e.g., send the malicious payload, assert 400/403 not 200)
540
+ 5. **Run full test suite** — regression check before marking DONE
541
+
542
+ ### Stop condition
543
+
544
+ `security_complete` fires when the scan finds zero P0/P1 items. P2 items (medium) may remain — they won't stop the campaign unless `zero_completed` fires (no items at all).
545
+
546
+ A security campaign that ends with `security_complete` means: no critical or high OWASP violations found in the scanned files. Medium/low items can be addressed in subsequent targeted passes or documented as accepted risk.
547
+
548
+ ---
549
+
550
+ ## Distill Category — Execution Details
551
+
552
+ The `distill` category targets **AI-generated code bloat** with a 5-pass pipeline based on the SOTA agentic-refactoring architecture (deterministic-first, LLM-on-narrow-spans).
553
+
554
+ ### Pipeline
555
+
556
+ | Pass | What | Cost | Tier output |
557
+ |------|------|------|-------------|
558
+ | 1 | **Deterministic patterns** — phantom try/catch, unused imports, magic numbers, long functions, wide param lists | Free | safe / review |
559
+ | 2 | **AST-style analysis** — single-instance factories, deep nesting | Free | review |
560
+ | 3 | **Cross-file graph** — repeated 5+ line blocks, unreferenced exports | Free | review |
561
+ | 4 | **LLM judgment** — pan-distiller agent receives ONLY flagged spans (max 50 lines context per finding); validates pattern, refines tier, proposes minimal rewrite | LLM tokens | safe / review / risky |
562
+ | 5 | **Cross-session memory** — compares findings to `.planning/memory/distill-patterns.md`; flags **regressed** patterns ("we already fixed this") | Free | metadata |
563
+
564
+ ### Safety Tiers
565
+
566
+ | Tier | Rule | Action |
567
+ |------|------|--------|
568
+ | `safe` | Deterministic, behavior-preserving (e.g., remove unused import) | Auto-applied |
569
+ | `review_required` | Behavior preserved under invariants but human should verify | Surfaced to user |
570
+ | `risky` | Cross-file impact or might surface latent bugs | Never auto-applied |
571
+
572
+ A finding's confidence below 0.85 is automatically downgraded to `review_required` regardless of original tier.
573
+
574
+ ### Bloat Budget
575
+
576
+ After each cycle, distill computes:
577
+ - **touched_loc** — total LOC modified in cycle
578
+ - **removable_loc** — sum of `loc_saved` across findings
579
+ - **essential_loc** — touched_loc − removable_loc
580
+ - **bloat ratio** — touched_loc / essential_loc
581
+
582
+ Default threshold: **2.0x**. If a cycle's ratio exceeds threshold, the bloat budget gate flags it for review.
583
+
584
+ ### Stop condition
585
+
586
+ `distill_complete` fires when the scan finds zero bloat findings. The codebase is fully distilled for the patterns the deterministic + AST + graph passes detect.
587
+
588
+ ### CLI
589
+
590
+ ```bash
591
+ node ~/.claude/pan-wizard-core/bin/pan-tools.cjs distill scan
592
+ node ~/.claude/pan-wizard-core/bin/pan-tools.cjs distill analyze [--touched-loc N] [--bloat-threshold X]
593
+ node ~/.claude/pan-wizard-core/bin/pan-tools.cjs distill report
594
+ ```
595
+
596
+ `scan` returns findings. `analyze` adds bloat budget + regressed pattern detection. `report` writes findings to `.planning/memory/distill-patterns.md` for the next session.
597
+
433
598
  <failure_pattern_capture>
434
599
  When the same failure pattern appears in 2+ items within a campaign, capture it for future runs.
435
600
 
@@ -116,6 +116,7 @@ HARD STOP conditions (do not proceed to next stage):
116
116
  - `--dry-run` — Run Stages 1-2 only (show what WOULD be executed)
117
117
  - `--no-commit` — Skip the commit step in Stage 6
118
118
  - `--continue` — Resume a previously interrupted execution
119
+ - `--deep-review` (v3.4+) — After each high-stakes item's execution, run `/pan:review-deep` for that item (pan-hardener + pan-meta-reviewer security + cross-check). Slows the campaign by roughly 3× per item that triggers the deep pass; use for batches touching auth/payment/migrations.
119
120
 
120
121
  ---
121
122
 
@@ -209,7 +210,14 @@ This catches emergent interactions: 5 "add try-catch" fixes might reveal the mod
209
210
  1. **Check Project Status** — git status, recent commits
210
211
  2. **Test Baseline** — run test suite, record current counts
211
212
  3. **Create rollback snapshot** — git tag for safety
212
- 4. **Report** — Output session start summary
213
+ 4. **Prime prompt cache** — `pan-tools cache prime --summary` (once; all sub-agents in the next 5 min hit cached context)
214
+ 5. **Report** — Output session start summary
215
+
216
+ **Circular optimization — init trace:**
217
+ ```bash
218
+ node ~/.claude/pan-wizard-core/bin/pan-tools.cjs optimize trace init \
219
+ --description "focus-exec session" --command "focus-exec" 2>/dev/null || true
220
+ ```
213
221
 
214
222
  **Record baseline:**
215
223
  ```
@@ -243,6 +251,13 @@ Display the execution batch to user, then continue automatically.
243
251
  ### 3.0 Pre-Execution Setup
244
252
  1. Cache project facts — do NOT re-read later
245
253
  2. Create/update progress tracker with the batch table
254
+ 3. Classify stages for parallel tool use:
255
+ ```
256
+ pan-tools focus classify-stages --raw
257
+ ```
258
+ The CLI reads the latest batch and returns `{waves, parallelism_hint}`. When `parallelism_hint` is `emit-micro-in-parallel` or `emit-standard-in-parallel`, all reads and greps for items in the current wave SHOULD be emitted in a single assistant turn (parallel tool calls). Opus 4.7 is markedly better at emitting parallel tool calls than earlier models; use that to collapse Stage 3 latency on MICRO-heavy batches.
259
+
260
+ Serialize on `FULL` tier items — each is its own wave.
246
261
 
247
262
  ### 3.1 Process Items by Tier
248
263
 
@@ -386,6 +401,11 @@ Unless `--no-commit`:
386
401
  - Record session summary (items completed, tests before/after, budget used)
387
402
  - Append error patterns if any failures occurred
388
403
 
404
+ ### 6.3.5 Circular optimization — end trace
405
+ ```bash
406
+ node ~/.claude/pan-wizard-core/bin/pan-tools.cjs optimize trace end 2>/dev/null || true
407
+ ```
408
+
389
409
  ### 6.4 Final Report
390
410
 
391
411
  ```markdown
@@ -58,6 +58,12 @@ When `/pan:focus-scan` is invoked, execute all phases without stopping. Do not a
58
58
 
59
59
  ## Phase 0: Orientation & Baseline Snapshot
60
60
 
61
+ **Circular optimization — init trace:**
62
+ ```bash
63
+ node ~/.claude/pan-wizard-core/bin/pan-tools.cjs optimize trace init \
64
+ --description "focus-scan" --command "focus-scan" 2>/dev/null || true
65
+ ```
66
+
61
67
  ### 0.1 Read Current State
62
68
  Read these files to establish baseline:
63
69