@a-canary/pi-director 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,99 @@
1
+ # Phase Loop
2
+
3
+ Core execution loop for a single PLAN.md phase. The director agent follows this for each phase.
4
+
5
+ ## Input
6
+ - PLAN.md with phases, steps, and gates
7
+ - CHOICES.md for context and constraints
8
+ - Current phase number (first incomplete)
9
+
10
+ ## Loop
11
+
12
+ ```
13
+ ┌──────────────────┐
14
+ │ 1. Read Gates │ ← PLAN.md + CHOICES.md
15
+ └────────┬─────────┘
16
+
17
+ ┌──────────────────┐
18
+ │ 2. Recon │ ← scout agents (operational, parallel, many tool calls)
19
+ └────────┬─────────┘
20
+
21
+ ┌──────────────────┐
22
+ │ 3. Plan │ ← planner agent (tactical, few tool calls)
23
+ └────────┬─────────┘
24
+
25
+ ┌──────────────────┐
26
+ │ 4. Critique │ ← critic agent (strategic, ZERO tools, decision tree)
27
+ └────────┬─────────┘
28
+
29
+ ┌──────────────────┐
30
+ │ 5. Finalize │ ← planner resolves decision tree branches (tactical)
31
+ └────────┬─────────┘
32
+
33
+ ┌──────────────────┐
34
+ │ 6. Build & Test │ ← builder + reviewer agents (operational/tactical)
35
+ └────────┬─────────┘
36
+
37
+ ┌──────────────────┐
38
+ │ 7. Gate Critique │ ← critic reviews results (strategic, ZERO tools)
39
+ └────────┬─────────┘
40
+ pass │ │ fail
41
+ ▼ ▼
42
+ ✅ Next → diagnose → retry or ❌ STOP
43
+ ```
44
+
45
+ ## Step Details
46
+
47
+ ### 1. Read Gates
48
+ - Parse PLAN.md for current phase's steps and gates
49
+ - Parse CHOICES.md for relevant decisions
50
+ - Identify: exit criteria, assumptions, blockers
51
+
52
+ ### 2. Recon
53
+ Spawn parallel read-only agents:
54
+ - **scout**: `find`, `grep`, `read` relevant codebase areas
55
+ - **scout** (web): context7/web-search if phase references external APIs/libraries
56
+ - Output: compressed context for builder handoff
57
+
58
+ ### 3. Plan (tactical)
59
+ Delegate to **planner** agent with recon findings:
60
+ - Planner synthesizes concrete steps with file paths and function names
61
+ - Few tool calls — reads specific files to confirm assumptions
62
+ - Produces structured plan for critique
63
+
64
+ ### 4. Critique (strategic — zero tools)
65
+ Delegate to **critic** agent with:
66
+ - Recon summary (from step 2)
67
+ - Proposed plan (from step 3)
68
+ - CHOICES.md context + priority ladder
69
+ - Critic produces: approval/improvements + **decision tree** (max 8 leaves) for unknowns
70
+ - Critic uses maximum thinking depth for elevated reasoning
71
+
72
+ ### 5. Finalize (tactical)
73
+ Delegate back to **planner** to incorporate critique:
74
+ - Resolve decision tree branches using tool calls (check conditions)
75
+ - Apply critic's improvements
76
+ - Delegate to **writer** to update PLAN.md
77
+ - If critique rejected the plan → rework or STOP
78
+
79
+ ### 6. Build & Test (operational)
80
+ - Delegate to **builder** agents (parallel when tasks touch independent files)
81
+ - After each builder: delegate to **reviewer** (tactical)
82
+ - Reviewer issues → delegate fixes to builder
83
+ - Run test suite after implementation
84
+
85
+ ### 7. Gate Critique (strategic — zero tools)
86
+ Delegate to **critic** with:
87
+ - Phase gate results + exit criteria
88
+ - Regression check results (see [regression-check.md](regression-check.md))
89
+ - Implementation summary
90
+ - Critic approves, or produces decision tree for remediation
91
+ - If critic rejects → diagnose and fix via tactical/operational agents, or STOP if infeasible
92
+ - All gates pass → mark phase complete in PLAN.md
93
+
94
+ ## Multi-Phase Mode
95
+
96
+ When user says "do all phases" or "implement the plan":
97
+ 1. Execute steps 1-6 for current phase
98
+ 2. On success, loop to step 1 for next phase
99
+ 3. Continue until all phases complete or hard stop hit
@@ -0,0 +1,63 @@
1
+ # Regression Check
2
+
3
+ Verify that changes don't regress higher-priority concerns (M-0100, A-0100).
4
+
5
+ ## When to Run
6
+ - After every build step (Step 5 of phase loop)
7
+ - As part of gate check (Step 6 of phase loop)
8
+ - Before marking any phase complete
9
+
10
+ ## Check Order
11
+
12
+ Run checks top-down through the priority ladder. Stop at first regression.
13
+
14
+ ### 1. UX Quality Regression
15
+ - Do all existing user-facing features still work?
16
+ - Has any error message become less clear?
17
+ - Has any workflow gained extra steps?
18
+ - Do interactive elements still respond correctly?
19
+
20
+ **How to verify:**
21
+ ```bash
22
+ # Run existing tests
23
+ npm test 2>&1
24
+ # Check for removed/changed public APIs
25
+ git diff --stat HEAD~1 | grep -E '\.(ts|js|py)$'
26
+ # Manual: does the happy path still work?
27
+ ```
28
+
29
+ ### 2. Security Regression
30
+ - Are there new unvalidated inputs?
31
+ - Are secrets still protected (no hardcoded keys)?
32
+ - Are dependencies from trusted sources?
33
+ - Are permissions still properly scoped?
34
+
35
+ **How to verify:**
36
+ ```bash
37
+ # Check for hardcoded secrets
38
+ grep -rn 'password\|secret\|api_key\|token' --include='*.ts' --include='*.js' | grep -v node_modules | grep -v '.md'
39
+ # Check new dependencies
40
+ git diff HEAD~1 -- package.json
41
+ ```
42
+
43
+ ### 3. Scale Regression (if past scale gate)
44
+ - Does the change add O(n²) or worse operations?
45
+ - Are there new unbounded loops or recursions?
46
+ - Are new resources properly cleaned up?
47
+
48
+ ### 4. Efficiency Check (informational only)
49
+ - Note any efficiency impacts but don't block
50
+ - Log as suggestion for future optimization
51
+
52
+ ## Output
53
+
54
+ ```markdown
55
+ ### Regression Check
56
+ - [x] UX Quality: {pass/fail — details}
57
+ - [x] Security: {pass/fail — details}
58
+ - [ ] Scale: {pass/fail — details}
59
+ - [ ] Efficiency: {noted — details}
60
+ ```
61
+
62
+ If any check fails at a level higher than the current work's priority:
63
+ → **Hard stop. Do not proceed.**
@@ -0,0 +1,48 @@
1
+ # Choose — Project Intent Clarification
2
+
3
+ Wrapper around pi-choose-wisely for managing CHOICES.md within the director workflow.
4
+
5
+ ## When to Use
6
+ - User asks to clarify project intent, scope, goals
7
+ - User runs `/choose`
8
+ - New project without CHOICES.md
9
+
10
+ ## Routing
11
+
12
+ | User Input | Delegate To |
13
+ |------------|-------------|
14
+ | No CHOICES.md exists | `pi-choose-wisely:choose-wisely` → bootstrap mode (scan docs, extract choices) |
15
+ | "audit" / "check" | `pi-choose-wisely:choose-wisely` → audit mode (8 validation checks) |
16
+ | "init" / "interview" | `pi-choose-wisely:choose-wisely` → interview mode (structured planning) |
17
+ | Describes a change | `pi-choose-wisely:choose-wisely` → change mode (apply + cascade) |
18
+ | "replan" / "plan" | `pi-choose-wisely:replan` → gap analysis → generate PLAN.md |
19
+
20
+ ## Process
21
+
22
+ ### Step 1 — Delegate to pi-choose-wisely
23
+ Route the user's request to the appropriate pi-choose-wisely operation (see table above). Pass through all user context.
24
+
25
+ ### Step 2 — Post-Change Pipeline
26
+ After any CHOICES.md modification:
27
+
28
+ 1. **Cascade audit** — pi-choose-wisely runs this automatically (upward, lateral, downward checks)
29
+ 2. **Priority ladder check** — verify new/changed choices respect M-0100 ordering
30
+ 3. **Suggest replan** — if PLAN.md exists, ask: "CHOICES.md changed. Regenerate PLAN.md?" If yes, delegate to `pi-choose-wisely:replan`
31
+ 4. **Suggest /next** — "Run `/next` to see how this affects recommendations?"
32
+
33
+ ### Step 3 — New Choice Validation
34
+ For any new choice added, verify:
35
+ - Has a `Supports:` line (unless top-level Mission)
36
+ - ID is a fresh UID (never reuse or renumber existing IDs)
37
+ - Positioned correctly within its section (position = priority)
38
+
39
+ See [pipeline.md](lib/pipeline.md) for the full intent-to-execution flow.
40
+
41
+ ## Integration Points
42
+ - After CHOICES.md changes → suggest `/build` if PLAN.md exists
43
+ - After CHOICES.md changes → suggest `/next` for impact analysis
44
+ - CHOICES.md gaps feed into `/next` recommendations via choice-scanner
45
+
46
+ ## Delegates To
47
+ - `pi-choose-wisely:choose-wisely` skill (all CHOICES.md operations)
48
+ - `pi-choose-wisely:replan` skill (gap analysis → PLAN.md generation)
@@ -0,0 +1,83 @@
1
+ # Intent-to-Execution Pipeline
2
+
3
+ How project intent flows from CHOICES.md through to running code.
4
+
5
+ ## Flow
6
+
7
+ ```
8
+ User Intent
9
+
10
+
11
+ ┌─────────────────┐
12
+ │ /choose │ ← Clarify WHY and WHAT
13
+ │ CHOICES.md │
14
+ └────────┬────────┘
15
+ │ changes
16
+
17
+ ┌─────────────────┐
18
+ │ replan │ ← Gap analysis: choices vs reality
19
+ │ PLAN.md │
20
+ └────────┬────────┘
21
+ │ phases
22
+
23
+ ┌─────────────────┐
24
+ │ /build │ ← Execute phases: HOW
25
+ │ code + tests │
26
+ └────────┬────────┘
27
+ │ results
28
+
29
+ ┌─────────────────┐
30
+ │ /next │ ← Analyze what happened
31
+ │ NEXT.md │
32
+ └────────┬────────┘
33
+ │ recommendations
34
+
35
+ Back to /choose or /build
36
+ ```
37
+
38
+ ## Artifact Lifecycle
39
+
40
+ ### CHOICES.md
41
+ - **Created by**: `/choose` (bootstrap or interview)
42
+ - **Modified by**: `/choose` (change + cascade)
43
+ - **Read by**: `/build` (constraints), `/next` (gap analysis)
44
+ - **Owned by**: pi-choose-wisely
45
+
46
+ ### PLAN.md
47
+ - **Created by**: replan (from CHOICES.md gap analysis)
48
+ - **Modified by**: `/build` (marks phases complete, refines steps)
49
+ - **Read by**: `/next` (incomplete phases), `/build` (current phase)
50
+ - **Owned by**: pi-choose-wisely:replan + pi-director:/build
51
+
52
+ ### NEXT.md
53
+ - **Created by**: `/next` (analysis engine)
54
+ - **Modified by**: User approval (items deferred/dismissed)
55
+ - **Read by**: User (recommendations), `/choose` (scope changes), `/build` (approved items)
56
+ - **Owned by**: pi-director:/next
57
+
58
+ ## Autonomy Boundary
59
+
60
+ CHOICES.md is the autonomy boundary:
61
+ - **Inside scope** → director acts freely (bugs, gaps, refactors aligned with choices)
62
+ - **Outside scope** → NEXT.md surfaces it, user decides via `/choose`
63
+
64
+ ```
65
+ CHOICES.md (user-steered)
66
+
67
+ ├── In scope? ──→ Director acts autonomously
68
+ │ (build, fix, refactor, test)
69
+
70
+ └── Out of scope? → NEXT.md (agent-discovered)
71
+
72
+ └── User accepts? → Update CHOICES.md → Director can act
73
+ ```
74
+
75
+ ## Cycle
76
+
77
+ 1. `/choose` → user steers intent (interview, feedback)
78
+ 2. replan → generate phases from intent
79
+ 3. `/build` → execute phases autonomously (within scope)
80
+ 4. `/next` → surface out-of-scope issues for user review
81
+ 5. User accepts items → back to `/choose`
82
+
83
+ Each cycle tightens alignment between intent and implementation.
@@ -0,0 +1,84 @@
1
+ # Next — Analysis & Recommendation Engine
2
+
3
+ Analyze project data and generate ranked recommendations in NEXT.md.
4
+
5
+ ## When to Use
6
+ - User asks "what should I do next?"
7
+ - User runs `/next`
8
+ - Nightly scheduled analysis
9
+
10
+ ## Data Sources
11
+ 1. **Session history** — `.pi/agent/sessions/*.jsonl` — patterns of repeated work, failed attempts
12
+ 2. **Correction logs** — `.pi/corrections.jsonl` — systematic failures (via pi-upskill)
13
+ 3. **Code analysis** — complexity, test coverage gaps, dead code, large files
14
+ 4. **CHOICES.md** — unimplemented choices, stale decisions
15
+ 5. **PLAN.md** — incomplete phases, blocked items
16
+ 6. **App output/logs** — runtime errors, performance issues
17
+
18
+ ## Process
19
+
20
+ ### Step 1 — Gather
21
+ Spawn **parallel** scout agents, one per scanner module:
22
+ - [Session scanner](lib/session-scanner.md): parse recent sessions for failure patterns, token waste, repeated manual fixes
23
+ - [Code scanner](lib/code-scanner.md): find complexity hotspots, untested code, large files, dead exports
24
+ - [Choice scanner](lib/choice-scanner.md): diff CHOICES.md against codebase reality
25
+ - [Log scanner](lib/log-scanner.md): parse app logs for recurring errors
26
+
27
+ Each scanner runs as an independent subagent. All four run in parallel.
28
+
29
+ ### Step 2 — Analyze
30
+ Synthesize findings into recommendation categories:
31
+ - **refactor** — code quality improvements with clear before/after
32
+ - **simplify** — remove unnecessary complexity, dead code, over-abstraction
33
+ - **scope-change** — CHOICES.md additions/removals based on evidence
34
+ - **ux-improvement** — user experience issues found in logs or session patterns
35
+ - **upskill** — repeated agent failures suggesting a new skill or rule
36
+ - **debt** — technical debt items with effort estimates
37
+
38
+ ### Step 3 — Rank
39
+ Apply the [ranking algorithm](lib/ranker.md):
40
+ - **Impact** × **Effort** × **Evidence** = priority score (1-27)
41
+ - Filter through priority ladder (M-0100): UX Quality > Security > Scale > Efficiency
42
+ - Flag any recommendation that would regress a higher priority with ⚠️
43
+
44
+ ### Step 4 — Write NEXT.md
45
+ Generate structured output:
46
+
47
+ ```markdown
48
+ # NEXT.md — Recommended Actions
49
+
50
+ Generated: {date}
51
+ Sources analyzed: {count} sessions, {count} corrections, {count} files
52
+
53
+ ## Priority 1: {title}
54
+ Category: refactor | Impact: high | Effort: small
55
+ Evidence: {what data supports this}
56
+ Action: {specific steps}
57
+
58
+ ## Priority 2: {title}
59
+ ...
60
+ ```
61
+
62
+ ### Step 5 — Classify & Route
63
+
64
+ **Within CHOICES.md scope** → director handles autonomously (no NEXT.md entry needed):
65
+ - Bug fixes aligned with existing choices
66
+ - Test failures for implemented features
67
+ - Implementation gaps for existing choices
68
+ - Refactors that support existing architecture decisions
69
+
70
+ **Outside CHOICES.md scope** → write to NEXT.md for user review:
71
+ - Problems that contradict CHOICES.md decisions
72
+ - Opportunities that expand beyond current scope
73
+ - New concerns not addressed by any existing choice
74
+ - Trade-offs that require user judgment
75
+
76
+ ### Step 6 — Present for Approval
77
+ Show NEXT.md items (scope-external only). User selects items to:
78
+ - Accept → feeds into `/choose` to update CHOICES.md, then director can act
79
+ - Defer (stays in NEXT.md for next cycle)
80
+ - Dismiss (removed with reason logged)
81
+
82
+ ## Output
83
+ - Writes `NEXT.md` in project root
84
+ - Returns summary of top recommendations for user selection
@@ -0,0 +1,51 @@
1
+ # Choice Scanner
2
+
3
+ Instructions for a scout agent to diff CHOICES.md against codebase reality.
4
+
5
+ ## Process
6
+
7
+ 1. Read CHOICES.md, extract all choices with IDs and titles
8
+ 2. For each choice, assess implementation status:
9
+
10
+ ### Status Assessment
11
+
12
+ | Status | Evidence |
13
+ |--------|----------|
14
+ | **Fulfilled** | Code exists, tests pass, feature works |
15
+ | **Partial** | Some code exists, incomplete or untested |
16
+ | **Not started** | No implementation evidence |
17
+ | **Stale** | Code exists but contradicts current choice wording |
18
+ | **Orphaned** | Implementation exists for a choice that was removed |
19
+
20
+ 3. Check for gaps:
21
+ - Choices in Technology section without matching `package.json` deps
22
+ - Choices in Architecture section without matching directory structure
23
+ - Choices in Features section without matching code paths
24
+
25
+ 4. Check PLAN.md alignment:
26
+ - Plan phases that reference removed/changed choices
27
+ - Completed plan phases whose choices have since changed
28
+
29
+ ## Output Format
30
+
31
+ ```markdown
32
+ ## Choice-Reality Gap Analysis
33
+
34
+ CHOICES.md: {N} choices
35
+ Codebase: {M} source files
36
+
37
+ ### Status
38
+ - ✓ Fulfilled: {count} — {IDs}
39
+ - ◐ Partial: {count} — {IDs}
40
+ - ✗ Not started: {count} — {IDs}
41
+ - ⚠ Stale: {count} — {IDs}
42
+ - 👻 Orphaned: {count} — {descriptions}
43
+
44
+ ### Gaps
45
+ 1. {choice ID}: {what's missing}
46
+ 2. ...
47
+
48
+ ### PLAN.md Drift
49
+ 1. Phase {N}: {misalignment description}
50
+ 2. ...
51
+ ```
@@ -0,0 +1,57 @@
1
+ # Code Scanner
2
+
3
+ Instructions for a scout agent to find code quality issues.
4
+
5
+ ## Process
6
+
7
+ 1. Get project file list:
8
+ ```bash
9
+ find . -type f \( -name "*.ts" -o -name "*.js" -o -name "*.py" -o -name "*.go" -o -name "*.rs" \) \
10
+ -not -path '*/node_modules/*' -not -path '*/.git/*' -not -path '*/dist/*'
11
+ ```
12
+
13
+ 2. For each source file, check:
14
+
15
+ ### Complexity Hotspots
16
+ - Files > 300 lines: `wc -l` on each file
17
+ - Functions > 50 lines: grep for function definitions, count lines to next function/end
18
+ - Deeply nested code: grep for 4+ levels of indentation
19
+
20
+ ### Test Coverage Gaps
21
+ - Source files without corresponding test file (`*.test.*`, `*.spec.*`)
22
+ - Test files that exist but are empty or have no assertions
23
+ - Untested exports: public functions/classes not referenced in tests
24
+
25
+ ### Dead Code
26
+ - Exported functions not imported anywhere else
27
+ - Files not imported by any other file
28
+ - Unused dependencies in package.json
29
+
30
+ ### Dependency Health
31
+ - `package.json` deps vs what's actually imported
32
+ - Outdated major versions (if lockfile available)
33
+
34
+ ## Output Format
35
+
36
+ ```markdown
37
+ ## Code Analysis
38
+
39
+ Files scanned: {count}
40
+ Total lines: {count}
41
+
42
+ ### Complexity Hotspots
43
+ 1. `{path}` — {lines} lines, {reason}
44
+ 2. ...
45
+
46
+ ### Missing Tests
47
+ 1. `{path}` — no test file found
48
+ 2. ...
49
+
50
+ ### Dead Code Candidates
51
+ 1. `{path}:{export}` — not imported anywhere
52
+ 2. ...
53
+
54
+ ### Dependency Issues
55
+ 1. `{package}` — {issue}
56
+ 2. ...
57
+ ```
@@ -0,0 +1,55 @@
1
+ # Log Scanner
2
+
3
+ Instructions for a scout agent to parse application output logs.
4
+
5
+ ## Process
6
+
7
+ 1. Find log files:
8
+ ```bash
9
+ find . -name "*.log" -not -path '*/node_modules/*' -not -path '*/.git/*' 2>/dev/null
10
+ ls logs/ 2>/dev/null
11
+ ls .pi/*.log 2>/dev/null
12
+ ```
13
+
14
+ 2. Also check for common log patterns:
15
+ - `npm test` output (captured in CI or local)
16
+ - Docker logs if containerized
17
+ - stderr output captured in session files
18
+
19
+ 3. For each log source, extract:
20
+
21
+ ### Error Patterns
22
+ - Repeated error messages (same error 3+ times)
23
+ - Stack traces with common root causes
24
+ - Warnings that escalated to errors over time
25
+
26
+ ### Performance Signals
27
+ - Slow operations (timeout warnings, > 5s responses)
28
+ - Memory warnings
29
+ - Rate limit hits
30
+
31
+ ### Runtime Issues
32
+ - Deprecated API usage warnings
33
+ - Unhandled promise rejections
34
+ - Missing environment variables
35
+
36
+ ## Output Format
37
+
38
+ ```markdown
39
+ ## Log Analysis
40
+
41
+ Log sources found: {count}
42
+ Period: {date range}
43
+
44
+ ### Recurring Errors
45
+ 1. `{error message}` — {count} occurrences — source: {file/component}
46
+ 2. ...
47
+
48
+ ### Performance Issues
49
+ 1. {description} — {frequency} — impact: {assessment}
50
+ 2. ...
51
+
52
+ ### Warnings
53
+ 1. {description} — {count} occurrences
54
+ 2. ...
55
+ ```
@@ -0,0 +1,72 @@
1
+ # Recommendation Ranker
2
+
3
+ How to score and rank recommendations from scanner outputs.
4
+
5
+ ## Input
6
+ Combined findings from all scanners: session, code, choice, log.
7
+
8
+ ## Scoring
9
+
10
+ Each recommendation gets three scores (1-3):
11
+
12
+ ### Impact (how much does fixing this improve the project?)
13
+ - **3 (high)**: Affects core UX, blocks features, or causes user-visible issues
14
+ - **2 (medium)**: Improves maintainability, reduces debt, prevents future problems
15
+ - **1 (low)**: Nice-to-have cleanup, minor optimization
16
+
17
+ ### Effort (how much work to fix?)
18
+ - **3 (small)**: < 1 hour, single file, clear fix
19
+ - **2 (medium)**: 1-4 hours, multiple files, some design needed
20
+ - **1 (large)**: > 4 hours, architectural change, needs planning
21
+
22
+ ### Evidence (how strong is the supporting data?)
23
+ - **3 (strong)**: 5+ signals from multiple sources
24
+ - **2 (moderate)**: 2-4 signals or single strong signal
25
+ - **1 (weak)**: 1 signal, inference-based
26
+
27
+ ### Priority Score
28
+ `priority = impact × effort × evidence` (max 27, min 1)
29
+
30
+ ## Categories
31
+
32
+ Assign each recommendation exactly one category:
33
+ - **refactor** — restructure code without changing behavior
34
+ - **simplify** — remove complexity, dead code, over-abstraction
35
+ - **scope-change** — add/remove/modify a CHOICES.md decision
36
+ - **ux-improvement** — improve user experience based on evidence
37
+ - **upskill** — create/modify agent skill or rule to prevent recurring failures
38
+ - **debt** — address accumulated technical debt
39
+
40
+ ## Priority Ladder Filter
41
+
42
+ After ranking, verify each recommendation respects M-0100:
43
+ - UX Quality recommendations always rank above Security-only items
44
+ - Security items rank above Scale-only items
45
+ - Scale items rank above Efficiency-only items
46
+ - Any recommendation that would regress a higher priority is flagged with ⚠️
47
+
48
+ ## Scope Classification
49
+
50
+ Before ranking, classify each finding:
51
+
52
+ ### In-scope (autonomous)
53
+ The finding relates to an existing CHOICES.md decision. The director can act without approval.
54
+ - Mark as `scope: in` — these don't go to NEXT.md
55
+ - Route directly to `/build` or fix inline
56
+
57
+ ### Out-of-scope (needs approval)
58
+ The finding conflicts with, expands beyond, or is not covered by CHOICES.md.
59
+ - Mark as `scope: out` — these go to NEXT.md
60
+ - User must accept (→ update CHOICES.md) before director can act
61
+
62
+ ## Output
63
+
64
+ Top 10 out-of-scope recommendations for NEXT.md, sorted by priority score descending:
65
+
66
+ ```markdown
67
+ ## Priority 1: {title}
68
+ Category: {category} | Impact: {high/med/low} | Effort: {small/med/large} | Score: {N}
69
+ Evidence: {what data supports this — scanner, count, files}
70
+ Action: {specific steps to take}
71
+ Supports: {CHOICES.md IDs affected, if any}
72
+ ```
@@ -0,0 +1,53 @@
1
+ # Session Scanner
2
+
3
+ Instructions for a scout agent to analyze pi session history.
4
+
5
+ ## Input
6
+ - Session files: `.pi/agent/sessions/*.jsonl` (current project)
7
+ - Global sessions: `~/.pi/agent/sessions/*.jsonl` (cross-project patterns)
8
+
9
+ ## Process
10
+
11
+ 1. List session files, sorted by date (newest first), limit to last 14 days
12
+ 2. For each session file, parse JSONL — each line is a message object
13
+ 3. Look for these patterns:
14
+
15
+ ### Failure Patterns
16
+ - Messages with `tool_error` or failed bash commands (exit code != 0)
17
+ - Repeated attempts at the same operation (3+ tries = signal)
18
+ - User corrections ("no", "wrong", "actually", "I meant")
19
+
20
+ ### Token Waste
21
+ - Sessions > 50 messages without a commit (spinning)
22
+ - Large file reads followed by small edits (could have used grep)
23
+ - Repeated context re-establishment (agent forgot prior work)
24
+
25
+ ### Repeated Manual Work
26
+ - Same file edited across 3+ sessions (hotspot)
27
+ - Same bash command run across 3+ sessions (should be automated)
28
+ - Same question asked across sessions (missing documentation)
29
+
30
+ ## Output Format
31
+
32
+ ```markdown
33
+ ## Session Analysis
34
+
35
+ Period: {date range}
36
+ Sessions analyzed: {count}
37
+
38
+ ### Failure Patterns
39
+ 1. {pattern} — seen {N} times — files: {list}
40
+ 2. ...
41
+
42
+ ### Token Waste Signals
43
+ 1. {pattern} — est. {N} tokens wasted — sessions: {list}
44
+ 2. ...
45
+
46
+ ### Repeated Manual Work
47
+ 1. {file/command} — {N} occurrences — suggestion: {what to automate}
48
+ 2. ...
49
+
50
+ ### Hotspot Files
51
+ 1. `{path}` — touched in {N} sessions — last: {date}
52
+ 2. ...
53
+ ```