@a-canary/pi-director 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.pi/corrections.jsonl +3 -0
- package/CHOICES.md +244 -0
- package/PLAN.md +108 -0
- package/README.md +97 -0
- package/agents/README.md +42 -0
- package/agents/builder.md +37 -0
- package/agents/critic.md +74 -0
- package/agents/director.md +133 -0
- package/agents/planner.md +44 -0
- package/agents/reviewer.md +35 -0
- package/agents/scout.md +37 -0
- package/agents/writer.md +28 -0
- package/extensions/nightly-analysis.ts +229 -0
- package/package.json +39 -0
- package/skills/build/SKILL.md +53 -0
- package/skills/build/lib/hard-stops.md +66 -0
- package/skills/build/lib/phase-loop.md +99 -0
- package/skills/build/lib/regression-check.md +63 -0
- package/skills/choose/SKILL.md +48 -0
- package/skills/choose/lib/pipeline.md +83 -0
- package/skills/next/SKILL.md +84 -0
- package/skills/next/lib/choice-scanner.md +51 -0
- package/skills/next/lib/code-scanner.md +57 -0
- package/skills/next/lib/log-scanner.md +55 -0
- package/skills/next/lib/ranker.md +72 -0
- package/skills/next/lib/session-scanner.md +53 -0
- package/templates/NEXT.md +35 -0
- package/test/agents.test.ts +63 -0
- package/test/next-template.test.ts +41 -0
- package/test/package.test.ts +59 -0
- package/test/skills.test.ts +52 -0
|
@@ -0,0 +1,3 @@
|
|
|
1
|
+
{"timestamp":"2026-03-17T14:57:22.799Z","failure":"Used \"aaron\" as author name in package.json instead of \"a-canary\" github username","correction":"Always use \"a-canary\" as author name — it's the github username for all packages","tokens_wasted":500,"source":"user","strength":"strong"}
|
|
2
|
+
{"timestamp":"2026-03-17T15:10:39.460Z","failure":"Renumbered CHOICES.md IDs sequentially when adding new choices, causing cascading reference updates","correction":"CHOICES.md IDs are UIDs — never renumber. Position = priority, not ID number. New choices get next available ID.","tokens_wasted":2000,"source":"user","strength":"strong"}
|
|
3
|
+
{"timestamp":"2026-03-17T16:35:37.017Z","failure":"Using \"aaron\" as git/npm author instead of \"a-canary\"","correction":"Always use GitHub username \"a-canary\" for all git commits, npm author fields, and attribution. Never use \"aaron\".","tokens_wasted":100,"source":"user","strength":"strong"}
|
package/CHOICES.md
ADDED
|
@@ -0,0 +1,244 @@
|
|
|
1
|
+
# CHOICES.md — Source of Plan
|
|
2
|
+
|
|
3
|
+
All project choices in priority order. Higher choices constrain lower ones.
|
|
4
|
+
Use `/choose-wisely:choose` to add, change, remove, or reorder choices.
|
|
5
|
+
Use `/choose-wisely:choose-audit` to check for contradictions and structural issues.
|
|
6
|
+
|
|
7
|
+
## Rules
|
|
8
|
+
|
|
9
|
+
- **Position = Priority**: higher constrains lower, no exceptions
|
|
10
|
+
- **Gravity rule**: changing a choice triggers cascading review
|
|
11
|
+
- **Section order is fixed**: Mission > User Experiences > Features > Operations > Data > Architecture > Technology > Implementation
|
|
12
|
+
- **Supports line required**: every choice (except top) lists IDs it directly supports
|
|
13
|
+
- **Architecture is tool-agnostic**: Architecture describes patterns; Technology names tools
|
|
14
|
+
- **No status values**: git diff is the change record
|
|
15
|
+
|
|
16
|
+
### Choice Entry Format
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
### X-0001: Title of choice
|
|
20
|
+
Supports: X-0000, Y-0000
|
|
21
|
+
|
|
22
|
+
One to two lines of rationale. Not a spec. Just why this choice was made.
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
- ID format: `PREFIX-NNNN` (globally unique 4-digit number)
|
|
26
|
+
- Prefixes: `M-` (Mission), `UX-` (User Experiences), `F-` (Features), `O-` (Operations), `D-` (Data), `A-` (Architecture), `T-` (Technology), `I-` (Implementation)
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## Mission
|
|
31
|
+
|
|
32
|
+
### M-0100: Priority ladder — UX Quality → Security → Scale → Efficiency
|
|
33
|
+
|
|
34
|
+
All components follow a strict priority ladder. Higher priorities must never regress when pursuing lower ones. Each level is a release gate:
|
|
35
|
+
1. **UX Quality** — prove it works well for users → UX testing gate
|
|
36
|
+
2. **Security** — prove it's safe → beta release gate
|
|
37
|
+
3. **Scale** — prove it handles growth → full release gate
|
|
38
|
+
4. **Efficiency** — optimize cost/speed → ongoing
|
|
39
|
+
|
|
40
|
+
### M-0001: Autonomous project development agent
|
|
41
|
+
Supports: M-0100
|
|
42
|
+
|
|
43
|
+
pi-director is the brain of pi-based development. It understands project intent, recommends what to do next, and executes development autonomously through specialized subagents. It replaces ad-hoc manual orchestration with a structured decision→plan→build loop.
|
|
44
|
+
|
|
45
|
+
### M-0002: Installable npm package for any pi project
|
|
46
|
+
Supports: M-0001
|
|
47
|
+
|
|
48
|
+
Every pi project (pi-default, pi-admin, etc.) installs pi-director via npm. It brings the full director stack: agents, skills, and extensions. No manual ~/.pi/ setup needed.
|
|
49
|
+
|
|
50
|
+
### M-0003: Data-driven decision making
|
|
51
|
+
Supports: M-0001
|
|
52
|
+
|
|
53
|
+
Recommendations come from evidence — session history, correction logs, code analysis, app output logs — not guesswork. The agent observes patterns and surfaces what matters.
|
|
54
|
+
|
|
55
|
+
---
|
|
56
|
+
|
|
57
|
+
## User Experiences
|
|
58
|
+
|
|
59
|
+
### UX-0001: Three clear operations with distinct outputs
|
|
60
|
+
Supports: M-0001
|
|
61
|
+
|
|
62
|
+
The user has exactly three high-level commands: `/next` (what should I do?), `/choose` (clarify intent), `/build` (execute the plan). Each produces a distinct artifact: NEXT.md, CHOICES.md, PLAN.md.
|
|
63
|
+
|
|
64
|
+
### UX-0002: CHOICES.md is user-steered intent
|
|
65
|
+
Supports: UX-0001
|
|
66
|
+
|
|
67
|
+
CHOICES.md represents what the user has decided through interviews and feedback. It is the user's voice. Agents may clean up language for coherence and clarity, but never change intent, add choices, remove choices, or reorder priorities without user direction.
|
|
68
|
+
|
|
69
|
+
### UX-0003: NEXT.md surfaces agent-discovered issues outside CHOICES.md scope
|
|
70
|
+
Supports: UX-0001, M-0003
|
|
71
|
+
|
|
72
|
+
NEXT.md contains problems agents found that conflict with or expand beyond CHOICES.md. These require user approval before action — they represent scope changes, new concerns, or contradictions the user hasn't addressed yet.
|
|
73
|
+
|
|
74
|
+
### UX-0004: Director acts autonomously within CHOICES.md scope
|
|
75
|
+
Supports: UX-0002, UX-0003
|
|
76
|
+
|
|
77
|
+
Any issue that falls within the scope of CHOICES.md — bugs, test failures, implementation gaps, refactors aligned with existing choices — the director fixes autonomously. No approval needed. Only issues outside CHOICES.md scope go to NEXT.md for user review.
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## Features
|
|
82
|
+
|
|
83
|
+
### F-0001: Next — Analysis and recommendation engine
|
|
84
|
+
Supports: UX-0001, UX-0002
|
|
85
|
+
|
|
86
|
+
Analyzes session history, correction logs (pi-upskill), code quality, test coverage, and app output logs. Produces NEXT.md with ranked, categorized recommendations: refactors, simplifications, scope changes, UX improvements, upskilling opportunities.
|
|
87
|
+
|
|
88
|
+
### F-0002: Choose — Project intent clarification
|
|
89
|
+
Supports: UX-0001
|
|
90
|
+
|
|
91
|
+
Wraps pi-choose-wisely. Manages CHOICES.md through structured interview, bootstrap from existing docs, cascading audit on changes. Defines the WHY and WHAT.
|
|
92
|
+
|
|
93
|
+
### F-0003: Build — TDD iterative development loop
|
|
94
|
+
Supports: UX-0001, UX-0003
|
|
95
|
+
|
|
96
|
+
Executes PLAN.md phases using the director pattern: establish gates → recon → refine plan → feasibility experiments → build/test/gate check. Marks phases complete, loops until done or blocked.
|
|
97
|
+
|
|
98
|
+
### F-0004: Nightly analysis cron
|
|
99
|
+
Supports: F-0001, M-0003
|
|
100
|
+
|
|
101
|
+
Runs the analysis engine on a schedule (nightly or configurable). Produces fresh NEXT.md so the developer starts each day with prioritized recommendations.
|
|
102
|
+
|
|
103
|
+
### F-0005: Parallel subagent delegation
|
|
104
|
+
Supports: F-0003, M-0001
|
|
105
|
+
|
|
106
|
+
Uses pi's subagent system to run independent tasks in parallel — scout + web-search simultaneously, multiple builders on independent files, reviewer while next phase plans.
|
|
107
|
+
|
|
108
|
+
### F-0006: Replan — Gap analysis between CHOICES.md and current state
|
|
109
|
+
Supports: F-0002, F-0003
|
|
110
|
+
|
|
111
|
+
Wraps pi-choose-wisely's replan skill. Compares CHOICES.md against codebase reality, generates PLAN.md for next implementation phase. Bridge between intent and execution.
|
|
112
|
+
|
|
113
|
+
---
|
|
114
|
+
|
|
115
|
+
## Operations
|
|
116
|
+
|
|
117
|
+
### O-0100: Four release gates matching priority ladder
|
|
118
|
+
Supports: M-0100
|
|
119
|
+
|
|
120
|
+
Each component progresses through gates in order. No gate may regress a prior one:
|
|
121
|
+
1. **UX Testing** — prove UX quality with real usage → internal/alpha
|
|
122
|
+
2. **Security Audit** — prove safety with review + hardening → beta publish
|
|
123
|
+
3. **Scale Testing** — prove it handles load/growth → full publish
|
|
124
|
+
4. **Efficiency Optimization** — reduce cost/latency → ongoing post-release
|
|
125
|
+
|
|
126
|
+
### O-0001: Subagent model tiering via pi-model-router
|
|
127
|
+
Supports: F-0005
|
|
128
|
+
|
|
129
|
+
Model tiers routed by pi-model-router. Strategic is the most expensive tier and must be used sparingly — only for elevated reasoning with zero tool calls.
|
|
130
|
+
|
|
131
|
+
### O-0101: Strategic models as thinking-only critics
|
|
132
|
+
Supports: O-0001, M-0100
|
|
133
|
+
|
|
134
|
+
Strategic models never receive tools. They are critics: given structured input curated by cheaper agents, they review, improve, and produce decision trees (max 8 leaves). This eliminates expensive tool-call loops and focuses strategic spend on pure reasoning at maximum thinking depth.
|
|
135
|
+
|
|
136
|
+
### O-0102: Recon→Plan→Critique→Finalize→Build pipeline
|
|
137
|
+
Supports: O-0101
|
|
138
|
+
|
|
139
|
+
The standard execution pipeline:
|
|
140
|
+
1. **Recon** (operational) — retrieve and summarize context, many tool calls
|
|
141
|
+
2. **Plan** (tactical) — synthesize plan from context, few tool calls
|
|
142
|
+
3. **Critique** (strategic) — review plan, provide improvements and/or decision tree, zero tool calls
|
|
143
|
+
4. **Finalize** (tactical) — incorporate critique, resolve decision tree branches with tool calls
|
|
144
|
+
5. **Build** (operational) — execute finalized plan
|
|
145
|
+
|
|
146
|
+
A similar critique loop applies to implementation validation and phase gate results.
|
|
147
|
+
|
|
148
|
+
### O-0002: Hard stop vs soft issue classification
|
|
149
|
+
Supports: F-0003
|
|
150
|
+
|
|
151
|
+
Hard stops (mission infeasible, security breach, external dep broken) require user input. Soft issues (API changed, test failure, review nits) are handled autonomously. Clear escalation policy.
|
|
152
|
+
|
|
153
|
+
---
|
|
154
|
+
|
|
155
|
+
## Data
|
|
156
|
+
|
|
157
|
+
### D-0001: NEXT.md — Recommendation artifact
|
|
158
|
+
Supports: F-0001
|
|
159
|
+
|
|
160
|
+
Structured markdown with ranked recommendations. Categories: refactor, simplify, scope-change, ux-improvement, upskill, debt. Each item has rationale, effort estimate, and supporting evidence (source file, session, log line).
|
|
161
|
+
|
|
162
|
+
### D-0002: Session and correction log analysis
|
|
163
|
+
Supports: F-0001, M-0003
|
|
164
|
+
|
|
165
|
+
Reads pi session history (.pi/agent/sessions/*.jsonl) and correction logs (.pi/corrections.jsonl) to identify patterns: repeated failures, wasted tokens, recurring manual fixes.
|
|
166
|
+
|
|
167
|
+
### D-0003: CHOICES.md and PLAN.md as source of truth
|
|
168
|
+
Supports: F-0002, F-0003
|
|
169
|
+
|
|
170
|
+
CHOICES.md defines intent (managed by pi-choose-wisely). PLAN.md defines execution phases (managed by replan + director). Both are markdown, version-controlled, human-readable.
|
|
171
|
+
|
|
172
|
+
---
|
|
173
|
+
|
|
174
|
+
## Architecture
|
|
175
|
+
|
|
176
|
+
### A-0100: Non-regression constraint across priority levels
|
|
177
|
+
Supports: M-0100, O-0100
|
|
178
|
+
|
|
179
|
+
Every change must pass a regression check: does this degrade UX quality? If pursuing security, does it make the UX worse? If pursuing scale, does it compromise security or UX? Gate checks in /build enforce this — no phase completes if a higher-priority concern regresses.
|
|
180
|
+
|
|
181
|
+
### A-0001: Three-layer architecture
|
|
182
|
+
Supports: M-0001
|
|
183
|
+
|
|
184
|
+
Layer 1: Skills (next, choose, build) — user-facing operations invoked by commands.
|
|
185
|
+
Layer 2: Agent definitions (director, builder, planner, reviewer, scout, writer) — specialized roles spawned as subagents.
|
|
186
|
+
Layer 3: Analysis modules — data readers for sessions, logs, code metrics.
|
|
187
|
+
|
|
188
|
+
### A-0002: Skills as orchestrators, agents as executors
|
|
189
|
+
Supports: A-0001, F-0005
|
|
190
|
+
|
|
191
|
+
Skills contain the workflow logic (what to do in what order). Agent .md files define persona and constraints. Skills spawn agents via pi's subagent system. Skills never implement code directly.
|
|
192
|
+
|
|
193
|
+
### A-0003: Extension for nightly/scheduled analysis
|
|
194
|
+
Supports: F-0004, A-0001
|
|
195
|
+
|
|
196
|
+
A pi extension handles the cron/scheduling aspect. It invokes the `next` skill on a timer and writes NEXT.md. Follows the pi-pi.ts pattern for spawning analysis agents.
|
|
197
|
+
|
|
198
|
+
### A-0004: Dependency on pi-choose-wisely for CHOICES.md operations
|
|
199
|
+
Supports: F-0002, F-0006
|
|
200
|
+
|
|
201
|
+
pi-director does not duplicate CHOICES.md logic. It depends on pi-choose-wisely as an npm peer dependency. The `choose` and `replan` skills are re-exported/wrapped.
|
|
202
|
+
|
|
203
|
+
### A-0005: Dependency on pi-upskill for correction analysis
|
|
204
|
+
Supports: F-0001, D-0002
|
|
205
|
+
|
|
206
|
+
pi-director consumes pi-upskill's correction logs and analyze skill as input to the recommendation engine. pi-upskill remains a separate package.
|
|
207
|
+
|
|
208
|
+
---
|
|
209
|
+
|
|
210
|
+
## Technology
|
|
211
|
+
|
|
212
|
+
### T-0001: Pi package format with npm distribution
|
|
213
|
+
Supports: M-0002, A-0001
|
|
214
|
+
|
|
215
|
+
Structured as a pi package (package.json with `pi.skills`, `pi.agents`, `pi.extensions`). Published to npm as `@a-canary/pi-director`. Installed per-project.
|
|
216
|
+
|
|
217
|
+
### T-0002: TypeScript extension following pi-pi.ts patterns
|
|
218
|
+
Supports: A-0003
|
|
219
|
+
|
|
220
|
+
The scheduling extension is TypeScript using pi's ExtensionAPI. Spawns agents via `spawn("pi", ...)` like pi-pi.ts does. Uses TUI widgets for status display.
|
|
221
|
+
|
|
222
|
+
### T-0003: Peer dependencies on pi-choose-wisely, pi-upskill, and pi-model-router
|
|
223
|
+
Supports: A-0004, A-0005, O-0001
|
|
224
|
+
|
|
225
|
+
`peerDependencies`: `@a-canary/pi-choose-wisely`, `@a-canary/pi-upskill`, `pi-model-router`, `@mariozechner/pi-coding-agent`. pi-model-router handles tier resolution so agents always get the right model for their role.
|
|
226
|
+
|
|
227
|
+
---
|
|
228
|
+
|
|
229
|
+
## Implementation
|
|
230
|
+
|
|
231
|
+
### I-0001: Migrate agent definitions from ~/.pi/agent/agents/
|
|
232
|
+
Supports: A-0001, M-0002
|
|
233
|
+
|
|
234
|
+
Move director.md, builder.md, planner.md, reviewer.md, scout.md, writer.md from global ~/.pi/agent/agents/ into this package's agents/ directory. Update model references to use model groups.
|
|
235
|
+
|
|
236
|
+
### I-0002: Skill files reference agent definitions relatively
|
|
237
|
+
Supports: A-0002, I-0001
|
|
238
|
+
|
|
239
|
+
Skills discover agents from the package's agents/ directory. No hardcoded paths. Agent discovery uses `ls` on known locations (package agents/, project .pi/agents/, global ~/.pi/agent/agents/).
|
|
240
|
+
|
|
241
|
+
### I-0003: Test with vitest
|
|
242
|
+
Supports: M-0002
|
|
243
|
+
|
|
244
|
+
Unit tests for analysis modules (session parsing, recommendation ranking). Integration tests for skill workflows using mock sessions. Follows pi-model-router's test pattern.
|
package/PLAN.md
ADDED
|
@@ -0,0 +1,108 @@
|
|
|
1
|
+
# PLAN.md — Implementation Plan
|
|
2
|
+
|
|
3
|
+
## Objective
|
|
4
|
+
Implement pi-director as a functional pi package with three working operations (/next, /choose, /build) that orchestrate subagents autonomously.
|
|
5
|
+
|
|
6
|
+
## Scope
|
|
7
|
+
**In**: Agent discovery, model tiering, /next analysis engine, /build phase execution, /choose integration, nightly extension, tests.
|
|
8
|
+
**Out**: Publishing to npm (separate step), UI/TUI beyond status widgets, multi-project orchestration.
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Phase 1: Agent Foundation — Model Tiering & Discovery
|
|
13
|
+
Make agents use model groups instead of hardcoded models. Add relative discovery.
|
|
14
|
+
|
|
15
|
+
### Steps
|
|
16
|
+
- [x] 1.1 Update all agent .md frontmatter to use model groups (strategic/tactical/operational/scout) instead of hardcoded provider/model strings
|
|
17
|
+
- [x] 1.2 Update director.md agent discovery to check package-relative agents/ dir, then project .pi/agents/, then ~/.pi/agent/agents/
|
|
18
|
+
- [x] 1.3 Add director.md reference to the three skills (/next, /choose, /build) so it knows its operational modes
|
|
19
|
+
- [x] 1.4 Create agents/README.md documenting model tier assignments and discovery order
|
|
20
|
+
|
|
21
|
+
### Gates
|
|
22
|
+
- [x] All agent .md files use model group names, not provider/model strings
|
|
23
|
+
- [x] Director agent discovery section references package-relative path first
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## Phase 2: /next — Session & Code Analysis Engine
|
|
28
|
+
Build the data gathering and recommendation pipeline.
|
|
29
|
+
|
|
30
|
+
### Steps
|
|
31
|
+
- [x] 2.1 Create skills/next/lib/session-scanner.md — instructions for scout agent to parse .pi/agent/sessions/*.jsonl, extract failure patterns, token waste, repeated operations
|
|
32
|
+
- [x] 2.2 Create skills/next/lib/code-scanner.md — instructions for scout agent to find complexity hotspots (files >300 lines, functions >50 lines, untested code, dead exports)
|
|
33
|
+
- [x] 2.3 Create skills/next/lib/choice-scanner.md — instructions for scout agent to diff CHOICES.md against codebase, find unimplemented/stale choices
|
|
34
|
+
- [x] 2.4 Create skills/next/lib/log-scanner.md — instructions for scout agent to parse app logs for recurring errors
|
|
35
|
+
- [x] 2.5 Create skills/next/lib/ranker.md — ranking algorithm: impact × inverse-effort × evidence-strength
|
|
36
|
+
- [x] 2.6 Update skills/next/SKILL.md to reference lib modules and define the parallel gather → analyze → rank → write workflow
|
|
37
|
+
- [x] 2.7 Create templates/NEXT.md — template for generated recommendations file
|
|
38
|
+
|
|
39
|
+
### Gates
|
|
40
|
+
- [x] Each scanner module is a self-contained instruction set a scout agent can execute
|
|
41
|
+
- [x] SKILL.md references all lib modules and describes parallel dispatch
|
|
42
|
+
- [x] templates/NEXT.md exists with placeholder structure
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## Phase 3: /build — Director Phase Execution
|
|
47
|
+
Refine the build skill to be the canonical phase execution loop.
|
|
48
|
+
|
|
49
|
+
### Steps
|
|
50
|
+
- [x] 3.1 Extract the core loop from agents/director.md into skills/build/lib/phase-loop.md as reusable reference
|
|
51
|
+
- [x] 3.2 Create skills/build/lib/regression-check.md — priority ladder regression verification
|
|
52
|
+
- [x] 3.3 Create skills/build/lib/hard-stops.md — enumeration of hard vs soft issues with decision tree
|
|
53
|
+
- [x] 3.4 Update skills/build/SKILL.md to reference lib modules
|
|
54
|
+
- [x] 3.5 Slim down agents/director.md to reference build skill instead of duplicating the loop
|
|
55
|
+
|
|
56
|
+
### Gates
|
|
57
|
+
- [x] Director agent references build skill for phase execution
|
|
58
|
+
- [x] Hard stop vs soft issue classification is documented in lib/hard-stops.md
|
|
59
|
+
- [x] No duplication between director.md and build/SKILL.md
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
## Phase 4: /choose — pi-choose-wisely Integration
|
|
64
|
+
Wire choose skill to delegate to pi-choose-wisely and bridge to replan.
|
|
65
|
+
|
|
66
|
+
### Steps
|
|
67
|
+
- [x] 4.1 Update skills/choose/SKILL.md with concrete delegation instructions: which pi-choose-wisely skill to invoke for each operation (bootstrap, audit, change, interview)
|
|
68
|
+
- [x] 4.2 Add replan bridge: after CHOICES.md changes, auto-suggest regenerating PLAN.md
|
|
69
|
+
- [x] 4.3 Add /next bridge: after CHOICES.md changes, note that /next should re-analyze
|
|
70
|
+
- [x] 4.4 Document the CHOICES.md → replan → PLAN.md → /build pipeline in skills/choose/lib/pipeline.md
|
|
71
|
+
- [x] 4.5 Define autonomy boundary: CHOICES.md (user-steered) vs NEXT.md (agent-discovered, out-of-scope)
|
|
72
|
+
|
|
73
|
+
### Gates
|
|
74
|
+
- [x] Choose skill has explicit delegation paths for all pi-choose-wisely operations
|
|
75
|
+
- [x] Pipeline documentation shows complete flow from intent to execution
|
|
76
|
+
|
|
77
|
+
---
|
|
78
|
+
|
|
79
|
+
## Phase 5: Nightly Extension
|
|
80
|
+
TypeScript extension for scheduled analysis.
|
|
81
|
+
|
|
82
|
+
### Steps
|
|
83
|
+
- [x] 5.1 Create extensions/nightly-analysis.ts following pi-pi.ts pattern — spawns pi with /next skill on configurable schedule
|
|
84
|
+
- [x] 5.2 Add TUI widget showing last analysis time and top 3 recommendations from NEXT.md
|
|
85
|
+
- [x] 5.3 Add /nightly-status command to show schedule and last run
|
|
86
|
+
- [x] 5.4 Add configuration for schedule (default: daily at 2am, configurable via /nightly-set)
|
|
87
|
+
|
|
88
|
+
### Gates
|
|
89
|
+
- [ ] Extension loads without errors when pi starts
|
|
90
|
+
- [x] /nightly-status command responds with schedule info
|
|
91
|
+
- [x] Widget renders placeholder when no NEXT.md exists
|
|
92
|
+
|
|
93
|
+
---
|
|
94
|
+
|
|
95
|
+
## Phase 6: Tests & Validation
|
|
96
|
+
Verify the package works end-to-end.
|
|
97
|
+
|
|
98
|
+
### Steps
|
|
99
|
+
- [x] 6.1 Create test/agents.test.ts — validate all agent .md files parse correctly (frontmatter, model groups, required fields)
|
|
100
|
+
- [x] 6.2 Create test/skills.test.ts — validate all SKILL.md files exist and have required sections
|
|
101
|
+
- [x] 6.3 Create test/package.test.ts — validate package.json pi config points to real directories
|
|
102
|
+
- [x] 6.4 Create test/next-template.test.ts — validate NEXT.md template structure
|
|
103
|
+
- [x] 6.5 Run full test suite, fix any failures — 72/72 pass
|
|
104
|
+
- [x] 6.6 Update README.md with final structure and usage
|
|
105
|
+
|
|
106
|
+
### Gates
|
|
107
|
+
- [x] `npm test` passes with all tests green
|
|
108
|
+
- [x] README.md reflects actual package contents
|
package/README.md
ADDED
|
@@ -0,0 +1,97 @@
|
|
|
1
|
+
# pi-director
|
|
2
|
+
|
|
3
|
+
Autonomous project director for [pi](https://github.com/mariozechner/pi). Three operations, three artifacts:
|
|
4
|
+
|
|
5
|
+
| Command | Operation | Artifact | Question |
|
|
6
|
+
|---------|-----------|----------|----------|
|
|
7
|
+
| `/next` | Analyze & Recommend | NEXT.md | What's outside scope? |
|
|
8
|
+
| `/choose` | Clarify Intent | CHOICES.md | Why and what? |
|
|
9
|
+
| `/build` | TDD Development | PLAN.md | How to implement? |
|
|
10
|
+
|
|
11
|
+
## Autonomy Model
|
|
12
|
+
|
|
13
|
+
- **CHOICES.md** — user-steered intent. Only the user modifies it (via interviews and feedback).
|
|
14
|
+
- **Within CHOICES.md scope** — director acts autonomously. Bugs, gaps, refactors aligned with existing choices need no approval.
|
|
15
|
+
- **Outside CHOICES.md scope** — surfaces in NEXT.md for user review. Scope changes, contradictions, and new concerns require user acceptance before action.
|
|
16
|
+
|
|
17
|
+
## Priority Ladder
|
|
18
|
+
|
|
19
|
+
All work follows: **UX Quality > Security > Scale > Efficiency**. Each level is a release gate. Higher priorities never regress when pursuing lower ones.
|
|
20
|
+
|
|
21
|
+
## Install
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
npm install @a-canary/pi-director @a-canary/pi-choose-wisely @a-canary/pi-upskill
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
## Architecture
|
|
28
|
+
|
|
29
|
+
```
|
|
30
|
+
┌──────────────────────────────────────────┐
|
|
31
|
+
│ pi-director │
|
|
32
|
+
│ ┌───────┐ ┌────────┐ ┌──────┐ │
|
|
33
|
+
│ │ /next │ │/choose │ │/build│ │
|
|
34
|
+
│ └───┬───┘ └───┬────┘ └──┬───┘ │
|
|
35
|
+
│ │ │ │ │
|
|
36
|
+
│ ┌───▼──────────▼──────────▼──────┐ │
|
|
37
|
+
│ │ Subagent Orchestration │ │
|
|
38
|
+
│ │ scout planner builder │ │
|
|
39
|
+
│ │ reviewer writer │ │
|
|
40
|
+
│ └────────────────────────────────┘ │
|
|
41
|
+
│ │
|
|
42
|
+
│ ┌─────────────────────────────────┐ │
|
|
43
|
+
│ │ Nightly Extension (cron) │ │
|
|
44
|
+
│ │ /nightly-status /nightly-run │ │
|
|
45
|
+
│ └─────────────────────────────────┘ │
|
|
46
|
+
├──────────────────────────────────────────┤
|
|
47
|
+
│ pi-choose-wisely │ pi-upskill │
|
|
48
|
+
│ CHOICES.md mgmt │ corrections │
|
|
49
|
+
└──────────────────────────────────────────┘
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
## Package Structure
|
|
53
|
+
|
|
54
|
+
```
|
|
55
|
+
pi-director/
|
|
56
|
+
├── agents/ # Subagent definitions
|
|
57
|
+
│ ├── director.md # strategic — orchestration
|
|
58
|
+
│ ├── planner.md # strategic — architecture
|
|
59
|
+
│ ├── reviewer.md # tactical — code review
|
|
60
|
+
│ ├── builder.md # operational — implementation
|
|
61
|
+
│ ├── scout.md # scout — fast recon
|
|
62
|
+
│ └── writer.md # operational — documentation
|
|
63
|
+
├── skills/
|
|
64
|
+
│ ├── next/ # /next — analysis engine
|
|
65
|
+
│ │ ├── SKILL.md
|
|
66
|
+
│ │ └── lib/ # scanner modules + ranker
|
|
67
|
+
│ ├── build/ # /build — TDD phase loop
|
|
68
|
+
│ │ ├── SKILL.md
|
|
69
|
+
│ │ └── lib/ # phase-loop, hard-stops, regression-check
|
|
70
|
+
│ └── choose/ # /choose — wraps pi-choose-wisely
|
|
71
|
+
│ ├── SKILL.md
|
|
72
|
+
│ └── lib/ # pipeline documentation
|
|
73
|
+
├── extensions/
|
|
74
|
+
│ └── nightly-analysis.ts # scheduled /next execution
|
|
75
|
+
├── templates/
|
|
76
|
+
│ └── NEXT.md # recommendation output format
|
|
77
|
+
├── CHOICES.md # project intent
|
|
78
|
+
├── PLAN.md # implementation phases
|
|
79
|
+
└── package.json
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
## Commands
|
|
83
|
+
|
|
84
|
+
| Command | Description |
|
|
85
|
+
|---------|-------------|
|
|
86
|
+
| `/next` | Analyze project data, generate recommendations |
|
|
87
|
+
| `/choose` | Clarify project intent (wraps pi-choose-wisely) |
|
|
88
|
+
| `/build` | Execute PLAN.md phases via TDD loop |
|
|
89
|
+
| `/nightly-status` | Show analysis schedule and last run |
|
|
90
|
+
| `/nightly-run` | Trigger analysis immediately |
|
|
91
|
+
| `/nightly-set <hour>` | Set daily analysis hour (0-23) |
|
|
92
|
+
|
|
93
|
+
## Dependencies
|
|
94
|
+
|
|
95
|
+
- `@a-canary/pi-choose-wisely` — CHOICES.md management, replan
|
|
96
|
+
- `@a-canary/pi-upskill` — Correction analysis, session learning
|
|
97
|
+
- `@mariozechner/pi-coding-agent` — Pi runtime
|
package/agents/README.md
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
# Agents
|
|
2
|
+
|
|
3
|
+
Specialized subagent definitions for pi-director.
|
|
4
|
+
|
|
5
|
+
## Model Tiers
|
|
6
|
+
|
|
7
|
+
| Agent | Tier | Tools | Rationale |
|
|
8
|
+
|-------|------|-------|-----------|
|
|
9
|
+
| critic | strategic | **none** | Pure reasoning — reviews, improves, produces decision trees |
|
|
10
|
+
| director | tactical | read, bash, grep, find, ls | Orchestrates pipeline, delegates to specialists |
|
|
11
|
+
| planner | tactical | read, grep, find, ls | Synthesizes plans from recon context |
|
|
12
|
+
| reviewer | tactical | read, grep, find, ls, bash | Code review for quality and security |
|
|
13
|
+
| builder | operational | read, write, edit, bash, grep, find, ls | High-throughput implementation |
|
|
14
|
+
| scout | scout | read, grep, find, ls, bash | Cheapest tier for fast read-only recon |
|
|
15
|
+
| writer | operational | read, write, edit, grep, find, ls | Documentation updates |
|
|
16
|
+
|
|
17
|
+
## Strategic Model Philosophy
|
|
18
|
+
|
|
19
|
+
Strategic models are the most expensive. We maximize their value by:
|
|
20
|
+
1. **Zero tools** — no tool-call loops burning tokens
|
|
21
|
+
2. **Curated input** — cheaper agents gather and summarize context first
|
|
22
|
+
3. **Structured output** — decision trees (max 8 leaves) that cheaper agents can evaluate and execute
|
|
23
|
+
4. **Maximum thinking depth** — ultrathink/extended thinking for elevated reasoning
|
|
24
|
+
|
|
25
|
+
### Standard Pipeline
|
|
26
|
+
```
|
|
27
|
+
operational (recon) → tactical (plan) → strategic (critique) → tactical (finalize) → operational (build)
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
Strategic activation happens exactly twice per phase: plan critique and gate critique.
|
|
31
|
+
|
|
32
|
+
## Discovery Order
|
|
33
|
+
|
|
34
|
+
1. **Package agents** — `@a-canary/pi-director/agents/` (always available)
|
|
35
|
+
2. **Project agents** — `.pi/agents/` (project-specific overrides)
|
|
36
|
+
3. **Global agents** — `~/.pi/agent/agents/` (user-level fallback)
|
|
37
|
+
|
|
38
|
+
## Priority Ladder
|
|
39
|
+
|
|
40
|
+
All agents operate under M-0100: **UX Quality > Security > Scale > Efficiency**
|
|
41
|
+
|
|
42
|
+
No agent action may regress a higher-priority concern.
|
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: builder
|
|
3
|
+
description: Implementation agent. Writes code, runs tests, commits incrementally.
|
|
4
|
+
model: operational
|
|
5
|
+
tools: read, write, edit, bash, grep, find, ls
|
|
6
|
+
---
|
|
7
|
+
You are a builder agent. Implement the requested changes. Write clean, minimal code. Follow existing patterns.
|
|
8
|
+
|
|
9
|
+
## Process
|
|
10
|
+
|
|
11
|
+
1. Read the plan or task description fully.
|
|
12
|
+
2. Understand existing code patterns before writing.
|
|
13
|
+
3. Make changes incrementally — smallest working step first.
|
|
14
|
+
4. Run tests after each significant change.
|
|
15
|
+
5. Fix any failures before moving on.
|
|
16
|
+
|
|
17
|
+
## Rules
|
|
18
|
+
|
|
19
|
+
- Follow existing code style and patterns in the project.
|
|
20
|
+
- Write tests when the project has a test framework.
|
|
21
|
+
- Keep functions under 50 lines.
|
|
22
|
+
- No premature abstraction — three instances before extracting.
|
|
23
|
+
- No speculative features — build what's asked, nothing more.
|
|
24
|
+
|
|
25
|
+
## Output format
|
|
26
|
+
|
|
27
|
+
## Completed
|
|
28
|
+
What was done.
|
|
29
|
+
|
|
30
|
+
## Files changed
|
|
31
|
+
- `path/to/file.ts` — what changed
|
|
32
|
+
|
|
33
|
+
## Tests
|
|
34
|
+
{test results or "no test framework"}
|
|
35
|
+
|
|
36
|
+
## Notes
|
|
37
|
+
Anything the caller should know.
|
package/agents/critic.md
ADDED
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: critic
|
|
3
|
+
description: Strategic thinking-only reviewer. Zero tools. Produces structured feedback and decision trees from curated input.
|
|
4
|
+
model: strategic
|
|
5
|
+
tools:
|
|
6
|
+
thinking: ultrathink
|
|
7
|
+
---
|
|
8
|
+
You are a critic agent. You receive structured input curated by other agents and produce elevated analysis. You have NO tools — your value is pure reasoning at maximum depth.
|
|
9
|
+
|
|
10
|
+
## Input Format
|
|
11
|
+
|
|
12
|
+
You receive a structured review request:
|
|
13
|
+
```
|
|
14
|
+
## Context
|
|
15
|
+
{summarized codebase/project state from scout agents}
|
|
16
|
+
|
|
17
|
+
## Proposal
|
|
18
|
+
{plan, implementation, or gate results from tactical/operational agents}
|
|
19
|
+
|
|
20
|
+
## Review Criteria
|
|
21
|
+
{what to evaluate against — CHOICES.md decisions, priority ladder, etc.}
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
## Output Format
|
|
25
|
+
|
|
26
|
+
Always produce structured output with decision tree when applicable:
|
|
27
|
+
|
|
28
|
+
### Feedback
|
|
29
|
+
|
|
30
|
+
```
|
|
31
|
+
## Assessment
|
|
32
|
+
{overall judgment: approve / improve / reject}
|
|
33
|
+
|
|
34
|
+
## Strengths
|
|
35
|
+
1. {what's good and why}
|
|
36
|
+
|
|
37
|
+
## Issues
|
|
38
|
+
1. {problem} — {impact} — {suggested fix}
|
|
39
|
+
|
|
40
|
+
## Decision Tree
|
|
41
|
+
When the correct approach depends on conditions the critic cannot verify (requires tool calls), produce a decision tree:
|
|
42
|
+
|
|
43
|
+
IF {condition that needs tool verification}
|
|
44
|
+
├── TRUE: {approach A — specific instructions}
|
|
45
|
+
└── FALSE:
|
|
46
|
+
IF {second condition}
|
|
47
|
+
├── TRUE: {approach B}
|
|
48
|
+
└── FALSE: {approach C}
|
|
49
|
+
|
|
50
|
+
Max 8 leaf nodes. Each leaf must be self-contained and actionable.
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
## Rules
|
|
54
|
+
|
|
55
|
+
- **Never request tool access.** If you need data, say what data and why — the calling agent will gather it.
|
|
56
|
+
- **Decision trees for uncertainty.** When the right answer depends on runtime state, produce branching paths instead of guessing.
|
|
57
|
+
- **Max 8 leaves** per decision tree. If more complex, decompose into sequential decisions.
|
|
58
|
+
- **Priority ladder always.** Evaluate against M-0100: UX Quality > Security > Scale > Efficiency.
|
|
59
|
+
- **Be specific.** Reference file paths, choice IDs, and concrete alternatives.
|
|
60
|
+
- **Reject boldly.** If a proposal violates CHOICES.md or regresses a higher priority, say so clearly.
|
|
61
|
+
|
|
62
|
+
## Use Cases
|
|
63
|
+
|
|
64
|
+
### Plan Review
|
|
65
|
+
Input: recon summary + proposed plan
|
|
66
|
+
Output: improved plan + decision tree for unknowns
|
|
67
|
+
|
|
68
|
+
### Implementation Review
|
|
69
|
+
Input: code diff summary + test results + CHOICES.md context
|
|
70
|
+
Output: approval/rejection + specific issues + fix approaches
|
|
71
|
+
|
|
72
|
+
### Gate Review
|
|
73
|
+
Input: phase gate results + exit criteria + regression check
|
|
74
|
+
Output: pass/fail judgment + issues + remediation decision tree
|