@fro.bot/systematic 2.3.2 → 2.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (71) hide show
  1. package/README.md +12 -13
  2. package/agents/design/design-implementation-reviewer.md +2 -19
  3. package/agents/design/design-iterator.md +2 -31
  4. package/agents/design/figma-design-sync.md +2 -22
  5. package/agents/docs/ankane-readme-writer.md +2 -19
  6. package/agents/document-review/adversarial-document-reviewer.md +3 -2
  7. package/agents/document-review/coherence-reviewer.md +5 -7
  8. package/agents/document-review/design-lens-reviewer.md +3 -4
  9. package/agents/document-review/feasibility-reviewer.md +3 -4
  10. package/agents/document-review/product-lens-reviewer.md +25 -6
  11. package/agents/document-review/scope-guardian-reviewer.md +3 -4
  12. package/agents/document-review/security-lens-reviewer.md +3 -4
  13. package/agents/research/best-practices-researcher.md +4 -21
  14. package/agents/research/framework-docs-researcher.md +2 -19
  15. package/agents/research/git-history-analyzer.md +2 -19
  16. package/agents/research/issue-intelligence-analyst.md +2 -24
  17. package/agents/research/learnings-researcher.md +7 -28
  18. package/agents/research/repo-research-analyst.md +3 -32
  19. package/agents/research/slack-researcher.md +128 -0
  20. package/agents/review/agent-native-reviewer.md +109 -195
  21. package/agents/review/architecture-strategist.md +3 -19
  22. package/agents/review/cli-agent-readiness-reviewer.md +1 -27
  23. package/agents/review/code-simplicity-reviewer.md +5 -19
  24. package/agents/review/data-integrity-guardian.md +3 -19
  25. package/agents/review/data-migration-expert.md +3 -19
  26. package/agents/review/deployment-verification-agent.md +3 -19
  27. package/agents/review/pattern-recognition-specialist.md +4 -20
  28. package/agents/review/performance-oracle.md +3 -31
  29. package/agents/review/project-standards-reviewer.md +5 -5
  30. package/agents/review/schema-drift-detector.md +3 -19
  31. package/agents/review/security-sentinel.md +3 -25
  32. package/agents/review/testing-reviewer.md +3 -3
  33. package/agents/workflow/pr-comment-resolver.md +54 -22
  34. package/agents/workflow/spec-flow-analyzer.md +2 -25
  35. package/package.json +1 -1
  36. package/skills/agent-native-architecture/SKILL.md +28 -27
  37. package/skills/agent-native-architecture/references/agent-execution-patterns.md +3 -3
  38. package/skills/agent-native-architecture/references/agent-native-testing.md +1 -1
  39. package/skills/agent-native-architecture/references/mobile-patterns.md +1 -1
  40. package/skills/andrew-kane-gem-writer/SKILL.md +5 -5
  41. package/skills/ce-brainstorm/SKILL.md +43 -181
  42. package/skills/ce-compound/SKILL.md +143 -89
  43. package/skills/ce-compound-refresh/SKILL.md +48 -5
  44. package/skills/ce-ideate/SKILL.md +27 -242
  45. package/skills/ce-plan/SKILL.md +165 -81
  46. package/skills/ce-review/SKILL.md +348 -125
  47. package/skills/ce-review/references/findings-schema.json +5 -0
  48. package/skills/ce-review/references/persona-catalog.md +2 -2
  49. package/skills/ce-review/references/resolve-base.sh +5 -2
  50. package/skills/ce-review/references/subagent-template.md +25 -3
  51. package/skills/ce-work/SKILL.md +95 -242
  52. package/skills/ce-work-beta/SKILL.md +154 -301
  53. package/skills/dhh-rails-style/SKILL.md +13 -12
  54. package/skills/document-review/SKILL.md +56 -109
  55. package/skills/document-review/references/findings-schema.json +0 -23
  56. package/skills/document-review/references/subagent-template.md +13 -18
  57. package/skills/dspy-ruby/SKILL.md +8 -8
  58. package/skills/every-style-editor/SKILL.md +3 -2
  59. package/skills/frontend-design/SKILL.md +2 -3
  60. package/skills/git-commit/SKILL.md +1 -1
  61. package/skills/git-commit-push-pr/SKILL.md +81 -265
  62. package/skills/git-worktree/SKILL.md +20 -21
  63. package/skills/lfg/SKILL.md +10 -17
  64. package/skills/onboarding/SKILL.md +2 -2
  65. package/skills/onboarding/scripts/inventory.mjs +31 -7
  66. package/skills/proof/SKILL.md +134 -28
  67. package/skills/resolve-pr-feedback/SKILL.md +7 -2
  68. package/skills/setup/SKILL.md +1 -1
  69. package/skills/test-browser/SKILL.md +10 -11
  70. package/skills/test-xcode/SKILL.md +6 -3
  71. package/dist/lib/manifest.d.ts +0 -39
@@ -1,14 +1,16 @@
1
1
  ---
2
2
  name: ce:plan
3
- description: Transform feature descriptions or requirements into structured implementation plans grounded in repo patterns and research. Use when the user says 'plan this', 'create a plan', 'write a tech plan', 'plan the implementation', 'how should we build', 'what's the approach for', 'break this down', or when a brainstorm/requirements document is ready for technical planning. Best when requirements are at least roughly defined; for exploratory or ambiguous requests, prefer ce:brainstorm first.
4
- argument-hint: '[feature description, requirements doc path, or improvement idea]'
3
+ description: "Create structured plans for any multi-step task -- software features, research workflows, events, study plans, or any goal that benefits from structured breakdown. Also deepen existing plans with interactive review of sub-agent findings. Use for plan creation when the user says 'plan this', 'create a plan', 'write a tech plan', 'plan the implementation', 'how should we build', 'what's the approach for', 'break this down', 'plan a trip', 'create a study plan', or when a brainstorm/requirements document is ready for planning. Use for plan deepening when the user says 'deepen the plan', 'deepen my plan', 'deepening pass', or uses 'deepen' in reference to a plan."
4
+ argument-hint: "[optional: feature description, requirements doc path, plan path to deepen, or any task to plan]"
5
5
  ---
6
6
 
7
7
  # Create Technical Plan
8
8
 
9
9
  **Note: The current year is 2026.** Use this when dating plans and searching for recent documentation.
10
10
 
11
- `ce:brainstorm` defines **WHAT** to build. `ce:plan` defines **HOW** to build it. `ce:work` executes the plan.
11
+ `ce:brainstorm` defines **WHAT** to build. `ce:plan` defines **HOW** to build it. `ce:work` executes the plan. A prior brainstorm is useful context but never required — `ce:plan` works from any input: a requirements doc, a bug report, a feature idea, or a rough description.
12
+
13
+ **When directly invoked, always plan.** Never classify a direct invocation as "not a planning task" and abandon the workflow. If the input is unclear, ask clarifying questions or use the planning bootstrap (Phase 0.4) to establish enough context — but always stay in the planning workflow.
12
14
 
13
15
  This workflow produces a durable implementation plan. It does **not** implement code, run tests, or learn from execution-time results. If the answer depends on changing code and seeing what happens, that belongs in `ce:work`, not here.
14
16
 
@@ -22,9 +24,11 @@ Ask one question at a time. Prefer a concise single-select choice when natural o
22
24
 
23
25
  <feature_description> #$ARGUMENTS </feature_description>
24
26
 
25
- **If the feature description above is empty, ask the user:** "What would you like to plan? Please describe the feature, bug fix, or improvement you have in mind."
27
+ **If the feature description above is empty, ask the user:** "What would you like to plan? Describe the task, goal, or project you have in mind." Then wait for their response before continuing.
28
+
29
+ If the input is present but unclear or underspecified, do not abandon — ask one or two clarifying questions, or proceed to Phase 0.4's planning bootstrap to establish enough context. The goal is always to help the user plan, never to exit the workflow.
26
30
 
27
- Do not proceed until you have a clear planning input.
31
+ **IMPORTANT: All file references in the plan document must use repo-relative paths (e.g., `src/models/user.rb`), never absolute paths (e.g., `/Users/name/Code/project/src/models/user.rb`). This applies everywhere — implementation unit file lists, pattern references, origin document links, and prose mentions. Absolute paths break portability across machines, worktrees, and teammates.**
28
32
 
29
33
  ## Core Principles
30
34
 
@@ -41,11 +45,11 @@ Do not proceed until you have a clear planning input.
41
45
  Every plan should contain:
42
46
  - A clear problem frame and scope boundary
43
47
  - Concrete requirements traceability back to the request or origin document
44
- - Exact file paths for the work being proposed
48
+ - Repo-relative file paths for the work being proposed (never absolute paths — see Planning Rules)
45
49
  - Explicit test file paths for feature-bearing implementation units
46
50
  - Decisions with rationale, not just tasks
47
51
  - Existing patterns or code references to follow
48
- - Specific test scenarios and verification outcomes
52
+ - Enumerated test scenarios for each feature-bearing unit, specific enough that an implementer knows exactly what to test without inventing coverage themselves
49
53
  - Clear dependencies and sequencing
50
54
 
51
55
  A plan is ready when an implementer can start confidently without needing the plan to write the code for them.
@@ -61,6 +65,28 @@ If the user references an existing plan file or there is an obvious recent match
61
65
  - Confirm whether to update it in place or create a new plan
62
66
  - If updating, preserve completed checkboxes and revise only the still-relevant sections
63
67
 
68
+ **Deepen intent:** The word "deepen" (or "deepening") in reference to a plan is the primary trigger for the deepening fast path. When the user says "deepen the plan", "deepen my plan", "run a deepening pass", or similar, the target document is a **plan** in `docs/plans/`, not a requirements document. Use any path, keyword, or context the user provides to identify the right plan. If a path is provided, verify it is actually a plan document. If the match is not obvious, confirm with the user before proceeding.
69
+
70
+ Words like "strengthen", "confidence", "gaps", and "rigor" are NOT sufficient on their own to trigger deepening. These words appear in normal editing requests ("strengthen that section about the diagram", "there are gaps in the test scenarios") and should not cause a holistic deepening pass. Only treat them as deepening intent when the request clearly targets the plan as a whole and does not name a specific section or content area to change — and even then, prefer to confirm with the user before entering the deepening flow.
71
+
72
+ Once the plan is identified and appears complete (all major sections present, implementation units defined, `status: active`):
73
+ - If the plan lacks YAML frontmatter (non-software plans use a simple `# Title` heading with `Created:` date instead of frontmatter), route to `references/universal-planning.md` for editing or deepening instead of Phase 5.3. Non-software plans do not use the software confidence check.
74
+ - Otherwise, short-circuit to Phase 5.3 (Confidence Check and Deepening) in **interactive mode**. This avoids re-running the full planning workflow and gives the user control over which findings are integrated.
75
+
76
+ Normal editing requests (e.g., "update the test scenarios", "add a new implementation unit", "strengthen the risk section") should NOT trigger the fast path — they follow the standard resume flow.
77
+
78
+ If the plan already has a `deepened: YYYY-MM-DD` frontmatter field and there is no explicit user request to re-deepen, the fast path still applies the same confidence-gap evaluation — it does not force deepening.
79
+
80
+ #### 0.1b Classify Task Domain
81
+
82
+ If the task involves building, modifying, or architecting software (references code, repos, APIs, databases, or asks to build/modify/deploy), continue to Phase 0.2.
83
+
84
+ If the task is about a non-software domain and describes a multi-step goal worth planning, read `references/universal-planning.md` and follow that workflow instead. Skip all subsequent phases.
85
+
86
+ If genuinely ambiguous (e.g., "plan a migration" with no other context), ask the user before routing.
87
+
88
+ For everything else (quick questions, error messages, factual lookups) **only when auto-selected**, respond directly without any planning workflow. When directly invoked by the user, treat the input as a planning request — ask clarifying questions if needed, but do not exit the workflow.
89
+
64
90
  #### 0.2 Find Upstream Requirements Document
65
91
 
66
92
  Before asking planning questions, search `docs/brainstorms/` for files matching `*-requirements.md`.
@@ -90,12 +116,12 @@ If a relevant requirements document exists:
90
116
 
91
117
  If no relevant requirements document exists, planning may proceed from the user's request directly.
92
118
 
93
- #### 0.4 No-Requirements-Doc Fallback
119
+ #### 0.4 Planning Bootstrap (No Requirements Doc or Unclear Input)
94
120
 
95
- If no relevant requirements document exists:
96
- - Assess whether the request is already clear enough for direct technical planning
97
- - If the ambiguity is mainly product framing, user behavior, or scope definition, recommend `ce:brainstorm` first
98
- - If the user wants to continue here anyway, run a short planning bootstrap instead of refusing
121
+ If no relevant requirements document exists, or the input needs more structure:
122
+ - Assess whether the request is already clear enough for direct technical planning — if so, continue to Phase 0.5
123
+ - If the ambiguity is mainly product framing, user behavior, or scope definition, recommend `ce:brainstorm` as a suggestion — but always offer to continue planning here as well
124
+ - If the user wants to continue here (or was already explicit about wanting a plan), run the planning bootstrap below
99
125
 
100
126
  The planning bootstrap should establish:
101
127
  - Problem frame
@@ -110,6 +136,11 @@ If the bootstrap uncovers major unresolved product questions:
110
136
  - Recommend `ce:brainstorm` again
111
137
  - If the user still wants to continue, require explicit assumptions before proceeding
112
138
 
139
+ If the bootstrap reveals that a different workflow would serve the user better:
140
+
141
+ - **Symptom without a root cause** (user describes broken behavior but hasn't identified why) — announce that investigation is needed before planning and load the `ce:debug` skill. A plan requires a known problem to solve; debugging identifies what that problem is. Announce the routing clearly: "This needs investigation before planning — switching to ce:debug to find the root cause."
142
+ - **Clear task ready to execute** (known root cause, obvious fix, no architectural decisions) — suggest `ce:work` as a faster alternative alongside continuing with planning. The user decides.
143
+
113
144
  #### 0.5 Classify Outstanding Questions Before Planning
114
145
 
115
146
  If the origin document contains `Resolve Before Planning` or similar blocking questions:
@@ -144,9 +175,8 @@ Prepare a concise planning context summary (a paragraph or two) to pass as input
144
175
 
145
176
  Run these agents in parallel:
146
177
 
147
- - task systematic:research:repo-research-analyst(Scope: technology, architecture, patterns. {planning context summary})
148
- - task systematic:research:learnings-researcher(planning context summary)
149
-
178
+ - Task systematic:research:repo-research-analyst(Scope: technology, architecture, patterns. {planning context summary})
179
+ - Task systematic:research:learnings-researcher(planning context summary)
150
180
  Collect:
151
181
  - Technology stack and versions (used in section 1.2 to make sharper external research decisions)
152
182
  - Architectural patterns and conventions to follow
@@ -154,6 +184,12 @@ Collect:
154
184
  - AGENTS.md guidance that materially affects the plan, with AGENTS.md used only as compatibility fallback when present
155
185
  - Institutional learnings from `docs/solutions/`
156
186
 
187
+ **Slack context** (opt-in) — never auto-dispatch. Route by condition:
188
+
189
+ - **Tools available + user asked**: Dispatch `systematic:research:slack-researcher` with the planning context summary in parallel with other Phase 1.1 agents. If the origin document has a Slack context section, pass it verbatim so the researcher focuses on gaps. Include findings in consolidation.
190
+ - **Tools available + user didn't ask**: Note in output: "Slack tools detected. Ask me to search Slack for organizational context at any point, or include it in your next prompt."
191
+ - **No tools + user asked**: Note in output: "Slack context was requested but no Slack tools are available. Install and authenticate the Slack plugin to enable organizational context search."
192
+
157
193
  #### 1.1b Detect Execution Posture Signals
158
194
 
159
195
  Decide whether the plan should carry a lightweight execution posture signal.
@@ -162,7 +198,6 @@ Look for signals such as:
162
198
  - The user explicitly asks for TDD, test-first, or characterization-first work
163
199
  - The origin document calls for test-first implementation or exploratory hardening of legacy code
164
200
  - Local research shows the target area is legacy, weakly tested, or historically fragile, suggesting characterization coverage before changing behavior
165
- - The user asks for external delegation, says "use codex", "delegate mode", or mentions token conservation -- add `Execution target: external-delegate` to implementation units that are pure code writing
166
201
 
167
202
  When the signal is clear, carry it forward silently in the relevant implementation units.
168
203
 
@@ -190,12 +225,13 @@ The repo-research-analyst output includes a structured Technology & Infrastructu
190
225
 
191
226
  **Always lean toward external research when:**
192
227
  - The topic is high-risk: security, payments, privacy, external APIs, migrations, compliance
193
- - The codebase lacks relevant local patterns
228
+ - The codebase lacks relevant local patterns -- fewer than 3 direct examples of the pattern this plan needs
229
+ - Local patterns exist for an adjacent domain but not the exact one -- e.g., the codebase has HTTP clients but not webhook receivers, or has background jobs but not event-driven pub/sub. Adjacent patterns suggest the team is comfortable with the technology layer but may not know domain-specific pitfalls. When this signal is present, frame the external research query around the domain gap specifically, not the general technology
194
230
  - The user is exploring unfamiliar territory
195
231
  - The technology scan found the relevant layer absent or thin in the codebase
196
232
 
197
233
  **Skip external research when:**
198
- - The codebase already shows a strong local pattern
234
+ - The codebase already shows a strong local pattern -- multiple direct examples (not adjacent-domain), recently touched, following current conventions
199
235
  - The user already knows the intended shape
200
236
  - Additional external context would add little practical value
201
237
  - The technology scan found the relevant layer well-established with existing examples to follow
@@ -208,23 +244,36 @@ Announce the decision briefly before continuing. Examples:
208
244
 
209
245
  If Step 1.2 indicates external research is useful, run these agents in parallel:
210
246
 
211
- - task systematic:research:best-practices-researcher(planning context summary)
212
- - task systematic:research:framework-docs-researcher(planning context summary)
247
+ - Task systematic:research:best-practices-researcher(planning context summary)
248
+ - Task systematic:research:framework-docs-researcher(planning context summary)
213
249
 
214
250
  #### 1.4 Consolidate Research
215
251
 
216
252
  Summarize:
217
253
  - Relevant codebase patterns and file paths
218
254
  - Relevant institutional learnings
255
+ - Organizational context from Slack conversations, if gathered (prior discussions, decisions, or domain knowledge relevant to the feature)
219
256
  - External references and best practices, if gathered
220
257
  - Related issues, PRs, or prior art
221
258
  - Any constraints that should materially shape the plan
222
259
 
260
+ #### 1.4b Reclassify Depth When Research Reveals External Contract Surfaces
261
+
262
+ If the current classification is **Lightweight** and Phase 1 research found that the work touches any of these external contract surfaces, reclassify to **Standard**:
263
+
264
+ - Environment variables consumed by external systems, CI, or other repositories
265
+ - Exported public APIs, CLI flags, or command-line interface contracts
266
+ - CI/CD configuration files (`.github/workflows/`, `Dockerfile`, deployment scripts)
267
+ - Shared types or interfaces imported by downstream consumers
268
+ - Documentation referenced by external URLs or linked from other systems
269
+
270
+ This ensures flow analysis (Phase 1.5) runs and the confidence check (Phase 5.3) applies critical-section bonuses. Announce the reclassification briefly: "Reclassifying to Standard — this change touches [environment variables / exported APIs / CI config] with external consumers."
271
+
223
272
  #### 1.5 Flow and Edge-Case Analysis (Conditional)
224
273
 
225
274
  For **Standard** or **Deep** plans, or when user flow completeness is still unclear, run:
226
275
 
227
- - task systematic:workflow:spec-flow-analyzer(planning context summary, research findings)
276
+ - Task systematic:workflow:spec-flow-analyzer(planning context summary, research findings)
228
277
 
229
278
  Use the output to:
230
279
  - Identify missing edge cases, state transitions, or handoff gaps
@@ -292,6 +341,7 @@ Before detailing implementation units, decide whether an overview would help a r
292
341
  | Data pipeline or transformation | Data flow sketch |
293
342
  | State-heavy lifecycle | State diagram |
294
343
  | Complex branching logic | Flowchart |
344
+ | Mode/flag combinations or multi-input behavior | Decision matrix (inputs -> outcomes) |
295
345
  | Single-component with non-obvious shape | Pseudo-code sketch |
296
346
 
297
347
  **When to skip it:**
@@ -305,18 +355,36 @@ Frame every sketch with: *"This illustrates the intended approach and is directi
305
355
 
306
356
  Keep sketches concise — enough to validate direction, not enough to copy-paste into production.
307
357
 
358
+ #### 3.4b Output Structure (Optional)
359
+
360
+ For greenfield plans that create a new directory structure (new plugin, service, package, or module), include an `## Output Structure` section with a file tree showing the expected layout. This gives reviewers the overall shape before diving into per-unit details.
361
+
362
+ **When to include it:**
363
+ - The plan creates 3+ new files in a new directory hierarchy
364
+ - The directory layout itself is a meaningful design decision
365
+
366
+ **When to skip it:**
367
+ - The plan only modifies existing files
368
+ - The plan creates 1-2 files in an existing directory — the per-unit file lists are sufficient
369
+
370
+ The tree is a scope declaration showing the expected output shape. It is not a constraint — the implementer may adjust the structure if implementation reveals a better layout. The per-unit `**Files:**` sections remain authoritative for what each unit creates or modifies.
371
+
308
372
  #### 3.5 Define Each Implementation Unit
309
373
 
310
374
  For each unit, include:
311
375
  - **Goal** - what this unit accomplishes
312
376
  - **Requirements** - which requirements or success criteria it advances
313
377
  - **Dependencies** - what must exist first
314
- - **Files** - exact file paths to create, modify, or test
378
+ - **Files** - repo-relative file paths to create, modify, or test (never absolute paths)
315
379
  - **Approach** - key decisions, data flow, component boundaries, or integration notes
316
- - **Execution note** - optional, only when the unit benefits from a non-default execution posture such as test-first, characterization-first, or external delegation
380
+ - **Execution note** - optional, only when the unit benefits from a non-default execution posture such as test-first or characterization-first
317
381
  - **Technical design** - optional pseudo-code or diagram when the unit's approach is non-obvious and prose alone would leave it ambiguous. Frame explicitly as directional guidance, not implementation specification
318
382
  - **Patterns to follow** - existing code or conventions to mirror
319
- - **Test scenarios** - specific behaviors, edge cases, and failure paths to cover
383
+ - **Test scenarios** - enumerate the specific test cases the implementer should write, right-sized to the unit's complexity and risk. Consider each category below and include scenarios from every category that applies to this unit. A simple config change may need one scenario; a payment flow may need a dozen. The quality signal is specificity — each scenario should name the input, action, and expected outcome so the implementer doesn't have to invent coverage. For units with no behavioral change (pure config, scaffolding, styling), use `Test expectation: none -- [reason]` instead of leaving the field blank.
384
+ - **Happy path behaviors** - core functionality with expected inputs and outputs
385
+ - **Edge cases** (when the unit has meaningful boundaries) - boundary values, empty inputs, nil/null states, concurrent access
386
+ - **Error and failure paths** (when the unit has failure modes) - invalid input, downstream service failures, timeout behavior, permission denials
387
+ - **Integration scenarios** (when the unit crosses layers) - behaviors that mocks alone will not prove, e.g., "creating X triggers callback Y which persists Z". Include these for any unit touching callbacks, middleware, or multi-layer interactions
320
388
  - **Verification** - how an implementer should know the unit is complete, expressed as outcomes rather than shell command scripts
321
389
 
322
390
  Every feature-bearing unit should include the test file path in `**Files:**`.
@@ -325,7 +393,6 @@ Use `Execution note` sparingly. Good uses include:
325
393
  - `Execution note: Start with a failing integration test for the request/response contract.`
326
394
  - `Execution note: Add characterization coverage before modifying this legacy parser.`
327
395
  - `Execution note: Implement new domain behavior test-first.`
328
- - `Execution note: Execution target: external-delegate`
329
396
 
330
397
  Do not expand units into literal `RED/GREEN/REFACTOR` substeps.
331
398
 
@@ -386,7 +453,7 @@ type: [feat|fix|refactor]
386
453
  status: active
387
454
  date: YYYY-MM-DD
388
455
  origin: docs/brainstorms/YYYY-MM-DD-<topic>-requirements.md # include when planning from a requirements doc
389
- deepened: YYYY-MM-DD # optional, set later by deepen-plan when the plan is substantively strengthened
456
+ deepened: YYYY-MM-DD # optional, set when the confidence check substantively strengthens the plan
390
457
  ---
391
458
 
392
459
  # [Plan Title]
@@ -408,6 +475,12 @@ deepened: YYYY-MM-DD # optional, set later by deepen-plan when the plan is subs
408
475
 
409
476
  - [Explicit non-goal or exclusion]
410
477
 
478
+ <!-- Optional: When some items are planned work that will happen in a separate PR, issue,
479
+ or repo, use this sub-heading to distinguish them from true non-goals. -->
480
+ ### Deferred to Separate Tasks
481
+
482
+ - [Work that will be done separately]: [Where or when -- e.g., "separate PR in repo-x", "future iteration"]
483
+
411
484
  ## Context & Research
412
485
 
413
486
  ### Relevant Code and Patterns
@@ -436,6 +509,14 @@ deepened: YYYY-MM-DD # optional, set later by deepen-plan when the plan is subs
436
509
 
437
510
  - [Question or unknown]: [Why it is intentionally deferred]
438
511
 
512
+ <!-- Optional: Include when the plan creates a new directory structure (greenfield plugin,
513
+ new service, new package). Shows the expected output shape at a glance. Omit for plans
514
+ that only modify existing files. This is a scope declaration, not a constraint --
515
+ the implementer may adjust the structure if implementation reveals a better layout. -->
516
+ ## Output Structure
517
+
518
+ [directory tree showing new directories and files]
519
+
439
520
  <!-- Optional: Include this section only when the work involves DSL design, multi-component
440
521
  integration, complex data flow, state-heavy lifecycle, or other cases where prose alone
441
522
  would leave the approach shape ambiguous. Omit it entirely for well-patterned or
@@ -464,7 +545,7 @@ deepened: YYYY-MM-DD # optional, set later by deepen-plan when the plan is subs
464
545
  **Approach:**
465
546
  - [Key design or sequencing decision]
466
547
 
467
- **Execution note:** [Optional test-first, characterization-first, external-delegate, or other execution posture signal]
548
+ **Execution note:** [Optional test-first, characterization-first, or other execution posture signal]
468
549
 
469
550
  **Technical design:** *(optional -- pseudo-code or diagram when the unit's approach is non-obvious. Directional guidance, not implementation specification.)*
470
551
 
@@ -472,8 +553,8 @@ deepened: YYYY-MM-DD # optional, set later by deepen-plan when the plan is subs
472
553
  - [Existing file, class, or pattern]
473
554
 
474
555
  **Test scenarios:**
475
- - [Specific scenario with expected behavior]
476
- - [Edge case or failure path]
556
+ <!-- Include only categories that apply to this unit. Omit categories that don't. For units with no behavioral change, use "Test expectation: none -- [reason]" instead of leaving this section blank. -->
557
+ - [Scenario: specific input/action -> expected outcome. Prefix with category — Happy path, Edge case, Error path, or Integration — to signal intent]
477
558
 
478
559
  **Verification:**
479
560
  - [Outcome that should hold when this unit is complete]
@@ -485,10 +566,13 @@ deepened: YYYY-MM-DD # optional, set later by deepen-plan when the plan is subs
485
566
  - **State lifecycle risks:** [Partial-write, cache, duplicate, or cleanup concerns]
486
567
  - **API surface parity:** [Other interfaces that may require the same change]
487
568
  - **Integration coverage:** [Cross-layer scenarios unit tests alone will not prove]
569
+ - **Unchanged invariants:** [Existing APIs, interfaces, or behaviors that this plan explicitly does not change — and how the new work relates to them. Include when the change touches shared surfaces and reviewers need blast-radius assurance]
488
570
 
489
571
  ## Risks & Dependencies
490
572
 
491
- - [Meaningful risk, dependency, or sequencing concern]
573
+ | Risk | Mitigation |
574
+ |------|------------|
575
+ | [Meaningful risk] | [How it is addressed or accepted] |
492
576
 
493
577
  ## Documentation / Operational Notes
494
578
 
@@ -519,7 +603,9 @@ For larger `Deep` plans, extend the core template only when useful with sections
519
603
 
520
604
  ## Risk Analysis & Mitigation
521
605
 
522
- - [Risk]: [Mitigation]
606
+ | Risk | Likelihood | Impact | Mitigation |
607
+ |------|-----------|--------|------------|
608
+ | [Risk] | [Low/Med/High] | [Low/Med/High] | [How addressed] |
523
609
 
524
610
  ## Phased Delivery
525
611
 
@@ -540,6 +626,7 @@ For larger `Deep` plans, extend the core template only when useful with sections
540
626
 
541
627
  #### 4.3 Planning Rules
542
628
 
629
+ - **All file paths must be repo-relative** — never use absolute paths like `/Users/name/Code/project/src/file.ts`. Use `src/file.ts` instead. Absolute paths make plans non-portable across machines, worktrees, and teammates. When a plan targets a different repo than the document's home, state the target repo once at the top of the plan (e.g., `**Target repo:** my-other-project`) and use repo-relative paths throughout
543
630
  - Prefer path plus class/component/pattern references over brittle line numbers
544
631
  - Keep implementation units checkable with `- [ ]` syntax for progress tracking
545
632
  - Do not include implementation code — no imports, exact method signatures, or framework-specific syntax
@@ -549,6 +636,10 @@ For larger `Deep` plans, extend the core template only when useful with sections
549
636
  - Do not expand implementation units into micro-step `RED/GREEN/REFACTOR` instructions
550
637
  - Do not pretend an execution-time question is settled just to make the plan look complete
551
638
 
639
+ #### 4.4 Visual Communication in Plan Documents
640
+
641
+ When the plan contains 4+ implementation units with non-linear dependencies, 3+ interacting surfaces in System-Wide Impact, 3+ behavioral modes/variants in Overview or Problem Frame, or 3+ interacting decisions in Key Technical Decisions or alternatives in Alternative Approaches, read `references/visual-communication.md` for diagram and table guidance. This covers plan-structure visuals (dependency graphs, interaction diagrams, comparison tables) — not solution-design diagrams, which are covered in Section 3.4.
642
+
552
643
  ### Phase 5: Final Review, Write File, and Handoff
553
644
 
554
645
  #### 5.1 Review Before Writing
@@ -559,10 +650,15 @@ Before finalizing, check:
559
650
  - Every major decision is grounded in the origin document or research
560
651
  - Each implementation unit is concrete, dependency-ordered, and implementation-ready
561
652
  - If test-first or characterization-first posture was explicit or strongly implied, the relevant units carry it forward with a lightweight `Execution note`
562
- - Test scenarios are specific without becoming test code
653
+ - Each feature-bearing unit has test scenarios from every applicable category (happy path, edge cases, error paths, integration) — right-sized to the unit's complexity, not padded or skimped
654
+ - Test scenarios name specific inputs, actions, and expected outcomes without becoming test code
655
+ - Feature-bearing units with blank or missing test scenarios are flagged as incomplete — feature-bearing units must have actual test scenarios, not just an annotation. The `Test expectation: none -- [reason]` annotation is only valid for non-feature-bearing units (pure config, scaffolding, styling)
563
656
  - Deferred items are explicit and not hidden as fake certainty
564
657
  - If a High-Level Technical Design section is included, it uses the right medium for the work, carries the non-prescriptive framing, and does not contain implementation code (no imports, exact signatures, or framework-specific syntax)
565
658
  - Per-unit technical design fields, if present, are concise and directional rather than copy-paste-ready
659
+ - If the plan creates a new directory structure, would an Output Structure tree help reviewers see the overall shape?
660
+ - If Scope Boundaries lists items that are planned work for a separate PR or task, are they under `### Deferred to Separate Tasks` rather than mixed with true non-goals?
661
+ - Would a visual aid (dependency graph, interaction diagram, comparison table) help a reader grasp the plan structure faster than scanning prose alone?
566
662
 
567
663
  If the plan originated from a requirements document, re-read that document and verify:
568
664
  - The chosen approach still matches the product intent
@@ -574,7 +670,7 @@ If the plan originated from a requirements document, re-read that document and v
574
670
 
575
671
  **REQUIRED: Write the plan file to disk before presenting any options.**
576
672
 
577
- Use the write tool to save the complete plan to:
673
+ Use the Write tool to save the complete plan to:
578
674
 
579
675
  ```text
580
676
  docs/plans/YYYY-MM-DD-NNN-<type>-<descriptive-name>-plan.md
@@ -588,66 +684,54 @@ Plan written to docs/plans/[filename]
588
684
 
589
685
  **Pipeline mode:** If invoked from an automated workflow such as LFG, SLFG, or any `disable-model-invocation` context, skip interactive questions. Make the needed choices automatically and proceed to writing the plan.
590
686
 
591
- #### 5.3 Post-Generation Options
687
+ #### 5.3 Confidence Check and Deepening
592
688
 
593
- After writing the plan file, present the options using the platform's blocking question tool when available (see Interaction Method). Otherwise present numbered options in chat and wait for the user's reply before proceeding.
689
+ After writing the plan file, automatically evaluate whether the plan needs strengthening.
594
690
 
595
- **Question:** "Plan ready at `docs/plans/YYYY-MM-DD-NNN-<type>-<name>-plan.md`. What would you like to do next?"
691
+ **Two deepening modes:**
596
692
 
597
- **Options:**
598
- 1. **Open plan in editor** - Open the plan file for review
599
- 2. **Run `/deepen-plan`** - Stress-test weak sections with targeted research when the plan needs more confidence
600
- 3. **Run `document-review` skill** - Improve the plan through structured document review
601
- 4. **Share to Proof** - Upload the plan for collaborative review and sharing
602
- 5. **Start `/ce:work`** - Begin implementing this plan in the current environment
603
- 6. **Start `/ce:work` in another session** - Begin implementing in a separate agent session when the current platform supports it
604
- 7. **Create Issue** - Create an issue in the configured tracker
693
+ - **Auto mode** (default during plan generation): Runs without asking the user for approval. The user sees what is being strengthened but does not need to make a decision. Sub-agent findings are synthesized directly into the plan.
694
+ - **Interactive mode** (activated by the re-deepen fast path in Phase 0.1): The user explicitly asked to deepen an existing plan. Sub-agent findings are presented individually for review before integration. The user can accept, reject, or discuss each agent's findings. Only accepted findings are synthesized into the plan.
605
695
 
606
- Based on selection:
607
- - **Open plan in editor** → Open `docs/plans/<plan_filename>.md` using the current platform's file-open or editor mechanism (e.g., `open` on macOS, `xdg-open` on Linux, or the IDE's file-open API)
608
- - **`/deepen-plan`** → Call `/deepen-plan` with the plan path
609
- - **`document-review` skill** → Load the `document-review` skill with the plan path
610
- - **Share to Proof** → Upload the plan:
611
- ```bash
612
- CONTENT=$(cat docs/plans/<plan_filename>.md)
613
- TITLE="Plan: <plan title from frontmatter>"
614
- RESPONSE=$(curl -s -X POST https://www.proofeditor.ai/share/markdown \
615
- -H "Content-Type: application/json" \
616
- -d "$(jq -n --arg title "$TITLE" --arg markdown "$CONTENT" --arg by "ai:compound" '{title: $title, markdown: $markdown, by: $by}')")
617
- PROOF_URL=$(echo "$RESPONSE" | jq -r '.tokenUrl')
618
- ```
619
- Display `View & collaborate in Proof: <PROOF_URL>` if successful, then return to the options
620
- - **`/ce:work`** → Call `/ce:work` with the plan path
621
- - **`/ce:work` in another session** → If the current platform supports launching a separate agent session, start `/ce:work` with the plan path there. Otherwise, explain the limitation briefly and offer to run `/ce:work` in the current session instead.
622
- - **Create Issue** → Follow the Issue Creation section below
623
- - **Other** → Accept free text for revisions and loop back to options
696
+ Interactive mode exists because on-demand deepening is a different user posture — the user already has a plan they are invested in and wants to be surgical about what changes. This applies whether the plan was generated by this skill, written by hand, or produced by another tool.
624
697
 
625
- If running with ultrathink enabled, or the platform's reasoning/effort level is set to max or extra-high, automatically run `/deepen-plan` only when the plan is `Standard` or `Deep`, high-risk, or still shows meaningful confidence gaps in decisions, sequencing, system-wide impact, risks, or verification.
698
+ `document-review` and this confidence check are different:
699
+ - Use the `document-review` skill when the document needs clarity, simplification, completeness, or scope control
700
+ - This confidence check strengthens rationale, sequencing, risk treatment, and system-wide thinking when the plan is structurally sound but still needs stronger grounding
626
701
 
627
- ## Issue Creation
702
+ **Pipeline mode:** This phase always runs in auto mode in pipeline/disable-model-invocation contexts. No user interaction needed.
628
703
 
629
- When the user selects "Create Issue", detect their project tracker from `AGENTS.md` or, if needed for compatibility, `AGENTS.md`:
704
+ ##### 5.3.1 Classify Plan Depth and Topic Risk
630
705
 
631
- 1. Look for `project_tracker: github` or `project_tracker: linear`
632
- 2. If GitHub:
706
+ Determine the plan depth from the document:
707
+ - **Lightweight** - small, bounded, low ambiguity, usually 2-4 implementation units
708
+ - **Standard** - moderate complexity, some technical decisions, usually 3-6 units
709
+ - **Deep** - cross-cutting, high-risk, or strategically important work, usually 4-8 units or phased delivery
633
710
 
634
- ```bash
635
- gh issue create --title "<type>: <title>" --body-file <plan_path>
636
- ```
711
+ Build a risk profile. Treat these as high-risk signals:
712
+ - Authentication, authorization, or security-sensitive behavior
713
+ - Payments, billing, or financial flows
714
+ - Data migrations, backfills, or persistent data changes
715
+ - External APIs or third-party integrations
716
+ - Privacy, compliance, or user data handling
717
+ - Cross-interface parity or multi-surface behavior
718
+ - Significant rollout, monitoring, or operational concerns
637
719
 
638
- 3. If Linear:
720
+ ##### 5.3.2 Gate: Decide Whether to Deepen
639
721
 
640
- ```bash
641
- linear issue create --title "<title>" --description "$(cat <plan_path>)"
642
- ```
722
+ - **Lightweight** plans usually do not need deepening unless they are high-risk
723
+ - **Standard** plans often benefit when one or more important sections still look thin
724
+ - **Deep** or high-risk plans often benefit from a targeted second pass
725
+ - **Thin local grounding override:** If Phase 1.2 triggered external research because local patterns were thin (fewer than 3 direct examples or adjacent-domain match), always proceed to scoring regardless of how grounded the plan appears. When the plan was built on unfamiliar territory, claims about system behavior are more likely to be assumptions than verified facts. The scoring pass is cheap — if the plan is genuinely solid, scoring finds nothing and exits quickly
643
726
 
644
- 4. If no tracker is configured:
645
- - Ask which tracker they use using the platform's blocking question tool when available (see Interaction Method)
646
- - Suggest adding the tracker to `AGENTS.md` for future runs
727
+ If the plan already appears sufficiently grounded and the thin-grounding override does not apply, report "Confidence check passed — no sections need strengthening" and skip to Phase 5.3.8 (Document Review). Document-review always runs regardless of whether deepening was needed — the two tools catch different classes of issues.
647
728
 
648
- After issue creation:
649
- - Display the issue URL
650
- - Ask whether to proceed to `/ce:work`
729
+ ##### 5.3.3–5.3.7 Deepening Execution
651
730
 
652
- NEVER CODE! Research, decide, and write the plan.
731
+ When deepening is warranted, read `references/deepening-workflow.md` for confidence scoring checklists, section-to-agent dispatch mapping, execution mode selection, research execution, interactive finding review, and plan synthesis instructions. Execute steps 5.3.3 through 5.3.7 from that file, then return here for 5.3.8.
732
+
733
+ ##### 5.3.8–5.4 Document Review, Final Checks, and Post-Generation Options
653
734
 
735
+ When reaching this phase, read `references/plan-handoff.md` for document review instructions (5.3.8), final checks and cleanup (5.3.9), post-generation options menu (5.4), and issue creation. Do not load this file earlier. Document review is mandatory — do not skip it even if the confidence check already ran.
736
+
737
+ NEVER CODE! Research, decide, and write the plan.