@fro.bot/systematic 2.6.0 → 2.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/agents/review/api-contract-reviewer.md +1 -1
  2. package/agents/review/correctness-reviewer.md +1 -1
  3. package/agents/review/data-migrations-reviewer.md +1 -1
  4. package/agents/review/dhh-rails-reviewer.md +1 -1
  5. package/agents/review/julik-frontend-races-reviewer.md +1 -1
  6. package/agents/review/kieran-python-reviewer.md +1 -1
  7. package/agents/review/kieran-rails-reviewer.md +1 -1
  8. package/agents/review/kieran-typescript-reviewer.md +1 -1
  9. package/agents/review/maintainability-reviewer.md +1 -1
  10. package/agents/review/performance-reviewer.md +1 -1
  11. package/agents/review/reliability-reviewer.md +1 -1
  12. package/agents/review/security-reviewer.md +1 -1
  13. package/agents/workflow/bug-reproduction-validator.md +1 -1
  14. package/dist/cli.js +1 -1
  15. package/dist/{index-3h7kpmfa.js → index-k9tdxh0p.js} +1 -1
  16. package/dist/index.d.ts +1 -1
  17. package/dist/index.js +2 -3
  18. package/dist/lib/skills.d.ts +1 -0
  19. package/package.json +1 -1
  20. package/skills/ce-brainstorm/references/handoff.md +127 -0
  21. package/skills/ce-brainstorm/references/requirements-capture.md +243 -0
  22. package/skills/ce-brainstorm/references/universal-brainstorming.md +63 -0
  23. package/skills/ce-ideate/references/post-ideation-workflow.md +240 -0
  24. package/skills/ce-plan/references/deepening-workflow.md +249 -0
  25. package/skills/ce-plan/references/plan-handoff.md +96 -0
  26. package/skills/ce-plan/references/universal-planning.md +114 -0
  27. package/skills/ce-plan/references/visual-communication.md +31 -0
  28. package/skills/ce-work/references/shipping-workflow.md +129 -0
  29. package/skills/ce-work-beta/references/codex-delegation-workflow.md +327 -0
  30. package/skills/ce-work-beta/references/shipping-workflow.md +129 -0
  31. package/skills/compound-docs/SKILL.md +2 -3
  32. package/skills/document-review/references/synthesis-and-presentation.md +406 -0
  33. package/skills/proof/references/hitl-review.md +368 -0
  34. package/skills/writing-systematic-skills/SKILL.md +115 -0
  35. package/skills/writing-systematic-skills/references/foundation-conventions.md +143 -0
@@ -0,0 +1,63 @@
1
+ # Universal Brainstorming Facilitator
2
+
3
+ This file is loaded when ce-brainstorm detects a non-software task (Phase 0). It replaces the software-specific brainstorming phases (Phases 0.2 through 4) with facilitation principles for any domain. The Core Principles and **Interaction Rules** in the parent `ce-brainstorm/SKILL.md` still apply unchanged — including one-question-per-turn and the default to the platform's blocking question tool. This file extends those rules with universal-domain facilitation guidance; it does not relax them.
4
+
5
+ ---
6
+
7
+ ## Your role
8
+
9
+ Be a thinking partner, not an answer machine. The user came here because they're stuck or exploring — they want to think WITH someone, not receive a deliverable. Resist the urge to generate a complete solution immediately. A premature answer anchors the conversation and kills exploration.
10
+
11
+ **Match the tone to the stakes.** For personal or life decisions (career changes, housing, relationships, family), lead with values and feelings before frameworks and analysis. Ask what matters to them, not just what the options are. For lighter or creative tasks (podcast topics, event ideas, side projects), energy and enthusiasm are more useful than caution.
12
+
13
+ ## Asking questions
14
+
15
+ "Thinking partner" framing does not mean "conversational prose." The parent skill's Interaction Rules apply in full: one question per turn, and default to the platform's blocking question tool (with its free-text fallback) even for opening and elicitation.
16
+
17
+ "What's prompting this?", "what matters most here?", and "what have you ruled out?" feel open-ended and conversational, but that's not a reason to skip the tool. The free-text option preserves flexibility while a well-crafted option set teaches the user the dimensions they might not have separated. Pick-plus-optional-note is lower activation energy than composing prose from scratch — especially for emotional or values-laden topics where prose can feel like an essay prompt.
18
+
19
+ Drop to prose only when (a) the answer is inherently narrative ("walk me through how you got here"), (b) the question is diagnostic or introspective and presented options would leak your priors and bias the answer, or (c) you cannot write 3-4 genuinely distinct, plausibly-correct options that cover the space without padding. If you'd be straining to fill the option slots, the question is open — use prose.
20
+
21
+ ## How to start
22
+
23
+ **Assess scope first.** Not every brainstorm needs deep exploration:
24
+ - **Quick** (user has a clear goal, just needs a sounding board): Confirm understanding, offer a few targeted suggestions or reactions, done in 2-3 exchanges.
25
+ - **Standard** (some unknowns, needs to explore options): 4-6 exchanges, generate and compare options, help decide.
26
+ - **Full** (vague goal, lots of uncertainty, or high-stakes decision): Deep exploration, many exchanges, structured convergence.
27
+
28
+ **Ask what they're already thinking.** Before offering ideas, find out what the user has considered, tried, or rejected. This prevents fixation on AI-generated ideas and surfaces hidden constraints.
29
+
30
+ **When the user represents a group** (couple, family, team) — surface whose preferences are in play and where they diverge. The brainstorm shifts from "help you decide" to "help you find alignment." Ask about each person's priorities, not just the speaker's.
31
+
32
+ **Understand before generating.** Spend time on the problem before jumping to solutions. "What would success look like?" and "What have you already ruled out?" reveal more than "Here are 10 ideas."
33
+
34
+ ## How to explore and generate
35
+
36
+ **Use diverse angles to avoid repetitive ideas.** When generating options, vary your approach across exchanges:
37
+ - Inversion: "What if you did the opposite of the obvious choice?"
38
+ - Constraints as creative tools: "What if budget/time/distance were no issue?" then "What if you had to do it for free?"
39
+ - Analogy: "How does someone in a completely different context solve a similar problem?"
40
+ - What the user hasn't considered: introduce lateral ideas from unexpected directions
41
+
42
+ **Separate generation from evaluation.** When exploring options, don't critique them in the same breath. Generate first, evaluate later. Make the transition explicit when it's time to narrow.
43
+
44
+ **Offer options to react to when the user is stuck.** People who can't generate from scratch can often evaluate presented options. Use multi-select questions to gather preferences efficiently. Always include a skip option for users who want to move faster.
45
+
46
+ **Keep presented options to 3-5 at any decision point.** More causes analysis paralysis.
47
+
48
+ ## How to converge
49
+
50
+ When the conversation has enough material to narrow — reflect back what you've heard. Name the user's priorities as they've emerged through the conversation (what excited them, what they rejected, what they asked about). Propose a frontrunner with reasoning tied to their criteria, and invite pushback. Keep final options to 3-5 max. Don't force a final decision if the user isn't there yet — clarity on direction is a valid outcome.
51
+
52
+ ## When to wrap up
53
+
54
+ **Always synthesize a summary in the chat.** Before offering any next steps, reflect back what emerged: key decisions, the direction chosen, open threads, and any assumptions made. This is the primary output of the brainstorm — the user should be able to read the summary and know what they landed on.
55
+
56
+ **Then offer next steps** using the platform's blocking question tool: `question` in OpenCode (call `ToolSearch` with `select:question` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to numbered options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question.
57
+
58
+ **Question:** "Brainstorm wrapped. What would you like to do next?"
59
+
60
+ - **Create a plan** → hand off to `/ce-plan` with the decided goal and constraints
61
+ - **Save summary to disk** → write the summary as a markdown file in the current working directory
62
+ - **Open in Proof (web app) — review and comment to iterate with the agent** → load the `ce-proof` skill to open the doc in Every's Proof editor, iterate with the agent via comments, or copy a link to share with others
63
+ - **Done** → the conversation was the value, no artifact needed
@@ -0,0 +1,240 @@
1
+ # Post-Ideation Workflow
2
+
3
+ Read this file after Phase 2 ideation agents return and the orchestrator has merged and deduped their outputs into a master candidate list. Do not load before Phase 2 completes.
4
+
5
+ ## Phase 3: Adversarial Filtering
6
+
7
+ Review every candidate idea critically. The orchestrator performs this filtering directly -- do not dispatch sub-agents for critique.
8
+
9
+ Do not generate replacement ideas in this phase unless explicitly refining.
10
+
11
+ For each rejected idea, write a one-line reason.
12
+
13
+ Rejection criteria:
14
+ - too vague
15
+ - not actionable
16
+ - duplicates a stronger idea
17
+ - not grounded in the stated context
18
+ - too expensive relative to likely value
19
+ - already covered by existing workflows or docs
20
+ - interesting but better handled as a brainstorm variant, not a product improvement
21
+ - **unjustified — no articulated warrant** (sub-agent failed to provide `direct:`, `external:`, or `reasoned:` justification, or the stated warrant does not actually support the claimed move)
22
+ - **below ambition floor** (fails the meeting-test: would not warrant team discussion — except when Phase 0.5 detected tactical focus signals, in which case this criterion is waived)
23
+ - **subject-replacement** (abandons or replaces the subject of ideation rather than operating on it — e.g., "pivot to an unrelated domain," "become a different organization")
24
+
25
+ Score survivors using a consistent rubric weighing: groundedness in stated context, **warrant strength** (`direct:` > `external:` > `reasoned:`; none excluded, but direct-evidence ideas score higher all else equal), expected value, novelty, pragmatism, leverage on future work, implementation burden, and overlap with stronger ideas.
26
+
27
+ Target output:
28
+ - keep 5-7 survivors by default
29
+ - if too many survive, run a second stricter pass
30
+ - if fewer than 5 survive, report that honestly rather than lowering the bar
31
+
32
+ ## Phase 4: Present the Survivors
33
+
34
+ **Checkpoint B (V17).** Before presenting, write `<scratch-dir>/survivors.md` (using the absolute path captured in Phase 1) containing the survivor list plus key context (focus hint, grounding summary, rejection summary). This protects the post-critique state before the user reaches the persistence menu. Best-effort: if the write fails (disk full, permissions), log a warning and proceed; the checkpoint is not load-bearing. Reuses the same `<run-id>` and `<scratch-dir>` generated in Phase 1; not cleaned up at the end of the run (the run directory is preserved so the V15 cache remains reusable across run-ids in the same session — see Phase 6).
35
+
36
+ Present the surviving ideas to the user. The terminal review loop is a complete ideation cycle in itself — persistence is opt-in (Phase 5), and refinement happens in conversation with no file or network cost (Phase 6).
37
+
38
+ Present only the surviving ideas in structured form:
39
+
40
+ - title
41
+ - description
42
+ - **warrant** (tagged `direct:` / `external:` / `reasoned:`, with the quoted evidence, cited source, or written-out argument)
43
+ - rationale (how the warrant connects to the move's significance)
44
+ - downsides
45
+ - confidence score
46
+ - estimated complexity
47
+
48
+ Then include a brief rejection summary so the user can see what was considered and cut.
49
+
50
+ Keep the presentation concise. Allow brief follow-up questions and lightweight clarification.
51
+
52
+ ## Phase 5: Persistence (Opt-In, Mode-Aware)
53
+
54
+ Persistence is opt-in. The terminal review loop is a complete ideation cycle. Refinement loops happen in conversation with no file or network cost. Persistence triggers only when the user explicitly chooses to save, share, or hand off (selected in Phase 6).
55
+
56
+ When the user picks an option in Phase 6 that requires a durable record (Open and iterate in Proof, Brainstorm, Save and end), ensure a record exists first. When the user chooses to keep refining, no record is needed unless the user asks.
57
+
58
+ **Mode-determined defaults:**
59
+
60
+ | Action | Repo mode default | Elsewhere mode default |
61
+ |---|---|---|
62
+ | Save | `docs/ideation/YYYY-MM-DD-<topic>-ideation.md` | Proof |
63
+ | Share | Proof (additional) | Proof (primary) |
64
+ | Brainstorm handoff | `ce-brainstorm` | `ce-brainstorm` (universal-brainstorming) |
65
+ | End | Conversation only is fine | Conversation only is fine |
66
+
67
+ Either mode can also use the other destination on explicit request ("save to Proof even though this is repo mode", "save to a local file even though this is elsewhere"). Honor such overrides directly.
68
+
69
+ ### 5.1 File Save (default for repo mode; on request for elsewhere mode)
70
+
71
+ 1. Ensure `docs/ideation/` exists
72
+ 2. Choose the file path:
73
+ - `docs/ideation/YYYY-MM-DD-<topic>-ideation.md`
74
+ - `docs/ideation/YYYY-MM-DD-open-ideation.md` when no focus exists
75
+ 3. Write or update the ideation document
76
+
77
+ Use this structure and omit clearly irrelevant fields only when necessary:
78
+
79
+ ```markdown
80
+ ---
81
+ date: YYYY-MM-DD
82
+ topic: <kebab-case-topic>
83
+ focus: <optional focus hint>
84
+ mode: <repo-grounded | elsewhere-software | elsewhere-non-software>
85
+ ---
86
+
87
+ # Ideation: <Title>
88
+
89
+ ## Grounding Context
90
+ [Grounding summary from Phase 1 — labeled "Codebase Context" in repo mode, "Topic Context" in elsewhere mode]
91
+
92
+ ## Ranked Ideas
93
+
94
+ ### 1. <Idea Title>
95
+ **Description:** [Concrete explanation]
96
+ **Warrant:** [`direct:` / `external:` / `reasoned:` — the actual basis, quoted or cited]
97
+ **Rationale:** [How the warrant connects to the move's significance]
98
+ **Downsides:** [Tradeoffs or costs]
99
+ **Confidence:** [0-100%]
100
+ **Complexity:** [Low / Medium / High]
101
+ **Status:** [Unexplored / Explored]
102
+
103
+ ## Rejection Summary
104
+
105
+ | # | Idea | Reason Rejected |
106
+ |---|------|-----------------|
107
+ | 1 | <Idea> | <Reason rejected> |
108
+ ```
109
+
110
+ If resuming:
111
+ - update the existing file in place
112
+ - preserve explored markers
113
+
114
+ ### 5.2 Proof Save (default for elsewhere mode; on request for repo mode)
115
+
116
+ Hand off the ideation content to the `ce-proof` skill in HITL review mode. This uploads the doc, runs an iterative review loop (user annotates in Proof, agent ingests feedback and applies tracked edits), and (in repo mode) syncs the reviewed markdown back to `docs/ideation/`.
117
+
118
+ Load the `ce-proof` skill in HITL-review mode with:
119
+
120
+ - **source content:** the survivors and rejection summary from Phase 4 (in repo mode, this is the file written in 5.1; in elsewhere mode, render to a temp file as the source for upload)
121
+ - **doc title:** `Ideation: <topic>` or the H1 of the ideation doc
122
+ - **identity:** `ai:systematic` / `Systematic`
123
+ - **recommended next step:** `/ce-brainstorm` (shown in the proof skill's final terminal output)
124
+
125
+ The Proof failure ladder in Phase 6.5 governs what happens when this hand-off fails.
126
+
127
+ **Caller-aware return.** The return-rule bullets below describe the default control flow, but the next step depends on which Phase 6 option invoked the Proof save. Apply the right branch for the caller:
128
+
129
+ - **§6.2 Open and iterate in Proof.** Behavior is mode-aware:
130
+ - *Repo mode:* return to the Phase 6 menu on every status. The Proof-reviewed content is now synced locally, and the user typically has a follow-up action in the repo (brainstorm toward a plan, save and end, or keep refining).
131
+ - *Elsewhere mode:* on a successful Proof return (`proceeded` or `done_for_now`), exit cleanly — narrate that the artifact lives at `docUrl` (including any stale-local note if applicable) and stop. Proof iteration is often the terminal act in elsewhere mode; forcing another menu choice after the user already got what they came for produces decision fatigue. Only the `aborted` branch returns to the Phase 6 menu so the user can retry or pick another path.
132
+ - **§6.3 Brainstorm a selected idea.** On a successful Proof return (`proceeded` or `done_for_now`), do **not** stop at the Phase 6 menu — after applying the per-status handling below (including any stale-local pull offer), continue into §6.3's remaining bullets (mark the chosen idea as `Explored`, then load `ce-brainstorm`). Only the `aborted` branch returns to the Phase 6 menu, since no durable record was written.
133
+ - **§6.4 Save and end.** On a successful Proof return (`proceeded` or `done_for_now`), exit cleanly: narrate that the ideation was saved, surface the `docUrl` (and the local-path note if applicable), and stop. Do **not** re-ask the Phase 6 question — the user already chose to end. Only the `aborted` branch returns to the Phase 6 menu so the user can retry or pick a different path.
134
+
135
+ When the proof skill returns control:
136
+
137
+ - `status: proceeded` with `localSynced: true` → the ideation doc on disk now reflects the review. Apply the caller-aware return rule above for the invoking branch.
138
+ - `status: proceeded` with `localSynced: false` → the reviewed version lives in Proof at `docUrl` but the local copy is stale. Offer to pull the Proof doc to `localPath` using the proof skill's Pull workflow. Apply the caller-aware return rule above; if the pull was declined, include a one-line note that `<localPath>` is stale vs. Proof so the next handoff (or final exit narration) doesn't read the old content silently. Placement: above the Phase 6 menu when the caller-aware rule returns to it, in the handoff preamble to `ce-brainstorm` for §6.3, or alongside the final save/exit narration for §6.2 elsewhere / §6.4.
139
+ - `status: done_for_now` → the doc on disk may be stale if the user edited in Proof before leaving. Offer to pull the Proof doc to `localPath` so the local ideation artifact stays in sync, then apply the caller-aware return rule above. `done_for_now` means the user stopped the HITL loop — it does not mean they ended the whole ideation session unless the caller-aware rule exits (§6.2 elsewhere mode or §6.4). If the pull was declined, include the stale-local note at the placement described in the previous bullet.
140
+ - `status: aborted` → fall back to the Phase 6 menu without changes, regardless of caller. No durable record was written, so §6.3 must not proceed with the brainstorm handoff and §6.4 must not end — the menu lets the user retry or pick another path.
141
+
142
+ ## Phase 6: Refine or Hand Off
143
+
144
+ Ask what should happen next using the platform's blocking question tool: `question` in OpenCode (call `ToolSearch` with `select:question` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to numbered options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question.
145
+
146
+ **Question:** "What should the agent do next?"
147
+
148
+ Offer these four options (labels are self-contained with the distinguishing word front-loaded so options stay distinct when truncated):
149
+
150
+ 1. **Refine the ideation in conversation (or stop here — no save)** — add ideas, re-evaluate, or deepen analysis. No file or network side effects; ending the conversation at any point after this pick is a valid no-save exit.
151
+ 2. **Open and iterate in Proof** — save the ideation to Proof and enter the proof skill's HITL review loop: iterate via comments in the Proof editor; reviewed edits sync back to `docs/ideation/` in repo mode.
152
+ 3. **Brainstorm a selected idea** — load `ce-brainstorm` with the chosen idea as the seed. The orchestrator first writes a durable record using the mode default in Phase 5.
153
+ 4. **Save and end** — persist the ideation using the mode default (file in repo mode, Proof in elsewhere mode), then end.
154
+
155
+ No-save exit is supported without a dedicated menu option. Pick option 1 and stop the conversation, or use the question tool's free-text escape to say so directly — persistence is opt-in and the terminal review loop is already a complete ideation cycle.
156
+
157
+ Do not delete the run's scratch directory (`<scratch-dir>` resolved in Phase 1) on completion. The V15 web-research cache is session-scoped and reused across run-ids by later ideation invocations in the same session (see `references/web-research-cache.md`); per-run cleanup would defeat that reuse. Checkpoint A (`raw-candidates.md`) and Checkpoint B (`survivors.md`) are cheap to leave behind and follow the repo's Scratch Space cross-invocation-reusable convention — OS handles eventual cleanup.
158
+
159
+ ### 6.1 Refine the Ideation in Conversation
160
+
161
+ Route refinement by intent:
162
+
163
+ - `add more ideas` or `explore new angles` -> return to Phase 2
164
+ - `re-evaluate` or `raise the bar` -> return to Phase 3
165
+ - `dig deeper on idea #N` -> expand only that idea's analysis
166
+
167
+ No persistence triggers during refinement. The user can choose Save and end (or Brainstorm, or Open and iterate in Proof) when they are ready to persist.
168
+
169
+ Ending after refinement — or without any refinement at all — is a valid no-save exit. There is no required next step; stopping the conversation here leaves no durable artifact, which matches the opt-in persistence contract.
170
+
171
+ ### 6.2 Open and Iterate in Proof
172
+
173
+ Invoke the Proof HITL review path via §5.2 with §6.2 as the caller. In repo mode, ensure the local file exists first (run §5.1) so the HITL sync-back has a target; in elsewhere mode, §5.2 renders to a temp file as usual. Honor Phase 5's "ensure a record exists first" contract either way.
174
+
175
+ Apply §5.2's caller-aware return rule for the §6.2 branch — behavior is mode-aware. In repo mode, return to the Phase 6 menu on every status so the user can pick a follow-up (brainstorm toward a plan, save-and-end, or keep refining) now that the Proof review is reflected in the local file. In elsewhere mode, exit cleanly on a successful Proof return since Proof iteration is often the terminal act — the artifact lives at `docUrl` and is the canonical record; only the `aborted` status returns to the menu.
176
+
177
+ If the Proof handoff fails, the §6.5 Proof Failure Ladder governs recovery.
178
+
179
+ ### 6.3 Brainstorm a Selected Idea
180
+
181
+ - Write or update the durable record per the mode default in Phase 5 (file in repo mode, Proof in elsewhere mode). When this routes through §5.2 Proof Save, apply §5.2's caller-aware return rule: continue into the next bullet on a successful Proof return instead of bouncing back to the Phase 6 menu. If Proof returned `aborted` (no durable record written), go back to the Phase 6 menu and do **not** proceed with the brainstorm handoff.
182
+ - Mark the chosen idea as `Explored` in the saved record
183
+ - Load the `ce-brainstorm` skill with the chosen idea as the seed
184
+
185
+ **Repo mode only:** do **not** skip brainstorming and go straight to `ce-plan` from ideation output — `ce-plan` wants brainstorm-grounded requirements. In elsewhere modes, ideation (or ideation + Proof iteration) is a legitimate terminal state; brainstorming is optional deeper development of one idea, not a required next rung on an implementation ladder that does not exist in these modes.
186
+
187
+ ### 6.4 Save and End
188
+
189
+ Persist via the mode default (5.1 in repo mode, 5.2 in elsewhere mode), then end. If the user instead asked to use the non-default destination, honor that explicit request.
190
+
191
+ When the path lands in a Proof save (5.2), apply §5.2's caller-aware return rule for the §6.4 branch: on a successful Proof return, exit cleanly — narrate the save, surface the `docUrl` (and any stale-local note if the pull was declined), and stop. Do **not** loop back to the Phase 6 menu; the user already chose to end. Only a `status: aborted` from Proof returns to the menu so the user can retry or pick another path (file save, custom path, or keep refining). The §6.5 Proof Failure Ladder still governs persistent Proof failures and ends at the Phase 6 menu — that failure-recovery path is distinct from the successful-save exit described here.
192
+
193
+ When the path lands in a file save (5.1):
194
+
195
+ - offer to commit only the ideation doc
196
+ - do not create a branch
197
+ - do not push
198
+ - if the user declines, leave the file uncommitted
199
+
200
+ After the file save (and optional commit), end the session — do not return to the Phase 6 menu.
201
+
202
+ ### 6.5 Proof Failure Ladder
203
+
204
+ The `ce-proof` skill performs single-retry-once internally on transient failures (`STALE_BASE`, `BASE_TOKEN_REQUIRED`) before surfacing failure. The proof skill's return contract does not expose typed error classes to callers — the orchestrator cannot distinguish retryable vs terminal failures from outside.
205
+
206
+ **Orchestrator-side retry harness (intentionally minimal):** wrap the proof skill invocation in **one** additional best-effort retry with a short pause (~2 seconds). The proof skill already retried internally, so this catches transient races at the orchestrator boundary without compounding latency. Do not classify error types from outside the skill — no detection mechanism exists.
207
+
208
+ Distinguish create-failure from ops-failure by inspecting whether the proof skill returned a `docUrl` before failing:
209
+
210
+ - **Create-failure** (no `docUrl` returned): retry the create.
211
+ - **Ops-failure** (a `docUrl` was returned, but a later operation failed): retry only the failing operation. **Do not recreate** the document.
212
+
213
+ **Failure narration.** Narrate the single retry to the terminal so the pause does not look like a hang ("Retrying Proof... attempt 2/2"). On persistent failure, narrate that retry exhausted before showing the fallback menu.
214
+
215
+ **Fallback menu after persistent failure.** Use the platform's blocking question tool. Present these options (omit option (a) if no repo exists at CWD):
216
+
217
+ - "Save to `docs/ideation/` instead" (repo-mode default destination, available when CWD is inside a git repo)
218
+ - "Save to a custom path the user provides" (validate writable; create parent dirs)
219
+ - "Skip save and keep the ideation in conversation" (no persistence)
220
+
221
+ If proof returned a partial `docUrl` before failing, surface that URL alongside the fallback options so the user can recover or share the partial record.
222
+
223
+ After the fallback completes (any path), continue back to the Phase 6 menu so the user can still refine, iterate in Proof, brainstorm, or save and end.
224
+
225
+ ## Quality Bar
226
+
227
+ Before finishing, check:
228
+
229
+ - the idea set is grounded in the stated context (codebase in repo mode; user-supplied context in elsewhere mode)
230
+ - **every surviving idea has articulated warrant** (`direct:`, `external:`, or `reasoned:`) that actually supports the claimed move — speculation dressed as ambition was rejected, with reasons
231
+ - **every surviving idea passes the meeting-test** unless Phase 0.5 detected tactical focus signals that waived the floor
232
+ - **no surviving idea replaces the subject** rather than operating on it
233
+ - the candidate list was generated before filtering
234
+ - the original many-ideas -> critique -> survivors mechanism was preserved
235
+ - if sub-agents were used, they improved diversity without replacing the core workflow
236
+ - every rejected idea has a reason
237
+ - survivors are materially better than a naive "give me ideas" list
238
+ - persistence followed user choice — terminal-only sessions did not write a file or call Proof
239
+ - when persistence did trigger, the mode default was respected unless the user explicitly overrode it
240
+ - acting on an idea routes to `ce-brainstorm`, not directly to implementation
@@ -0,0 +1,249 @@
1
+ # Deepening Workflow
2
+
3
+ This file contains the confidence-check execution path (5.3.3-5.3.7). Load it only when the deepening gate at 5.3.2 determines that deepening is warranted.
4
+
5
+ ## 5.3.3 Score Confidence Gaps
6
+
7
+ Use a checklist-first, risk-weighted scoring pass.
8
+
9
+ For each section, compute:
10
+ - **Trigger count** - number of checklist problems that apply
11
+ - **Risk bonus** - add 1 if the topic is high-risk and this section is materially relevant to that risk
12
+ - **Critical-section bonus** - add 1 for `Key Technical Decisions`, `Implementation Units`, `System-Wide Impact`, `Risks & Dependencies`, or `Open Questions` in `Standard` or `Deep` plans
13
+
14
+ Treat a section as a candidate if:
15
+ - it hits **2+ total points**, or
16
+ - it hits **1+ point** in a high-risk domain and the section is materially important
17
+
18
+ Choose only the top **2-5** sections by score. If deepening a lightweight plan (high-risk exception), cap at **1-2** sections.
19
+
20
+ If the plan already has a `deepened:` date:
21
+ - Prefer sections that have not yet been substantially strengthened, if their scores are comparable
22
+ - Revisit an already-deepened section only when it still scores clearly higher than alternatives
23
+
24
+ **Section Checklists:**
25
+
26
+ **Requirements**
27
+ - Requirements are vague or disconnected from implementation units
28
+ - Success criteria are missing or not reflected downstream
29
+ - Units do not clearly advance the traced requirements
30
+ - Origin requirements are not clearly carried forward
31
+ - Origin A/F/AE IDs (when supplied by the upstream brainstorm) are not preserved where planning decisions touch them, or are referenced inconsistently across Requirements, units, and test scenarios
32
+
33
+ **Context & Research / Sources & References**
34
+ - Relevant repo patterns are named but never used in decisions or implementation units
35
+ - Cited learnings or references do not materially shape the plan
36
+ - High-risk work lacks appropriate external or internal grounding
37
+ - Research is generic instead of tied to this repo or this plan
38
+
39
+ **Key Technical Decisions**
40
+ - A decision is stated without rationale
41
+ - Rationale does not explain tradeoffs or rejected alternatives
42
+ - The decision does not connect back to scope, requirements, or origin context
43
+ - An obvious design fork exists but the plan never addresses why one path won
44
+
45
+ **Open Questions**
46
+ - Product blockers are hidden as assumptions
47
+ - Planning-owned questions are incorrectly deferred to implementation
48
+ - Resolved questions have no clear basis in repo context, research, or origin decisions
49
+ - Deferred items are too vague to be useful later
50
+
51
+ **High-Level Technical Design (when present)**
52
+ - The sketch uses the wrong medium for the work
53
+ - The sketch contains implementation code rather than pseudo-code
54
+ - The non-prescriptive framing is missing or weak
55
+ - The sketch does not connect to the key technical decisions or implementation units
56
+
57
+ **High-Level Technical Design (when absent)** *(Standard or Deep plans only)*
58
+ - The work involves DSL design, API surface design, multi-component integration, complex data flow, or state-heavy lifecycle
59
+ - Key technical decisions would be easier to validate with a visual or pseudo-code representation
60
+ - The approach section of implementation units is thin and a higher-level technical design would provide context
61
+
62
+ **Implementation Units**
63
+ - Dependency order is unclear or likely wrong
64
+ - File paths or test file paths are missing where they should be explicit
65
+ - Units are too large, too vague, or broken into micro-steps
66
+ - Approach notes are thin or do not name the pattern to follow
67
+ - Test scenarios are vague (don't name inputs and expected outcomes), skip applicable categories (e.g., no error paths for a unit with failure modes, no integration scenarios for a unit crossing layers), or are disproportionate to the unit's complexity
68
+ - Feature-bearing units have blank or missing test scenarios (feature-bearing units require actual test scenarios; the `Test expectation: none` annotation is only valid for non-feature-bearing units)
69
+ - Verification outcomes are vague or not expressed as observable results
70
+ - Existing U-IDs were renumbered after a unit was reordered, split, or deleted (U-IDs are stable: never renumber existing IDs; gaps from deletions are preserved; new units take the next unused number)
71
+ - A unit realizing an origin Key Flow does not cite the F-ID, or a unit enforcing an origin Acceptance Example does not cite the AE-ID, when origin supplies them
72
+
73
+ **System-Wide Impact**
74
+ - Affected interfaces, callbacks, middleware, entry points, or parity surfaces are missing
75
+ - Failure propagation is underexplored
76
+ - State lifecycle, caching, or data integrity risks are absent where relevant
77
+ - Integration coverage is weak for cross-layer work
78
+
79
+ **Risks & Dependencies / Documentation / Operational Notes**
80
+ - Risks are listed without mitigation
81
+ - Rollout, monitoring, migration, or support implications are missing when warranted
82
+ - External dependency assumptions are weak or unstated
83
+ - Security, privacy, performance, or data risks are absent where they obviously apply
84
+
85
+ Use the plan's own `Context & Research` and `Sources & References` as evidence. If those sections cite a pattern, learning, or risk that never affects decisions, implementation units, or verification, treat that as a confidence gap.
86
+
87
+ ## 5.3.4 Report and Dispatch Targeted Research
88
+
89
+ Before dispatching agents, report what sections are being strengthened and why:
90
+
91
+ ```text
92
+ Strengthening [section names] — [brief reason for each, e.g., "decision rationale is thin", "cross-boundary effects aren't mapped"]
93
+ ```
94
+
95
+ For each selected section, choose the smallest useful agent set. Do **not** run every agent. Use at most **1-3 agents per section** and usually no more than **8 agents total**.
96
+
97
+ Use fully-qualified agent names inside Task calls.
98
+
99
+ **Deterministic Section-to-Agent Mapping:**
100
+
101
+ **Requirements / Open Questions classification**
102
+ - `ce-spec-flow-analyzer` for missing user flows, edge cases, and handoff gaps
103
+ - `ce-repo-research-analyst` (Scope: `architecture, patterns`) for repo-grounded patterns, conventions, and implementation reality checks
104
+
105
+ **Context & Research / Sources & References gaps**
106
+ - `ce-learnings-researcher` for institutional knowledge and past solved problems
107
+ - `ce-framework-docs-researcher` for official framework or library behavior
108
+ - `ce-best-practices-researcher` for current external patterns and industry guidance
109
+ - Add `ce-git-history-analyzer` only when historical rationale or prior art is materially missing
110
+
111
+ **Key Technical Decisions**
112
+ - `ce-architecture-strategist` for design integrity, boundaries, and architectural tradeoffs
113
+ - Add `ce-framework-docs-researcher` or `ce-best-practices-researcher` when the decision needs external grounding beyond repo evidence
114
+
115
+ **High-Level Technical Design**
116
+ - `ce-architecture-strategist` for validating that the technical design accurately represents the intended approach and identifying gaps
117
+ - `ce-repo-research-analyst` (Scope: `architecture, patterns`) for grounding the technical design in existing repo patterns and conventions
118
+ - Add `ce-best-practices-researcher` when the technical design involves a DSL, API surface, or pattern that benefits from external validation
119
+
120
+ **Implementation Units / Verification**
121
+ - `ce-repo-research-analyst` (Scope: `patterns`) for concrete file targets, patterns to follow, and repo-specific sequencing clues
122
+ - `ce-pattern-recognition-specialist` for consistency, duplication risks, and alignment with existing patterns
123
+ - Add `ce-spec-flow-analyzer` when sequencing depends on user flow or handoff completeness
124
+
125
+ **System-Wide Impact**
126
+ - `ce-architecture-strategist` for cross-boundary effects, interface surfaces, and architectural knock-on impact
127
+ - Add the specific specialist that matches the risk:
128
+ - `ce-performance-oracle` for scalability, latency, throughput, and resource-risk analysis
129
+ - `ce-security-sentinel` for auth, validation, exploit surfaces, and security boundary review
130
+ - `ce-data-integrity-guardian` for migrations, persistent state safety, consistency, and data lifecycle risks
131
+
132
+ **Risks & Dependencies / Operational Notes**
133
+ - Use the specialist that matches the actual risk:
134
+ - `ce-security-sentinel` for security, auth, privacy, and exploit risk
135
+ - `ce-data-integrity-guardian` for persistent data safety, constraints, and transaction boundaries
136
+ - `ce-data-migration-expert` for migration realism, backfills, and production data transformation risk
137
+ - `ce-deployment-verification-agent` for rollout checklists, rollback planning, and launch verification
138
+ - `ce-performance-oracle` for capacity, latency, and scaling concerns
139
+
140
+ **Agent Prompt Shape:**
141
+
142
+ For each selected section, pass:
143
+ - The scope prefix from the mapping above when the agent supports scoped invocation
144
+ - A short plan summary
145
+ - The exact section text
146
+ - Why the section was selected, including which checklist triggers fired
147
+ - The plan depth and risk profile
148
+ - A specific question to answer
149
+
150
+ Instruct the agent to return:
151
+ - findings that change planning quality
152
+ - stronger rationale, sequencing, verification, risk treatment, or references
153
+ - no implementation code
154
+ - no shell commands
155
+
156
+ ## 5.3.5 Choose Research Execution Mode
157
+
158
+ Use the lightest mode that will work:
159
+
160
+ - **Direct mode** - Default. Use when the selected section set is small and the parent can safely read the agent outputs inline.
161
+ - **Artifact-backed mode** - Use only when the selected research scope is large enough that inline returns would create unnecessary context pressure.
162
+
163
+ Signals that justify artifact-backed mode:
164
+ - More than 5 agents are likely to return meaningful findings
165
+ - The selected section excerpts are long enough that repeating them in multiple agent outputs would be wasteful
166
+ - The topic is high-risk and likely to attract bulky source-backed analysis
167
+
168
+ If artifact-backed mode is not clearly warranted, stay in direct mode.
169
+
170
+ Artifact-backed mode uses a per-run OS-temp scratch directory. Create it once before dispatching sub-agents and capture its **absolute path** — pass that absolute path to each sub-agent so they write to it directly. Do not use `.context/`; the artifacts are per-run throwaway that are cleaned up when deepening ends (see 5.3.6b), matching the repo Scratch Space convention for one-shot artifacts. Do not pass unresolved shell-variable strings to sub-agents; they need the resolved absolute path.
171
+
172
+ ```bash
173
+ SCRATCH_DIR="$(mktemp -d -t ce-plan-deepen-XXXXXX)"
174
+ echo "$SCRATCH_DIR"
175
+ ```
176
+
177
+ Refer to the echoed absolute path as `<scratch-dir>` throughout the rest of this workflow.
178
+
179
+ ## 5.3.6 Run Targeted Research
180
+
181
+ Launch the selected agents in parallel using the execution mode chosen above. If the current platform does not support parallel dispatch, run them sequentially instead. Omit the `mode` parameter when dispatching so the user's configured permission settings apply.
182
+
183
+ Prefer local repo and institutional evidence first. Use external research only when the gap cannot be closed responsibly from repo context or already-cited sources.
184
+
185
+ If a selected section can be improved by reading the origin document more carefully, do that before dispatching external agents.
186
+
187
+ **Direct mode:** Have each selected agent return its findings directly to the parent. Keep the return payload focused: strongest findings only, the evidence or sources that matter, the concrete planning improvement implied by the finding.
188
+
189
+ **Artifact-backed mode:** For each selected agent, pass the absolute `<scratch-dir>` path captured earlier and instruct the agent to write one compact artifact file inside that directory, then return only a short completion summary. Each artifact should contain: target section, why selected, 3-7 findings, source-backed rationale, the specific plan change implied by each finding. No implementation code, no shell commands.
190
+
191
+ If an artifact is missing or clearly malformed, re-run that agent or fall back to direct-mode reasoning for that section.
192
+
193
+ If agent outputs conflict:
194
+ - Prefer repo-grounded and origin-grounded evidence over generic advice
195
+ - Prefer official framework documentation over secondary best-practice summaries when the conflict is about library behavior
196
+ - If a real tradeoff remains, record it explicitly in the plan
197
+
198
+ ## 5.3.6b Interactive Finding Review (Interactive Mode Only)
199
+
200
+ Skip this step in auto mode — proceed directly to 5.3.7.
201
+
202
+ In interactive mode, present each agent's findings to the user before integration. For each agent that returned findings:
203
+
204
+ 1. **Summarize the agent and its target section** — e.g., "The ce-architecture-strategist reviewed Key Technical Decisions and found:"
205
+ 2. **Present the findings concisely** — bullet the key points, not the raw agent output. Include enough context for the user to evaluate: what the agent found, what evidence supports it, and what plan change it implies.
206
+ 3. **Ask the user** using the platform's blocking question tool when available (see Interaction Method):
207
+ - **Accept** — integrate these findings into the plan
208
+ - **Reject** — discard these findings entirely
209
+ - **Discuss** — the user wants to talk through the findings before deciding
210
+
211
+ If the user chooses "Discuss", engage in brief dialogue about the findings and then re-ask with only accept/reject (no discuss option on the second ask). The user makes a deliberate choice either way.
212
+
213
+ When presenting findings from multiple agents targeting the same section, present them one agent at a time so the user can make independent decisions. Do not merge findings from different agents before showing them.
214
+
215
+ After all agents have been reviewed, carry only the accepted findings forward to 5.3.7.
216
+
217
+ If the user accepted no findings, report "No findings accepted — plan unchanged." Then proceed directly to Phase 5.4 (skip document-review and synthesis — the plan was not modified). This interactive-mode-only skip does not apply in auto mode; auto mode always proceeds through 5.3.7 and 5.3.8. No explicit scratch cleanup needed — `$SCRATCH_DIR` is OS temp and will be cleaned up by the OS; leaving it in place preserves the rejected agent artifacts for debugging.
218
+
219
+ If findings were accepted and the plan was modified, proceed through 5.3.7 and 5.3.8 as normal — document-review acts as a quality gate on the changes.
220
+
221
+ ## 5.3.7 Synthesize and Update the Plan
222
+
223
+ Strengthen only the selected sections. Keep the plan coherent and preserve its overall structure.
224
+
225
+ **In interactive mode:** Only integrate findings the user accepted in 5.3.6b. If some findings from different agents touch the same section, reconcile them coherently but do not reintroduce rejected findings.
226
+
227
+ Allowed changes:
228
+ - Clarify or strengthen decision rationale
229
+ - Tighten requirements trace or origin fidelity
230
+ - Reorder or split implementation units when sequencing is weak — but **never renumber existing U-IDs**. Reordering preserves U-IDs in their new order (e.g., U1, U3, U5 reordered is correct; renumbering to U1, U2, U3 is not). Splitting keeps the original U-ID on the original concept and assigns the next unused number to the new unit. Renumbering breaks ce-work blocker and verification references that were written against the original IDs
231
+ - Add missing pattern references, file/test paths, or verification outcomes
232
+ - Expand system-wide impact, risks, or rollout treatment where justified
233
+ - Reclassify open questions between `Resolved During Planning` and `Deferred to Implementation` when evidence supports the change
234
+ - Strengthen, replace, or add a High-Level Technical Design section when the work warrants it and the current representation is weak
235
+ - Strengthen or add per-unit technical design fields where the unit's approach is non-obvious
236
+ - Add or update `deepened: YYYY-MM-DD` in frontmatter when the plan was substantively improved
237
+
238
+ Do **not**:
239
+ - Add implementation code — no imports, exact method signatures, or framework-specific syntax. Pseudo-code sketches and DSL grammars are allowed
240
+ - Add git commands, commit choreography, or exact test command recipes
241
+ - Add generic `Research Insights` subsections everywhere
242
+ - Rewrite the entire plan from scratch
243
+ - Invent new product requirements, scope changes, or success criteria without surfacing them explicitly
244
+ - Renumber existing U-IDs as part of reordering, splitting, deletion, or "tidying" the unit list. Deepening is the most likely accidental-renumber vector — preserve U-IDs even when the new order would look cleaner with sequential numbering
245
+
246
+ If research reveals a product-level ambiguity that should change behavior or scope:
247
+ - Do not silently decide it here
248
+ - Record it under `Open Questions`
249
+ - Recommend `ce-brainstorm` if the gap is truly product-defining