npm - claude-mem - Versions diffs - 13.2.0 → 13.3.0 - Mend

claude-mem 13.2.0 → 13.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/.codex-plugin/plugin.json +2 -2
package/dist/npx-cli/index.js +307 -278
package/openclaw/openclaw.plugin.json +1 -1
package/package.json +1 -2
package/plugin/.claude-plugin/plugin.json +1 -1
package/plugin/.codex-plugin/plugin.json +1 -1
package/plugin/package.json +1 -1
package/plugin/scripts/context-generator.cjs +4 -4
package/plugin/scripts/mcp-server.cjs +13 -13
package/plugin/scripts/server-beta-service.cjs +147 -122
package/plugin/scripts/worker-service.cjs +362 -337
package/plugin/skills/design-is/SKILL.md +312 -0
package/plugin/skills/make-plan/SKILL.md +4 -0
package/plugin/skills/oh-my-issues/SKILL.md +226 -0
package/plugin/skills/weekly-digests/SKILL.md +262 -0
package/plugin/skills/wowerpoint/SKILL.md +62 -0
package/plugin/ui/viewer-bundle.js +14 -13
package/.mcp.json +0 -12

package/plugin/skills/design-is/SKILL.md ADDED Viewed

@@ -0,0 +1,312 @@
+---
+name: design-is
+description: Audit a design against Dieter Rams' ten "Good design is..." principles, then hand off a /make-plan prompt for one of three outcomes — new design, refine design, or redesign. Use when the user says "audit this design", "design review", "check this UI against Rams", "is this UI good", "critique this design", "design audit", or asks for a critique that should lead to a plan.
+---
+# Design Is
+## Do not use for
+- Routine UI code reviews → use `/review`
+- Pure copy edits → use a separate copy pass
+- Pre-design ideation with no artifact yet → start with `/make-plan` directly
+You are an ORCHESTRATOR. Audit a design against Dieter Rams' ten principles, score each principle with evidence, decide the outcome verdict (NEW / REFINE / REDESIGN), and hand off to `/make-plan` with a ready-to-run prompt.
+You do not write implementation code. You produce: evidence-cited scores, a verdict, and a `/make-plan` handoff prompt.
+## The Ten Principles (Dieter Rams)
+Audit each principle in this exact order. Each gets a score 0–3 and ≥1 piece of evidence (`file:line`, screenshot region, copy excerpt, or measured value).
+1. **Good design is innovative** — Does it advance the form, or imitate? Innovation rides on technology; never an end in itself.
+2. **Good design makes a product useful** — Does it serve the primary task? Emphasizes usefulness; disregards anything that detracts.
+3. **Good design is aesthetic** — Is it beautiful? Only well-executed objects can be beautiful; aesthetic quality affects well-being.
+4. **Good design makes a product understandable** — Does the structure clarify function? Or is it self-explanatory at best?
+5. **Good design is unobtrusive** — Does it stay out of the way? Neither decorative objects nor works of art — leave room for self-expression.
+6. **Good design is honest** — Does it claim only what it is? No false promises, no manipulation, no inflated value.
+7. **Good design is long-lasting** — Will it age well? Avoids being fashionable; never appears antiquated.
+8. **Good design is thorough down to the last detail** — Are edges, empty states, errors, focus rings, motion curves all considered? Care and accuracy express respect for the user.
+9. **Good design is environmentally friendly** — Does it conserve resources? Minimizes pollution — in software: bundle weight, energy, attention, cognitive load.
+10. **Good design is as little design as possible** — Less, but better. Concentrates on essentials; back to purity, back to simplicity.
+> The user wrote "Dieter Braun" — they mean Dieter Rams. Don't correct them inline; just use the right principles.
+## Delegation Model
+Use subagents for *evidence gathering* (reading components, measuring contrast, counting elements, inspecting tokens, screenshotting via agent-browser). Keep *scoring and verdict synthesis* with the orchestrator. Reject subagent reports that score without citing evidence and redeploy.
+### Subagent Reporting Contract (MANDATORY)
+Each evidence subagent response must include:
+1. Sources consulted — exact file paths and line ranges, or screenshot regions
+2. Concrete findings — what is present, what is missing, with quotes/values
+3. Per-principle facts (not opinions) — leave scoring to the orchestrator
+4. Known gaps — what could not be inspected and why
+## Output Artifacts
+All artifacts go in `DESIGN-IS-<YYYY-MM-DD>/` at repo root (or the project the user points at):
+- `00-scope.md` — what was audited (URL, component paths, screens), input materials
+- `01-evidence.md` — per-principle evidence collected by subagents
+- `02-scorecard.md` — per-principle 0–3 score with one-line justification + total
+- `03-verdict.md` — NEW / REFINE / REDESIGN with reasoning
+- `04-handoff-prompt.md` — copy-pasteable `/make-plan` prompt for the chosen outcome
+## Phases
+### Phase 0: Scope Lock (ALWAYS FIRST)
+Ask the user (or infer from the request) and write `00-scope.md`:
+- What is being audited? (live URL, repo path, Figma frame, component name)
+- Who is the primary user, and what is the primary task?
+- Constraints (brand, stack, deadline)
+- Reference designs or competitors, if any
+If the user is asking about a design that doesn't exist yet, skip Phases 1–2 and go straight to Phase 3 with verdict = **NEW**.
+### Phase 1: Evidence Gathering (FAN OUT)
+Deploy subagents in parallel. Each must return ONLY the required fields below — no prose paragraphs, no scoring.
+**1. Structural Evidence** subagent (always deploy)
+Required fields returned:
+- Total interactive-element count on audited surface
+- Max nesting depth of the primary component tree
+- Repeated-pattern count (same affordance appearing >1 place with the same purpose)
+- Dead-prop / unused-import count
+- File:line citations for every count
+**2. Visual Evidence** subagent (always deploy)
+Mode: if target is a reachable URL or running dev server → use the `agent-browser` skill for screenshots and computed-style inspection. If target is a static repo with no running instance → read source CSS / tokens / component files and report inferred facts only (mark these "INFERRED").
+Required fields returned:
+- Spacing scale observed (px array)
+- Type scale observed (px array)
+- Distinct color count (count of unique hex/oklch tokens actually rendered or referenced)
+- Lowest contrast ratio observed across primary text
+- States present checklist: empty / loading / error / success / focus / disabled — present or missing for each
+**3. Copy & Honesty** subagent (always deploy)
+Required fields returned:
+- List of every user-facing string with file:line
+- Flagged inflations (marketing superlatives without backing)
+- Flagged dark patterns (forced continuity, hidden cost, fake scarcity, confirmshaming)
+- Flagged jargon / unclear labels with proposed plain replacement
+- Label→behavior mismatches with file:line of both
+**4. Weight & Friction** subagent (always deploy)
+Required fields returned:
+- Initial JS bytes (number)
+- Network request count for primary view (number)
+- Time-to-interactive ms (number, measured or estimated with method noted)
+- Animation count on idle screen (number)
+- Notification / badge / modal count on initial load (number)
+**5. Accessibility Evidence** subagent (OPTIONAL — deploy only if target has a meaningful interactive UI surface; skip for static landing pages without interaction)
+Required fields returned:
+- WCAG contrast pass/fail per text token
+- Focus order list across primary controls
+- Keyboard reachability of every primary action (yes/no per action)
+- ARIA landmark count
+- Skip-link present (yes/no)
+**Principle → subagent mapping** (orchestrator uses this when scoring):
+| Principle | Fed by |
+|-----------|--------|
+| #1 innovative | orchestrator-only (judgment using all evidence) |
+| #2 useful | Structural, Accessibility |
+| #3 aesthetic | Visual |
+| #4 understandable | Structural, Copy & Honesty, Accessibility |
+| #5 unobtrusive | Structural, Visual |
+| #6 honest | Copy & Honesty |
+| #7 long-lasting | orchestrator-only (judgment using all evidence) |
+| #8 thorough | Visual |
+| #9 environmentally friendly | Weight & Friction |
+| #10 as little design as possible | Structural |
+The orchestrator writes `01-evidence.md` consolidating all subagent reports. Reject any finding without a source citation. Subagents are explicitly forbidden from scoring — only the orchestrator scores, using the rubric in Phase 2.
+### Phase 2: Scorecard (ORCHESTRATOR)
+The orchestrator scores each of the ten principles itself — do NOT delegate scoring.
+For each principle, write to `02-scorecard.md`:
+```
+N. Good design is <principle> — Score: X/3
+   Evidence: <one-line summary citing 01-evidence.md anchors>
+   Justification: <one sentence on why this score, not the one above or below>
+```
+Per-principle scoring anchors (apply verbatim — pick the level whose signal best matches the audited surface):
+#1 innovative — 3: introduces a pattern not seen in 5+ peer products and ships it with restraint. 2: refreshes an existing pattern with a clear improvement. 1: imitates competitors with minor variation. 0: copies a competitor's flow wholesale.
+#2 useful — 3: primary task completes in fewest possible steps; no decoy actions. 2: primary task completes but adjacent surface adds steps. 1: primary task requires unnecessary detours. 0: primary task is not directly supported on the screen audited.
+#3 aesthetic — 3: spacing/type/color obey a single visible system; no orphan styles. 2: ≤2 minor inconsistencies across audited surface. 1: 3–5 inconsistencies OR one jarring violation. 0: no visible system OR active visual noise.
+#4 understandable — 3: a first-time user names every primary control correctly. 2: 1 control needs a tooltip. 1: 2–3 controls unclear; jargon present. 0: primary action is not identifiable without help.
+#5 unobtrusive — 3: chrome recedes; content is the figure, UI the ground. 2: chrome visible but quiet. 1: decoration competes with content. 0: chrome dominates content.
+#6 honest — 3: every claim, badge, and label maps 1:1 to actual behavior. 2: ≤1 minor inflation (e.g. "powerful" once). 1: 2+ inflations OR one dark pattern. 0: any deceptive flow (forced continuity, hidden cost, fake scarcity).
+#7 long-lasting — 3: visual language has no dated trend markers; would read as current 3 years from now. 2: 1 dated marker. 1: 2–3 dated markers (skeuomorph residue, fad gradients, trend typography). 0: design reads as a specific year's trend.
+#8 thorough — 3: empty / loading / error / success / focus / disabled all present and considered. 2: 1 state missing or rough. 1: 2–3 states missing. 0: 4+ states missing or default-browser.
+#9 environmentally friendly — 3: initial JS <100KB, no idle animation, dark mode honored, prefers-reduced-motion respected. 2: <500KB, motion gated. 1: 500KB–2MB, motion always on. 0: >2MB OR autoplay video OR dark mode ignored.
+#10 as little design as possible — 3: every element earns its place; removing any one breaks the task. 2: ≤2 removable elements. 1: 3–5 removable elements. 0: page is dominated by decoration or duplicated affordances.
+Scoring rules:
+- **Tie-breaker rule**: When uncertain between two scores, pick the lower one. Convergence > generosity.
+- **Score worst, not mean**: When a principle has multiple representative instances on the audited surface, score the worst instance — not the average.
+- **No bonuses, no weights**: Scores stay 0–3 integer. Principles are equally weighted. Total is sum of ten scores, max 30.
+### Phase 3: Verdict (ORCHESTRATOR)
+Write `03-verdict.md` with one of three verdicts, chosen by these rules:
+- **NEW DESIGN** — No design exists yet, OR the existing artifact is a stub/wireframe with no real decisions to preserve.
+- **REFINE** — Total score ≥ 20 AND no individual principle scored 0. The bones are good; iterate.
+- **REDESIGN** — Total score < 20, OR any principle scored 0 on a load-bearing dimension (typically #2 useful, #4 understandable, or #6 honest). Start over from purpose.
+State the verdict in one sentence. Then list the 3–5 highest-leverage moves — each tied to a specific principle and evidence anchor. These become the spine of the next phase's plan.
+**Anti-patterns to reject in your own verdict:**
+- Recommending REFINE because the codebase is large (sunk cost is not a design principle)
+- Recommending REDESIGN because a single screen is ugly (scope it)
+- Recommending NEW when an honest REDESIGN is warranted (don't dodge the critique)
+### Phase 4: /make-plan Handoff
+Write `04-handoff-prompt.md` containing exactly ONE fenced `/make-plan` prompt matching the verdict. The prompt must be self-contained — the next session won't see this audit unless it's quoted in.
+Use the matching template below. Fill every `<bracket>`. Include the top 3–5 moves from Phase 3 verbatim, each with its evidence anchor.
+**Quote-in step (mandatory, applies to all three templates below):** Before emitting the handoff, replace EVERY `<bracket>` placeholder with concrete content from the audit. Inline the verdict paragraph from `03-verdict.md` and the top 3–5 moves verbatim into the template. Do NOT leave bare references like "see DESIGN-IS-.../03-verdict.md" — the next session won't have file access to the audit. The emitted handoff must be readable and actionable with zero external lookups.
+#### Template: NEW DESIGN
+````
+/make-plan Design <product/screen/component name> from scratch.
+Primary user: <who>
+Primary task: <one sentence>
+Constraints: <brand, stack, deadline, accessibility floor>
+Non-goals (do not design these now):
+- <explicit out-of-scope item 1>
+- <explicit out-of-scope item 2>
+- <explicit out-of-scope item 3>
+Reference principles to optimize for, in order:
+1. Useful (#2) — <what useful looks like here>
+2. Understandable (#4) — <what clarity looks like here>
+3. As little design as possible (#10) — <what restraint looks like here>
+Deliverables for the plan:
+- Information architecture (one screen map or component tree)
+- Primary flow wireframe (low-fi, labeled)
+- Token decisions (type scale, spacing scale, color count cap)
+- States checklist (empty, loading, error, success, focus, disabled)
+- Honesty audit on every user-facing string before ship
+Anti-patterns to guard against (specific to NEW):
+- Decoration without function
+- Novel interactions without precedent
+- Copy that overpromises
+- Designing for screens the Non-goals list excluded
+````
+#### Template: REFINE DESIGN
+````
+/make-plan Refine <product/screen/component name> based on a Dieter Rams audit (total <X>/30).
+Verdict paragraph (quoted from 03-verdict.md):
+> <paste the one-sentence verdict here>
+Keep (already strong, do NOT touch in this pass):
+- Principle #<N> (<name>) scored 3 — Evidence: <file:line or anchor>. Regression check: <what to grep / re-test to confirm it still scores 3 after the refine>.
+- <repeat for every principle that scored 3>
+Fix in priority order (top 3–5 moves from the audit, verbatim):
+1. <Principle # — short name>: <specific move>. Evidence: <file:line or anchor>.
+2. <Principle # — short name>: <specific move>. Evidence: <file:line or anchor>.
+3. <Principle # — short name>: <specific move>. Evidence: <file:line or anchor>.
+4. <optional 4th>
+5. <optional 5th>
+Out of scope for this refine pass: <explicit list — what NOT to touch>
+Deliverables for the plan:
+- Per-fix: target files, exact change, verification step
+- Token/spec changes consolidated in one place
+- Regression checklist for every "Keep" item above
+Anti-patterns to guard against (specific to REFINE):
+- Adding new abstractions where a direct change suffices
+- Restyling areas that already scored 3
+- Scope creep into structural redesign (if structure must change, this should be REDESIGN, not REFINE)
+- Letting fixes mutate principles outside the priority list
+````
+#### Template: REDESIGN
+````
+/make-plan Redesign <product/screen/component name>. Current design failed audit at <X>/30 with critical gaps in principles <comma-separated list of 0-scored or 1-scored load-bearing principles>.
+Verdict paragraph (quoted from 03-verdict.md):
+> <paste the one-sentence verdict here>
+Why redesign and not refine: <one sentence — usually a load-bearing principle (#2, #4, or #6) scored 0, or total is below threshold>
+Preserve from current design (MUST be non-empty — at minimum, name the brand tokens):
+- <specific element 1, with file:line>
+- <specific element 2, with file:line>
+- (if structurally nothing survives, write: "Brand tokens only — color palette and logo. Discard everything else.")
+Discard (MUST be non-empty — name the structural patterns causing the failures):
+- <pattern 1>. Evidence: <file:line>. Caused failure on principle #<N>.
+- <pattern 2>. Evidence: <file:line>. Caused failure on principle #<N>.
+Top 3–5 moves from the audit (verbatim):
+1. <Principle # — short name>: <specific move>. Evidence: <file:line>.
+2. <Principle # — short name>: <specific move>. Evidence: <file:line>.
+3. <Principle # — short name>: <specific move>. Evidence: <file:line>.
+Redesign principles in priority order:
+1. <Principle # — name> — <what success looks like>
+2. <Principle # — name> — <what success looks like>
+3. <Principle # — name> — <what success looks like>
+Deliverables for the plan:
+- New information architecture (not derived from old)
+- New primary flow (low-fi, labeled, compared side-by-side to current)
+- States checklist (empty, loading, error, success, focus, disabled)
+- Migration path for users currently on the old design
+- Cutover criteria (when is the old design retired)
+Anti-patterns to guard against (specific to REDESIGN):
+- Porting old structure under new styling
+- Keeping both designs behind a flag indefinitely
+- Redesigning to follow a trend rather than the principles above
+- Treating the Preserve list as optional — it must be filled before this handoff is valid
+````
+## Key Principles (for the auditor)
+- **Evidence over taste** — every score cites a source; "feels wrong" is not a finding
+- **Score what is, not what was intended** — design is what ships, not what was drawn
+- **Honesty applies to the audit too** — if total is 28/30, say REFINE even if the user wanted a redesign; if it's 12/30, say REDESIGN even if the user wanted a refine
+- **One verdict, not three** — pick NEW or REFINE or REDESIGN; do not hedge
+- **Handoff, don't implement** — `design-is` ends at the `/make-plan` prompt; `/make-plan` and `/do` take it from there
+- **Verdict commitment** — Once `02-scorecard.md` is written, the verdict follows the Phase 3 rule mechanically. Never re-score to back into a preferred verdict; if the scorecard says REDESIGN, the handoff is REDESIGN.
+## Failure Modes to Prevent
+- Scoring from screenshots alone without reading the code — redeploy with structural subagent
+- Scoring the codebase instead of the design — re-anchor on user-facing evidence
+- Awarding 3s generously to soften the verdict — recalibrate against the per-principle anchors in Phase 2
+- Producing a handoff prompt that doesn't quote the verdict and top moves — the next session is blind without them
+- Skipping Phase 0 scope lock — auditing the wrong surface wastes Phase 1
+- **Sunk-cost reasoning** — recommending REFINE because the codebase is large; sunk cost is not a design principle
+- **Hedging across verdicts** — "could be REFINE or REDESIGN depending on..." — pick one
+- **Score inflation to match a desired verdict** — score the evidence, then read the verdict off the rule
+- **Letting Phase 0 user preference override Phase 3 evidence** — the user can disagree with the verdict, but the audit reports what the evidence says

package/plugin/skills/make-plan/SKILL.md CHANGED Viewed

@@ -61,3 +61,7 @@ The orchestrator consolidates findings into a single Phase 0 output.
 - Adding parameters not in documentation
 - Skipping verification steps
 - Assuming structure without checking examples
+## See Also
+- `oh-my-issues` — the issue-side sibling. When the plan you're being asked to make is rooted in a bug or feature backlog rather than a fresh idea, route through `oh-my-issues` first to cluster issues by root cause into plan masters and `plans/0X-*.md` design docs. `make-plan` then operates on the design doc for one plan slice.

package/plugin/skills/oh-my-issues/SKILL.md ADDED Viewed

@@ -0,0 +1,226 @@
+---
+name: oh-my-issues
+description: Cluster a GitHub issue backlog by root cause into a small set of plan-master issues, redirect children with a standardized comment, and bundle architectural-fix PRs that close clusters atomically. Use when an issue tracker has accumulated dozens of reports that share underlying defects, when asked to triage / consolidate / cluster / dedupe issues, when asked to build a plan series or roadmap from open issues, or when routing a new incoming bug into an existing plan.
+---
+# oh-my-issues
+Turn an issue backlog into a roadmap. Issues are symptom data, not units of work — the unit of work is the architectural defect that produces them. The end state is `open issues == open plans`, 1:1.
+## Core principle
+Stop closing issues one at a time. Group symptoms that share a single architectural fix into a cluster, give the cluster one canonical home (a plan-master issue + a `plans/0X-*.md` design doc), close every child with a standardized redirect, and ship one PR per cluster that closes all children atomically. New incoming bugs get appended to the matching master as a "Round N" comment, not opened as new tracked issues.
+This compounds three ways: architectural fixes retire whole symptom families, the plan's test matrix institutionalizes prevention in CI, and standardized triage makes residual inflow cheap.
+## When to use
+- The repo has 20+ open issues and many feel like duplicates or platform-specific symptoms of the same defect.
+- The user asks to "triage", "consolidate", "cluster", "dedupe", "group", or "make a plan from" the issue list.
+- A new bug is filed and the user wants to know whether it belongs to existing work.
+- The user wants to ship a focused PR that resolves a cluster of related issues.
+## When NOT to use
+- Fewer than ~15 open issues: just close them.
+- Issues are genuinely independent (no shared root causes): one fix per issue is correct.
+- The repo lacks `plans/` discipline and the user does not want to introduce one — propose first, do not impose.
+## Three modes
+### Mode 1: Cluster pass (initial reduction)
+Use when the backlog has never been consolidated. Goal: go from N issues to N_plans masters in one operation.
+1. **Read everything in full.** Fetch every open issue's body *and* its comment thread — not just titles. Surface-level grouping fails without full text, and reproduction steps, linked duplicates, and diagnostic output often live in comments rather than the original body. See "GitHub CLI primitives" below for the correct paginated listing + per-issue comment fetch (a single `gh issue list` call does **not** return comment bodies).
+2. **Cluster by root cause, not by surface.** The clustering question is *would one architectural change retire all of these?* — not *do these mention the same word?*. "Windows" is a surface; "spawn contract violated by host shells" is a root cause. Two issues with different surfaces can share a cluster (e.g. an env-var leak in two different code paths sharing one missing env-isolation boundary).
+3. **Name each cluster as an architectural problem.** Title format: `[plan-XX] <Architectural Defect> — <one-line scope>`. Example: `[plan-02] Spawn-Contract Templating — canonical ${CLAUDE_PLUGIN_ROOT} resolution across all hosts`. The title must imply a fix, not a topic.
+4. **Open one master issue per cluster** with a body that lists: the architectural defect, the children (by issue number), the fix sequence, and a required test matrix (host × IDE × shell, etc.) that prevents regression.
+5. **Mirror each master as `plans/0X-<slug>.md`** in the repo. The issue is the public tracker; the doc is the design. They reference each other.
+6. **Close every child** with the standardized redirect comment (see below) and state `not planned`.
+7. **Verify end state:** `gh issue list --state open` returns exactly the masters and nothing else.
+Target shape for ~100 issues: 4–8 masters. More than 10 means you're clustering by surface; fewer than 3 means clusters are too broad to ship as one PR each.
+### Mode 2: Triage (new incoming bug, steady state)
+Use when a new issue is filed after consolidation is in place. Goal: never let the issue list re-accumulate.
+1. **Read the new issue's body in full.**
+2. **Pattern-match the symptom against existing plan masters.** For each open master, ask: *would the fix described here also fix this new bug?* If yes → it belongs to that plan.
+3. **If a match exists**, post a "Round N" comment on the master that:
+   - Names the new child by number
+   - Describes the symptom in one line
+   - Sketches the concrete fix (1–3 lines, e.g. "guard with `case "$_SH" in /*.exe|"") _SH=bash ;; esac`")
+   - Adds any new test-matrix cell the bug exposes
+4. **Close the child** with the standardized redirect comment, `not planned`.
+5. **If no match exists** and the bug is genuinely novel: open a new plan master + `plans/0X-*.md`. Resist this. Most bugs are children of existing plans.
+### Mode 3: Bundle (ship the cluster)
+Use when a plan slice is ready to ship. Goal: one PR closes N children atomically.
+1. **List the master's children.** From the master body and consolidation comments, collect every child issue number routed to this plan.
+2. **Verify each child's symptom is covered** by the architectural fix in the PR. If a child is not covered, the PR is not ready or that child belongs in a different plan.
+3. **Generate the PR description**: title is the plan slice (e.g. "fix(spawn): canonical ${CLAUDE_PLUGIN_ROOT} resolution"); body lists every child with `Closes #N` so GitHub auto-closes them on merge.
+4. **Add the test matrix from the plan** to CI in the same PR. Without the matrix, the cluster will re-emerge.
+5. **After merge**, the master issue can be closed only if every child was covered. If the plan has remaining scope, leave the master open and link the PR as a partial-shipping checkpoint.
+## Naming a plan master
+A plan-master title must imply its fix.
+| Bad (surface) | Good (architectural) |
+|---|---|
+| Windows bugs | Spawn-Contract Templating across hosts |
+| Worker crashes | Worker / Daemon Lifecycle Hardening — supervision, health, retry |
+| Auth issues | Worker Env Isolation — strip host CLI env from the SDK subprocess |
+| Install failures | Installer Failure Transparency — cross-IDE error taxonomy + 12×4 test matrix |
+If you cannot write a one-line architectural scope, the cluster is wrong.
+## The standardized redirect comment
+Use this exact phrasing on every child closure. Consistency lets contributors recognize the pattern at a glance and keeps the audit trail searchable.
+```text
+Consolidating into #<MASTER> (plan-XX). The root cause and fix sequencing are tracked there alongside the rest of the cluster — please follow that issue for progress.
+```
+Close as `not planned` (not `completed`) — the child was a symptom, not a unit of work.
+## GitHub CLI primitives
+Resolve repo:
+```bash
+repo_json=$(gh repo view --json owner,name)
+owner=$(jq -r '.owner.login // .owner.name' <<<"$repo_json")
+repo=$(jq -r '.name' <<<"$repo_json")
+```
+List all open issues (the read-everything pass). Two gotchas:
+- `gh issue list --json comments` returns only a count placeholder, not the comment bodies. You must fetch comments per issue with `gh issue view <N> --json comments`.
+- Any explicit `--limit` silently truncates if the backlog is larger. Always check the total open count first.
+```bash
+# 1. Confirm total — never trust an arbitrary --limit.
+# Note: GitHub's REST API treats PRs as issues, so .open_issues_count
+# from /repos/{owner}/{repo} is actually issues + PRs. Use the search
+# API to get the issue-only count.
+total=$(gh api "search/issues?q=repo:$owner/$repo+is:issue+is:open" --jq '.total_count')
+echo "Open issues: $total"
+# 2. List bodies (set --limit at or above the true total)
+gh issue list --state open --limit "$total" \
+  --json number,title,body,labels,author,createdAt
+# 3. For each issue, fetch its full comment thread
+for n in $(gh issue list --state open --limit "$total" --json number --jq '.[].number'); do
+  echo "=== Issue #$n ==="
+  gh issue view "$n" --json comments \
+    --jq '.comments[] | "\(.author.login) (\(.createdAt)): \(.body)"'
+done
+```
+If `total > 1000`, paginate via the REST API: `gh api "repos/$owner/$repo/issues?state=open&per_page=100&page=N"` looped until the result array is empty (note this includes PRs, so filter `select(.pull_request|not)`).
+Open a plan master:
+```bash
+gh issue create \
+  --title "[plan-02] Spawn-Contract Templating — canonical \${CLAUDE_PLUGIN_ROOT} resolution across all hosts" \
+  --body-file plans/02-spawn-contract-templating.md \
+  --label plan,plan-02
+```
+Post the consolidation comment + close the child:
+```bash
+gh issue comment <CHILD> --body "Consolidating into #<MASTER> (plan-XX). The root cause and fix sequencing are tracked there alongside the rest of the cluster — please follow that issue for progress."
+gh issue close <CHILD> --reason "not planned"
+```
+Append a "Round N" triage comment to a master:
+```bash
+gh issue comment <MASTER> --body "$(cat <<'EOF'
+**Round N consolidation**
+- #<CHILD> (<one-line symptom>) folded into this plan as <classification>.
+Proposed fix: <1–3 line sketch>.
+Adds matrix cell: <host/IDE/shell combination>.
+EOF
+)"
+```
+Verify final state:
+```bash
+gh issue list --state open --json number,title \
+  | jq -r '.[] | "\(.number)\t\(.title)"'
+```
+Output should be exactly the plan masters.
+## Plan master body template
+Save as `plans/0X-<slug>.md` and use as `--body-file` for the master issue.
+```markdown
+# [plan-XX] <Architectural Defect> — <one-line scope>
+## Defect
+<One paragraph: what is structurally broken, why it produces the observed family of symptoms.>
+## Children
+- #N — <symptom one-liner>
+- #N — <symptom one-liner>
+- ...
+## Fix sequence
+1. <First architectural change — bounded, reviewable>
+2. <Second>
+3. ...
+## Test matrix
+| Axis A | Axis B | Required behavior |
+|---|---|---|
+| ... | ... | ... |
+The matrix lives in CI. A future regression must fail CI before a user can file.
+## Out of scope
+<What this plan deliberately does not cover, with pointers to other plan masters.>
+```
+## Health checks
+Run periodically against the plan masters to catch the failure modes.
+- **Graveyard master:** master issue has accumulated 5+ "Round N" comments without a shipping PR. The plan needs a forcing PR or it must be split.
+- **Over-broad master:** the children's fixes cannot fit one PR. Split into two plans with narrower scope.
+- **Surface-clustered master:** the children share a topic but not a fix. Re-cluster by root cause; some children belong to different plans.
+- **Drift between issue and doc:** the plan master body and `plans/0X-*.md` disagree. Pick one as canonical (the doc) and regenerate the issue body from it.
+## Stop conditions
+For a cluster pass: stop when `gh issue list --state open` returns exactly the masters.
+For a triage: stop when the new child is closed and the master has a Round-N entry.
+For a bundle: stop when the PR is merged and every listed child is auto-closed by `Closes #N`.
+## Failure modes worth refusing
+- **Premature clustering** before reading every issue body in full. Don't.
+- **Closing children before the master is open.** Children must always have a redirect target.
+- **Using the redirect comment for issues that aren't symptoms** (e.g. genuine feature requests with no shared root cause). Those stay open or get their own track.
+- **Closing a master before every listed child is shipped.** The master is the contract; closing it early breaks the audit trail.