mindforge-cc 11.5.1 → 11.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (214) hide show
  1. package/.agent/mindforge/skill-tdd.md +53 -0
  2. package/.agent/mindforge/skills-index.md +118 -0
  3. package/.agent/mindforge/systematic-debug.md +60 -0
  4. package/.agent/mindforge/wf-catalog.md +37 -0
  5. package/.agent/mindforge/wf-code-audit.md +31 -0
  6. package/.agent/mindforge/wf-competitive-analysis.md +31 -0
  7. package/.agent/mindforge/wf-deep-research.md +32 -0
  8. package/.agent/mindforge/wf-feature-planner.md +31 -0
  9. package/.agent/mindforge/wf-incident-response.md +31 -0
  10. package/.agent/mindforge/wf-onboard-codebase.md +31 -0
  11. package/.agent/mindforge/wf-perf-optimize.md +31 -0
  12. package/.agent/mindforge/wf-pr-review.md +31 -0
  13. package/.agent/mindforge/wf-refactor-plan.md +31 -0
  14. package/.agent/mindforge/wf-release-prep.md +31 -0
  15. package/.agent/mindforge/wf-tdd-sprint.md +31 -0
  16. package/.agent/mindforge/wf-tech-evaluation.md +31 -0
  17. package/.agent/skills/1password-skill/SKILL.md +156 -0
  18. package/.agent/skills/1password-skill/references/cli-examples.md +31 -0
  19. package/.agent/skills/1password-skill/references/get-started.md +21 -0
  20. package/.agent/skills/article-illustrator/SKILL.md +199 -0
  21. package/.agent/skills/article-illustrator/references/prompt-construction.md +426 -0
  22. package/.agent/skills/article-illustrator/references/style-presets.md +80 -0
  23. package/.agent/skills/article-illustrator/references/styles.md +224 -0
  24. package/.agent/skills/article-illustrator/references/usage.md +50 -0
  25. package/.agent/skills/article-illustrator/references/workflow.md +332 -0
  26. package/.agent/skills/arxiv/SKILL.md +275 -0
  27. package/.agent/skills/blogwatcher/SKILL.md +130 -0
  28. package/.agent/skills/code-wiki/SKILL.md +438 -0
  29. package/.agent/skills/code-wiki/templates/README.md +31 -0
  30. package/.agent/skills/code-wiki/templates/architecture.md +30 -0
  31. package/.agent/skills/code-wiki/templates/getting-started.md +47 -0
  32. package/.agent/skills/code-wiki/templates/module.md +38 -0
  33. package/.agent/skills/codebase-inspection/SKILL.md +109 -0
  34. package/.agent/skills/comic-creator/SKILL.md +240 -0
  35. package/.agent/skills/comic-creator/references/analysis-framework.md +176 -0
  36. package/.agent/skills/comic-creator/references/auto-selection.md +71 -0
  37. package/.agent/skills/comic-creator/references/base-prompt.md +98 -0
  38. package/.agent/skills/comic-creator/references/character-template.md +180 -0
  39. package/.agent/skills/comic-creator/references/ohmsha-guide.md +85 -0
  40. package/.agent/skills/comic-creator/references/partial-workflows.md +106 -0
  41. package/.agent/skills/comic-creator/references/storyboard-template.md +143 -0
  42. package/.agent/skills/comic-creator/references/workflow.md +401 -0
  43. package/.agent/skills/concept-diagrams/SKILL.md +355 -0
  44. package/.agent/skills/concept-diagrams/references/dashboard-patterns.md +43 -0
  45. package/.agent/skills/concept-diagrams/references/infrastructure-patterns.md +144 -0
  46. package/.agent/skills/concept-diagrams/references/physical-shape-cookbook.md +42 -0
  47. package/.agent/skills/creative-ideation/SKILL.md +144 -0
  48. package/.agent/skills/creative-ideation/references/full-prompt-library.md +110 -0
  49. package/.agent/skills/devops-cli/SKILL.md +149 -0
  50. package/.agent/skills/devops-cli/references/app-discovery.md +112 -0
  51. package/.agent/skills/devops-cli/references/authentication.md +59 -0
  52. package/.agent/skills/devops-cli/references/cli-reference.md +104 -0
  53. package/.agent/skills/devops-cli/references/running-apps.md +171 -0
  54. package/.agent/skills/devops-watchers/SKILL.md +103 -0
  55. package/.agent/skills/docker-management/SKILL.md +273 -0
  56. package/.agent/skills/domain-intel/SKILL.md +96 -0
  57. package/.agent/skills/duckduckgo-search/SKILL.md +230 -0
  58. package/.agent/skills/github-auth/SKILL.md +240 -0
  59. package/.agent/skills/github-code-review/SKILL.md +474 -0
  60. package/.agent/skills/github-code-review/references/review-output-template.md +74 -0
  61. package/.agent/skills/github-issues/SKILL.md +363 -0
  62. package/.agent/skills/github-issues/templates/bug-report.md +35 -0
  63. package/.agent/skills/github-issues/templates/feature-request.md +31 -0
  64. package/.agent/skills/github-pr-workflow/SKILL.md +360 -0
  65. package/.agent/skills/github-pr-workflow/references/ci-troubleshooting.md +183 -0
  66. package/.agent/skills/github-pr-workflow/references/conventional-commits.md +71 -0
  67. package/.agent/skills/github-pr-workflow/templates/pr-body-bugfix.md +35 -0
  68. package/.agent/skills/github-pr-workflow/templates/pr-body-feature.md +33 -0
  69. package/.agent/skills/github-repo-management/SKILL.md +509 -0
  70. package/.agent/skills/github-repo-management/references/github-api-cheatsheet.md +161 -0
  71. package/.agent/skills/godmode/SKILL.md +396 -0
  72. package/.agent/skills/godmode/references/jailbreak-templates.md +128 -0
  73. package/.agent/skills/godmode/references/refusal-detection.md +142 -0
  74. package/.agent/skills/hyperframes/SKILL.md +182 -0
  75. package/.agent/skills/hyperframes/references/cli.md +185 -0
  76. package/.agent/skills/hyperframes/references/composition.md +129 -0
  77. package/.agent/skills/hyperframes/references/features.md +289 -0
  78. package/.agent/skills/hyperframes/references/gsap.md +136 -0
  79. package/.agent/skills/hyperframes/references/troubleshooting.md +137 -0
  80. package/.agent/skills/hyperframes/references/website-to-video.md +145 -0
  81. package/.agent/skills/jupyter-live-kernel/SKILL.md +160 -0
  82. package/.agent/skills/kanban-orchestrator/SKILL.md +209 -0
  83. package/.agent/skills/kanban-worker/SKILL.md +188 -0
  84. package/.agent/skills/llm-wiki/SKILL.md +499 -0
  85. package/.agent/skills/meme-generation/SKILL.md +122 -0
  86. package/.agent/skills/node-inspect-debugger/SKILL.md +312 -0
  87. package/.agent/skills/obsidian/SKILL.md +60 -0
  88. package/.agent/skills/osint-investigation/SKILL.md +269 -0
  89. package/.agent/skills/osint-investigation/templates/source-template.md +59 -0
  90. package/.agent/skills/oss-forensics/SKILL.md +422 -0
  91. package/.agent/skills/oss-forensics/references/evidence-types.md +89 -0
  92. package/.agent/skills/oss-forensics/references/github-archive-guide.md +184 -0
  93. package/.agent/skills/oss-forensics/references/investigation-templates.md +131 -0
  94. package/.agent/skills/oss-forensics/references/recovery-techniques.md +164 -0
  95. package/.agent/skills/oss-forensics/templates/forensic-report.md +151 -0
  96. package/.agent/skills/oss-forensics/templates/malicious-package-report.md +43 -0
  97. package/.agent/skills/parallel-cli/SKILL.md +384 -0
  98. package/.agent/skills/pinggy-tunnel/SKILL.md +302 -0
  99. package/.agent/skills/pixel-art/SKILL.md +209 -0
  100. package/.agent/skills/pixel-art/references/palettes.md +49 -0
  101. package/.agent/skills/plan/SKILL.md +331 -0
  102. package/.agent/skills/polymarket/SKILL.md +75 -0
  103. package/.agent/skills/polymarket/references/api-endpoints.md +220 -0
  104. package/.agent/skills/python-debugpy/SKILL.md +368 -0
  105. package/.agent/skills/requesting-code-review/SKILL.md +273 -0
  106. package/.agent/skills/research-paper-writing/SKILL.md +2367 -0
  107. package/.agent/skills/research-paper-writing/references/autoreason-methodology.md +394 -0
  108. package/.agent/skills/research-paper-writing/references/checklists.md +434 -0
  109. package/.agent/skills/research-paper-writing/references/citation-workflow.md +563 -0
  110. package/.agent/skills/research-paper-writing/references/experiment-patterns.md +728 -0
  111. package/.agent/skills/research-paper-writing/references/human-evaluation.md +476 -0
  112. package/.agent/skills/research-paper-writing/references/paper-types.md +481 -0
  113. package/.agent/skills/research-paper-writing/references/reviewer-guidelines.md +433 -0
  114. package/.agent/skills/research-paper-writing/references/sources.md +191 -0
  115. package/.agent/skills/research-paper-writing/references/writing-guide.md +474 -0
  116. package/.agent/skills/research-paper-writing/templates/README.md +251 -0
  117. package/.agent/skills/rest-graphql-debug/SKILL.md +507 -0
  118. package/.agent/skills/s6-container-supervision/SKILL.md +171 -0
  119. package/.agent/skills/scrapling/SKILL.md +328 -0
  120. package/.agent/skills/sherlock/SKILL.md +186 -0
  121. package/.agent/skills/simplify-code/SKILL.md +168 -0
  122. package/.agent/skills/skill-authoring/SKILL.md +158 -0
  123. package/.agent/skills/spike/SKILL.md +190 -0
  124. package/.agent/skills/subagent-driven-development/SKILL.md +345 -0
  125. package/.agent/skills/subagent-driven-development/references/context-budget-discipline.md +53 -0
  126. package/.agent/skills/subagent-driven-development/references/gates-taxonomy.md +93 -0
  127. package/.agent/skills/systematic-debugging/SKILL.md +360 -0
  128. package/.agent/skills/test-driven-development/SKILL.md +336 -0
  129. package/.agent/skills/video-orchestrator/SKILL.md +194 -0
  130. package/.agent/skills/video-orchestrator/references/examples.md +227 -0
  131. package/.agent/skills/video-orchestrator/references/intake.md +166 -0
  132. package/.agent/skills/video-orchestrator/references/kanban-setup.md +278 -0
  133. package/.agent/skills/video-orchestrator/references/monitoring.md +180 -0
  134. package/.agent/skills/video-orchestrator/references/role-archetypes.md +298 -0
  135. package/.agent/skills/video-orchestrator/references/tool-matrix.md +317 -0
  136. package/.agent/skills/web-pentest/SKILL.md +332 -0
  137. package/.agent/skills/web-pentest/references/bypass-techniques.md +133 -0
  138. package/.agent/skills/web-pentest/references/exploitation-techniques.md +204 -0
  139. package/.agent/skills/web-pentest/references/scope-enforcement.md +110 -0
  140. package/.agent/skills/web-pentest/references/vuln-taxonomy.md +81 -0
  141. package/.agent/skills/web-pentest/templates/authorization.md +69 -0
  142. package/.agent/skills/web-pentest/templates/pentest-report.md +178 -0
  143. package/.claude/commands/mindforge/skill-tdd.md +53 -0
  144. package/.claude/commands/mindforge/skills-index.md +118 -0
  145. package/.claude/commands/mindforge/systematic-debug.md +60 -0
  146. package/.claude/commands/mindforge/wf-catalog.md +37 -0
  147. package/.claude/commands/mindforge/wf-code-audit.md +31 -0
  148. package/.claude/commands/mindforge/wf-competitive-analysis.md +31 -0
  149. package/.claude/commands/mindforge/wf-deep-research.md +32 -0
  150. package/.claude/commands/mindforge/wf-feature-planner.md +31 -0
  151. package/.claude/commands/mindforge/wf-incident-response.md +31 -0
  152. package/.claude/commands/mindforge/wf-onboard-codebase.md +31 -0
  153. package/.claude/commands/mindforge/wf-perf-optimize.md +31 -0
  154. package/.claude/commands/mindforge/wf-pr-review.md +31 -0
  155. package/.claude/commands/mindforge/wf-refactor-plan.md +31 -0
  156. package/.claude/commands/mindforge/wf-release-prep.md +31 -0
  157. package/.claude/commands/mindforge/wf-tdd-sprint.md +31 -0
  158. package/.claude/commands/mindforge/wf-tech-evaluation.md +31 -0
  159. package/.mindforge/config.json +2 -2
  160. package/.mindforge/dynamic-workflows/REGISTRY.md +65 -0
  161. package/.mindforge/dynamic-workflows/index.json +171 -0
  162. package/.mindforge/dynamic-workflows/scripts/code-audit.js +103 -0
  163. package/.mindforge/dynamic-workflows/scripts/competitive-analysis.js +85 -0
  164. package/.mindforge/dynamic-workflows/scripts/deep-research.js +151 -0
  165. package/.mindforge/dynamic-workflows/scripts/feature-planner.js +104 -0
  166. package/.mindforge/dynamic-workflows/scripts/incident-response.js +106 -0
  167. package/.mindforge/dynamic-workflows/scripts/onboard-codebase.js +102 -0
  168. package/.mindforge/dynamic-workflows/scripts/perf-optimize.js +128 -0
  169. package/.mindforge/dynamic-workflows/scripts/pr-review.js +87 -0
  170. package/.mindforge/dynamic-workflows/scripts/refactor-plan.js +121 -0
  171. package/.mindforge/dynamic-workflows/scripts/release-prep.js +102 -0
  172. package/.mindforge/dynamic-workflows/scripts/tdd-sprint.js +103 -0
  173. package/.mindforge/dynamic-workflows/scripts/tech-evaluation.js +72 -0
  174. package/.mindforge/memory/sync-manifest.json +1 -1
  175. package/.mindforge/skills/arxiv/SKILL.md +294 -0
  176. package/.mindforge/skills/blogwatcher/SKILL.md +147 -0
  177. package/.mindforge/skills/code-wiki/SKILL.md +457 -0
  178. package/.mindforge/skills/codebase-inspection/SKILL.md +126 -0
  179. package/.mindforge/skills/concept-diagrams/SKILL.md +373 -0
  180. package/.mindforge/skills/creative-ideation/SKILL.md +162 -0
  181. package/.mindforge/skills/domain-intel/SKILL.md +116 -0
  182. package/.mindforge/skills/duckduckgo-search/SKILL.md +249 -0
  183. package/.mindforge/skills/github-code-review/SKILL.md +493 -0
  184. package/.mindforge/skills/github-issues/SKILL.md +382 -0
  185. package/.mindforge/skills/github-pr-workflow/SKILL.md +379 -0
  186. package/.mindforge/skills/jupyter-live-kernel/SKILL.md +179 -0
  187. package/.mindforge/skills/kanban-orchestrator/SKILL.md +227 -0
  188. package/.mindforge/skills/kanban-worker/SKILL.md +206 -0
  189. package/.mindforge/skills/meme-generation/SKILL.md +141 -0
  190. package/.mindforge/skills/obsidian/SKILL.md +80 -0
  191. package/.mindforge/skills/osint-investigation/SKILL.md +288 -0
  192. package/.mindforge/skills/oss-forensics/SKILL.md +421 -0
  193. package/.mindforge/skills/pixel-art/SKILL.md +228 -0
  194. package/.mindforge/skills/plan/SKILL.md +350 -0
  195. package/.mindforge/skills/requesting-code-review/SKILL.md +292 -0
  196. package/.mindforge/skills/research-paper-writing/SKILL.md +2384 -0
  197. package/.mindforge/skills/scrapling/SKILL.md +345 -0
  198. package/.mindforge/skills/sherlock/SKILL.md +203 -0
  199. package/.mindforge/skills/simplify-code/SKILL.md +187 -0
  200. package/.mindforge/skills/spike/SKILL.md +209 -0
  201. package/.mindforge/skills/subagent-driven-development/SKILL.md +364 -0
  202. package/.mindforge/skills/systematic-debugging/SKILL.md +379 -0
  203. package/.mindforge/skills/test-driven-development/SKILL.md +355 -0
  204. package/.mindforge/skills/web-pentest/SKILL.md +327 -0
  205. package/CHANGELOG.md +71 -0
  206. package/MINDFORGE.md +2 -2
  207. package/README.md +72 -3
  208. package/RELEASENOTES.md +109 -0
  209. package/bin/installer-core.js +6 -2
  210. package/bin/mindforge-cli.js +7 -0
  211. package/bin/workflows/workflow-runner.js +110 -0
  212. package/docs/commands-reference.md +25 -0
  213. package/docs/getting-started.md +42 -5
  214. package/package.json +2 -1
@@ -0,0 +1,53 @@
1
+ # Context Budget Discipline
2
+
3
+ Practical rules for keeping orchestrator context lean when spawning subagents or reading large artifacts. Use these whenever you're running a multi-step agent loop that will consume significant context — plan execution, subagent orchestration, review pipelines, multi-file refactors.
4
+
5
+ Adapted from the GSD (Get Shit Done) project's context-budget reference — MIT © 2025 Lex Christopherson ([gsd-build/get-shit-done](https://github.com/gsd-build/get-shit-done)).
6
+
7
+ ## Universal rules
8
+
9
+ Every workflow that spawns agents or reads significant content must follow these:
10
+
11
+ 1. **Never read agent definition files.** `delegate_task` auto-loads them — you reading them too just doubles the cost.
12
+ 2. **Never inline large files into subagent prompts.** Tell the agent to read the file from disk with `read_file` instead. The subagent gets full content; your context stays lean.
13
+ 3. **Read depth scales with context window.** See the table below.
14
+ 4. **Delegate heavy work to subagents.** The orchestrator routes; it doesn't execute.
15
+ 5. **Proactively warn** the user when you've consumed significant context ("Context is getting heavy — consider checkpointing progress before we continue").
16
+
17
+ ## Read depth by context window
18
+
19
+ Check the model's actual context window (not "it's Claude so 200K"). Some Sonnet deployments are 1M, some are 200K. If you don't know, assume the smaller one — err toward leanness.
20
+
21
+ | Context window | Subagent output reading | Summary files | Verification files | Plans for other phases |
22
+ |----------------|-------------------------|---------------|--------------------|-----------------------|
23
+ | < 500k (e.g. 200k) | Frontmatter only | Frontmatter only | Frontmatter only | Current phase only |
24
+ | >= 500k (1M models) | Full body permitted | Full body permitted | Full body permitted | Current phase only |
25
+
26
+ "Frontmatter only" means: read enough to see the final status/verdict/conclusion. If the subagent wrote a 3000-line debug log, read the summary section it produced, not the log.
27
+
28
+ ## Four-tier degradation model
29
+
30
+ Monitor your context usage and shift behavior as you climb the tiers. The point is to notice *before* you hit the wall, not when responses start truncating.
31
+
32
+ | Tier | Usage | Behavior |
33
+ |------|-------|----------|
34
+ | **PEAK** | 0 – 30% | Full operations. Read bodies, spawn multiple agents in parallel, inline results freely. |
35
+ | **GOOD** | 30 – 50% | Normal operations. Prefer frontmatter reads. Delegate aggressively. |
36
+ | **DEGRADING** | 50 – 70% | Economize. Frontmatter-only reads, minimal inlining, **warn the user** about budget. |
37
+ | **POOR** | 70%+ | Emergency mode. **Checkpoint progress immediately.** No new reads unless critical. Finish the current task and stop cleanly. |
38
+
39
+ ## Early warning signs (before panic thresholds fire)
40
+
41
+ Quality degrades *gradually* before hard limits hit. Watch for these:
42
+
43
+ - **Silent partial completion.** Subagent claims done but implementation is incomplete. Self-checks catch file existence, not semantic completeness. Always verify subagent output against the plan's must-haves, not just "did a file appear?"
44
+ - **Increasing vagueness.** Agent starts using phrases like "appropriate handling" or "standard patterns" instead of specific code. This is context pressure showing up before budget warnings fire.
45
+ - **Skipped protocol steps.** Agent omits steps it would normally follow. If success criteria has 8 items and the report covers 5, suspect context pressure, not "the agent decided 5 was enough."
46
+
47
+ When these signs appear, checkpoint the work and either reset context or hand off to a fresh subagent.
48
+
49
+ ## Fundamental limitation
50
+
51
+ When you orchestrate, you cannot verify semantic correctness of subagent output — only structural completeness ("did the file appear?", "does the test pass?"). Semantic verification requires either running the code yourself or delegating a review pass to another fresh subagent.
52
+
53
+ **Mitigation:** in every task you delegate, include explicit "must-have" truths the subagent must confirm in its response (e.g., "confirm your test actually tests X, not just that X was imported"). The subagent re-asserting concrete facts is evidence; vague summaries are not.
@@ -0,0 +1,93 @@
1
+ # Gates Taxonomy
2
+
3
+ Canonical gate types for validation checkpoints across any workflow that spawns subagents, runs review loops, or has human-approval pauses. Every validation checkpoint maps to one of these four types — naming them explicitly makes the workflow legible and prevents "what happens when this check fails?" confusion.
4
+
5
+ Adapted from the GSD (Get Shit Done) project's gates reference — MIT © 2025 Lex Christopherson ([gsd-build/get-shit-done](https://github.com/gsd-build/get-shit-done)).
6
+
7
+ ## The four gate types
8
+
9
+ ### 1. Pre-flight gate
10
+
11
+ **Purpose:** Validates preconditions before starting an operation.
12
+
13
+ **Behavior:** Blocks entry if conditions unmet. No partial work created — bail before anything changes.
14
+
15
+ **Recovery:** Fix the missing precondition, then retry.
16
+
17
+ **Examples:**
18
+ - Implementation phase checks that the plan file exists before it starts writing code.
19
+ - Delegated subagent checks that required env vars are set before making API calls.
20
+ - Commit checks that tests passed before pushing.
21
+
22
+ ### 2. Revision gate
23
+
24
+ **Purpose:** Evaluates output quality and routes to revision if insufficient.
25
+
26
+ **Behavior:** Loops back to the producer with specific feedback. Bounded by an iteration cap (typically 3).
27
+
28
+ **Recovery:** Producer addresses feedback; checker re-evaluates. The loop escalates early if issue count does not decrease between consecutive iterations (stall detection). After max iterations, escalates to the user unconditionally — never loop forever.
29
+
30
+ **Examples:**
31
+ - Plan reviewer reads a draft plan, returns specific issues, planner revises, reviewer re-reads (max 3 cycles).
32
+ - Code reviewer checks subagent-produced code against must-haves; dispatches fixes back to the implementer if any must-have failed.
33
+ - Test coverage checker validates new tests exercise the new paths; if not, sends back to author.
34
+
35
+ ### 3. Escalation gate
36
+
37
+ **Purpose:** Surfaces unresolvable issues to the human for a decision.
38
+
39
+ **Behavior:** Pauses workflow, presents options, waits for human input. Never guesses, never picks a default.
40
+
41
+ **Recovery:** Human chooses action; workflow resumes on the selected path.
42
+
43
+ **Examples:**
44
+ - Revision loop exhausted after 3 iterations.
45
+ - Merge conflict during automated worktree cleanup.
46
+ - Ambiguous requirement — two reasonable interpretations and the choice changes the approach.
47
+ - Subagent reports "the plan says X but the codebase actually does Y" — human decides which is right.
48
+
49
+ ### 4. Abort gate
50
+
51
+ **Purpose:** Terminates the operation to prevent damage or waste.
52
+
53
+ **Behavior:** Stops immediately, preserves state (checkpoint current progress), reports the specific reason.
54
+
55
+ **Recovery:** Human investigates root cause, fixes, restarts from checkpoint.
56
+
57
+ **Examples:**
58
+ - Context window critically low during execution (POOR tier, >70%) — abort cleanly rather than produce truncated output.
59
+ - Critical dependency unavailable mid-run (network down, API key revoked).
60
+ - Unrecoverable filesystem state (disk full, permissions lost).
61
+ - Safety invariant violated (agent attempted an irreversible destructive action outside approved scope).
62
+
63
+ ## How to use this in a skill
64
+
65
+ When you write an orchestration skill that has validation checkpoints, **name each checkpoint by its gate type explicitly** and answer three questions:
66
+
67
+ 1. **What condition triggers this gate?** (e.g., "plan file missing", "issue count didn't decrease", "context >70%")
68
+ 2. **What happens when it fails?** (block / loop back / ask human / abort)
69
+ 3. **Who resumes, and from where?** (fix precondition + retry, revise + re-check, human decision, restart from checkpoint)
70
+
71
+ Answering these three up front means your skill never hits "what do we do now?" at runtime.
72
+
73
+ ## Example — a review loop with all four gate types
74
+
75
+ ```
76
+ [Pre-flight] plan.md exists and is non-empty? → no: bail, ask user to write a plan first
77
+ ↓ yes
78
+ [Execute] subagent implements task
79
+
80
+ [Revision] reviewer checks against must-haves → fail: loop back to subagent (max 3)
81
+ ↓ pass
82
+ [Pre-flight] tests pass? → no: bail, report failing tests
83
+ ↓ yes
84
+ [Commit]
85
+
86
+ (on revision loop exhaustion)
87
+ [Escalation] "3 review cycles failed to converge on issue X — pick: force-merge, rewrite task, abandon"
88
+ ↓ user picks
89
+ (on any tier-POOR context pressure during loop)
90
+ [Abort] "context at 73%, checkpointing and stopping"
91
+ ```
92
+
93
+ The vocabulary is small on purpose. Every gate in every workflow should fit one of these four. If you find yourself inventing a fifth, it's probably a revision gate with extra branching, or an escalation gate in disguise.
@@ -0,0 +1,360 @@
1
+ ---
2
+ name: systematic-debugging
3
+ description: "4-phase root cause debugging: understand bugs before fixing."
4
+ version: 1.1.0
5
+ ---
6
+
7
+ # Systematic Debugging
8
+
9
+ ## Overview
10
+
11
+ Random fixes waste time and create new bugs. Quick patches mask underlying issues.
12
+
13
+ **Core principle:** ALWAYS find root cause before attempting fixes. Symptom fixes are failure.
14
+
15
+ **Violating the letter of this process is violating the spirit of debugging.**
16
+
17
+ ## The Iron Law
18
+
19
+ ```
20
+ NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
21
+ ```
22
+
23
+ If you haven't completed Phase 1, you cannot propose fixes.
24
+
25
+ ## When to Use
26
+
27
+ Use for ANY technical issue:
28
+ - Test failures
29
+ - Bugs in production
30
+ - Unexpected behavior
31
+ - Performance problems
32
+ - Build failures
33
+ - Integration issues
34
+
35
+ **Use this ESPECIALLY when:**
36
+ - Under time pressure (emergencies make guessing tempting)
37
+ - "Just one quick fix" seems obvious
38
+ - You've already tried multiple fixes
39
+ - Previous fix didn't work
40
+ - You don't fully understand the issue
41
+
42
+ **Don't skip when:**
43
+ - Issue seems simple (simple bugs have root causes too)
44
+ - You're in a hurry (rushing guarantees rework)
45
+ - Someone wants it fixed NOW (systematic is faster than thrashing)
46
+
47
+ ## The Four Phases
48
+
49
+ You MUST complete each phase before proceeding to the next.
50
+
51
+ ---
52
+
53
+ ## Phase 1: Root Cause Investigation
54
+
55
+ **BEFORE attempting ANY fix:**
56
+
57
+ ### 1. Read Error Messages Carefully
58
+
59
+ - Don't skip past errors or warnings
60
+ - They often contain the exact solution
61
+ - Read stack traces completely
62
+ - Note line numbers, file paths, error codes
63
+
64
+ **Action:** Use `read_file` on the relevant source files. Use `search_files` to find the error string in the codebase.
65
+
66
+ ### 2. Reproduce Consistently
67
+
68
+ - Can you trigger it reliably?
69
+ - What are the exact steps?
70
+ - Does it happen every time?
71
+ - If not reproducible → gather more data, don't guess
72
+
73
+ **Action:** Use the `terminal` tool to run the failing test or trigger the bug:
74
+
75
+ ```bash
76
+ # Run specific failing test
77
+ pytest tests/test_module.py::test_name -v
78
+
79
+ # Run with verbose output
80
+ pytest tests/test_module.py -v --tb=long
81
+ ```
82
+
83
+ ### 3. Check Recent Changes
84
+
85
+ - What changed that could cause this?
86
+ - Git diff, recent commits
87
+ - New dependencies, config changes
88
+
89
+ **Action:**
90
+
91
+ ```bash
92
+ # Recent commits
93
+ git log --oneline -10
94
+
95
+ # Uncommitted changes
96
+ git diff
97
+
98
+ # Changes in specific file
99
+ git log -p --follow src/problematic_file.py | head -100
100
+ ```
101
+
102
+ ### 4. Gather Evidence in Multi-Component Systems
103
+
104
+ **WHEN system has multiple components (API → service → database, CI → build → deploy):**
105
+
106
+ **BEFORE proposing fixes, add diagnostic instrumentation:**
107
+
108
+ For EACH component boundary:
109
+ - Log what data enters the component
110
+ - Log what data exits the component
111
+ - Verify environment/config propagation
112
+ - Check state at each layer
113
+
114
+ Run once to gather evidence showing WHERE it breaks.
115
+ THEN analyze evidence to identify the failing component.
116
+ THEN investigate that specific component.
117
+
118
+ ### 5. Trace Data Flow
119
+
120
+ **WHEN error is deep in the call stack:**
121
+
122
+ - Where does the bad value originate?
123
+ - What called this function with the bad value?
124
+ - Keep tracing upstream until you find the source
125
+ - Fix at the source, not at the symptom
126
+
127
+ **Action:** Use `search_files` to trace references:
128
+
129
+ ```python
130
+ # Find where the function is called
131
+ search_files("function_name(", path="src/", file_glob="*.py")
132
+
133
+ # Find where the variable is set
134
+ search_files("variable_name\\s*=", path="src/", file_glob="*.py")
135
+ ```
136
+
137
+ ### Phase 1 Completion Checklist
138
+
139
+ - [ ] Error messages fully read and understood
140
+ - [ ] Issue reproduced consistently
141
+ - [ ] Recent changes identified and reviewed
142
+ - [ ] Evidence gathered (logs, state, data flow)
143
+ - [ ] Problem isolated to specific component/code
144
+ - [ ] Root cause hypothesis formed
145
+
146
+ **STOP:** Do not proceed to Phase 2 until you understand WHY it's happening.
147
+
148
+ ---
149
+
150
+ ## Phase 2: Pattern Analysis
151
+
152
+ **Find the pattern before fixing:**
153
+
154
+ ### 1. Find Working Examples
155
+
156
+ - Locate similar working code in the same codebase
157
+ - What works that's similar to what's broken?
158
+
159
+ **Action:** Use `search_files` to find comparable patterns:
160
+
161
+ ```python
162
+ search_files("similar_pattern", path="src/", file_glob="*.py")
163
+ ```
164
+
165
+ ### 2. Compare Against References
166
+
167
+ - If implementing a pattern, read the reference implementation COMPLETELY
168
+ - Don't skim — read every line
169
+ - Understand the pattern fully before applying
170
+
171
+ ### 3. Identify Differences
172
+
173
+ - What's different between working and broken?
174
+ - List every difference, however small
175
+ - Don't assume "that can't matter"
176
+
177
+ ### 4. Understand Dependencies
178
+
179
+ - What other components does this need?
180
+ - What settings, config, environment?
181
+ - What assumptions does it make?
182
+
183
+ ---
184
+
185
+ ## Phase 3: Hypothesis and Testing
186
+
187
+ **Scientific method:**
188
+
189
+ ### 1. Form a Single Hypothesis
190
+
191
+ - State clearly: "I think X is the root cause because Y"
192
+ - Write it down
193
+ - Be specific, not vague
194
+
195
+ ### 2. Test Minimally
196
+
197
+ - Make the SMALLEST possible change to test the hypothesis
198
+ - One variable at a time
199
+ - Don't fix multiple things at once
200
+
201
+ ### 3. Verify Before Continuing
202
+
203
+ - Did it work? → Phase 4
204
+ - Didn't work? → Form NEW hypothesis
205
+ - DON'T add more fixes on top
206
+
207
+ ### 4. When You Don't Know
208
+
209
+ - Say "I don't understand X"
210
+ - Don't pretend to know
211
+ - Ask the user for help
212
+ - Research more
213
+
214
+ ---
215
+
216
+ ## Phase 4: Implementation
217
+
218
+ **Fix the root cause, not the symptom:**
219
+
220
+ ### 1. Create Failing Test Case
221
+
222
+ - Simplest possible reproduction
223
+ - Automated test if possible
224
+ - MUST have before fixing
225
+ - Use the `test-driven-development` skill
226
+
227
+ ### 2. Implement Single Fix
228
+
229
+ - Address the root cause identified
230
+ - ONE change at a time
231
+ - No "while I'm here" improvements
232
+ - No bundled refactoring
233
+
234
+ ### 3. Verify Fix
235
+
236
+ ```bash
237
+ # Run the specific regression test
238
+ pytest tests/test_module.py::test_regression -v
239
+
240
+ # Run full suite — no regressions
241
+ pytest tests/ -q
242
+ ```
243
+
244
+ ### 4. If Fix Doesn't Work — The Rule of Three
245
+
246
+ - **STOP.**
247
+ - Count: How many fixes have you tried?
248
+ - If < 3: Return to Phase 1, re-analyze with new information
249
+ - **If ≥ 3: STOP and question the architecture (step 5 below)**
250
+ - DON'T attempt Fix #4 without architectural discussion
251
+
252
+ ### 5. If 3+ Fixes Failed: Question Architecture
253
+
254
+ **Pattern indicating an architectural problem:**
255
+ - Each fix reveals new shared state/coupling in a different place
256
+ - Fixes require "massive refactoring" to implement
257
+ - Each fix creates new symptoms elsewhere
258
+
259
+ **STOP and question fundamentals:**
260
+ - Is this pattern fundamentally sound?
261
+ - Are we "sticking with it through sheer inertia"?
262
+ - Should we refactor the architecture vs. continue fixing symptoms?
263
+
264
+ **Discuss with the user before attempting more fixes.**
265
+
266
+ This is NOT a failed hypothesis — this is a wrong architecture.
267
+
268
+ ---
269
+
270
+ ## Red Flags — STOP and Follow Process
271
+
272
+ If you catch yourself thinking:
273
+ - "Quick fix for now, investigate later"
274
+ - "Just try changing X and see if it works"
275
+ - "Add multiple changes, run tests"
276
+ - "Skip the test, I'll manually verify"
277
+ - "It's probably X, let me fix that"
278
+ - "I don't fully understand but this might work"
279
+ - "Pattern says X but I'll adapt it differently"
280
+ - "Here are the main problems: [lists fixes without investigation]"
281
+ - Proposing solutions before tracing data flow
282
+ - **"One more fix attempt" (when already tried 2+)**
283
+ - **Each fix reveals a new problem in a different place**
284
+
285
+ **ALL of these mean: STOP. Return to Phase 1.**
286
+
287
+ **If 3+ fixes failed:** Question the architecture (Phase 4 step 5).
288
+
289
+ ## Common Rationalizations
290
+
291
+ | Excuse | Reality |
292
+ |--------|---------|
293
+ | "Issue is simple, don't need process" | Simple issues have root causes too. Process is fast for simple bugs. |
294
+ | "Emergency, no time for process" | Systematic debugging is FASTER than guess-and-check thrashing. |
295
+ | "Just try this first, then investigate" | First fix sets the pattern. Do it right from the start. |
296
+ | "I'll write test after confirming fix works" | Untested fixes don't stick. Test first proves it. |
297
+ | "Multiple fixes at once saves time" | Can't isolate what worked. Causes new bugs. |
298
+ | "Reference too long, I'll adapt the pattern" | Partial understanding guarantees bugs. Read it completely. |
299
+ | "I see the problem, let me fix it" | Seeing symptoms ≠ understanding root cause. |
300
+ | "One more fix attempt" (after 2+ failures) | 3+ failures = architectural problem. Question the pattern, don't fix again. |
301
+
302
+ ## Quick Reference
303
+
304
+ | Phase | Key Activities | Success Criteria |
305
+ |-------|---------------|------------------|
306
+ | **1. Root Cause** | Read errors, reproduce, check changes, gather evidence, trace data flow | Understand WHAT and WHY |
307
+ | **2. Pattern** | Find working examples, compare, identify differences | Know what's different |
308
+ | **3. Hypothesis** | Form theory, test minimally, one variable at a time | Confirmed or new hypothesis |
309
+ | **4. Implementation** | Create regression test, fix root cause, verify | Bug resolved, all tests pass |
310
+
311
+ ##
312
+
313
+ ### Investigation Tools
314
+
315
+ Use these tools during Phase 1:
316
+
317
+ - **`search_files`** — Find error strings, trace function calls, locate patterns
318
+ - **`read_file`** — Read source code with line numbers for precise analysis
319
+ - **`terminal`** — Run tests, check git history, reproduce bugs
320
+ - **`web_search`/`web_extract`** — Research error messages, library docs
321
+
322
+ ### With delegate_task
323
+
324
+ For complex multi-component debugging, dispatch investigation subagents:
325
+
326
+ ```python
327
+ delegate_task(
328
+ goal="Investigate why [specific test/behavior] fails",
329
+ context="""
330
+ Follow systematic-debugging skill:
331
+ 1. Read the error message carefully
332
+ 2. Reproduce the issue
333
+ 3. Trace the data flow to find root cause
334
+ 4. Report findings — do NOT fix yet
335
+
336
+ Error: [paste full error]
337
+ File: [path to failing code]
338
+ Test command: [exact command]
339
+ """,
340
+ toolsets=['terminal', 'file']
341
+ )
342
+ ```
343
+
344
+ ### With test-driven-development
345
+
346
+ When fixing bugs:
347
+ 1. Write a test that reproduces the bug (RED)
348
+ 2. Debug systematically to find root cause
349
+ 3. Fix the root cause (GREEN)
350
+ 4. The test proves the fix and prevents regression
351
+
352
+ ## Real-World Impact
353
+
354
+ From debugging sessions:
355
+ - Systematic approach: 15-30 minutes to fix
356
+ - Random fixes approach: 2-3 hours of thrashing
357
+ - First-time fix rate: 95% vs 40%
358
+ - New bugs introduced: Near zero vs common
359
+
360
+ **No shortcuts. No guessing. Systematic always wins.**