litclaude-ai 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (156) hide show
  1. package/CHANGELOG.md +155 -0
  2. package/LICENSE +21 -0
  3. package/README.md +369 -0
  4. package/README_ko-KR.md +374 -0
  5. package/RELEASE_CHECKLIST.md +165 -0
  6. package/bin/litclaude-ai.js +643 -0
  7. package/cover.png +0 -0
  8. package/docs/agents.md +67 -0
  9. package/docs/hooks.md +134 -0
  10. package/docs/lsp.md +40 -0
  11. package/docs/migration.md +209 -0
  12. package/docs/workflow-compatibility-audit.md +119 -0
  13. package/generate_cover.py +123 -0
  14. package/package.json +48 -0
  15. package/plugins/litclaude/.claude-plugin/plugin.json +25 -0
  16. package/plugins/litclaude/.lsp.json +13 -0
  17. package/plugins/litclaude/.mcp.json +9 -0
  18. package/plugins/litclaude/agents/boulder-executor.md +12 -0
  19. package/plugins/litclaude/agents/librarian-researcher.md +15 -0
  20. package/plugins/litclaude/agents/oracle-verifier.md +16 -0
  21. package/plugins/litclaude/agents/prometheus-planner.md +13 -0
  22. package/plugins/litclaude/agents/qa-runner.md +16 -0
  23. package/plugins/litclaude/agents/quality-reviewer.md +17 -0
  24. package/plugins/litclaude/bin/litclaude-hook.js +110 -0
  25. package/plugins/litclaude/bin/litclaude-hud.js +271 -0
  26. package/plugins/litclaude/bin/litclaude-lsp-doctor.js +15 -0
  27. package/plugins/litclaude/bin/litclaude-mcp.js +70 -0
  28. package/plugins/litclaude/commands/deep-interview.md +21 -0
  29. package/plugins/litclaude/commands/dynamic-workflow.md +36 -0
  30. package/plugins/litclaude/commands/lit-loop.md +40 -0
  31. package/plugins/litclaude/commands/lit-plan.md +35 -0
  32. package/plugins/litclaude/commands/litgoal.md +30 -0
  33. package/plugins/litclaude/commands/review-work.md +35 -0
  34. package/plugins/litclaude/commands/start-work.md +36 -0
  35. package/plugins/litclaude/hooks/hooks.json +54 -0
  36. package/plugins/litclaude/lib/context-pressure.mjs +25 -0
  37. package/plugins/litclaude/lib/hud-accent-palette.mjs +58 -0
  38. package/plugins/litclaude/lib/litgoal/cli.mjs +266 -0
  39. package/plugins/litclaude/lib/litgoal/ledger.mjs +16 -0
  40. package/plugins/litclaude/lib/litgoal/paths.mjs +7 -0
  41. package/plugins/litclaude/lib/litgoal/state.mjs +67 -0
  42. package/plugins/litclaude/lib/mutated-file-paths.mjs +63 -0
  43. package/plugins/litclaude/lib/start-work-continuation.mjs +99 -0
  44. package/plugins/litclaude/lib/workflow-check.mjs +83 -0
  45. package/plugins/litclaude/skills/ai-slop-remover/SKILL.md +142 -0
  46. package/plugins/litclaude/skills/comment-checker/SKILL.md +55 -0
  47. package/plugins/litclaude/skills/debugging/SKILL.md +70 -0
  48. package/plugins/litclaude/skills/debugging/references/methodology/00-setup.md +108 -0
  49. package/plugins/litclaude/skills/debugging/references/methodology/02-investigate.md +126 -0
  50. package/plugins/litclaude/skills/debugging/references/methodology/04-oracle-triple.md +106 -0
  51. package/plugins/litclaude/skills/debugging/references/methodology/05-escalate.md +69 -0
  52. package/plugins/litclaude/skills/debugging/references/methodology/06-fix.md +116 -0
  53. package/plugins/litclaude/skills/debugging/references/methodology/08-qa.md +94 -0
  54. package/plugins/litclaude/skills/debugging/references/methodology/09-cleanup.md +164 -0
  55. package/plugins/litclaude/skills/debugging/references/methodology/partial-runtime-evidence.md +228 -0
  56. package/plugins/litclaude/skills/debugging/references/runtimes/bundled-js-binary.md +415 -0
  57. package/plugins/litclaude/skills/debugging/references/runtimes/go.md +252 -0
  58. package/plugins/litclaude/skills/debugging/references/runtimes/native-binary.md +484 -0
  59. package/plugins/litclaude/skills/debugging/references/runtimes/node.md +260 -0
  60. package/plugins/litclaude/skills/debugging/references/runtimes/python.md +248 -0
  61. package/plugins/litclaude/skills/debugging/references/runtimes/rust.md +234 -0
  62. package/plugins/litclaude/skills/debugging/references/tools/ghidra.md +212 -0
  63. package/plugins/litclaude/skills/debugging/references/tools/playwright-cli.md +194 -0
  64. package/plugins/litclaude/skills/debugging/references/tools/pwndbg.md +263 -0
  65. package/plugins/litclaude/skills/debugging/references/tools/pwntools.md +265 -0
  66. package/plugins/litclaude/skills/deep-interview/SKILL.md +323 -0
  67. package/plugins/litclaude/skills/deep-interview/scripts/render_progress.py +193 -0
  68. package/plugins/litclaude/skills/frontend-ui-ux/SKILL.md +62 -0
  69. package/plugins/litclaude/skills/lit-loop/SKILL.md +144 -0
  70. package/plugins/litclaude/skills/lit-plan/SKILL.md +125 -0
  71. package/plugins/litclaude/skills/litgoal/SKILL.md +219 -0
  72. package/plugins/litclaude/skills/lsp/SKILL.md +63 -0
  73. package/plugins/litclaude/skills/programming/SKILL.md +106 -0
  74. package/plugins/litclaude/skills/programming/references/go/README.md +90 -0
  75. package/plugins/litclaude/skills/programming/references/go/backend-stack.md +641 -0
  76. package/plugins/litclaude/skills/programming/references/go/bootstrap.md +328 -0
  77. package/plugins/litclaude/skills/programming/references/go/bubbletea-v2.md +360 -0
  78. package/plugins/litclaude/skills/programming/references/go/cobra-stack.md +468 -0
  79. package/plugins/litclaude/skills/programming/references/go/concurrency.md +362 -0
  80. package/plugins/litclaude/skills/programming/references/go/data-modeling.md +329 -0
  81. package/plugins/litclaude/skills/programming/references/go/error-handling.md +359 -0
  82. package/plugins/litclaude/skills/programming/references/go/golangci-strict.md +236 -0
  83. package/plugins/litclaude/skills/programming/references/go/grpc-connect.md +375 -0
  84. package/plugins/litclaude/skills/programming/references/go/libraries.md +337 -0
  85. package/plugins/litclaude/skills/programming/references/go/one-liners.md +202 -0
  86. package/plugins/litclaude/skills/programming/references/go/sqlc-pgx.md +471 -0
  87. package/plugins/litclaude/skills/programming/references/go/testing.md +467 -0
  88. package/plugins/litclaude/skills/programming/references/go/type-patterns.md +298 -0
  89. package/plugins/litclaude/skills/programming/references/python/README.md +314 -0
  90. package/plugins/litclaude/skills/programming/references/python/async-anyio.md +442 -0
  91. package/plugins/litclaude/skills/programming/references/python/data-modeling.md +233 -0
  92. package/plugins/litclaude/skills/programming/references/python/data-processing.md +133 -0
  93. package/plugins/litclaude/skills/programming/references/python/error-handling.md +218 -0
  94. package/plugins/litclaude/skills/programming/references/python/fastapi-stack.md +316 -0
  95. package/plugins/litclaude/skills/programming/references/python/httpx2-optimization.md +360 -0
  96. package/plugins/litclaude/skills/programming/references/python/libraries.md +307 -0
  97. package/plugins/litclaude/skills/programming/references/python/one-liners.md +268 -0
  98. package/plugins/litclaude/skills/programming/references/python/orjson-stack.md +378 -0
  99. package/plugins/litclaude/skills/programming/references/python/pydantic-ai.md +285 -0
  100. package/plugins/litclaude/skills/programming/references/python/pyproject-strict.md +232 -0
  101. package/plugins/litclaude/skills/programming/references/python/textual-tui.md +201 -0
  102. package/plugins/litclaude/skills/programming/references/python/type-patterns.md +176 -0
  103. package/plugins/litclaude/skills/programming/references/rust/README.md +317 -0
  104. package/plugins/litclaude/skills/programming/references/rust/async-tokio.md +299 -0
  105. package/plugins/litclaude/skills/programming/references/rust/axum-stack.md +467 -0
  106. package/plugins/litclaude/skills/programming/references/rust/cargo-strict.md +317 -0
  107. package/plugins/litclaude/skills/programming/references/rust/clap-stack.md +409 -0
  108. package/plugins/litclaude/skills/programming/references/rust/concurrency.md +375 -0
  109. package/plugins/litclaude/skills/programming/references/rust/libraries.md +439 -0
  110. package/plugins/litclaude/skills/programming/references/rust/one-liners.md +291 -0
  111. package/plugins/litclaude/skills/programming/references/rust/proptest-insta.md +429 -0
  112. package/plugins/litclaude/skills/programming/references/rust/type-state.md +354 -0
  113. package/plugins/litclaude/skills/programming/references/rust/unsafe-discipline.md +250 -0
  114. package/plugins/litclaude/skills/programming/references/rust/zero-cost-safety.md +527 -0
  115. package/plugins/litclaude/skills/programming/references/rust-ub/README.md +289 -0
  116. package/plugins/litclaude/skills/programming/references/rust-ub/miri-sanitizers-loom.md +411 -0
  117. package/plugins/litclaude/skills/programming/references/rust-ub/ub-taxonomy.md +269 -0
  118. package/plugins/litclaude/skills/programming/references/typescript/README.md +195 -0
  119. package/plugins/litclaude/skills/programming/references/typescript/backend-hono.md +672 -0
  120. package/plugins/litclaude/skills/programming/references/typescript/bootstrap.md +199 -0
  121. package/plugins/litclaude/skills/programming/references/typescript/data-modeling.md +202 -0
  122. package/plugins/litclaude/skills/programming/references/typescript/error-handling.md +169 -0
  123. package/plugins/litclaude/skills/programming/references/typescript/tsconfig-strict.md +152 -0
  124. package/plugins/litclaude/skills/programming/references/typescript/type-patterns.md +196 -0
  125. package/plugins/litclaude/skills/programming/scripts/go/check-no-excuse-rules.sh +173 -0
  126. package/plugins/litclaude/skills/programming/scripts/go/new-project.py +138 -0
  127. package/plugins/litclaude/skills/programming/scripts/go/templates/.editorconfig +13 -0
  128. package/plugins/litclaude/skills/programming/scripts/go/templates/.golangci.yml +95 -0
  129. package/plugins/litclaude/skills/programming/scripts/go/templates/AGENTS.md.tmpl +24 -0
  130. package/plugins/litclaude/skills/programming/scripts/go/templates/README.md.tmpl +12 -0
  131. package/plugins/litclaude/skills/programming/scripts/go/templates/Taskfile.yml +40 -0
  132. package/plugins/litclaude/skills/programming/scripts/go/templates/ci.yml +37 -0
  133. package/plugins/litclaude/skills/programming/scripts/go/templates/config.go +24 -0
  134. package/plugins/litclaude/skills/programming/scripts/go/templates/gitignore +15 -0
  135. package/plugins/litclaude/skills/programming/scripts/go/templates/main.go.tmpl +22 -0
  136. package/plugins/litclaude/skills/programming/scripts/go/templates/run.go +15 -0
  137. package/plugins/litclaude/skills/programming/scripts/python/check-no-excuse-rules.py +687 -0
  138. package/plugins/litclaude/skills/programming/scripts/python/new-project.py +172 -0
  139. package/plugins/litclaude/skills/programming/scripts/python/new-script.py +116 -0
  140. package/plugins/litclaude/skills/programming/scripts/rust/check-no-excuse-rules.py +296 -0
  141. package/plugins/litclaude/skills/programming/scripts/rust/check-no-excuse-rules.sh +158 -0
  142. package/plugins/litclaude/skills/programming/scripts/rust/new-project.py +175 -0
  143. package/plugins/litclaude/skills/programming/scripts/typescript/check-no-excuse-rules.ts +282 -0
  144. package/plugins/litclaude/skills/programming/scripts/typescript/new-project.ts +177 -0
  145. package/plugins/litclaude/skills/refactor/SKILL.md +73 -0
  146. package/plugins/litclaude/skills/remove-ai-slops/SKILL.md +52 -0
  147. package/plugins/litclaude/skills/review-work/SKILL.md +331 -0
  148. package/plugins/litclaude/skills/rules/SKILL.md +66 -0
  149. package/plugins/litclaude/skills/start-work/SKILL.md +132 -0
  150. package/scripts/audit-plan-checkboxes.mjs +37 -0
  151. package/scripts/doctor.mjs +41 -0
  152. package/scripts/inspect-agent-tools.mjs +27 -0
  153. package/scripts/postinstall.mjs +50 -0
  154. package/scripts/qa-claude-plugin-smoke.sh +60 -0
  155. package/scripts/qa-portable-install.sh +136 -0
  156. package/scripts/validate-plugin.mjs +72 -0
@@ -0,0 +1,142 @@
1
+ ---
2
+ name: ai-slop-remover
3
+ description: "Removes AI-generated code smells from a SINGLE file while preserving functionality. For multiple files, run per file in parallel."
4
+ ---
5
+ You are an expert code refactorer specializing in removing AI-generated "slop" patterns while STRICTLY preserving functionality.
6
+
7
+ **INPUT**: Exactly ONE file path. If multiple paths provided, REJECT and instruct to run one LitClaude pass per file.
8
+
9
+ ---
10
+
11
+ ## DETECTION CRITERIA (Specific)
12
+
13
+ ### 1. Obvious Comments (EXCLUDE: BDD comments like #given, #when, #then, #when/then)
14
+
15
+ **REMOVE**:
16
+ - Comments restating the code: `x += 1 # increment x`
17
+ - Docstrings on trivial methods: `"""Returns the name."""` for `def get_name(): return self.name`
18
+ - Section dividers: `# ===== HELPER FUNCTIONS =====`
19
+ - Commented-out code blocks
20
+ - `# TODO: future enhancement` without concrete plan
21
+ - `# Note: this is important` without explaining WHY
22
+
23
+ **KEEP**:
24
+ - Comments explaining WHY (business logic, edge cases, workarounds)
25
+ - Links to issues/tickets: `# See SPR-1234`
26
+ - Non-obvious algorithm explanations
27
+ - Regex explanations
28
+ - Matches to existing code style
29
+
30
+ ### 2. Over-Defensive Code
31
+
32
+ **REMOVE**:
33
+ - Null checks for values that CANNOT be None (e.g., Django request in view)
34
+ - `if x is not None and x.attr is not None:` when x is guaranteed
35
+ - Try-except around code that can't raise (e.g., dict literal access)
36
+ - `isinstance()` checks for statically typed parameters
37
+ - Default values for required parameters: `def foo(x: str = "")` when empty string is invalid
38
+ - Backward-compat shims: `_old_name = new_name # deprecated`
39
+ - `# removed` or `# deleted` comments for removed code
40
+ - Re-exports of unused items
41
+ - Verbose, duplicated, or redundant code / test cases
42
+
43
+ **KEEP**:
44
+ - Validation at system boundaries (user input, external API responses)
45
+ - Error handling for I/O operations
46
+ - Null checks for nullable DB fields
47
+ - assertions in test code to matching type expectations
48
+
49
+ ### 3. Spaghetti Nesting (2+ levels deep)
50
+
51
+ **REFACTOR**:
52
+ - Nested if-else chains -> early returns / guard clauses
53
+ - `if x: if y: if z:` -> `if not x: return` / `if not y: return`
54
+ - Nested loops with conditionals -> extract to helper OR use comprehensions
55
+ - Complex ternary `a if b else (c if d else e)` -> explicit if-else
56
+
57
+ ---
58
+
59
+ ## PROCESS
60
+
61
+ ### Step 1: Read & Analyze
62
+ Read the file. Identify ALL slop instances with line numbers.
63
+
64
+ ### Step 2: Deep Consideration (CRITICAL)
65
+ For EACH identified issue, think:
66
+ - **Functionality Impact**: Will removing this change behavior? If ANY doubt, SKIP.
67
+ - **Test Coverage**: Are there tests that might break? If uncertain, SKIP.
68
+ - **Context Dependency**: Is this "slop" actually necessary for this specific codebase? (e.g., defensive code for known flaky external API)
69
+ - **Readability Trade-off**: Will removal make code LESS readable? If yes, SKIP.
70
+
71
+ **RULE**: When in doubt, DO NOT CHANGE. False negatives are better than breaking code.
72
+
73
+ ### Step 3: Execute Changes
74
+ Make changes using Claude Code edit tools. One logical change at a time.
75
+
76
+ ### Step 4: Detailed Report
77
+
78
+ **OUTPUT FORMAT**:
79
+
80
+ ```
81
+ ## AI Slop Removed: {filename}
82
+
83
+ ### Analysis Summary
84
+ - Total issues found: N
85
+ - Issues fixed: M
86
+ - Issues skipped (safety): K
87
+
88
+ ### Changes Made
89
+
90
+ #### Change 1: [Category] Line X-Y
91
+ **Before**: [original code snippet]
92
+ **After**: [modified code snippet]
93
+ **Why this is slop**: [Explain why this pattern is problematic]
94
+ **Why safe to remove**: [Explain why functionality is preserved]
95
+ **Impact**: None - purely cosmetic improvement
96
+
97
+ ---
98
+
99
+ ### Skipped Issues (Preserved for Safety)
100
+
101
+ #### Skipped 1: Line X
102
+ **Reason**: [Why you chose not to change this]
103
+
104
+ ### Summary
105
+ - Removed N obvious comments
106
+ - Simplified M defensive patterns
107
+ - Flattened K nested structures
108
+ - Preserved L patterns that looked like slop but serve purpose
109
+ ```
110
+
111
+ ---
112
+
113
+ ## SAFETY RULES
114
+
115
+ 1. **NEVER remove error handling for I/O, network, or file operations**
116
+ 2. **NEVER simplify validation for user input or external data**
117
+ 3. **NEVER change public API signatures**
118
+ 4. **NEVER remove type hints (even redundant-looking ones)**
119
+ 5. **If a pattern appears in multiple places, it might be intentional - ASK before bulk removal**
120
+ 6. **Preserve all BDD test comments (#given, #when, #then)**
121
+
122
+ When finished, your report should be detailed enough that a reviewer can understand EXACTLY what changed and feel confident the changes are safe.
123
+
124
+ ---
125
+
126
+ ## WHEN NO SLOP FOUND
127
+
128
+ If the file is clean, report:
129
+
130
+ ```
131
+ ## AI Slop Analysis: {filename}
132
+
133
+ ### Result: No AI Slop Detected
134
+
135
+ This file is clean. Here's why:
136
+
137
+ **Comments**: N comments found, all explain WHY not WHAT
138
+ **Defensive Code**: Null checks present are appropriate (e.g., checks external API response)
139
+ **Code Structure**: Maximum nesting depth acceptable, early returns used appropriately
140
+
141
+ **Conclusion**: This code appears to be human-written or well-reviewed AI code. No changes needed.
142
+ ```
@@ -0,0 +1,55 @@
1
+ ---
2
+ name: comment-checker
3
+ description: "Comment hygiene workflow adapted for LitClaude: keep comments useful, remove stale narration, and preserve intent-bearing notes."
4
+ ---
5
+
6
+ # Comment Checker
7
+
8
+ Use this skill after code edits or documentation rewrites where comments may
9
+ become stale. The goal is not fewer comments; it is comments that carry intent
10
+ the code or tests cannot express.
11
+
12
+ ## Keep Comments That Explain
13
+
14
+ Keep comments when they explain:
15
+
16
+ - why a compatibility path exists
17
+ - why a safety boundary is conservative
18
+ - why a file is generated or packaged
19
+ - why a fallback exists for a specific runtime behavior
20
+ - why a test asserts a policy rather than exact prose
21
+
22
+ ## Remove Comments That Narrate
23
+
24
+ Remove or rewrite comments that merely repeat code:
25
+
26
+ - "read the file"
27
+ - "set the variable"
28
+ - "loop through items"
29
+ - "return result"
30
+
31
+ If the comment is only true because the current implementation happens to work
32
+ that way, prefer a clearer name or a test.
33
+
34
+ ## LitClaude-Specific Checks
35
+
36
+ - Comments about Claude registry writes must match actual paths.
37
+ - Comments about `/goal` must not imply the plugin can type it silently.
38
+ - Comments about `--dry-run` must distinguish inspection from final proof.
39
+ - Comments about package contents must match `package.json.files`.
40
+ - Comments in tests should explain policy assertions, not every regex.
41
+
42
+ ## Review Flow
43
+
44
+ 1. Search changed files for comments.
45
+ 2. For each comment, ask whether the next maintainer would be worse off without
46
+ it.
47
+ 3. Delete narration.
48
+ 4. Update stale intent comments.
49
+ 5. Run tests touching the changed files.
50
+
51
+ ## Evidence
52
+
53
+ For code comments, evidence is the diff plus tests. For docs comments or
54
+ markdown policy notes, evidence is docs tests and one manual scan of the
55
+ rendered or packaged surface when relevant.
@@ -0,0 +1,70 @@
1
+ ---
2
+ name: debugging
3
+ description: "Systematic Claude Code debugging workflow adapted for LitClaude: reproduce, localize, explain root cause, patch minimally, and verify the real failing surface."
4
+ ---
5
+
6
+ # Debugging
7
+
8
+ Use this skill when behavior is broken, surprising, flaky, or contradicted by
9
+ evidence. Do not patch from vibes. First make the failure reproducible.
10
+
11
+ ## Debugging Loop
12
+
13
+ 1. Reproduce the failure with the smallest command or scenario.
14
+ 2. Capture exact output, exit code, log, or transcript.
15
+ 3. State expected vs actual behavior.
16
+ 4. Localize the failing boundary.
17
+ 5. Patch minimally.
18
+ 6. Re-run the original reproduction.
19
+ 7. Add or update a regression test.
20
+ 8. Run one real-surface QA scenario.
21
+
22
+ ## Reproduction Standards
23
+
24
+ A good reproduction is:
25
+
26
+ - runnable from the project root
27
+ - deterministic or has a bounded flake note
28
+ - small enough for the next agent to rerun
29
+ - tied to one expected observable
30
+
31
+ For LitClaude examples:
32
+
33
+ - `printf ... | node plugins/litclaude/bin/litclaude-hook.js user-prompt-submit`
34
+ - `npm exec --yes --package ./ -- litclaude-ai --dry-run install`
35
+ - `CLAUDE_CONFIG_DIR=$(mktemp -d) node bin/litclaude-ai.js install`
36
+ - `node --test test/hooks.test.mjs --test-name-pattern litwork`
37
+
38
+ ## Root Cause Discipline
39
+
40
+ Explain the mechanism, not just the symptom. "The test failed" is not a root
41
+ cause. "The command markdown mentions `$ARGUMENTS`, which Claude renders empty
42
+ and tests reject as a broken placeholder" is a root cause.
43
+
44
+ ## Common LitClaude Failure Classes
45
+
46
+ - npm same-name source checkout causes `npx litclaude-ai` resolution confusion.
47
+ - `npm publish` requires OTP and fails with `EOTP`.
48
+ - CLI smoke uses unsupported `--version` and receives usage.
49
+ - Claude Code exposes `/goal` as a slash command but not model-facing tools.
50
+ - Plugin manifest version differs from `package.json`.
51
+ - Hook JSON parsing fails with stack traces instead of controlled errors.
52
+ - Package `files` omit a new skill or command.
53
+
54
+ ## Patch Rules
55
+
56
+ - Patch the failing boundary, not an unrelated symptom.
57
+ - Keep compatibility aliases unless a breaking change is intentional.
58
+ - Add tests before production behavior changes.
59
+ - Do not broaden catch blocks without asserting the new error path.
60
+ - Do not call network or publish in tests.
61
+
62
+ ## Evidence
63
+
64
+ Record:
65
+
66
+ - failing command and output
67
+ - changed file
68
+ - regression test
69
+ - passing command and output
70
+ - manual QA artifact and cleanup
@@ -0,0 +1,108 @@
1
+ # Phase 0 + 1 — Environment Assessment & Journal Setup
2
+
3
+ Before a debugger touches anything, you need a map of what's running and a ledger of what you'll touch. Skipping either phase is how debug sessions turn into "why is my repo dirty a week later" sessions.
4
+
5
+ ---
6
+
7
+ ## Phase 0 — Environment Assessment
8
+
9
+ Map the ground truth before you attach. Attaching the wrong way wastes the first hour.
10
+
11
+ ### 1. Identify the runtime
12
+
13
+ Read the actual manifest file, don't guess from extensions:
14
+
15
+ - Python → `pyproject.toml`, `requirements*.txt`, `setup.py`, `uv.lock`, `.python-version`
16
+ - Node → `package.json` (check `scripts`, check `engines`, check `type: module`)
17
+ - Rust → `Cargo.toml`, `rust-toolchain*`
18
+ - Go → `go.mod`, `go.sum`
19
+ - Native / mixed → `Makefile`, `CMakeLists.txt`, the binary itself (`file <path>`)
20
+
21
+ ### 2. Load the matching runtime reference
22
+
23
+ The moment you know the runtime, open `references/runtimes/<runtime>.md`. The commands in this phase (and every phase after) are runtime-specific. The shape of the answers is the same; the commands are not.
24
+
25
+ ### 3. Gather observable environment state
26
+
27
+ The shape of the answers you need (commands in the runtime reference):
28
+
29
+ | Question | Why it matters |
30
+ |---|---|
31
+ | What binary/interpreter/runtime actually launches the process? | Determines debugger flag plumbing. Wrappers (`tsx`, `poetry run`, `cargo run`, `bun`, supervisor scripts) change how flags propagate. |
32
+ | Is there already a debug-relevant port in use, or another instance of the service running? | Either attach to it or kill it deliberately — never silently compete. |
33
+ | Are symbols / source maps / debug info present and correct? | This determines whether breakpoints land on the right lines. Compiled-but-not-debug builds, stripped binaries, and incomplete source maps all silently misplace breakpoints. |
34
+ | Does the code path require env vars, config files, or auth tokens to reach the bug? | Missing env often produces early-return paths that masquerade as the bug itself. |
35
+ | Is there an existing failing test or known repro? | Prefer amplifying an existing repro over inventing one. |
36
+ | Are watchers (file watchers, hot reloaders, supervisors) going to restart the process mid-session? | If yes, turn them off before attaching. Restarts drop inspector connections and invalidate breakpoints. |
37
+
38
+ ### 4. Gate check
39
+
40
+ If any answer is "I'm not sure", you are not ready for Phase 1. Investigate until certain. Guessing here cascades into false-positive hypotheses in Phase 2.
41
+
42
+ ---
43
+
44
+ ## Phase 1 — Journal Setup
45
+
46
+ Open **one** journal file at the project root: `.debug-journal.md`. Single source of truth for every artifact this skill creates. The contract with the user that you can undo everything.
47
+
48
+ ### Exclude from git (don't pollute the committed ignore list)
49
+
50
+ ```bash
51
+ grep -qx '.debug-journal.md' .git/info/exclude || echo '.debug-journal.md' >> .git/info/exclude
52
+ ```
53
+
54
+ `.git/info/exclude` is per-clone and not committed — perfect for local-session artifacts.
55
+
56
+ ### Journal template
57
+
58
+ ```markdown
59
+ # Debug Journal — <short bug name>
60
+ Started: <ISO timestamp>
61
+ Goal: <one-sentence user request>
62
+
63
+ ## Environment snapshot (Phase 0)
64
+ - Runtime: <language + version + launcher>
65
+ - Entry: <command that starts the process>
66
+ - Ports / sockets: <app=..., debugger=..., etc>
67
+ - Git HEAD: <sha>, working tree clean? <yes/no>
68
+ - References read: <list the files from references/ you loaded — proves you did the gate>
69
+
70
+ ## Hypotheses
71
+ 1. [STATUS] <hypothesis> — distinguishing evidence: <what would confirm/refute> — if true, fix is: <two words>
72
+ 2. ...
73
+
74
+ ## Failed hypothesis round counter
75
+ - Round 1: <result>
76
+ - Round 2: <result>
77
+ <!-- At 2 consecutive failures, invoke Oracle Triple (see 04-oracle-triple.md). -->
78
+
79
+ ## Artifacts to revert
80
+ <!-- Every temp edit, tmux session, fixture, env override, saved debugger session goes here
81
+ BEFORE it is created. The rule is journal-then-modify. -->
82
+ - [ ] `src/foo.py` — added `breakpoint()` on 2 lines. Revert: `git checkout src/foo.py`
83
+ - [ ] tmux session `debug-server`. Kill: `tmux kill-session -t debug-server`
84
+ - [ ] `/tmp/debug-payload.json`. Remove: `rm /tmp/debug-payload.json`
85
+ - [ ] env var in current shell: `FOO_BASE_URL=...`. Unset when done.
86
+ - [ ] GDB session save: `~/ghidra-projects/scratch.gzf`. Remove if not promoting.
87
+
88
+ ## Findings
89
+ <!-- Append observed values here with timestamp. Verbatim only, no paraphrasing. -->
90
+
91
+ ## Oracle Triple (if invoked)
92
+ <!-- One subsection per Oracle round, with the synthesized new hypothesis set. -->
93
+
94
+ ## Final fix
95
+ <!-- File paths + test path. Filled during Phase 7. -->
96
+ ```
97
+
98
+ ### The journal-then-modify rule
99
+
100
+ Before any modification to the repo, shell, or system state, append to "Artifacts to revert" first. This one discipline is what prevents debug sessions from becoming git cleanup sessions.
101
+
102
+ If you catch yourself about to run a command that creates a file, opens a port, or modifies source — stop, journal the intended artifact with its revert command, then run the command. Not the other way around.
103
+
104
+ ### Why a single journal (not scattered TODO comments)
105
+
106
+ - One `git checkout`, one `rm`, one `tmux kill-session` list — simple Phase 9 walk.
107
+ - Survives interruptions. If you get pulled away mid-session, the next agent (or you later) can continue or revert without guessing.
108
+ - Prevents the most common failure: leaving `console.log`/`print()`/`dbg!` scattered across the tree.
@@ -0,0 +1,126 @@
1
+ # Phase 2 + 3 — Hypothesis Formation & Parallel Investigation
2
+
3
+ One hypothesis is a hunch. Three hypotheses is a decision. Investigation is how you turn the decision into runtime evidence.
4
+
5
+ ---
6
+
7
+ ## Phase 2 — Hypothesis Formation (Minimum Three)
8
+
9
+ ### Why three, not one
10
+
11
+ A single hypothesis creates confirmation bias: you'll read runtime state looking for evidence that confirms it and unconsciously discount contradictions. Three hypotheses force you to design queries that *distinguish* between them, which is the only way runtime evidence becomes decisive.
12
+
13
+ ### Generate across orthogonal axes
14
+
15
+ If your three hypotheses are all variations of "the handler has a bug", you don't actually have three hypotheses. Span the space:
16
+
17
+ | Axis | Example framing |
18
+ |---|---|
19
+ | **User-code logic** | "The handler early-returns because condition X is unexpectedly true" |
20
+ | **Library/SDK behavior** | "The third-party client swallows the error and returns a stub" |
21
+ | **Environment/config** | "The env var is read at module-load time before it gets populated, so it's empty" |
22
+ | **Async/timing** | "The promise rejects (or goroutine panics) after the response is already sent" |
23
+ | **Silent side-effect** | "An earlier turn mutated shared state that the current turn inherits" |
24
+ | **Observability gap** | "The error is raised but suppressed before logging; it only exists as an unawaited rejection / ignored signal" |
25
+ | **Binary-level** (when applicable) | "The function we think is running is actually jumped over by a patched thunk / a different version loaded" |
26
+ | **Build-vs-runtime** | "The code we're reading is not the code that's running — stale build, wrong symlink, cached wheel, or dist/ ahead of src/" |
27
+
28
+ ### For each hypothesis, write in the journal
29
+
30
+ 1. **Claim** — one sentence.
31
+ 2. **Distinguishing evidence** — the exact value or state that confirms or refutes it, AND where to read it (file:line, log source, breakpoint location, memory address).
32
+ 3. **If true, the fix is** — two words. Forces you to think through fix cost before committing to the hunt.
33
+
34
+ ### Collapse rule
35
+
36
+ If two hypotheses have identical distinguishing evidence, they aren't actually different — collapse them and find a real alternative. If you can't come up with a third distinct hypothesis, you don't understand the system well enough yet. Go read a little more code before investigating.
37
+
38
+ ---
39
+
40
+ ## Phase 3 — Parallel Investigation
41
+
42
+ Branch depending on what's available.
43
+
44
+ ### Path A: Team mode ENABLED
45
+
46
+ When the `Claude workflow coordination` tools are present, create a **debug-squad** team and split investigation across members working on different evidence sources. This is the right default whenever you have ≥3 hypotheses and any of them would take >10 minutes to investigate single-threaded.
47
+
48
+ **Team spec** — write to `~/.litclaude/teams/debug-squad/config.json`:
49
+
50
+ ```json
51
+ {
52
+ "name": "debug-squad",
53
+ "lead": { "kind": "subagent_type", "subagent_type": "sisyphus" },
54
+ "members": [
55
+ {
56
+ "kind": "category",
57
+ "category": "deep",
58
+ "prompt": "You are the Runtime State Inspector. Your job: attach to the live process, hit breakpoints, read program state (variables, heap, goroutines, stack, registers depending on runtime), and report observed values verbatim. Never guess — if you don't see the value, say so. Report back via Claude workflow message with file:line / address references and captured values. Never edit source code. Never run git commands. If you need an instrumentation statement added (breakpoint(), debugger;, dbg!, etc.), ask the Lead first."
59
+ },
60
+ {
61
+ "kind": "category",
62
+ "category": "deep",
63
+ "prompt": "You are the Log Archaeologist. Your job: grep server logs, stderr streams, SDK-internal debug output (DEBUG env, RUST_LOG, GODEBUG, PYTHONASYNCIODEBUG), and correlate timestamps. Produce a timeline of events with latencies. Flag anything that looks like a silent catch, a swallowed rejection, a panic recovered-and-ignored, a success response that contains failure signals (HTTP 200 with empty body, stopReason=error, exit 0 with error-in-stdout). Never edit source code."
64
+ },
65
+ {
66
+ "kind": "category",
67
+ "category": "deep",
68
+ "prompt": "You are the Reproduction Engineer. Your job: build the smallest reliable repro — a curl command, a vitest/pytest/go test, a tmux script, a Playwright script for browser bugs, a pwntools script for binary targets. It must reproduce on first try and be copy-pasteable by the Lead. Document exact input, expected output, observed output. Save repro artifacts under /tmp/ and tell the Lead to journal them. If the bug is browser-based you MUST use Playwright CLI — do not simulate with curl."
69
+ },
70
+ {
71
+ "kind": "category",
72
+ "category": "deep",
73
+ "prompt": "You are the Trace Correlator. Your job: take findings from the other members and cross-link them. Build a causal chain from symptom to suspected cause. Identify missing evidence. Propose the next single most-decisive runtime query. Never edit source code; only reason across already-captured evidence. If hypotheses diverge sharply after correlation, tell the Lead immediately — that is the signal for the Oracle Triple."
74
+ }
75
+ ]
76
+ }
77
+ ```
78
+
79
+ **Assignment rule**: one hypothesis → one `Claude workflow task`. Give each hypothesis to the member whose evidence source is most likely to confirm or refute it. Broadcast the full hypothesis list once via `Claude workflow message(to="*")` so members know what the others are testing.
80
+
81
+ **Lead responsibilities**:
82
+ - Maintain the journal (members do not write to it).
83
+ - Approve any source-code edits (including `debugger;` / `breakpoint()` / `dbg!` statements).
84
+ - Synthesize member reports into updated hypothesis statuses.
85
+ - Decide when to disband: `Claude workflow shutdown request` → `Claude workflow shutdown approval` → `Claude workflow cleanup`.
86
+
87
+ **Team does NOT include Oracle** — Oracle is a hard-reject team member type. Oracle is used separately in Phase 4 (see `04-oracle-triple.md`).
88
+
89
+ ### Path B: Team mode DISABLED
90
+
91
+ Fan out async Claude Code workflow lanes or subagents instead. Same rule: one
92
+ hypothesis per lane.
93
+
94
+ - Lane 1: runtime state investigation for hypothesis 1, with the bug summary
95
+ and exact state to inspect.
96
+ - Lane 2: log/timing investigation for hypothesis 2.
97
+ - Lane 3: reproduction minimizer for hypothesis 3.
98
+
99
+ End your response, wait for completion notifications, then synthesize.
100
+
101
+ ---
102
+
103
+ ## Evidence capture discipline (both paths)
104
+
105
+ For every piece of runtime state captured, record in the journal:
106
+
107
+ ```markdown
108
+ ### <ISO timestamp> — <what you looked at>
109
+ - Source: <file:line | log source | curl command | breakpoint address>
110
+ - Value: `<verbatim>`
111
+ - Interpretation: <one line — why this matters>
112
+ - Refutes/Confirms: H<n>
113
+ ```
114
+
115
+ **Verbatim values only. No paraphrasing.**
116
+
117
+ - `messages.length=0` is evidence.
118
+ - "messages seemed empty" is not evidence — it's a memory of an observation, and memory of observations is where debug sessions go to die.
119
+
120
+ If you find yourself about to paraphrase, stop, go back, and copy the raw value.
121
+
122
+ ---
123
+
124
+ ## Round completion
125
+
126
+ A "round" is complete when every hypothesis has either confirming or refuting evidence — or when you have exhausted the evidence sources available without a decisive result. If the round ends inconclusively, that counts as a failed round for the counter in the journal. See `04-oracle-triple.md` for what to do at 2 consecutive failed rounds.
@@ -0,0 +1,106 @@
1
+ # Phase 4 — Oracle Triple Consultation
2
+
3
+ At 2 consecutive failed hypothesis rounds, stop investigating and reframe. Continuing past two failures usually means the real cause is in a category you haven't imagined — and more time on your current mental model is wasted time.
4
+
5
+ The Oracle Triple is how you break out of the mental box.
6
+
7
+ > ⚠️ **Wrong tool for non-debugging tasks.** The Triple is for *stuck root-cause hunts*. If your task is producing an artifact (extraction, reverse engineering, audit, compliance documentation) and you want a skeptical review before declaring it done, use the **Verification Oracle** pattern in [partial-runtime-evidence.md](partial-runtime-evidence.md#verification-oracle-pattern-for-non-debug-tasks). Running the Triple on a finished extraction returns three diverging "what if you tried…" tangents that are not what you need.
8
+
9
+ ---
10
+
11
+ ## When to invoke
12
+
13
+ | Situation | Invoke? |
14
+ |---|---|
15
+ | 1 round failed, you have new distinguishing evidence | No — run one more round with a refined hypothesis set |
16
+ | 2 rounds failed, hypotheses now feel like variations of each other | **Yes — invoke now** |
17
+ | 2 rounds failed, no new evidence angles left to try | **Yes — invoke now** |
18
+ | You've been investigating >2 hours on the same bug | **Yes — invoke now regardless of round count** |
19
+ | 1 round failed but the user is watching and wants speed | No — one round isn't enough to justify Oracle cost. Resist the urge. |
20
+
21
+ ---
22
+
23
+ ## Why three Oracles, and why *orthogonal* framings
24
+
25
+ A single Oracle call returns a single coherent analysis. Coherent analyses tend to inherit the framing of the prompt, which means they inherit the same blind spots the investigator already has. Three Oracles with *orthogonal framings* force the analyses to diverge, and the places where they agree across frames is where the real signal lives.
26
+
27
+ The three framings below are chosen to cover distinct bug-cause categories:
28
+
29
+ - **A (obvious-but-missed)** — embarrassingly simple causes the investigator walked past.
30
+ - **B (system-boundary)** — causes living at integration seams, not in the code being read.
31
+ - **C (invariant-violation)** — assumptions load-bearing to current hypotheses that may themselves be false.
32
+
33
+ Spawn all three in parallel.
34
+
35
+ ---
36
+
37
+ ## The three prompts
38
+
39
+ Launch three parallel Claude Code workflow lanes or subagents with the same
40
+ evidence packet and three orthogonal prompts:
41
+
42
+ - Framing A - obvious-but-missed: ask for the three simplest causes a senior
43
+ engineer might spot quickly, including typos, stale cache, wrong process,
44
+ wrong file, wrong import, or test harness mismatch.
45
+ - Framing B - system-boundary: ask for three causes at integration boundaries,
46
+ such as third-party SDK behavior, middleware mutation, proxy rewrites,
47
+ build-time vs runtime env resolution, module-load order, shared library
48
+ mismatch, ABI differences, or transport mismatch.
49
+ - Framing C - invariant-violation: ask for the five most load-bearing assumed
50
+ invariants and the smallest runtime query that would falsify each.
51
+
52
+ ---
53
+
54
+ ## Synthesizing across three Oracles
55
+
56
+ **Do not pick the highest-ranked candidate from a single Oracle.** That defeats the purpose of getting three framings.
57
+
58
+ Instead, walk the outputs in this order:
59
+
60
+ ### 1. Agreement scan
61
+
62
+ Note which candidate causes appear in at least two Oracles' outputs. Independent agreement across orthogonal framings is strong signal — when the obvious-but-missed framing and the system-boundary framing both land on the same cause, that's usually the bug.
63
+
64
+ ### 2. Disagreement scan
65
+
66
+ Note where Oracles disagree. Disagreement is genuine uncertainty that runtime evidence (not more reasoning) must resolve. Each disagreement becomes a candidate for the next round's distinguishing query.
67
+
68
+ ### 3. New falsification queries
69
+
70
+ Framing C produces concrete "one query that would decide it" suggestions. Pull these verbatim into your new round's evidence-gathering plan — they are designed to be decisive.
71
+
72
+ ### 4. Build the new hypothesis set
73
+
74
+ Minimum 3, same rules as Phase 2. Aim to have hypotheses drawn from the agreement scan (likely cause) AND from the disagreement scan (so one round's evidence resolves the disagreement).
75
+
76
+ Record in the journal:
77
+
78
+ ```markdown
79
+ ## Oracle Triple — Round <N>
80
+ - Invoked at: <ISO timestamp>
81
+ - Framing A summary: <top 3 candidates, one line each>
82
+ - Framing B summary: <top 3 candidates>
83
+ - Framing C summary: <5 load-bearing assumptions + falsification queries>
84
+
85
+ ### Cross-framing agreement
86
+ - <candidate> appeared in A + B
87
+ - <candidate> appeared in B + C
88
+
89
+ ### New hypothesis set
90
+ 1. <hypothesis> — evidence to gather: <one-liner>
91
+ 2. ...
92
+ ```
93
+
94
+ ### 5. Reset the counter
95
+
96
+ Reset the "consecutive failed rounds" counter to 0. Return to Phase 3 (parallel investigation) with the new set.
97
+
98
+ ---
99
+
100
+ ## If *another* 2 rounds fail after the Oracle Triple
101
+
102
+ You are genuinely stuck. This is the escalation threshold.
103
+
104
+ Escalate to the user (see `05-escalate.md`) with the full trace: every hypothesis tried, every piece of evidence captured, both Oracle syntheses. Do not guess a fix.
105
+
106
+ This is rare — in practice, the Oracle Triple resolves almost all stuck debugging sessions within one round, because it pulls in framings the investigator was too close to the code to see.
@@ -0,0 +1,69 @@
1
+ # Phase 5 — User Decision Escalation
2
+
3
+ Escalation is for genuine ambiguity, not for skipping investigation. Most "should I ask the user" moments are really "I don't want to do one more query" moments, and those are wrong.
4
+
5
+ ---
6
+
7
+ ## Ask the user ONLY when
8
+
9
+ - **Evidence exhausted**, contradictions remain, and further investigation would require a decision with policy implications (e.g. "patch the third-party SDK vs wrap it vs change architecture").
10
+ - The bug has **multiple valid fixes with different scope/risk tradeoffs** and the user's preference drives the choice.
11
+ - A proposed fix would **change observable product behavior** for the end user (not just fix the internal bug).
12
+ - You've **exhausted the Oracle Triple** and another 2 rounds failed after synthesis.
13
+
14
+ ## Do NOT ask when
15
+
16
+ - You haven't tried the Oracle Triple yet.
17
+ - The question can be answered by one more runtime query.
18
+ - You're asking for permission to do the obvious thing.
19
+ - You're asking because you're tired.
20
+
21
+ ---
22
+
23
+ ## Escalation format (paste into the reply)
24
+
25
+ Keep it short. Evidence-dense. One decision, not a status update.
26
+
27
+ ```markdown
28
+ ## Decision needed
29
+
30
+ **What we know** (verbatim evidence, not paraphrase):
31
+ - <fact 1 with file:line or address>
32
+ - <fact 2 with source>
33
+ - <what the evidence rules IN>
34
+ - <what the evidence rules OUT>
35
+
36
+ **What the decision is** (one sentence):
37
+ <the fork in the road>
38
+
39
+ **Options**:
40
+
41
+ | # | Fix | Scope | Risk | Effort |
42
+ |---|-----|-------|------|--------|
43
+ | A | <short label> | <files touched / layers> | <regressions possible> | <rough> |
44
+ | B | ... | ... | ... | ... |
45
+ | C | ... | ... | ... | ... |
46
+
47
+ **Recommendation**: <A/B/C> because <one-sentence reason>.
48
+
49
+ Which direction do you want?
50
+ ```
51
+
52
+ ---
53
+
54
+ ## Anti-patterns in escalation
55
+
56
+ - **Asking without evidence.** "What do you want me to do?" is not an escalation, it's abandonment. Every escalation includes the evidence the user needs to decide.
57
+ - **Two questions in one.** One decision per escalation. Multi-part questions lead to partial answers and re-escalation.
58
+ - **Escalating before Phase 4.** If you haven't tried the Oracle Triple, you haven't earned the right to escalate.
59
+ - **Presenting options you don't actually have.** If option C requires a library the user doesn't use, don't list it. The options are only things you can actually do today.
60
+ - **Hiding a recommendation.** The user hired you to think — always end with a recommendation, even if you're low-confidence. Say so explicitly: "Recommendation (low confidence): B, because X. If you have context about Y that I don't, it might change to A."
61
+
62
+ ---
63
+
64
+ ## What happens after the user responds
65
+
66
+ - **User picks an option**: return to Phase 6 (root cause confirmation) with the chosen direction. The user's choice is not itself confirmation — you still need runtime evidence that the cause you're fixing is the cause in play.
67
+ - **User proposes a different option you hadn't considered**: treat it as new information. Update hypotheses. May trigger another Phase 3 round.
68
+ - **User gives more context that resolves the disagreement**: skip to Phase 6.
69
+ - **User is also unsure**: that's a signal you need more evidence, not more opinions. Run one more targeted query before asking again.