devflow-kit 1.4.0 → 1.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (95) hide show
  1. package/CHANGELOG.md +51 -0
  2. package/README.md +7 -3
  3. package/dist/commands/ambient.js +1 -1
  4. package/dist/commands/init.js +31 -2
  5. package/dist/commands/list.d.ts +21 -0
  6. package/dist/commands/list.js +71 -3
  7. package/dist/plugins.js +24 -24
  8. package/dist/utils/manifest.d.ts +45 -0
  9. package/dist/utils/manifest.js +100 -0
  10. package/dist/utils/post-install.js +6 -1
  11. package/package.json +1 -1
  12. package/plugins/devflow-accessibility/.claude-plugin/plugin.json +1 -1
  13. package/plugins/devflow-ambient/.claude-plugin/plugin.json +25 -4
  14. package/plugins/devflow-ambient/README.md +48 -29
  15. package/plugins/devflow-ambient/agents/coder.md +135 -0
  16. package/plugins/devflow-ambient/agents/reviewer.md +165 -0
  17. package/plugins/devflow-ambient/agents/scrutinizer.md +80 -0
  18. package/plugins/devflow-ambient/agents/shepherd.md +94 -0
  19. package/plugins/devflow-ambient/agents/simplifier.md +93 -0
  20. package/plugins/devflow-ambient/agents/skimmer.md +93 -0
  21. package/plugins/devflow-ambient/agents/validator.md +86 -0
  22. package/plugins/devflow-ambient/skills/ambient-router/SKILL.md +72 -28
  23. package/plugins/devflow-ambient/skills/ambient-router/references/skill-catalog.md +40 -34
  24. package/plugins/devflow-ambient/skills/debug-orchestration/SKILL.md +69 -0
  25. package/plugins/devflow-ambient/skills/implementation-orchestration/SKILL.md +92 -0
  26. package/plugins/devflow-ambient/skills/plan-orchestration/SKILL.md +71 -0
  27. package/plugins/devflow-audit-claude/.claude-plugin/plugin.json +10 -1
  28. package/plugins/devflow-audit-claude/commands/audit-claude.md +4 -0
  29. package/plugins/devflow-code-review/.claude-plugin/plugin.json +2 -1
  30. package/plugins/devflow-code-review/agents/reviewer.md +47 -9
  31. package/plugins/devflow-code-review/agents/synthesizer.md +12 -5
  32. package/plugins/devflow-code-review/commands/code-review-teams.md +43 -30
  33. package/plugins/devflow-code-review/commands/code-review.md +14 -2
  34. package/plugins/devflow-code-review/skills/knowledge-persistence/SKILL.md +128 -0
  35. package/plugins/devflow-code-review/skills/knowledge-persistence/references/examples.md +44 -0
  36. package/plugins/devflow-core-skills/.claude-plugin/plugin.json +2 -1
  37. package/plugins/devflow-core-skills/skills/docs-framework/SKILL.md +7 -1
  38. package/plugins/devflow-core-skills/skills/search-first/SKILL.md +133 -0
  39. package/plugins/devflow-core-skills/skills/search-first/references/evaluation-criteria.md +101 -0
  40. package/plugins/devflow-core-skills/skills/test-driven-development/SKILL.md +6 -5
  41. package/plugins/devflow-debug/.claude-plugin/plugin.json +5 -3
  42. package/plugins/devflow-debug/agents/synthesizer.md +211 -0
  43. package/plugins/devflow-debug/commands/debug-teams.md +28 -14
  44. package/plugins/devflow-debug/commands/debug.md +26 -12
  45. package/plugins/devflow-debug/skills/knowledge-persistence/SKILL.md +128 -0
  46. package/plugins/devflow-debug/skills/knowledge-persistence/references/examples.md +44 -0
  47. package/plugins/devflow-frontend-design/.claude-plugin/plugin.json +1 -1
  48. package/plugins/devflow-go/.claude-plugin/plugin.json +1 -1
  49. package/plugins/devflow-implement/.claude-plugin/plugin.json +2 -1
  50. package/plugins/devflow-implement/agents/coder.md +21 -13
  51. package/plugins/devflow-implement/agents/simplifier.md +32 -1
  52. package/plugins/devflow-implement/agents/skimmer.md +5 -0
  53. package/plugins/devflow-implement/agents/synthesizer.md +12 -5
  54. package/plugins/devflow-implement/commands/implement-teams.md +73 -60
  55. package/plugins/devflow-implement/commands/implement.md +45 -40
  56. package/plugins/devflow-implement/skills/knowledge-persistence/SKILL.md +128 -0
  57. package/plugins/devflow-implement/skills/knowledge-persistence/references/examples.md +44 -0
  58. package/plugins/devflow-java/.claude-plugin/plugin.json +1 -1
  59. package/plugins/devflow-python/.claude-plugin/plugin.json +1 -1
  60. package/plugins/devflow-react/.claude-plugin/plugin.json +1 -1
  61. package/plugins/devflow-resolve/.claude-plugin/plugin.json +4 -3
  62. package/plugins/devflow-resolve/agents/simplifier.md +32 -1
  63. package/plugins/devflow-resolve/commands/resolve-teams.md +16 -7
  64. package/plugins/devflow-resolve/commands/resolve.md +16 -7
  65. package/plugins/devflow-resolve/skills/knowledge-persistence/SKILL.md +128 -0
  66. package/plugins/devflow-resolve/skills/knowledge-persistence/references/examples.md +44 -0
  67. package/plugins/devflow-rust/.claude-plugin/plugin.json +1 -1
  68. package/plugins/devflow-self-review/.claude-plugin/plugin.json +10 -1
  69. package/plugins/devflow-self-review/agents/simplifier.md +32 -1
  70. package/plugins/devflow-self-review/commands/self-review.md +10 -4
  71. package/plugins/devflow-specify/.claude-plugin/plugin.json +1 -1
  72. package/plugins/devflow-specify/agents/skimmer.md +5 -0
  73. package/plugins/devflow-specify/agents/synthesizer.md +12 -5
  74. package/plugins/devflow-specify/commands/specify-teams.md +27 -20
  75. package/plugins/devflow-specify/commands/specify.md +26 -19
  76. package/plugins/devflow-typescript/.claude-plugin/plugin.json +1 -1
  77. package/scripts/hooks/ambient-prompt +8 -7
  78. package/scripts/hooks/session-start-memory +33 -3
  79. package/shared/agents/coder.md +21 -13
  80. package/shared/agents/reviewer.md +47 -9
  81. package/shared/agents/simplifier.md +32 -1
  82. package/shared/agents/skimmer.md +5 -0
  83. package/shared/agents/synthesizer.md +12 -5
  84. package/shared/skills/ambient-router/SKILL.md +72 -28
  85. package/shared/skills/ambient-router/references/skill-catalog.md +40 -34
  86. package/shared/skills/debug-orchestration/SKILL.md +69 -0
  87. package/shared/skills/docs-framework/SKILL.md +7 -1
  88. package/shared/skills/implementation-orchestration/SKILL.md +92 -0
  89. package/shared/skills/knowledge-persistence/SKILL.md +128 -0
  90. package/shared/skills/knowledge-persistence/references/examples.md +44 -0
  91. package/shared/skills/plan-orchestration/SKILL.md +71 -0
  92. package/shared/skills/search-first/SKILL.md +133 -0
  93. package/shared/skills/search-first/references/evaluation-criteria.md +101 -0
  94. package/shared/skills/test-driven-development/SKILL.md +6 -5
  95. package/plugins/devflow-ambient/commands/ambient.md +0 -110
@@ -4,7 +4,7 @@
4
4
  "author": {
5
5
  "name": "Dean0x"
6
6
  },
7
- "version": "1.4.0",
7
+ "version": "1.6.0",
8
8
  "homepage": "https://github.com/dean0x/devflow",
9
9
  "repository": "https://github.com/dean0x/devflow",
10
10
  "license": "MIT",
@@ -24,6 +24,7 @@
24
24
  "git-workflow",
25
25
  "github-patterns",
26
26
  "input-validation",
27
+ "search-first",
27
28
  "test-driven-development",
28
29
  "test-patterns"
29
30
  ]
@@ -39,7 +39,10 @@ All generated documentation lives under `.docs/` in the project root:
39
39
  .memory/
40
40
  ├── WORKING-MEMORY.md # Auto-maintained by Stop hook (overwritten)
41
41
  ├── PROJECT-PATTERNS.md # Accumulated patterns (merged across sessions)
42
- └── backup.json # Pre-compact git state snapshot
42
+ ├── backup.json # Pre-compact git state snapshot
43
+ └── knowledge/
44
+ ├── decisions.md # Architectural decisions (ADR-NNN format)
45
+ └── pitfalls.md # Known pitfalls (PF-NNN format)
43
46
  ```
44
47
 
45
48
  ---
@@ -97,6 +100,8 @@ source .devflow/scripts/docs-helpers.sh 2>/dev/null || {
97
100
  |-------|-----------------|----------|
98
101
  | Reviewer | `.docs/reviews/{branch-slug}/{type}-report.{timestamp}.md` | Creates new |
99
102
  | Working Memory | `.memory/WORKING-MEMORY.md` | Overwrites (auto-maintained by Stop hook) |
103
+ | Knowledge (decisions) | `.memory/knowledge/decisions.md` | Append-only (ADR-NNN sequential IDs) |
104
+ | Knowledge (pitfalls) | `.memory/knowledge/pitfalls.md` | Append-only (PF-NNN sequential IDs) |
100
105
 
101
106
  ### Agents That Don't Persist
102
107
 
@@ -125,6 +130,7 @@ When creating or modifying persisting agents:
125
130
  This framework is used by:
126
131
  - **Review agents**: Creates review reports
127
132
  - **Working Memory hooks**: Auto-maintains `.memory/WORKING-MEMORY.md`
133
+ - **Command flows**: `/implement` appends ADRs to `decisions.md`; `/code-review`, `/debug`, `/resolve` append PFs to `pitfalls.md`
128
134
 
129
135
  All persisting agents should load this skill to ensure consistent documentation.
130
136
 
@@ -0,0 +1,133 @@
1
+ ---
2
+ name: search-first
3
+ description: >-
4
+ This skill should be used when the user asks to "add a utility", "create a helper",
5
+ "implement parsing", "build a wrapper", or writes infrastructure/utility code that
6
+ may already exist as a well-maintained package. Enforces research before building.
7
+ user-invocable: false
8
+ allowed-tools: Read, Grep, Glob
9
+ ---
10
+
11
+ # Search-First
12
+
13
+ Research before building. Check if a battle-tested solution exists before writing custom utility code.
14
+
15
+ ## Iron Law
16
+
17
+ > **RESEARCH BEFORE BUILDING**
18
+ >
19
+ > Never write custom utility code without first checking if a battle-tested solution
20
+ > exists. The best code is code you don't write. A maintained package with thousands
21
+ > of users will always beat a hand-rolled utility in reliability, edge cases, and
22
+ > long-term maintenance.
23
+
24
+ ## When This Skill Activates
25
+
26
+ **Triggers** — creating or modifying code that:
27
+ - Parses, formats, or validates data (dates, URLs, emails, UUIDs, etc.)
28
+ - Implements common algorithms (sorting, diffing, hashing, encoding)
29
+ - Wraps HTTP clients, retries, rate limiting, caching
30
+ - Handles file system operations beyond basic read/write
31
+ - Implements CLI argument parsing, logging, or configuration
32
+ - Creates test utilities (mocking, fixtures, assertions)
33
+
34
+ **Does NOT trigger** for:
35
+ - Domain-specific business logic unique to this project
36
+ - Glue code connecting existing components
37
+ - Trivial operations (< 5 lines, single-use)
38
+ - Code that intentionally avoids external dependencies (e.g., zero-dep libraries)
39
+
40
+ ---
41
+
42
+ ## Research Process
43
+
44
+ ### Phase 1: Need Analysis
45
+
46
+ Before searching, define what you actually need:
47
+
48
+ ```
49
+ Need: {one-sentence description of the capability}
50
+ Constraints: {runtime, bundle size, license, zero-dep requirement}
51
+ Must-haves: {non-negotiable requirements}
52
+ Nice-to-haves: {optional features}
53
+ ```
54
+
55
+ ### Phase 2: Search
56
+
57
+ Delegate research to an Explore subagent to keep main session context clean.
58
+
59
+ **Spawn an Explore agent** with this prompt template:
60
+
61
+ ```
62
+ Task(subagent_type="Explore"):
63
+ "Research existing solutions for: {need description}
64
+
65
+ Search for:
66
+ 1. npm/PyPI/crates packages that solve this (check package.json/requirements.txt for ecosystem)
67
+ 2. Existing utilities in this codebase (grep for related function names)
68
+ 3. Framework built-ins that already handle this
69
+
70
+ For each candidate, find:
71
+ - Package name and weekly downloads (if applicable)
72
+ - Last publish date and maintenance status
73
+ - Bundle size / dependency count
74
+ - API surface relevant to our need
75
+ - License compatibility
76
+
77
+ Return top 3 candidates with pros/cons, or confirm nothing suitable exists."
78
+ ```
79
+
80
+ ### Phase 3: Evaluate
81
+
82
+ Score each candidate against evaluation criteria. See `references/evaluation-criteria.md` for the full matrix.
83
+
84
+ Quick checklist:
85
+ - [ ] Last published within 12 months
86
+ - [ ] Weekly downloads > 1,000 (npm) or equivalent traction
87
+ - [ ] No known vulnerabilities (check Snyk/npm audit)
88
+ - [ ] API fits the use case without heavy wrapping
89
+ - [ ] License compatible with project (MIT/Apache/BSD preferred)
90
+ - [ ] Bundle size acceptable for the project context
91
+
92
+ ### Phase 4: Decide
93
+
94
+ Choose one of four outcomes:
95
+
96
+ | Decision | When | Action |
97
+ |----------|------|--------|
98
+ | **Adopt** | Exact match, well-maintained, good API | Install and use directly |
99
+ | **Extend** | Partial match, needs thin wrapper | Install + write minimal adapter |
100
+ | **Compose** | No single package fits, but 2-3 small ones combine well | Install multiple, write glue code |
101
+ | **Build** | Nothing fits, or dependency cost exceeds value | Write custom, document why |
102
+
103
+ **Document the decision** in a code comment at the usage site:
104
+
105
+ ```typescript
106
+ // search-first: Adopted date-fns for date formatting (2M weekly downloads, 30KB)
107
+ // search-first: Built custom — no package handles our specific wire format
108
+ ```
109
+
110
+ ---
111
+
112
+ ## Anti-Patterns
113
+
114
+ | Anti-Pattern | Correct Approach |
115
+ |-------------|-----------------|
116
+ | Adding a dependency for 5 lines of trivial code | Build — dependency overhead exceeds value |
117
+ | Choosing the most popular package without checking fit | Evaluate API fit, not just popularity |
118
+ | Wrapping a package so heavily it obscures the original | If wrapping > 50% of original API, reconsider |
119
+ | Skipping research because "I know how to build this" | Research anyway — maintenance cost matters more than initial build |
120
+ | Installing a massive framework for one utility function | Look for focused, single-purpose packages |
121
+
122
+ ## Scope Limiter
123
+
124
+ This skill concerns **utility and infrastructure code** only:
125
+ - Data transformation, validation, formatting
126
+ - Network operations, retries, caching
127
+ - CLI tooling, logging, configuration
128
+ - Test utilities and helpers
129
+
130
+ It does NOT apply to **domain-specific business logic** where:
131
+ - The logic encodes unique business rules
132
+ - No generic solution could exist
133
+ - The code is inherently project-specific
@@ -0,0 +1,101 @@
1
+ # Search-First — Evaluation Criteria
2
+
3
+ Detailed package evaluation criteria and decision matrix for the 4-outcome model.
4
+
5
+ ## Evaluation Matrix
6
+
7
+ Score each candidate on these axes (1-5 scale):
8
+
9
+ | Criterion | Weight | 1 (Poor) | 3 (Acceptable) | 5 (Excellent) |
10
+ |-----------|--------|-----------|-----------------|----------------|
11
+ | **Maintenance** | High | No commits in 2+ years | Active, yearly releases | Regular releases, responsive maintainer |
12
+ | **Adoption** | Medium | < 100 weekly downloads | 1K-10K weekly downloads | > 100K weekly downloads |
13
+ | **API Fit** | High | Needs heavy wrapping | Partial fit, thin adapter needed | Direct use, clean API |
14
+ | **Bundle Size** | Medium | > 500KB | 50-500KB | < 50KB |
15
+ | **Security** | High | Known vulnerabilities | No known issues, few dependencies | Audited, zero/minimal dependencies |
16
+ | **License** | Required | GPL/AGPL (restrictive) | LGPL (conditional) | MIT/Apache/BSD (permissive) |
17
+
18
+ **Minimum thresholds**: License must be compatible. Security must be ≥ 3. All others are trade-offs.
19
+
20
+ ## Decision Matrix
21
+
22
+ ### Adopt (score ≥ 20/25, API Fit ≥ 4)
23
+
24
+ The package directly solves the problem with minimal integration code.
25
+
26
+ **Example**: Using `zod` for schema validation — exact fit, massive adoption, tiny bundle.
27
+
28
+ ```
29
+ ✅ Adopt: zod v3.22
30
+ - Maintenance: 5 (monthly releases)
31
+ - Adoption: 5 (4M weekly downloads)
32
+ - API Fit: 5 (direct use for all validation)
33
+ - Bundle Size: 4 (57KB)
34
+ - Security: 5 (zero dependencies)
35
+ - Total: 24/25
36
+ ```
37
+
38
+ ### Extend (score ≥ 15/25, API Fit ≥ 2)
39
+
40
+ The package handles 60-80% of the need. Write a thin adapter for the rest.
41
+
42
+ **Example**: Using `got` for HTTP but wrapping it with project-specific retry and auth logic.
43
+
44
+ ```
45
+ ✅ Extend: got v14
46
+ - Maintenance: 4 (active)
47
+ - Adoption: 5 (8M weekly downloads)
48
+ - API Fit: 3 (need custom retry wrapper)
49
+ - Bundle Size: 3 (150KB)
50
+ - Security: 4 (minimal deps)
51
+ - Total: 19/25
52
+ Adapter: ~30 lines wrapping retry + auth headers
53
+ ```
54
+
55
+ ### Compose (no single package fits, but small packages combine)
56
+
57
+ Two or three focused packages together solve the problem better than one large framework.
58
+
59
+ **Example**: `ms` (time parsing) + `p-retry` (retry logic) + `quick-lru` (caching) instead of a monolithic HTTP client framework.
60
+
61
+ **Rules for Compose**:
62
+ - Maximum 3 packages in a composition
63
+ - Each package must be focused (single responsibility)
64
+ - Total combined bundle < what a monolithic alternative would cost
65
+ - Glue code should be < 50 lines
66
+
67
+ ### Build (nothing fits, or dependency cost > value)
68
+
69
+ Write custom code when:
70
+ - No package scores ≥ 15/25
71
+ - The code is < 50 lines and trivial
72
+ - Zero-dependency constraint is explicit
73
+ - The domain is too specific for generic packages
74
+
75
+ **Required**: Document why Build was chosen:
76
+
77
+ ```typescript
78
+ // search-first: Built custom — our wire format uses non-standard
79
+ // ISO-8601 extensions that no date library handles correctly.
80
+ // Evaluated: date-fns (no custom format support), luxon (500KB overhead),
81
+ // dayjs (close but missing timezone edge case).
82
+ ```
83
+
84
+ ## Ecosystem-Specific Hints
85
+
86
+ ### Node.js / TypeScript
87
+ - Check npm: `https://www.npmjs.com/package/{name}`
88
+ - Bundle size: `https://bundlephobia.com/package/{name}`
89
+ - Check if Node.js built-ins handle it (`node:crypto`, `node:url`, `node:path`)
90
+
91
+ ### Python
92
+ - Check PyPI: `https://pypi.org/project/{name}`
93
+ - Check if stdlib handles it (`urllib`, `json`, `pathlib`, `dataclasses`)
94
+
95
+ ### Rust
96
+ - Check crates.io: `https://crates.io/crates/{name}`
97
+ - Check if std handles it
98
+
99
+ ### Go
100
+ - Check pkg.go.dev
101
+ - Go standard library is extensive — check stdlib first
@@ -91,7 +91,7 @@ See `references/rationalization-prevention.md` for extended examples with code.
91
91
 
92
92
  ## Process Enforcement
93
93
 
94
- When implementing any feature under ambient BUILD/GUIDED:
94
+ When implementing any feature under ambient IMPLEMENT/GUIDED or IMPLEMENT/ORCHESTRATED:
95
95
 
96
96
  1. **Identify the first behavior** — What is the simplest thing this feature must do?
97
97
  2. **Write the test** — Describe that behavior as a failing test
@@ -130,7 +130,8 @@ When skipping TDD, never rationalize. State clearly: "Skipping TDD because: [spe
130
130
 
131
131
  ## Integration with Ambient Mode
132
132
 
133
- - **BUILD/GUIDED** → TDD enforced. Every new function/method gets test-first treatment.
134
- - **BUILD/QUICK** → TDD skipped (trivial single-file edit).
135
- - **BUILD/ELEVATE** → TDD mentioned in nudge toward `/implement`.
136
- - **DEBUG/GUIDED** → TDD applies to the fix: write a test that reproduces the bug first, then fix.
133
+ - **IMPLEMENT/GUIDED** → TDD enforced in main session. Write the failing test before production code. Skill loaded directly.
134
+ - **IMPLEMENT/ORCHESTRATED** → TDD enforced via Coder agent (skill in Coder frontmatter). Every implementation gets test-first treatment.
135
+ - **IMPLEMENT/QUICK** → TDD skipped (trivial single-file edit).
136
+ - **DEBUG/GUIDED** → TDD applies to the fix in main session: write a test that reproduces the bug first, then fix.
137
+ - **DEBUG/ORCHESTRATED** → TDD applies to the fix: write a test that reproduces the bug first, then fix.
@@ -4,7 +4,7 @@
4
4
  "author": {
5
5
  "name": "Dean0x"
6
6
  },
7
- "version": "1.4.0",
7
+ "version": "1.6.0",
8
8
  "homepage": "https://github.com/dean0x/devflow",
9
9
  "repository": "https://github.com/dean0x/devflow",
10
10
  "license": "MIT",
@@ -15,10 +15,12 @@
15
15
  "agent-teams"
16
16
  ],
17
17
  "agents": [
18
- "git"
18
+ "git",
19
+ "synthesizer"
19
20
  ],
20
21
  "skills": [
21
22
  "agent-teams",
22
- "git-safety"
23
+ "git-safety",
24
+ "knowledge-persistence"
23
25
  ]
24
26
  }
@@ -0,0 +1,211 @@
1
+ ---
2
+ name: Synthesizer
3
+ description: Combines outputs from multiple agents into actionable summaries (modes: exploration, planning, review)
4
+ model: haiku
5
+ skills: review-methodology, docs-framework
6
+ ---
7
+
8
+ # Synthesizer Agent
9
+
10
+ You are a synthesis specialist. You combine outputs from multiple parallel agents into clear, actionable summaries. You operate in three modes: exploration, planning, and review.
11
+
12
+ ## Input
13
+
14
+ The orchestrator provides:
15
+ - **Mode**: `exploration` | `planning` | `review`
16
+ - **Agent outputs**: Results from parallel agents to synthesize
17
+ - **Output path**: Where to save synthesis (if applicable)
18
+
19
+ ---
20
+
21
+ ## Mode: Exploration
22
+
23
+ Synthesize outputs from 4 Explore agents (architecture, integration, reusable code, edge cases).
24
+
25
+ **Process:**
26
+ 1. Extract key findings from each explorer
27
+ 2. Identify patterns that appear across multiple explorations
28
+ 3. Resolve conflicts if explorers found contradictory patterns
29
+ 4. Prioritize by relevance to the task
30
+
31
+ **Output:**
32
+ ```markdown
33
+ ## Exploration Synthesis
34
+
35
+ ### Patterns to Follow
36
+ | Pattern | Location | Usage |
37
+ |---------|----------|-------|
38
+ | {pattern} | `file:line` | {when to use} |
39
+
40
+ ### Integration Points
41
+ | Entry Point | File | How to Integrate |
42
+ |-------------|------|------------------|
43
+ | {point} | `file:line` | {description} |
44
+
45
+ ### Reusable Code
46
+ | Utility | Location | Purpose |
47
+ |---------|----------|---------|
48
+ | {function} | `file:line` | {what it does} |
49
+
50
+ ### Edge Cases
51
+ | Scenario | Pattern | Example |
52
+ |----------|---------|---------|
53
+ | {case} | {handling} | `file:line` |
54
+
55
+ ### Key Insights
56
+ 1. {insight}
57
+ 2. {insight}
58
+ ```
59
+
60
+ ---
61
+
62
+ ## Mode: Planning
63
+
64
+ Synthesize outputs from 3 Plan agents (implementation, testing, execution strategy).
65
+
66
+ **Process:**
67
+ 1. Merge implementation steps with testing strategy
68
+ 2. Apply execution strategy analysis to determine Coder deployment
69
+ 3. Identify dependencies between steps
70
+ 4. Assess context risk based on file count and module breadth
71
+
72
+ **Execution Strategy Decision:**
73
+
74
+ Analyze 3 axes to determine strategy:
75
+
76
+ | Axis | Signals | Impact |
77
+ |------|---------|--------|
78
+ | **Artifact Independence** | Shared contracts? Integration points? Cross-file dependencies? | If coupled → SINGLE_CODER |
79
+ | **Context Capacity** | File count, module breadth, pattern complexity | HIGH/CRITICAL → SEQUENTIAL_CODERS |
80
+ | **Domain Specialization** | Tech stack detected (backend, frontend, tests) | Determines DOMAIN hints |
81
+
82
+ **Context Risk Levels:**
83
+ - **LOW**: <10 files, single module → SINGLE_CODER
84
+ - **MEDIUM**: 10-20 files, 2-3 modules → Consider SEQUENTIAL_CODERS
85
+ - **HIGH**: 20-30 files, multiple modules → SEQUENTIAL_CODERS (2-3 phases)
86
+ - **CRITICAL**: >30 files, cross-cutting concerns → SEQUENTIAL_CODERS (more phases)
87
+
88
+ **Strategy Selection:**
89
+ - **SINGLE_CODER** (~80%): Default. Coherent A→Z implementation. Best for consistency in naming, patterns, error handling.
90
+ - **SEQUENTIAL_CODERS** (~15%): Context overflow risk or layered dependencies. Split into phases with handoff summaries.
91
+ - **PARALLEL_CODERS** (~5%): True artifact independence - no shared contracts, no integration points. Rare.
92
+
93
+ **Output:**
94
+ ```markdown
95
+ ## Planning Synthesis
96
+
97
+ ### Execution Strategy
98
+ **Type**: {SINGLE_CODER | SEQUENTIAL_CODERS | PARALLEL_CODERS}
99
+ **Context Risk**: {LOW | MEDIUM | HIGH | CRITICAL}
100
+ **File Estimate**: {n} files across {m} modules
101
+ **Reason**: {why this strategy}
102
+
103
+ ### Subtask Breakdown (if not SINGLE_CODER)
104
+ | Phase | Domain | Description | Files | Depends On |
105
+ |-------|--------|-------------|-------|------------|
106
+ | 1 | backend | {description} | `file1`, `file2` | - |
107
+ | 2 | frontend | {description} | `file3`, `file4` | Phase 1 |
108
+
109
+ ### Implementation Plan
110
+ | Step | Action | Files | Tests | Depends On |
111
+ |------|--------|-------|-------|------------|
112
+ | 1 | {action} | `file` | `test_file` | - |
113
+ | 2 | {action} | `file` | `test_file` | Step 1 |
114
+
115
+ ### Risk Assessment
116
+ | Risk | Mitigation |
117
+ |------|------------|
118
+ | {issue} | {approach} |
119
+
120
+ ### Complexity
121
+ {Low | Medium | High} - {reasoning}
122
+ ```
123
+
124
+ ---
125
+
126
+ ## Mode: Review
127
+
128
+ Synthesize outputs from multiple Reviewer agents. Apply strict merge rules.
129
+
130
+ **Process:**
131
+ 1. Read all review reports from `${REVIEW_BASE_DIR}/*.md` (exclude your own output `review-summary.*.md`)
132
+ 2. Extract confidence percentages from each finding
133
+ 3. Apply confidence-aware aggregation: when multiple reviewers flag the same file:line, boost confidence by 10% per additional reviewer (cap at 100%)
134
+ <!-- Confidence threshold also in: shared/agents/reviewer.md, plugins/devflow-code-review/commands/code-review.md -->
135
+ 4. Maintain ≥80% confidence threshold in final output
136
+ 5. Categorize issues into 3 buckets (from review-methodology)
137
+ 6. Count by severity (CRITICAL, HIGH, MEDIUM, LOW)
138
+ 7. Determine merge recommendation based on blocking issues
139
+
140
+ **Issue Categories:**
141
+ - **Blocking** (Category 1): Issues in YOUR changes - CRITICAL/HIGH must block
142
+ - **Should-Fix** (Category 2): Issues in code you touched - HIGH/MEDIUM
143
+ - **Pre-existing** (Category 3): Legacy issues - informational only
144
+
145
+ **Merge Rules:**
146
+ | Condition | Recommendation |
147
+ |-----------|----------------|
148
+ | Any CRITICAL in blocking | BLOCK MERGE |
149
+ | Any HIGH in blocking | CHANGES REQUESTED |
150
+ | Only MEDIUM in blocking | APPROVED WITH COMMENTS |
151
+ | No blocking issues | APPROVED |
152
+
153
+ **Output:**
154
+ **CRITICAL**: Write the summary to disk using the Write tool:
155
+ 1. Create directory: `mkdir -p ${REVIEW_BASE_DIR}`
156
+ 2. Write to `${REVIEW_BASE_DIR}/review-summary.${TIMESTAMP}.md` using Write tool
157
+ 3. Confirm file written in final message
158
+
159
+ Report format:
160
+
161
+ ```markdown
162
+ # Code Review Summary
163
+
164
+ **Branch**: {branch} -> {base}
165
+ **Date**: {timestamp}
166
+
167
+ ## Merge Recommendation: {BLOCK | CHANGES_REQUESTED | APPROVED}
168
+
169
+ {Brief reasoning}
170
+
171
+ ## Issue Summary
172
+ | Category | CRITICAL | HIGH | MEDIUM | LOW | Total |
173
+ |----------|----------|------|--------|-----|-------|
174
+ | Blocking | {n} | {n} | {n} | - | {n} |
175
+ | Should Fix | - | {n} | {n} | - | {n} |
176
+ | Pre-existing | - | - | {n} | {n} | {n} |
177
+
178
+ ## Blocking Issues
179
+ {List with file:line, confidence %, and suggested fix}
180
+
181
+ ## Suggestions (Lower Confidence)
182
+ {Max 5 items across all reviewers with 60-79% confidence. Brief descriptions only.}
183
+
184
+ ## Action Plan
185
+ 1. {Priority fix}
186
+ 2. {Next fix}
187
+ ```
188
+
189
+ ---
190
+
191
+ ## Principles
192
+
193
+ 1. **No new research** - Only synthesize what agents found
194
+ 2. **Preserve references** - Keep file:line from source agents
195
+ 3. **Resolve conflicts** - If agents disagree, pick best pattern with justification
196
+ 4. **Actionable output** - Results must be executable by next phase
197
+ 5. **Accurate counts** - Issue counts must match reality (review mode)
198
+ 6. **Honest recommendation** - Never approve with blocking issues (review mode)
199
+ 7. **Be decisive** - Make confident synthesis choices
200
+
201
+ ## Boundaries
202
+
203
+ **Handle autonomously:**
204
+ - Combining agent outputs
205
+ - Resolving conflicts between agents
206
+ - Generating structured summaries
207
+ - Determining merge recommendations
208
+
209
+ **Escalate to orchestrator:**
210
+ - Fundamental disagreements between agents that need user input
211
+ - Missing critical agent outputs
@@ -23,7 +23,11 @@ Investigate bugs by spawning a team of agents, each pursuing a different hypothe
23
23
 
24
24
  ## Phases
25
25
 
26
- ### Phase 1: Context Gathering
26
+ ### Phase 1: Load Project Knowledge
27
+
28
+ Read `.memory/knowledge/decisions.md` and `.memory/knowledge/pitfalls.md`. Known pitfalls from prior debugging sessions and code reviews can directly inform hypothesis generation — pass their content as context to investigators in Phase 2.
29
+
30
+ ### Phase 2: Context Gathering
27
31
 
28
32
  If `$ARGUMENTS` starts with `#`, fetch the GitHub issue:
29
33
 
@@ -39,7 +43,7 @@ Analyze the bug description (from arguments or issue) and identify 3-5 plausible
39
43
  - **Testable**: Can be confirmed or disproved by reading code/logs
40
44
  - **Distinct**: Does not overlap significantly with other hypotheses
41
45
 
42
- ### Phase 2: Spawn Investigation Team
46
+ ### Phase 3: Spawn Investigation Team
43
47
 
44
48
  Create an agent team with one investigator per hypothesis:
45
49
 
@@ -99,7 +103,7 @@ Spawn investigator teammates with self-contained prompts:
99
103
  (Add more investigators if bug complexity warrants 4-5 hypotheses — same pattern)
100
104
  ```
101
105
 
102
- ### Phase 3: Investigation
106
+ ### Phase 4: Investigation
103
107
 
104
108
  Teammates investigate in parallel:
105
109
  - Read relevant source files
@@ -108,7 +112,7 @@ Teammates investigate in parallel:
108
112
  - Look for edge cases and race conditions
109
113
  - Build evidence for or against their hypothesis
110
114
 
111
- ### Phase 4: Adversarial Debate
115
+ ### Phase 5: Adversarial Debate
112
116
 
113
117
  Lead initiates debate via broadcast:
114
118
 
@@ -135,7 +139,7 @@ Teammates challenge each other directly using SendMessage:
135
139
  - `SendMessage(type: "message", recipient: "team-lead", summary: "Updated hypothesis")`
136
140
  "I've updated my hypothesis based on investigator-b's finding at..."
137
141
 
138
- ### Phase 5: Convergence
142
+ ### Phase 6: Convergence
139
143
 
140
144
  After debate (max 2 rounds), lead collects results:
141
145
 
@@ -147,7 +151,7 @@ Lead broadcast:
147
151
  - PARTIAL: Some aspects confirmed, others not (details)"
148
152
  ```
149
153
 
150
- ### Phase 6: Cleanup
154
+ ### Phase 7: Cleanup
151
155
 
152
156
  Shut down all investigator teammates explicitly:
153
157
 
@@ -160,7 +164,7 @@ TeamDelete
160
164
  Verify TeamDelete succeeded. If failed, retry once after 5s. If retry fails, HALT.
161
165
  ```
162
166
 
163
- ### Phase 7: Report
167
+ ### Phase 8: Report
164
168
 
165
169
  Lead produces final report:
166
170
 
@@ -189,30 +193,40 @@ Lead produces final report:
189
193
  {HIGH/MEDIUM/LOW based on consensus strength}
190
194
  ```
191
195
 
196
+ ### Phase 9: Record Pitfall (if root cause found)
197
+
198
+ If root cause was identified with HIGH or MEDIUM confidence:
199
+ 1. Read `~/.claude/skills/knowledge-persistence/SKILL.md` and follow its extraction procedure to record pitfalls to `.memory/knowledge/pitfalls.md`
200
+ 2. Source field: `/debug {bug description}`
201
+
192
202
  ## Architecture
193
203
 
194
204
  ```
195
205
  /debug (orchestrator)
196
206
 
197
- ├─ Phase 1: Context gathering
207
+ ├─ Phase 1: Load Project Knowledge
208
+
209
+ ├─ Phase 2: Context gathering
198
210
  │ └─ Git agent (fetch issue, if #N provided)
199
211
 
200
- ├─ Phase 2: Spawn investigation team
212
+ ├─ Phase 3: Spawn investigation team
201
213
  │ └─ Create team with 3-5 hypothesis investigators
202
214
 
203
- ├─ Phase 3: Parallel investigation
215
+ ├─ Phase 4: Parallel investigation
204
216
  │ └─ Each teammate investigates independently
205
217
 
206
- ├─ Phase 4: Adversarial debate
218
+ ├─ Phase 5: Adversarial debate
207
219
  │ └─ Teammates challenge each other directly (max 2 rounds)
208
220
 
209
- ├─ Phase 5: Convergence
221
+ ├─ Phase 6: Convergence
210
222
  │ └─ Teammates submit final hypothesis status
211
223
 
212
- ├─ Phase 6: Cleanup
224
+ ├─ Phase 7: Cleanup
213
225
  │ └─ Shut down teammates, release resources
214
226
 
215
- └─ Phase 7: Root cause report with confidence level
227
+ ├─ Phase 8: Root cause report with confidence level
228
+
229
+ └─ Phase 9: Record Pitfall (inline, if root cause found)
216
230
  ```
217
231
 
218
232
  ## Principles