devflow-kit 1.4.0 → 1.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +51 -0
- package/README.md +7 -3
- package/dist/commands/ambient.js +1 -1
- package/dist/commands/init.js +31 -2
- package/dist/commands/list.d.ts +21 -0
- package/dist/commands/list.js +71 -3
- package/dist/plugins.js +24 -24
- package/dist/utils/manifest.d.ts +45 -0
- package/dist/utils/manifest.js +100 -0
- package/dist/utils/post-install.js +6 -1
- package/package.json +1 -1
- package/plugins/devflow-accessibility/.claude-plugin/plugin.json +1 -1
- package/plugins/devflow-ambient/.claude-plugin/plugin.json +25 -4
- package/plugins/devflow-ambient/README.md +48 -29
- package/plugins/devflow-ambient/agents/coder.md +135 -0
- package/plugins/devflow-ambient/agents/reviewer.md +165 -0
- package/plugins/devflow-ambient/agents/scrutinizer.md +80 -0
- package/plugins/devflow-ambient/agents/shepherd.md +94 -0
- package/plugins/devflow-ambient/agents/simplifier.md +93 -0
- package/plugins/devflow-ambient/agents/skimmer.md +93 -0
- package/plugins/devflow-ambient/agents/validator.md +86 -0
- package/plugins/devflow-ambient/skills/ambient-router/SKILL.md +72 -28
- package/plugins/devflow-ambient/skills/ambient-router/references/skill-catalog.md +40 -34
- package/plugins/devflow-ambient/skills/debug-orchestration/SKILL.md +69 -0
- package/plugins/devflow-ambient/skills/implementation-orchestration/SKILL.md +92 -0
- package/plugins/devflow-ambient/skills/plan-orchestration/SKILL.md +71 -0
- package/plugins/devflow-audit-claude/.claude-plugin/plugin.json +10 -1
- package/plugins/devflow-audit-claude/commands/audit-claude.md +4 -0
- package/plugins/devflow-code-review/.claude-plugin/plugin.json +2 -1
- package/plugins/devflow-code-review/agents/reviewer.md +47 -9
- package/plugins/devflow-code-review/agents/synthesizer.md +12 -5
- package/plugins/devflow-code-review/commands/code-review-teams.md +43 -30
- package/plugins/devflow-code-review/commands/code-review.md +14 -2
- package/plugins/devflow-code-review/skills/knowledge-persistence/SKILL.md +128 -0
- package/plugins/devflow-code-review/skills/knowledge-persistence/references/examples.md +44 -0
- package/plugins/devflow-core-skills/.claude-plugin/plugin.json +2 -1
- package/plugins/devflow-core-skills/skills/docs-framework/SKILL.md +7 -1
- package/plugins/devflow-core-skills/skills/search-first/SKILL.md +133 -0
- package/plugins/devflow-core-skills/skills/search-first/references/evaluation-criteria.md +101 -0
- package/plugins/devflow-core-skills/skills/test-driven-development/SKILL.md +6 -5
- package/plugins/devflow-debug/.claude-plugin/plugin.json +5 -3
- package/plugins/devflow-debug/agents/synthesizer.md +211 -0
- package/plugins/devflow-debug/commands/debug-teams.md +28 -14
- package/plugins/devflow-debug/commands/debug.md +26 -12
- package/plugins/devflow-debug/skills/knowledge-persistence/SKILL.md +128 -0
- package/plugins/devflow-debug/skills/knowledge-persistence/references/examples.md +44 -0
- package/plugins/devflow-frontend-design/.claude-plugin/plugin.json +1 -1
- package/plugins/devflow-go/.claude-plugin/plugin.json +1 -1
- package/plugins/devflow-implement/.claude-plugin/plugin.json +2 -1
- package/plugins/devflow-implement/agents/coder.md +21 -13
- package/plugins/devflow-implement/agents/simplifier.md +32 -1
- package/plugins/devflow-implement/agents/skimmer.md +5 -0
- package/plugins/devflow-implement/agents/synthesizer.md +12 -5
- package/plugins/devflow-implement/commands/implement-teams.md +73 -60
- package/plugins/devflow-implement/commands/implement.md +45 -40
- package/plugins/devflow-implement/skills/knowledge-persistence/SKILL.md +128 -0
- package/plugins/devflow-implement/skills/knowledge-persistence/references/examples.md +44 -0
- package/plugins/devflow-java/.claude-plugin/plugin.json +1 -1
- package/plugins/devflow-python/.claude-plugin/plugin.json +1 -1
- package/plugins/devflow-react/.claude-plugin/plugin.json +1 -1
- package/plugins/devflow-resolve/.claude-plugin/plugin.json +4 -3
- package/plugins/devflow-resolve/agents/simplifier.md +32 -1
- package/plugins/devflow-resolve/commands/resolve-teams.md +16 -7
- package/plugins/devflow-resolve/commands/resolve.md +16 -7
- package/plugins/devflow-resolve/skills/knowledge-persistence/SKILL.md +128 -0
- package/plugins/devflow-resolve/skills/knowledge-persistence/references/examples.md +44 -0
- package/plugins/devflow-rust/.claude-plugin/plugin.json +1 -1
- package/plugins/devflow-self-review/.claude-plugin/plugin.json +10 -1
- package/plugins/devflow-self-review/agents/simplifier.md +32 -1
- package/plugins/devflow-self-review/commands/self-review.md +10 -4
- package/plugins/devflow-specify/.claude-plugin/plugin.json +1 -1
- package/plugins/devflow-specify/agents/skimmer.md +5 -0
- package/plugins/devflow-specify/agents/synthesizer.md +12 -5
- package/plugins/devflow-specify/commands/specify-teams.md +27 -20
- package/plugins/devflow-specify/commands/specify.md +26 -19
- package/plugins/devflow-typescript/.claude-plugin/plugin.json +1 -1
- package/scripts/hooks/ambient-prompt +8 -7
- package/scripts/hooks/session-start-memory +33 -3
- package/shared/agents/coder.md +21 -13
- package/shared/agents/reviewer.md +47 -9
- package/shared/agents/simplifier.md +32 -1
- package/shared/agents/skimmer.md +5 -0
- package/shared/agents/synthesizer.md +12 -5
- package/shared/skills/ambient-router/SKILL.md +72 -28
- package/shared/skills/ambient-router/references/skill-catalog.md +40 -34
- package/shared/skills/debug-orchestration/SKILL.md +69 -0
- package/shared/skills/docs-framework/SKILL.md +7 -1
- package/shared/skills/implementation-orchestration/SKILL.md +92 -0
- package/shared/skills/knowledge-persistence/SKILL.md +128 -0
- package/shared/skills/knowledge-persistence/references/examples.md +44 -0
- package/shared/skills/plan-orchestration/SKILL.md +71 -0
- package/shared/skills/search-first/SKILL.md +133 -0
- package/shared/skills/search-first/references/evaluation-criteria.md +101 -0
- package/shared/skills/test-driven-development/SKILL.md +6 -5
- package/plugins/devflow-ambient/commands/ambient.md +0 -110
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
"author": {
|
|
5
5
|
"name": "Dean0x"
|
|
6
6
|
},
|
|
7
|
-
"version": "1.
|
|
7
|
+
"version": "1.6.0",
|
|
8
8
|
"homepage": "https://github.com/dean0x/devflow",
|
|
9
9
|
"repository": "https://github.com/dean0x/devflow",
|
|
10
10
|
"license": "MIT",
|
|
@@ -24,6 +24,7 @@
|
|
|
24
24
|
"git-workflow",
|
|
25
25
|
"github-patterns",
|
|
26
26
|
"input-validation",
|
|
27
|
+
"search-first",
|
|
27
28
|
"test-driven-development",
|
|
28
29
|
"test-patterns"
|
|
29
30
|
]
|
|
@@ -39,7 +39,10 @@ All generated documentation lives under `.docs/` in the project root:
|
|
|
39
39
|
.memory/
|
|
40
40
|
├── WORKING-MEMORY.md # Auto-maintained by Stop hook (overwritten)
|
|
41
41
|
├── PROJECT-PATTERNS.md # Accumulated patterns (merged across sessions)
|
|
42
|
-
|
|
42
|
+
├── backup.json # Pre-compact git state snapshot
|
|
43
|
+
└── knowledge/
|
|
44
|
+
├── decisions.md # Architectural decisions (ADR-NNN format)
|
|
45
|
+
└── pitfalls.md # Known pitfalls (PF-NNN format)
|
|
43
46
|
```
|
|
44
47
|
|
|
45
48
|
---
|
|
@@ -97,6 +100,8 @@ source .devflow/scripts/docs-helpers.sh 2>/dev/null || {
|
|
|
97
100
|
|-------|-----------------|----------|
|
|
98
101
|
| Reviewer | `.docs/reviews/{branch-slug}/{type}-report.{timestamp}.md` | Creates new |
|
|
99
102
|
| Working Memory | `.memory/WORKING-MEMORY.md` | Overwrites (auto-maintained by Stop hook) |
|
|
103
|
+
| Knowledge (decisions) | `.memory/knowledge/decisions.md` | Append-only (ADR-NNN sequential IDs) |
|
|
104
|
+
| Knowledge (pitfalls) | `.memory/knowledge/pitfalls.md` | Append-only (PF-NNN sequential IDs) |
|
|
100
105
|
|
|
101
106
|
### Agents That Don't Persist
|
|
102
107
|
|
|
@@ -125,6 +130,7 @@ When creating or modifying persisting agents:
|
|
|
125
130
|
This framework is used by:
|
|
126
131
|
- **Review agents**: Creates review reports
|
|
127
132
|
- **Working Memory hooks**: Auto-maintains `.memory/WORKING-MEMORY.md`
|
|
133
|
+
- **Command flows**: `/implement` appends ADRs to `decisions.md`; `/code-review`, `/debug`, `/resolve` append PFs to `pitfalls.md`
|
|
128
134
|
|
|
129
135
|
All persisting agents should load this skill to ensure consistent documentation.
|
|
130
136
|
|
|
@@ -0,0 +1,133 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: search-first
|
|
3
|
+
description: >-
|
|
4
|
+
This skill should be used when the user asks to "add a utility", "create a helper",
|
|
5
|
+
"implement parsing", "build a wrapper", or writes infrastructure/utility code that
|
|
6
|
+
may already exist as a well-maintained package. Enforces research before building.
|
|
7
|
+
user-invocable: false
|
|
8
|
+
allowed-tools: Read, Grep, Glob
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Search-First
|
|
12
|
+
|
|
13
|
+
Research before building. Check if a battle-tested solution exists before writing custom utility code.
|
|
14
|
+
|
|
15
|
+
## Iron Law
|
|
16
|
+
|
|
17
|
+
> **RESEARCH BEFORE BUILDING**
|
|
18
|
+
>
|
|
19
|
+
> Never write custom utility code without first checking if a battle-tested solution
|
|
20
|
+
> exists. The best code is code you don't write. A maintained package with thousands
|
|
21
|
+
> of users will always beat a hand-rolled utility in reliability, edge cases, and
|
|
22
|
+
> long-term maintenance.
|
|
23
|
+
|
|
24
|
+
## When This Skill Activates
|
|
25
|
+
|
|
26
|
+
**Triggers** — creating or modifying code that:
|
|
27
|
+
- Parses, formats, or validates data (dates, URLs, emails, UUIDs, etc.)
|
|
28
|
+
- Implements common algorithms (sorting, diffing, hashing, encoding)
|
|
29
|
+
- Wraps HTTP clients, retries, rate limiting, caching
|
|
30
|
+
- Handles file system operations beyond basic read/write
|
|
31
|
+
- Implements CLI argument parsing, logging, or configuration
|
|
32
|
+
- Creates test utilities (mocking, fixtures, assertions)
|
|
33
|
+
|
|
34
|
+
**Does NOT trigger** for:
|
|
35
|
+
- Domain-specific business logic unique to this project
|
|
36
|
+
- Glue code connecting existing components
|
|
37
|
+
- Trivial operations (< 5 lines, single-use)
|
|
38
|
+
- Code that intentionally avoids external dependencies (e.g., zero-dep libraries)
|
|
39
|
+
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
## Research Process
|
|
43
|
+
|
|
44
|
+
### Phase 1: Need Analysis
|
|
45
|
+
|
|
46
|
+
Before searching, define what you actually need:
|
|
47
|
+
|
|
48
|
+
```
|
|
49
|
+
Need: {one-sentence description of the capability}
|
|
50
|
+
Constraints: {runtime, bundle size, license, zero-dep requirement}
|
|
51
|
+
Must-haves: {non-negotiable requirements}
|
|
52
|
+
Nice-to-haves: {optional features}
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
### Phase 2: Search
|
|
56
|
+
|
|
57
|
+
Delegate research to an Explore subagent to keep main session context clean.
|
|
58
|
+
|
|
59
|
+
**Spawn an Explore agent** with this prompt template:
|
|
60
|
+
|
|
61
|
+
```
|
|
62
|
+
Task(subagent_type="Explore"):
|
|
63
|
+
"Research existing solutions for: {need description}
|
|
64
|
+
|
|
65
|
+
Search for:
|
|
66
|
+
1. npm/PyPI/crates packages that solve this (check package.json/requirements.txt for ecosystem)
|
|
67
|
+
2. Existing utilities in this codebase (grep for related function names)
|
|
68
|
+
3. Framework built-ins that already handle this
|
|
69
|
+
|
|
70
|
+
For each candidate, find:
|
|
71
|
+
- Package name and weekly downloads (if applicable)
|
|
72
|
+
- Last publish date and maintenance status
|
|
73
|
+
- Bundle size / dependency count
|
|
74
|
+
- API surface relevant to our need
|
|
75
|
+
- License compatibility
|
|
76
|
+
|
|
77
|
+
Return top 3 candidates with pros/cons, or confirm nothing suitable exists."
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
### Phase 3: Evaluate
|
|
81
|
+
|
|
82
|
+
Score each candidate against evaluation criteria. See `references/evaluation-criteria.md` for the full matrix.
|
|
83
|
+
|
|
84
|
+
Quick checklist:
|
|
85
|
+
- [ ] Last published within 12 months
|
|
86
|
+
- [ ] Weekly downloads > 1,000 (npm) or equivalent traction
|
|
87
|
+
- [ ] No known vulnerabilities (check Snyk/npm audit)
|
|
88
|
+
- [ ] API fits the use case without heavy wrapping
|
|
89
|
+
- [ ] License compatible with project (MIT/Apache/BSD preferred)
|
|
90
|
+
- [ ] Bundle size acceptable for the project context
|
|
91
|
+
|
|
92
|
+
### Phase 4: Decide
|
|
93
|
+
|
|
94
|
+
Choose one of four outcomes:
|
|
95
|
+
|
|
96
|
+
| Decision | When | Action |
|
|
97
|
+
|----------|------|--------|
|
|
98
|
+
| **Adopt** | Exact match, well-maintained, good API | Install and use directly |
|
|
99
|
+
| **Extend** | Partial match, needs thin wrapper | Install + write minimal adapter |
|
|
100
|
+
| **Compose** | No single package fits, but 2-3 small ones combine well | Install multiple, write glue code |
|
|
101
|
+
| **Build** | Nothing fits, or dependency cost exceeds value | Write custom, document why |
|
|
102
|
+
|
|
103
|
+
**Document the decision** in a code comment at the usage site:
|
|
104
|
+
|
|
105
|
+
```typescript
|
|
106
|
+
// search-first: Adopted date-fns for date formatting (2M weekly downloads, 30KB)
|
|
107
|
+
// search-first: Built custom — no package handles our specific wire format
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
## Anti-Patterns
|
|
113
|
+
|
|
114
|
+
| Anti-Pattern | Correct Approach |
|
|
115
|
+
|-------------|-----------------|
|
|
116
|
+
| Adding a dependency for 5 lines of trivial code | Build — dependency overhead exceeds value |
|
|
117
|
+
| Choosing the most popular package without checking fit | Evaluate API fit, not just popularity |
|
|
118
|
+
| Wrapping a package so heavily it obscures the original | If wrapping > 50% of original API, reconsider |
|
|
119
|
+
| Skipping research because "I know how to build this" | Research anyway — maintenance cost matters more than initial build |
|
|
120
|
+
| Installing a massive framework for one utility function | Look for focused, single-purpose packages |
|
|
121
|
+
|
|
122
|
+
## Scope Limiter
|
|
123
|
+
|
|
124
|
+
This skill concerns **utility and infrastructure code** only:
|
|
125
|
+
- Data transformation, validation, formatting
|
|
126
|
+
- Network operations, retries, caching
|
|
127
|
+
- CLI tooling, logging, configuration
|
|
128
|
+
- Test utilities and helpers
|
|
129
|
+
|
|
130
|
+
It does NOT apply to **domain-specific business logic** where:
|
|
131
|
+
- The logic encodes unique business rules
|
|
132
|
+
- No generic solution could exist
|
|
133
|
+
- The code is inherently project-specific
|
|
@@ -0,0 +1,101 @@
|
|
|
1
|
+
# Search-First — Evaluation Criteria
|
|
2
|
+
|
|
3
|
+
Detailed package evaluation criteria and decision matrix for the 4-outcome model.
|
|
4
|
+
|
|
5
|
+
## Evaluation Matrix
|
|
6
|
+
|
|
7
|
+
Score each candidate on these axes (1-5 scale):
|
|
8
|
+
|
|
9
|
+
| Criterion | Weight | 1 (Poor) | 3 (Acceptable) | 5 (Excellent) |
|
|
10
|
+
|-----------|--------|-----------|-----------------|----------------|
|
|
11
|
+
| **Maintenance** | High | No commits in 2+ years | Active, yearly releases | Regular releases, responsive maintainer |
|
|
12
|
+
| **Adoption** | Medium | < 100 weekly downloads | 1K-10K weekly downloads | > 100K weekly downloads |
|
|
13
|
+
| **API Fit** | High | Needs heavy wrapping | Partial fit, thin adapter needed | Direct use, clean API |
|
|
14
|
+
| **Bundle Size** | Medium | > 500KB | 50-500KB | < 50KB |
|
|
15
|
+
| **Security** | High | Known vulnerabilities | No known issues, few dependencies | Audited, zero/minimal dependencies |
|
|
16
|
+
| **License** | Required | GPL/AGPL (restrictive) | LGPL (conditional) | MIT/Apache/BSD (permissive) |
|
|
17
|
+
|
|
18
|
+
**Minimum thresholds**: License must be compatible. Security must be ≥ 3. All others are trade-offs.
|
|
19
|
+
|
|
20
|
+
## Decision Matrix
|
|
21
|
+
|
|
22
|
+
### Adopt (score ≥ 20/25, API Fit ≥ 4)
|
|
23
|
+
|
|
24
|
+
The package directly solves the problem with minimal integration code.
|
|
25
|
+
|
|
26
|
+
**Example**: Using `zod` for schema validation — exact fit, massive adoption, tiny bundle.
|
|
27
|
+
|
|
28
|
+
```
|
|
29
|
+
✅ Adopt: zod v3.22
|
|
30
|
+
- Maintenance: 5 (monthly releases)
|
|
31
|
+
- Adoption: 5 (4M weekly downloads)
|
|
32
|
+
- API Fit: 5 (direct use for all validation)
|
|
33
|
+
- Bundle Size: 4 (57KB)
|
|
34
|
+
- Security: 5 (zero dependencies)
|
|
35
|
+
- Total: 24/25
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
### Extend (score ≥ 15/25, API Fit ≥ 2)
|
|
39
|
+
|
|
40
|
+
The package handles 60-80% of the need. Write a thin adapter for the rest.
|
|
41
|
+
|
|
42
|
+
**Example**: Using `got` for HTTP but wrapping it with project-specific retry and auth logic.
|
|
43
|
+
|
|
44
|
+
```
|
|
45
|
+
✅ Extend: got v14
|
|
46
|
+
- Maintenance: 4 (active)
|
|
47
|
+
- Adoption: 5 (8M weekly downloads)
|
|
48
|
+
- API Fit: 3 (need custom retry wrapper)
|
|
49
|
+
- Bundle Size: 3 (150KB)
|
|
50
|
+
- Security: 4 (minimal deps)
|
|
51
|
+
- Total: 19/25
|
|
52
|
+
Adapter: ~30 lines wrapping retry + auth headers
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
### Compose (no single package fits, but small packages combine)
|
|
56
|
+
|
|
57
|
+
Two or three focused packages together solve the problem better than one large framework.
|
|
58
|
+
|
|
59
|
+
**Example**: `ms` (time parsing) + `p-retry` (retry logic) + `quick-lru` (caching) instead of a monolithic HTTP client framework.
|
|
60
|
+
|
|
61
|
+
**Rules for Compose**:
|
|
62
|
+
- Maximum 3 packages in a composition
|
|
63
|
+
- Each package must be focused (single responsibility)
|
|
64
|
+
- Total combined bundle < what a monolithic alternative would cost
|
|
65
|
+
- Glue code should be < 50 lines
|
|
66
|
+
|
|
67
|
+
### Build (nothing fits, or dependency cost > value)
|
|
68
|
+
|
|
69
|
+
Write custom code when:
|
|
70
|
+
- No package scores ≥ 15/25
|
|
71
|
+
- The code is < 50 lines and trivial
|
|
72
|
+
- Zero-dependency constraint is explicit
|
|
73
|
+
- The domain is too specific for generic packages
|
|
74
|
+
|
|
75
|
+
**Required**: Document why Build was chosen:
|
|
76
|
+
|
|
77
|
+
```typescript
|
|
78
|
+
// search-first: Built custom — our wire format uses non-standard
|
|
79
|
+
// ISO-8601 extensions that no date library handles correctly.
|
|
80
|
+
// Evaluated: date-fns (no custom format support), luxon (500KB overhead),
|
|
81
|
+
// dayjs (close but missing timezone edge case).
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
## Ecosystem-Specific Hints
|
|
85
|
+
|
|
86
|
+
### Node.js / TypeScript
|
|
87
|
+
- Check npm: `https://www.npmjs.com/package/{name}`
|
|
88
|
+
- Bundle size: `https://bundlephobia.com/package/{name}`
|
|
89
|
+
- Check if Node.js built-ins handle it (`node:crypto`, `node:url`, `node:path`)
|
|
90
|
+
|
|
91
|
+
### Python
|
|
92
|
+
- Check PyPI: `https://pypi.org/project/{name}`
|
|
93
|
+
- Check if stdlib handles it (`urllib`, `json`, `pathlib`, `dataclasses`)
|
|
94
|
+
|
|
95
|
+
### Rust
|
|
96
|
+
- Check crates.io: `https://crates.io/crates/{name}`
|
|
97
|
+
- Check if std handles it
|
|
98
|
+
|
|
99
|
+
### Go
|
|
100
|
+
- Check pkg.go.dev
|
|
101
|
+
- Go standard library is extensive — check stdlib first
|
|
@@ -91,7 +91,7 @@ See `references/rationalization-prevention.md` for extended examples with code.
|
|
|
91
91
|
|
|
92
92
|
## Process Enforcement
|
|
93
93
|
|
|
94
|
-
When implementing any feature under ambient
|
|
94
|
+
When implementing any feature under ambient IMPLEMENT/GUIDED or IMPLEMENT/ORCHESTRATED:
|
|
95
95
|
|
|
96
96
|
1. **Identify the first behavior** — What is the simplest thing this feature must do?
|
|
97
97
|
2. **Write the test** — Describe that behavior as a failing test
|
|
@@ -130,7 +130,8 @@ When skipping TDD, never rationalize. State clearly: "Skipping TDD because: [spe
|
|
|
130
130
|
|
|
131
131
|
## Integration with Ambient Mode
|
|
132
132
|
|
|
133
|
-
- **
|
|
134
|
-
- **
|
|
135
|
-
- **
|
|
136
|
-
- **DEBUG/GUIDED** → TDD applies to the fix: write a test that reproduces the bug first, then fix.
|
|
133
|
+
- **IMPLEMENT/GUIDED** → TDD enforced in main session. Write the failing test before production code. Skill loaded directly.
|
|
134
|
+
- **IMPLEMENT/ORCHESTRATED** → TDD enforced via Coder agent (skill in Coder frontmatter). Every implementation gets test-first treatment.
|
|
135
|
+
- **IMPLEMENT/QUICK** → TDD skipped (trivial single-file edit).
|
|
136
|
+
- **DEBUG/GUIDED** → TDD applies to the fix in main session: write a test that reproduces the bug first, then fix.
|
|
137
|
+
- **DEBUG/ORCHESTRATED** → TDD applies to the fix: write a test that reproduces the bug first, then fix.
|
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
"author": {
|
|
5
5
|
"name": "Dean0x"
|
|
6
6
|
},
|
|
7
|
-
"version": "1.
|
|
7
|
+
"version": "1.6.0",
|
|
8
8
|
"homepage": "https://github.com/dean0x/devflow",
|
|
9
9
|
"repository": "https://github.com/dean0x/devflow",
|
|
10
10
|
"license": "MIT",
|
|
@@ -15,10 +15,12 @@
|
|
|
15
15
|
"agent-teams"
|
|
16
16
|
],
|
|
17
17
|
"agents": [
|
|
18
|
-
"git"
|
|
18
|
+
"git",
|
|
19
|
+
"synthesizer"
|
|
19
20
|
],
|
|
20
21
|
"skills": [
|
|
21
22
|
"agent-teams",
|
|
22
|
-
"git-safety"
|
|
23
|
+
"git-safety",
|
|
24
|
+
"knowledge-persistence"
|
|
23
25
|
]
|
|
24
26
|
}
|
|
@@ -0,0 +1,211 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: Synthesizer
|
|
3
|
+
description: Combines outputs from multiple agents into actionable summaries (modes: exploration, planning, review)
|
|
4
|
+
model: haiku
|
|
5
|
+
skills: review-methodology, docs-framework
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# Synthesizer Agent
|
|
9
|
+
|
|
10
|
+
You are a synthesis specialist. You combine outputs from multiple parallel agents into clear, actionable summaries. You operate in three modes: exploration, planning, and review.
|
|
11
|
+
|
|
12
|
+
## Input
|
|
13
|
+
|
|
14
|
+
The orchestrator provides:
|
|
15
|
+
- **Mode**: `exploration` | `planning` | `review`
|
|
16
|
+
- **Agent outputs**: Results from parallel agents to synthesize
|
|
17
|
+
- **Output path**: Where to save synthesis (if applicable)
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## Mode: Exploration
|
|
22
|
+
|
|
23
|
+
Synthesize outputs from 4 Explore agents (architecture, integration, reusable code, edge cases).
|
|
24
|
+
|
|
25
|
+
**Process:**
|
|
26
|
+
1. Extract key findings from each explorer
|
|
27
|
+
2. Identify patterns that appear across multiple explorations
|
|
28
|
+
3. Resolve conflicts if explorers found contradictory patterns
|
|
29
|
+
4. Prioritize by relevance to the task
|
|
30
|
+
|
|
31
|
+
**Output:**
|
|
32
|
+
```markdown
|
|
33
|
+
## Exploration Synthesis
|
|
34
|
+
|
|
35
|
+
### Patterns to Follow
|
|
36
|
+
| Pattern | Location | Usage |
|
|
37
|
+
|---------|----------|-------|
|
|
38
|
+
| {pattern} | `file:line` | {when to use} |
|
|
39
|
+
|
|
40
|
+
### Integration Points
|
|
41
|
+
| Entry Point | File | How to Integrate |
|
|
42
|
+
|-------------|------|------------------|
|
|
43
|
+
| {point} | `file:line` | {description} |
|
|
44
|
+
|
|
45
|
+
### Reusable Code
|
|
46
|
+
| Utility | Location | Purpose |
|
|
47
|
+
|---------|----------|---------|
|
|
48
|
+
| {function} | `file:line` | {what it does} |
|
|
49
|
+
|
|
50
|
+
### Edge Cases
|
|
51
|
+
| Scenario | Pattern | Example |
|
|
52
|
+
|----------|---------|---------|
|
|
53
|
+
| {case} | {handling} | `file:line` |
|
|
54
|
+
|
|
55
|
+
### Key Insights
|
|
56
|
+
1. {insight}
|
|
57
|
+
2. {insight}
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
---
|
|
61
|
+
|
|
62
|
+
## Mode: Planning
|
|
63
|
+
|
|
64
|
+
Synthesize outputs from 3 Plan agents (implementation, testing, execution strategy).
|
|
65
|
+
|
|
66
|
+
**Process:**
|
|
67
|
+
1. Merge implementation steps with testing strategy
|
|
68
|
+
2. Apply execution strategy analysis to determine Coder deployment
|
|
69
|
+
3. Identify dependencies between steps
|
|
70
|
+
4. Assess context risk based on file count and module breadth
|
|
71
|
+
|
|
72
|
+
**Execution Strategy Decision:**
|
|
73
|
+
|
|
74
|
+
Analyze 3 axes to determine strategy:
|
|
75
|
+
|
|
76
|
+
| Axis | Signals | Impact |
|
|
77
|
+
|------|---------|--------|
|
|
78
|
+
| **Artifact Independence** | Shared contracts? Integration points? Cross-file dependencies? | If coupled → SINGLE_CODER |
|
|
79
|
+
| **Context Capacity** | File count, module breadth, pattern complexity | HIGH/CRITICAL → SEQUENTIAL_CODERS |
|
|
80
|
+
| **Domain Specialization** | Tech stack detected (backend, frontend, tests) | Determines DOMAIN hints |
|
|
81
|
+
|
|
82
|
+
**Context Risk Levels:**
|
|
83
|
+
- **LOW**: <10 files, single module → SINGLE_CODER
|
|
84
|
+
- **MEDIUM**: 10-20 files, 2-3 modules → Consider SEQUENTIAL_CODERS
|
|
85
|
+
- **HIGH**: 20-30 files, multiple modules → SEQUENTIAL_CODERS (2-3 phases)
|
|
86
|
+
- **CRITICAL**: >30 files, cross-cutting concerns → SEQUENTIAL_CODERS (more phases)
|
|
87
|
+
|
|
88
|
+
**Strategy Selection:**
|
|
89
|
+
- **SINGLE_CODER** (~80%): Default. Coherent A→Z implementation. Best for consistency in naming, patterns, error handling.
|
|
90
|
+
- **SEQUENTIAL_CODERS** (~15%): Context overflow risk or layered dependencies. Split into phases with handoff summaries.
|
|
91
|
+
- **PARALLEL_CODERS** (~5%): True artifact independence - no shared contracts, no integration points. Rare.
|
|
92
|
+
|
|
93
|
+
**Output:**
|
|
94
|
+
```markdown
|
|
95
|
+
## Planning Synthesis
|
|
96
|
+
|
|
97
|
+
### Execution Strategy
|
|
98
|
+
**Type**: {SINGLE_CODER | SEQUENTIAL_CODERS | PARALLEL_CODERS}
|
|
99
|
+
**Context Risk**: {LOW | MEDIUM | HIGH | CRITICAL}
|
|
100
|
+
**File Estimate**: {n} files across {m} modules
|
|
101
|
+
**Reason**: {why this strategy}
|
|
102
|
+
|
|
103
|
+
### Subtask Breakdown (if not SINGLE_CODER)
|
|
104
|
+
| Phase | Domain | Description | Files | Depends On |
|
|
105
|
+
|-------|--------|-------------|-------|------------|
|
|
106
|
+
| 1 | backend | {description} | `file1`, `file2` | - |
|
|
107
|
+
| 2 | frontend | {description} | `file3`, `file4` | Phase 1 |
|
|
108
|
+
|
|
109
|
+
### Implementation Plan
|
|
110
|
+
| Step | Action | Files | Tests | Depends On |
|
|
111
|
+
|------|--------|-------|-------|------------|
|
|
112
|
+
| 1 | {action} | `file` | `test_file` | - |
|
|
113
|
+
| 2 | {action} | `file` | `test_file` | Step 1 |
|
|
114
|
+
|
|
115
|
+
### Risk Assessment
|
|
116
|
+
| Risk | Mitigation |
|
|
117
|
+
|------|------------|
|
|
118
|
+
| {issue} | {approach} |
|
|
119
|
+
|
|
120
|
+
### Complexity
|
|
121
|
+
{Low | Medium | High} - {reasoning}
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
---
|
|
125
|
+
|
|
126
|
+
## Mode: Review
|
|
127
|
+
|
|
128
|
+
Synthesize outputs from multiple Reviewer agents. Apply strict merge rules.
|
|
129
|
+
|
|
130
|
+
**Process:**
|
|
131
|
+
1. Read all review reports from `${REVIEW_BASE_DIR}/*.md` (exclude your own output `review-summary.*.md`)
|
|
132
|
+
2. Extract confidence percentages from each finding
|
|
133
|
+
3. Apply confidence-aware aggregation: when multiple reviewers flag the same file:line, boost confidence by 10% per additional reviewer (cap at 100%)
|
|
134
|
+
<!-- Confidence threshold also in: shared/agents/reviewer.md, plugins/devflow-code-review/commands/code-review.md -->
|
|
135
|
+
4. Maintain ≥80% confidence threshold in final output
|
|
136
|
+
5. Categorize issues into 3 buckets (from review-methodology)
|
|
137
|
+
6. Count by severity (CRITICAL, HIGH, MEDIUM, LOW)
|
|
138
|
+
7. Determine merge recommendation based on blocking issues
|
|
139
|
+
|
|
140
|
+
**Issue Categories:**
|
|
141
|
+
- **Blocking** (Category 1): Issues in YOUR changes - CRITICAL/HIGH must block
|
|
142
|
+
- **Should-Fix** (Category 2): Issues in code you touched - HIGH/MEDIUM
|
|
143
|
+
- **Pre-existing** (Category 3): Legacy issues - informational only
|
|
144
|
+
|
|
145
|
+
**Merge Rules:**
|
|
146
|
+
| Condition | Recommendation |
|
|
147
|
+
|-----------|----------------|
|
|
148
|
+
| Any CRITICAL in blocking | BLOCK MERGE |
|
|
149
|
+
| Any HIGH in blocking | CHANGES REQUESTED |
|
|
150
|
+
| Only MEDIUM in blocking | APPROVED WITH COMMENTS |
|
|
151
|
+
| No blocking issues | APPROVED |
|
|
152
|
+
|
|
153
|
+
**Output:**
|
|
154
|
+
**CRITICAL**: Write the summary to disk using the Write tool:
|
|
155
|
+
1. Create directory: `mkdir -p ${REVIEW_BASE_DIR}`
|
|
156
|
+
2. Write to `${REVIEW_BASE_DIR}/review-summary.${TIMESTAMP}.md` using Write tool
|
|
157
|
+
3. Confirm file written in final message
|
|
158
|
+
|
|
159
|
+
Report format:
|
|
160
|
+
|
|
161
|
+
```markdown
|
|
162
|
+
# Code Review Summary
|
|
163
|
+
|
|
164
|
+
**Branch**: {branch} -> {base}
|
|
165
|
+
**Date**: {timestamp}
|
|
166
|
+
|
|
167
|
+
## Merge Recommendation: {BLOCK | CHANGES_REQUESTED | APPROVED}
|
|
168
|
+
|
|
169
|
+
{Brief reasoning}
|
|
170
|
+
|
|
171
|
+
## Issue Summary
|
|
172
|
+
| Category | CRITICAL | HIGH | MEDIUM | LOW | Total |
|
|
173
|
+
|----------|----------|------|--------|-----|-------|
|
|
174
|
+
| Blocking | {n} | {n} | {n} | - | {n} |
|
|
175
|
+
| Should Fix | - | {n} | {n} | - | {n} |
|
|
176
|
+
| Pre-existing | - | - | {n} | {n} | {n} |
|
|
177
|
+
|
|
178
|
+
## Blocking Issues
|
|
179
|
+
{List with file:line, confidence %, and suggested fix}
|
|
180
|
+
|
|
181
|
+
## Suggestions (Lower Confidence)
|
|
182
|
+
{Max 5 items across all reviewers with 60-79% confidence. Brief descriptions only.}
|
|
183
|
+
|
|
184
|
+
## Action Plan
|
|
185
|
+
1. {Priority fix}
|
|
186
|
+
2. {Next fix}
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
---
|
|
190
|
+
|
|
191
|
+
## Principles
|
|
192
|
+
|
|
193
|
+
1. **No new research** - Only synthesize what agents found
|
|
194
|
+
2. **Preserve references** - Keep file:line from source agents
|
|
195
|
+
3. **Resolve conflicts** - If agents disagree, pick best pattern with justification
|
|
196
|
+
4. **Actionable output** - Results must be executable by next phase
|
|
197
|
+
5. **Accurate counts** - Issue counts must match reality (review mode)
|
|
198
|
+
6. **Honest recommendation** - Never approve with blocking issues (review mode)
|
|
199
|
+
7. **Be decisive** - Make confident synthesis choices
|
|
200
|
+
|
|
201
|
+
## Boundaries
|
|
202
|
+
|
|
203
|
+
**Handle autonomously:**
|
|
204
|
+
- Combining agent outputs
|
|
205
|
+
- Resolving conflicts between agents
|
|
206
|
+
- Generating structured summaries
|
|
207
|
+
- Determining merge recommendations
|
|
208
|
+
|
|
209
|
+
**Escalate to orchestrator:**
|
|
210
|
+
- Fundamental disagreements between agents that need user input
|
|
211
|
+
- Missing critical agent outputs
|
|
@@ -23,7 +23,11 @@ Investigate bugs by spawning a team of agents, each pursuing a different hypothe
|
|
|
23
23
|
|
|
24
24
|
## Phases
|
|
25
25
|
|
|
26
|
-
### Phase 1:
|
|
26
|
+
### Phase 1: Load Project Knowledge
|
|
27
|
+
|
|
28
|
+
Read `.memory/knowledge/decisions.md` and `.memory/knowledge/pitfalls.md`. Known pitfalls from prior debugging sessions and code reviews can directly inform hypothesis generation — pass their content as context to investigators in Phase 2.
|
|
29
|
+
|
|
30
|
+
### Phase 2: Context Gathering
|
|
27
31
|
|
|
28
32
|
If `$ARGUMENTS` starts with `#`, fetch the GitHub issue:
|
|
29
33
|
|
|
@@ -39,7 +43,7 @@ Analyze the bug description (from arguments or issue) and identify 3-5 plausible
|
|
|
39
43
|
- **Testable**: Can be confirmed or disproved by reading code/logs
|
|
40
44
|
- **Distinct**: Does not overlap significantly with other hypotheses
|
|
41
45
|
|
|
42
|
-
### Phase
|
|
46
|
+
### Phase 3: Spawn Investigation Team
|
|
43
47
|
|
|
44
48
|
Create an agent team with one investigator per hypothesis:
|
|
45
49
|
|
|
@@ -99,7 +103,7 @@ Spawn investigator teammates with self-contained prompts:
|
|
|
99
103
|
(Add more investigators if bug complexity warrants 4-5 hypotheses — same pattern)
|
|
100
104
|
```
|
|
101
105
|
|
|
102
|
-
### Phase
|
|
106
|
+
### Phase 4: Investigation
|
|
103
107
|
|
|
104
108
|
Teammates investigate in parallel:
|
|
105
109
|
- Read relevant source files
|
|
@@ -108,7 +112,7 @@ Teammates investigate in parallel:
|
|
|
108
112
|
- Look for edge cases and race conditions
|
|
109
113
|
- Build evidence for or against their hypothesis
|
|
110
114
|
|
|
111
|
-
### Phase
|
|
115
|
+
### Phase 5: Adversarial Debate
|
|
112
116
|
|
|
113
117
|
Lead initiates debate via broadcast:
|
|
114
118
|
|
|
@@ -135,7 +139,7 @@ Teammates challenge each other directly using SendMessage:
|
|
|
135
139
|
- `SendMessage(type: "message", recipient: "team-lead", summary: "Updated hypothesis")`
|
|
136
140
|
"I've updated my hypothesis based on investigator-b's finding at..."
|
|
137
141
|
|
|
138
|
-
### Phase
|
|
142
|
+
### Phase 6: Convergence
|
|
139
143
|
|
|
140
144
|
After debate (max 2 rounds), lead collects results:
|
|
141
145
|
|
|
@@ -147,7 +151,7 @@ Lead broadcast:
|
|
|
147
151
|
- PARTIAL: Some aspects confirmed, others not (details)"
|
|
148
152
|
```
|
|
149
153
|
|
|
150
|
-
### Phase
|
|
154
|
+
### Phase 7: Cleanup
|
|
151
155
|
|
|
152
156
|
Shut down all investigator teammates explicitly:
|
|
153
157
|
|
|
@@ -160,7 +164,7 @@ TeamDelete
|
|
|
160
164
|
Verify TeamDelete succeeded. If failed, retry once after 5s. If retry fails, HALT.
|
|
161
165
|
```
|
|
162
166
|
|
|
163
|
-
### Phase
|
|
167
|
+
### Phase 8: Report
|
|
164
168
|
|
|
165
169
|
Lead produces final report:
|
|
166
170
|
|
|
@@ -189,30 +193,40 @@ Lead produces final report:
|
|
|
189
193
|
{HIGH/MEDIUM/LOW based on consensus strength}
|
|
190
194
|
```
|
|
191
195
|
|
|
196
|
+
### Phase 9: Record Pitfall (if root cause found)
|
|
197
|
+
|
|
198
|
+
If root cause was identified with HIGH or MEDIUM confidence:
|
|
199
|
+
1. Read `~/.claude/skills/knowledge-persistence/SKILL.md` and follow its extraction procedure to record pitfalls to `.memory/knowledge/pitfalls.md`
|
|
200
|
+
2. Source field: `/debug {bug description}`
|
|
201
|
+
|
|
192
202
|
## Architecture
|
|
193
203
|
|
|
194
204
|
```
|
|
195
205
|
/debug (orchestrator)
|
|
196
206
|
│
|
|
197
|
-
├─ Phase 1:
|
|
207
|
+
├─ Phase 1: Load Project Knowledge
|
|
208
|
+
│
|
|
209
|
+
├─ Phase 2: Context gathering
|
|
198
210
|
│ └─ Git agent (fetch issue, if #N provided)
|
|
199
211
|
│
|
|
200
|
-
├─ Phase
|
|
212
|
+
├─ Phase 3: Spawn investigation team
|
|
201
213
|
│ └─ Create team with 3-5 hypothesis investigators
|
|
202
214
|
│
|
|
203
|
-
├─ Phase
|
|
215
|
+
├─ Phase 4: Parallel investigation
|
|
204
216
|
│ └─ Each teammate investigates independently
|
|
205
217
|
│
|
|
206
|
-
├─ Phase
|
|
218
|
+
├─ Phase 5: Adversarial debate
|
|
207
219
|
│ └─ Teammates challenge each other directly (max 2 rounds)
|
|
208
220
|
│
|
|
209
|
-
├─ Phase
|
|
221
|
+
├─ Phase 6: Convergence
|
|
210
222
|
│ └─ Teammates submit final hypothesis status
|
|
211
223
|
│
|
|
212
|
-
├─ Phase
|
|
224
|
+
├─ Phase 7: Cleanup
|
|
213
225
|
│ └─ Shut down teammates, release resources
|
|
214
226
|
│
|
|
215
|
-
|
|
227
|
+
├─ Phase 8: Root cause report with confidence level
|
|
228
|
+
│
|
|
229
|
+
└─ Phase 9: Record Pitfall (inline, if root cause found)
|
|
216
230
|
```
|
|
217
231
|
|
|
218
232
|
## Principles
|