devflow-kit 1.4.0 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/CHANGELOG.md +14 -0
  2. package/README.md +2 -1
  3. package/dist/commands/init.js +29 -0
  4. package/dist/commands/list.d.ts +21 -0
  5. package/dist/commands/list.js +71 -3
  6. package/dist/plugins.js +1 -1
  7. package/dist/utils/manifest.d.ts +45 -0
  8. package/dist/utils/manifest.js +100 -0
  9. package/package.json +1 -1
  10. package/plugins/devflow-accessibility/.claude-plugin/plugin.json +1 -1
  11. package/plugins/devflow-ambient/.claude-plugin/plugin.json +1 -1
  12. package/plugins/devflow-ambient/skills/ambient-router/SKILL.md +1 -1
  13. package/plugins/devflow-ambient/skills/ambient-router/references/skill-catalog.md +1 -0
  14. package/plugins/devflow-audit-claude/.claude-plugin/plugin.json +1 -1
  15. package/plugins/devflow-code-review/.claude-plugin/plugin.json +1 -1
  16. package/plugins/devflow-code-review/agents/reviewer.md +42 -5
  17. package/plugins/devflow-code-review/agents/synthesizer.md +12 -5
  18. package/plugins/devflow-code-review/commands/code-review.md +4 -1
  19. package/plugins/devflow-core-skills/.claude-plugin/plugin.json +2 -1
  20. package/plugins/devflow-core-skills/skills/search-first/SKILL.md +133 -0
  21. package/plugins/devflow-core-skills/skills/search-first/references/evaluation-criteria.md +101 -0
  22. package/plugins/devflow-debug/.claude-plugin/plugin.json +1 -1
  23. package/plugins/devflow-frontend-design/.claude-plugin/plugin.json +1 -1
  24. package/plugins/devflow-go/.claude-plugin/plugin.json +1 -1
  25. package/plugins/devflow-implement/.claude-plugin/plugin.json +1 -1
  26. package/plugins/devflow-implement/agents/coder.md +16 -13
  27. package/plugins/devflow-implement/agents/synthesizer.md +12 -5
  28. package/plugins/devflow-implement/commands/implement-teams.md +1 -5
  29. package/plugins/devflow-implement/commands/implement.md +1 -5
  30. package/plugins/devflow-java/.claude-plugin/plugin.json +1 -1
  31. package/plugins/devflow-python/.claude-plugin/plugin.json +1 -1
  32. package/plugins/devflow-react/.claude-plugin/plugin.json +1 -1
  33. package/plugins/devflow-resolve/.claude-plugin/plugin.json +1 -1
  34. package/plugins/devflow-rust/.claude-plugin/plugin.json +1 -1
  35. package/plugins/devflow-self-review/.claude-plugin/plugin.json +1 -1
  36. package/plugins/devflow-specify/.claude-plugin/plugin.json +1 -1
  37. package/plugins/devflow-specify/agents/synthesizer.md +12 -5
  38. package/plugins/devflow-typescript/.claude-plugin/plugin.json +1 -1
  39. package/shared/agents/coder.md +16 -13
  40. package/shared/agents/reviewer.md +42 -5
  41. package/shared/agents/synthesizer.md +12 -5
  42. package/shared/skills/ambient-router/SKILL.md +1 -1
  43. package/shared/skills/ambient-router/references/skill-catalog.md +1 -0
  44. package/shared/skills/search-first/SKILL.md +133 -0
  45. package/shared/skills/search-first/references/evaluation-criteria.md +101 -0
@@ -128,10 +128,14 @@ Analyze 3 axes to determine strategy:
128
128
  Synthesize outputs from multiple Reviewer agents. Apply strict merge rules.
129
129
 
130
130
  **Process:**
131
- 1. Read all review reports from `${REVIEW_BASE_DIR}/*-report.*.md`
132
- 2. Categorize issues into 3 buckets (from review-methodology)
133
- 3. Count by severity (CRITICAL, HIGH, MEDIUM, LOW)
134
- 4. Determine merge recommendation based on blocking issues
131
+ 1. Read all review reports from `${REVIEW_BASE_DIR}/*.md` (exclude your own output `review-summary.*.md`)
132
+ 2. Extract confidence percentages from each finding
133
+ 3. Apply confidence-aware aggregation: when multiple reviewers flag the same file:line, boost confidence by 10% per additional reviewer (cap at 100%)
134
+ <!-- Confidence threshold also in: shared/agents/reviewer.md, plugins/devflow-code-review/commands/code-review.md -->
135
+ 4. Maintain ≥80% confidence threshold in final output
136
+ 5. Categorize issues into 3 buckets (from review-methodology)
137
+ 6. Count by severity (CRITICAL, HIGH, MEDIUM, LOW)
138
+ 7. Determine merge recommendation based on blocking issues
135
139
 
136
140
  **Issue Categories:**
137
141
  - **Blocking** (Category 1): Issues in YOUR changes - CRITICAL/HIGH must block
@@ -172,7 +176,10 @@ Report format:
172
176
  | Pre-existing | - | - | {n} | {n} | {n} |
173
177
 
174
178
  ## Blocking Issues
175
- {List with file:line and suggested fix}
179
+ {List with file:line, confidence %, and suggested fix}
180
+
181
+ ## Suggestions (Lower Confidence)
182
+ {Max 5 items across all reviewers with 60-79% confidence. Brief descriptions only.}
176
183
 
177
184
  ## Action Plan
178
185
  1. {Priority fix}
@@ -54,7 +54,7 @@ Based on classified intent, read the following skills to inform your response.
54
54
 
55
55
  | Intent | Primary Skills | Secondary (if file type matches) |
56
56
  |--------|---------------|----------------------------------|
57
- | **BUILD** | test-driven-development, implementation-patterns | typescript (.ts), react (.tsx/.jsx), go (.go), java (.java), python (.py), rust (.rs), frontend-design (CSS/UI), input-validation (forms/API), security-patterns (auth/crypto) |
57
+ | **BUILD** | test-driven-development, implementation-patterns, search-first | typescript (.ts), react (.tsx/.jsx), go (.go), java (.java), python (.py), rust (.rs), frontend-design (CSS/UI), input-validation (forms/API), security-patterns (auth/crypto) |
58
58
  | **DEBUG** | test-patterns, core-patterns | git-safety (if git operations involved) |
59
59
  | **REVIEW** | self-review, core-patterns | test-patterns |
60
60
  | **PLAN** | implementation-patterns | core-patterns |
@@ -12,6 +12,7 @@ These skills may be loaded during GUIDED-depth ambient routing.
12
12
  |-------|-------------|---------------|
13
13
  | test-driven-development | Always for BUILD | `*.ts`, `*.tsx`, `*.js`, `*.jsx`, `*.py` |
14
14
  | implementation-patterns | Always for BUILD | Any code file |
15
+ | search-first | Always for BUILD | Any code file |
15
16
  | typescript | TypeScript files in scope | `*.ts`, `*.tsx` |
16
17
  | react | React components in scope | `*.tsx`, `*.jsx` |
17
18
  | frontend-design | UI/styling work | `*.css`, `*.scss`, `*.tsx` with styling keywords |
@@ -0,0 +1,133 @@
1
+ ---
2
+ name: search-first
3
+ description: >-
4
+ This skill should be used when the user asks to "add a utility", "create a helper",
5
+ "implement parsing", "build a wrapper", or writes infrastructure/utility code that
6
+ may already exist as a well-maintained package. Enforces research before building.
7
+ user-invocable: false
8
+ allowed-tools: Read, Grep, Glob
9
+ ---
10
+
11
+ # Search-First
12
+
13
+ Research before building. Check if a battle-tested solution exists before writing custom utility code.
14
+
15
+ ## Iron Law
16
+
17
+ > **RESEARCH BEFORE BUILDING**
18
+ >
19
+ > Never write custom utility code without first checking if a battle-tested solution
20
+ > exists. The best code is code you don't write. A maintained package with thousands
21
+ > of users will always beat a hand-rolled utility in reliability, edge cases, and
22
+ > long-term maintenance.
23
+
24
+ ## When This Skill Activates
25
+
26
+ **Triggers** — creating or modifying code that:
27
+ - Parses, formats, or validates data (dates, URLs, emails, UUIDs, etc.)
28
+ - Implements common algorithms (sorting, diffing, hashing, encoding)
29
+ - Wraps HTTP clients, retries, rate limiting, caching
30
+ - Handles file system operations beyond basic read/write
31
+ - Implements CLI argument parsing, logging, or configuration
32
+ - Creates test utilities (mocking, fixtures, assertions)
33
+
34
+ **Does NOT trigger** for:
35
+ - Domain-specific business logic unique to this project
36
+ - Glue code connecting existing components
37
+ - Trivial operations (< 5 lines, single-use)
38
+ - Code that intentionally avoids external dependencies (e.g., zero-dep libraries)
39
+
40
+ ---
41
+
42
+ ## Research Process
43
+
44
+ ### Phase 1: Need Analysis
45
+
46
+ Before searching, define what you actually need:
47
+
48
+ ```
49
+ Need: {one-sentence description of the capability}
50
+ Constraints: {runtime, bundle size, license, zero-dep requirement}
51
+ Must-haves: {non-negotiable requirements}
52
+ Nice-to-haves: {optional features}
53
+ ```
54
+
55
+ ### Phase 2: Search
56
+
57
+ Delegate research to an Explore subagent to keep main session context clean.
58
+
59
+ **Spawn an Explore agent** with this prompt template:
60
+
61
+ ```
62
+ Task(subagent_type="Explore"):
63
+ "Research existing solutions for: {need description}
64
+
65
+ Search for:
66
+ 1. npm/PyPI/crates packages that solve this (check package.json/requirements.txt for ecosystem)
67
+ 2. Existing utilities in this codebase (grep for related function names)
68
+ 3. Framework built-ins that already handle this
69
+
70
+ For each candidate, find:
71
+ - Package name and weekly downloads (if applicable)
72
+ - Last publish date and maintenance status
73
+ - Bundle size / dependency count
74
+ - API surface relevant to our need
75
+ - License compatibility
76
+
77
+ Return top 3 candidates with pros/cons, or confirm nothing suitable exists."
78
+ ```
79
+
80
+ ### Phase 3: Evaluate
81
+
82
+ Score each candidate against evaluation criteria. See `references/evaluation-criteria.md` for the full matrix.
83
+
84
+ Quick checklist:
85
+ - [ ] Last published within 12 months
86
+ - [ ] Weekly downloads > 1,000 (npm) or equivalent traction
87
+ - [ ] No known vulnerabilities (check Snyk/npm audit)
88
+ - [ ] API fits the use case without heavy wrapping
89
+ - [ ] License compatible with project (MIT/Apache/BSD preferred)
90
+ - [ ] Bundle size acceptable for the project context
91
+
92
+ ### Phase 4: Decide
93
+
94
+ Choose one of four outcomes:
95
+
96
+ | Decision | When | Action |
97
+ |----------|------|--------|
98
+ | **Adopt** | Exact match, well-maintained, good API | Install and use directly |
99
+ | **Extend** | Partial match, needs thin wrapper | Install + write minimal adapter |
100
+ | **Compose** | No single package fits, but 2-3 small ones combine well | Install multiple, write glue code |
101
+ | **Build** | Nothing fits, or dependency cost exceeds value | Write custom, document why |
102
+
103
+ **Document the decision** in a code comment at the usage site:
104
+
105
+ ```typescript
106
+ // search-first: Adopted date-fns for date formatting (2M weekly downloads, 30KB)
107
+ // search-first: Built custom — no package handles our specific wire format
108
+ ```
109
+
110
+ ---
111
+
112
+ ## Anti-Patterns
113
+
114
+ | Anti-Pattern | Correct Approach |
115
+ |-------------|-----------------|
116
+ | Adding a dependency for 5 lines of trivial code | Build — dependency overhead exceeds value |
117
+ | Choosing the most popular package without checking fit | Evaluate API fit, not just popularity |
118
+ | Wrapping a package so heavily it obscures the original | If wrapping > 50% of original API, reconsider |
119
+ | Skipping research because "I know how to build this" | Research anyway — maintenance cost matters more than initial build |
120
+ | Installing a massive framework for one utility function | Look for focused, single-purpose packages |
121
+
122
+ ## Scope Limiter
123
+
124
+ This skill concerns **utility and infrastructure code** only:
125
+ - Data transformation, validation, formatting
126
+ - Network operations, retries, caching
127
+ - CLI tooling, logging, configuration
128
+ - Test utilities and helpers
129
+
130
+ It does NOT apply to **domain-specific business logic** where:
131
+ - The logic encodes unique business rules
132
+ - No generic solution could exist
133
+ - The code is inherently project-specific
@@ -0,0 +1,101 @@
1
+ # Search-First — Evaluation Criteria
2
+
3
+ Detailed package evaluation criteria and decision matrix for the 4-outcome model.
4
+
5
+ ## Evaluation Matrix
6
+
7
+ Score each candidate on these axes (1-5 scale):
8
+
9
+ | Criterion | Weight | 1 (Poor) | 3 (Acceptable) | 5 (Excellent) |
10
+ |-----------|--------|-----------|-----------------|----------------|
11
+ | **Maintenance** | High | No commits in 2+ years | Active, yearly releases | Regular releases, responsive maintainer |
12
+ | **Adoption** | Medium | < 100 weekly downloads | 1K-10K weekly downloads | > 100K weekly downloads |
13
+ | **API Fit** | High | Needs heavy wrapping | Partial fit, thin adapter needed | Direct use, clean API |
14
+ | **Bundle Size** | Medium | > 500KB | 50-500KB | < 50KB |
15
+ | **Security** | High | Known vulnerabilities | No known issues, few dependencies | Audited, zero/minimal dependencies |
16
+ | **License** | Required | GPL/AGPL (restrictive) | LGPL (conditional) | MIT/Apache/BSD (permissive) |
17
+
18
+ **Minimum thresholds**: License must be compatible. Security must be ≥ 3. All others are trade-offs.
19
+
20
+ ## Decision Matrix
21
+
22
+ ### Adopt (score ≥ 20/25, API Fit ≥ 4)
23
+
24
+ The package directly solves the problem with minimal integration code.
25
+
26
+ **Example**: Using `zod` for schema validation — exact fit, massive adoption, tiny bundle.
27
+
28
+ ```
29
+ ✅ Adopt: zod v3.22
30
+ - Maintenance: 5 (monthly releases)
31
+ - Adoption: 5 (4M weekly downloads)
32
+ - API Fit: 5 (direct use for all validation)
33
+ - Bundle Size: 4 (57KB)
34
+ - Security: 5 (zero dependencies)
35
+ - Total: 24/25
36
+ ```
37
+
38
+ ### Extend (score ≥ 15/25, API Fit ≥ 2)
39
+
40
+ The package handles 60-80% of the need. Write a thin adapter for the rest.
41
+
42
+ **Example**: Using `got` for HTTP but wrapping it with project-specific retry and auth logic.
43
+
44
+ ```
45
+ ✅ Extend: got v14
46
+ - Maintenance: 4 (active)
47
+ - Adoption: 5 (8M weekly downloads)
48
+ - API Fit: 3 (need custom retry wrapper)
49
+ - Bundle Size: 3 (150KB)
50
+ - Security: 4 (minimal deps)
51
+ - Total: 19/25
52
+ Adapter: ~30 lines wrapping retry + auth headers
53
+ ```
54
+
55
+ ### Compose (no single package fits, but small packages combine)
56
+
57
+ Two or three focused packages together solve the problem better than one large framework.
58
+
59
+ **Example**: `ms` (time parsing) + `p-retry` (retry logic) + `quick-lru` (caching) instead of a monolithic HTTP client framework.
60
+
61
+ **Rules for Compose**:
62
+ - Maximum 3 packages in a composition
63
+ - Each package must be focused (single responsibility)
64
+ - Total combined bundle < what a monolithic alternative would cost
65
+ - Glue code should be < 50 lines
66
+
67
+ ### Build (nothing fits, or dependency cost > value)
68
+
69
+ Write custom code when:
70
+ - No package scores ≥ 15/25
71
+ - The code is < 50 lines and trivial
72
+ - Zero-dependency constraint is explicit
73
+ - The domain is too specific for generic packages
74
+
75
+ **Required**: Document why Build was chosen:
76
+
77
+ ```typescript
78
+ // search-first: Built custom — our wire format uses non-standard
79
+ // ISO-8601 extensions that no date library handles correctly.
80
+ // Evaluated: date-fns (no custom format support), luxon (500KB overhead),
81
+ // dayjs (close but missing timezone edge case).
82
+ ```
83
+
84
+ ## Ecosystem-Specific Hints
85
+
86
+ ### Node.js / TypeScript
87
+ - Check npm: `https://www.npmjs.com/package/{name}`
88
+ - Bundle size: `https://bundlephobia.com/package/{name}`
89
+ - Check if Node.js built-ins handle it (`node:crypto`, `node:url`, `node:path`)
90
+
91
+ ### Python
92
+ - Check PyPI: `https://pypi.org/project/{name}`
93
+ - Check if stdlib handles it (`urllib`, `json`, `pathlib`, `dataclasses`)
94
+
95
+ ### Rust
96
+ - Check crates.io: `https://crates.io/crates/{name}`
97
+ - Check if std handles it
98
+
99
+ ### Go
100
+ - Check pkg.go.dev
101
+ - Go standard library is extensive — check stdlib first