npm - devflow-kit - Versions diffs - 1.4.0 → 1.5.0 - Mend

devflow-kit 1.4.0 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (45) hide show

package/shared/agents/synthesizer.md CHANGED Viewed

@@ -128,10 +128,14 @@ Analyze 3 axes to determine strategy:
 Synthesize outputs from multiple Reviewer agents. Apply strict merge rules.
 **Process:**
-1. Read all review reports from `${REVIEW_BASE_DIR}/*-report.*.md`
-2. Categorize issues into 3 buckets (from review-methodology)
-3. Count by severity (CRITICAL, HIGH, MEDIUM, LOW)
-4. Determine merge recommendation based on blocking issues
+1. Read all review reports from `${REVIEW_BASE_DIR}/*.md` (exclude your own output `review-summary.*.md`)
+2. Extract confidence percentages from each finding
+3. Apply confidence-aware aggregation: when multiple reviewers flag the same file:line, boost confidence by 10% per additional reviewer (cap at 100%)
+<!-- Confidence threshold also in: shared/agents/reviewer.md, plugins/devflow-code-review/commands/code-review.md -->
+4. Maintain ≥80% confidence threshold in final output
+5. Categorize issues into 3 buckets (from review-methodology)
+6. Count by severity (CRITICAL, HIGH, MEDIUM, LOW)
+7. Determine merge recommendation based on blocking issues
 **Issue Categories:**
 - **Blocking** (Category 1): Issues in YOUR changes - CRITICAL/HIGH must block
@@ -172,7 +176,10 @@ Report format:
 | Pre-existing | - | - | {n} | {n} | {n} |
 ## Blocking Issues
-{List with file:line and suggested fix}
+{List with file:line, confidence %, and suggested fix}
+## Suggestions (Lower Confidence)
+{Max 5 items across all reviewers with 60-79% confidence. Brief descriptions only.}
 ## Action Plan
 1. {Priority fix}

package/shared/skills/ambient-router/SKILL.md CHANGED Viewed

@@ -54,7 +54,7 @@ Based on classified intent, read the following skills to inform your response.
 | Intent | Primary Skills | Secondary (if file type matches) |
 |--------|---------------|----------------------------------|
-| **BUILD** | test-driven-development, implementation-patterns | typescript (.ts), react (.tsx/.jsx), go (.go), java (.java), python (.py), rust (.rs), frontend-design (CSS/UI), input-validation (forms/API), security-patterns (auth/crypto) |
+| **BUILD** | test-driven-development, implementation-patterns, search-first | typescript (.ts), react (.tsx/.jsx), go (.go), java (.java), python (.py), rust (.rs), frontend-design (CSS/UI), input-validation (forms/API), security-patterns (auth/crypto) |
 | **DEBUG** | test-patterns, core-patterns | git-safety (if git operations involved) |
 | **REVIEW** | self-review, core-patterns | test-patterns |
 | **PLAN** | implementation-patterns | core-patterns |

package/shared/skills/ambient-router/references/skill-catalog.md CHANGED Viewed

@@ -12,6 +12,7 @@ These skills may be loaded during GUIDED-depth ambient routing.
 |-------|-------------|---------------|
 | test-driven-development | Always for BUILD | `*.ts`, `*.tsx`, `*.js`, `*.jsx`, `*.py` |
 | implementation-patterns | Always for BUILD | Any code file |
+| search-first | Always for BUILD | Any code file |
 | typescript | TypeScript files in scope | `*.ts`, `*.tsx` |
 | react | React components in scope | `*.tsx`, `*.jsx` |
 | frontend-design | UI/styling work | `*.css`, `*.scss`, `*.tsx` with styling keywords |

package/shared/skills/search-first/SKILL.md ADDED Viewed

@@ -0,0 +1,133 @@
+---
+name: search-first
+description: >-
+  This skill should be used when the user asks to "add a utility", "create a helper",
+  "implement parsing", "build a wrapper", or writes infrastructure/utility code that
+  may already exist as a well-maintained package. Enforces research before building.
+user-invocable: false
+allowed-tools: Read, Grep, Glob
+---
+# Search-First
+Research before building. Check if a battle-tested solution exists before writing custom utility code.
+## Iron Law
+> **RESEARCH BEFORE BUILDING**
+>
+> Never write custom utility code without first checking if a battle-tested solution
+> exists. The best code is code you don't write. A maintained package with thousands
+> of users will always beat a hand-rolled utility in reliability, edge cases, and
+> long-term maintenance.
+## When This Skill Activates
+**Triggers** — creating or modifying code that:
+- Parses, formats, or validates data (dates, URLs, emails, UUIDs, etc.)
+- Implements common algorithms (sorting, diffing, hashing, encoding)
+- Wraps HTTP clients, retries, rate limiting, caching
+- Handles file system operations beyond basic read/write
+- Implements CLI argument parsing, logging, or configuration
+- Creates test utilities (mocking, fixtures, assertions)
+**Does NOT trigger** for:
+- Domain-specific business logic unique to this project
+- Glue code connecting existing components
+- Trivial operations (< 5 lines, single-use)
+- Code that intentionally avoids external dependencies (e.g., zero-dep libraries)
+---
+## Research Process
+### Phase 1: Need Analysis
+Before searching, define what you actually need:
+```
+Need: {one-sentence description of the capability}
+Constraints: {runtime, bundle size, license, zero-dep requirement}
+Must-haves: {non-negotiable requirements}
+Nice-to-haves: {optional features}
+```
+### Phase 2: Search
+Delegate research to an Explore subagent to keep main session context clean.
+**Spawn an Explore agent** with this prompt template:
+```
+Task(subagent_type="Explore"):
+"Research existing solutions for: {need description}
+Search for:
+1. npm/PyPI/crates packages that solve this (check package.json/requirements.txt for ecosystem)
+2. Existing utilities in this codebase (grep for related function names)
+3. Framework built-ins that already handle this
+For each candidate, find:
+- Package name and weekly downloads (if applicable)
+- Last publish date and maintenance status
+- Bundle size / dependency count
+- API surface relevant to our need
+- License compatibility
+Return top 3 candidates with pros/cons, or confirm nothing suitable exists."
+```
+### Phase 3: Evaluate
+Score each candidate against evaluation criteria. See `references/evaluation-criteria.md` for the full matrix.
+Quick checklist:
+- [ ] Last published within 12 months
+- [ ] Weekly downloads > 1,000 (npm) or equivalent traction
+- [ ] No known vulnerabilities (check Snyk/npm audit)
+- [ ] API fits the use case without heavy wrapping
+- [ ] License compatible with project (MIT/Apache/BSD preferred)
+- [ ] Bundle size acceptable for the project context
+### Phase 4: Decide
+Choose one of four outcomes:
+| Decision | When | Action |
+|----------|------|--------|
+| **Adopt** | Exact match, well-maintained, good API | Install and use directly |
+| **Extend** | Partial match, needs thin wrapper | Install + write minimal adapter |
+| **Compose** | No single package fits, but 2-3 small ones combine well | Install multiple, write glue code |
+| **Build** | Nothing fits, or dependency cost exceeds value | Write custom, document why |
+**Document the decision** in a code comment at the usage site:
+```typescript
+// search-first: Adopted date-fns for date formatting (2M weekly downloads, 30KB)
+// search-first: Built custom — no package handles our specific wire format
+```
+---
+## Anti-Patterns
+| Anti-Pattern | Correct Approach |
+|-------------|-----------------|
+| Adding a dependency for 5 lines of trivial code | Build — dependency overhead exceeds value |
+| Choosing the most popular package without checking fit | Evaluate API fit, not just popularity |
+| Wrapping a package so heavily it obscures the original | If wrapping > 50% of original API, reconsider |
+| Skipping research because "I know how to build this" | Research anyway — maintenance cost matters more than initial build |
+| Installing a massive framework for one utility function | Look for focused, single-purpose packages |
+## Scope Limiter
+This skill concerns **utility and infrastructure code** only:
+- Data transformation, validation, formatting
+- Network operations, retries, caching
+- CLI tooling, logging, configuration
+- Test utilities and helpers
+It does NOT apply to **domain-specific business logic** where:
+- The logic encodes unique business rules
+- No generic solution could exist
+- The code is inherently project-specific

package/shared/skills/search-first/references/evaluation-criteria.md ADDED Viewed

@@ -0,0 +1,101 @@
+# Search-First — Evaluation Criteria
+Detailed package evaluation criteria and decision matrix for the 4-outcome model.
+## Evaluation Matrix
+Score each candidate on these axes (1-5 scale):
+| Criterion | Weight | 1 (Poor) | 3 (Acceptable) | 5 (Excellent) |
+|-----------|--------|-----------|-----------------|----------------|
+| **Maintenance** | High | No commits in 2+ years | Active, yearly releases | Regular releases, responsive maintainer |
+| **Adoption** | Medium | < 100 weekly downloads | 1K-10K weekly downloads | > 100K weekly downloads |
+| **API Fit** | High | Needs heavy wrapping | Partial fit, thin adapter needed | Direct use, clean API |
+| **Bundle Size** | Medium | > 500KB | 50-500KB | < 50KB |
+| **Security** | High | Known vulnerabilities | No known issues, few dependencies | Audited, zero/minimal dependencies |
+| **License** | Required | GPL/AGPL (restrictive) | LGPL (conditional) | MIT/Apache/BSD (permissive) |
+**Minimum thresholds**: License must be compatible. Security must be ≥ 3. All others are trade-offs.
+## Decision Matrix
+### Adopt (score ≥ 20/25, API Fit ≥ 4)
+The package directly solves the problem with minimal integration code.
+**Example**: Using `zod` for schema validation — exact fit, massive adoption, tiny bundle.
+```
+✅ Adopt: zod v3.22
+- Maintenance: 5 (monthly releases)
+- Adoption: 5 (4M weekly downloads)
+- API Fit: 5 (direct use for all validation)
+- Bundle Size: 4 (57KB)
+- Security: 5 (zero dependencies)
+- Total: 24/25
+```
+### Extend (score ≥ 15/25, API Fit ≥ 2)
+The package handles 60-80% of the need. Write a thin adapter for the rest.
+**Example**: Using `got` for HTTP but wrapping it with project-specific retry and auth logic.
+```
+✅ Extend: got v14
+- Maintenance: 4 (active)
+- Adoption: 5 (8M weekly downloads)
+- API Fit: 3 (need custom retry wrapper)
+- Bundle Size: 3 (150KB)
+- Security: 4 (minimal deps)
+- Total: 19/25
+Adapter: ~30 lines wrapping retry + auth headers
+```
+### Compose (no single package fits, but small packages combine)
+Two or three focused packages together solve the problem better than one large framework.
+**Example**: `ms` (time parsing) + `p-retry` (retry logic) + `quick-lru` (caching) instead of a monolithic HTTP client framework.
+**Rules for Compose**:
+- Maximum 3 packages in a composition
+- Each package must be focused (single responsibility)
+- Total combined bundle < what a monolithic alternative would cost
+- Glue code should be < 50 lines
+### Build (nothing fits, or dependency cost > value)
+Write custom code when:
+- No package scores ≥ 15/25
+- The code is < 50 lines and trivial
+- Zero-dependency constraint is explicit
+- The domain is too specific for generic packages
+**Required**: Document why Build was chosen:
+```typescript
+// search-first: Built custom — our wire format uses non-standard
+// ISO-8601 extensions that no date library handles correctly.
+// Evaluated: date-fns (no custom format support), luxon (500KB overhead),
+// dayjs (close but missing timezone edge case).
+```
+## Ecosystem-Specific Hints
+### Node.js / TypeScript
+- Check npm: `https://www.npmjs.com/package/{name}`
+- Bundle size: `https://bundlephobia.com/package/{name}`
+- Check if Node.js built-ins handle it (`node:crypto`, `node:url`, `node:path`)
+### Python
+- Check PyPI: `https://pypi.org/project/{name}`
+- Check if stdlib handles it (`urllib`, `json`, `pathlib`, `dataclasses`)
+### Rust
+- Check crates.io: `https://crates.io/crates/{name}`
+- Check if std handles it
+### Go
+- Check pkg.go.dev
+- Go standard library is extensive — check stdlib first