npm - slash-do - Versions diffs - 2.0.0 → 2.2.0 - Mend

slash-do 2.0.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/README.md +3 -2
package/commands/do/better.md +33 -10
package/commands/do/depfree.md +529 -0
package/commands/do/help.md +1 -0
package/commands/do/review.md +10 -1
package/commands/do/rpr.md +3 -1
package/lib/code-review-checklist.md +11 -8
package/package.json +1 -1
package/src/environments.js +1 -1
package/src/transformer.js +0 -9

package/README.md CHANGED Viewed

@@ -24,7 +24,7 @@
 <p align="center">
   <img src="https://img.shields.io/npm/v/slash-do?style=flat-square&color=blue" alt="npm version" />
   <img src="https://img.shields.io/badge/environments-4-green?style=flat-square" alt="environments" />
-  <img src="https://img.shields.io/badge/commands-12-orange?style=flat-square" alt="commands" />
+  <img src="https://img.shields.io/badge/commands-14-orange?style=flat-square" alt="commands" />
   <img src="https://img.shields.io/badge/license-MIT-lightgrey?style=flat-square" alt="license" />
 </p>
@@ -60,8 +60,9 @@ All commands live under the `do:` namespace:
 | `/do:rpr` | Resolve PR review feedback with parallel agents |
 | `/do:release` | Create a release PR with version bump and changelog |
 | `/do:review` | Deep code review against best practices |
-| `/do:better` | Full DevSecOps audit with 7-agent scan and remediation |
+| `/do:better` | Full DevSecOps audit with 8-agent scan and remediation |
 | `/do:better-swift` | SwiftUI DevSecOps audit with multi-platform coverage |
+| `/do:depfree` | Audit dependencies, remove unnecessary ones, write replacement code |
 | `/do:goals` | Generate GOALS.md from codebase analysis |
 | `/do:replan` | Review and clean up PLAN.md |
 | `/do:omd` | Audit and optimize markdown files |

package/commands/do/better.md CHANGED Viewed

@@ -5,7 +5,7 @@ argument-hint: "[--interactive] [--scan-only] [--no-merge] [path filter or focus
 # Better — Unified DevSecOps Pipeline
-Run the full DevSecOps lifecycle: audit the codebase with 7 deduplicated agents, consolidate findings, remediate in an isolated worktree, create **separate PRs per category** with SemVer bump, verify CI, run Copilot review loops, and merge.
+Run the full DevSecOps lifecycle: audit the codebase with 8 deduplicated agents, consolidate findings, remediate in an isolated worktree, create **separate PRs per category** with SemVer bump, verify CI, run Copilot review loops, and merge.
 **Default mode: fully autonomous.** Uses Balanced model profile, proceeds through all phases without prompting, auto-merges PRs with clean reviews.
@@ -35,7 +35,7 @@ AskUserQuestion([
     header: "Model",
     multiSelect: false,
     options: [
-      { label: "Quality", description: "Opus for all agents — fewest false positives, best fixes (highest cost, 7+ Opus agents)" },
+      { label: "Quality", description: "Opus for all agents — fewest false positives, best fixes (highest cost, 8+ Opus agents)" },
       { label: "Balanced (Recommended)", description: "Sonnet for audit and remediation — good quality at moderate cost" },
       { label: "Budget", description: "Haiku for audit, Sonnet for remediation — fastest and cheapest" }
     ]
@@ -47,7 +47,7 @@ Record the selection as `MODEL_PROFILE` and derive agent models from this table:
 | Agent Role | Quality | Balanced | Budget |
 |------------|---------|----------|--------|
-| Audit agents (7 Explore agents, Phase 1) | opus | sonnet | haiku |
+| Audit agents (8 Explore agents, Phase 1) | opus | sonnet | haiku |
 | Remediation agents (general-purpose, Phase 3) | opus | sonnet | sonnet |
 Derive two variables:
@@ -121,7 +121,7 @@ Record as `BUILD_CMD` and `TEST_CMD`.
 Project conventions are already in your context. Pass relevant conventions to each agent.
-Launch 7 Explore agents in two batches. Each agent must report findings in this format:
+Launch 8 Explore agents in two batches. Each agent must report findings in this format:
 ```
 - **[CRITICAL/HIGH/MEDIUM/LOW]** `file:line` - Description. Suggested fix: ... Complexity: Simple/Medium/Complex
 ```
@@ -174,7 +174,7 @@ Skip step 4 if steps 1-3 reveal the code is correct.
    Resilience: external calls without timeouts, missing fallback for unavailable downstream services, retry without backoff ceiling/jitter, missing health check endpoints
    Observability: production paths without structured logging, error logs missing reproduction context (request ID, input params), async flows without correlation IDs
-### Batch 2 (2 agents after Batch 1 completes):
+### Batch 2 (3 agents after Batch 1 completes):
 **Model**: Same `AUDIT_MODEL` as Batch 1.
@@ -188,14 +188,27 @@ Skip step 4 if steps 1-3 reveal the code is correct.
    - **Database migrations**: exclusive-lock ALTER TABLE on large tables, CREATE INDEX without CONCURRENTLY, missing down migrations or untested rollback paths
    - General: framework-specific security issues, language-specific gotchas, domain-specific compliance, environment variable hygiene (missing `.env.example`, required env vars not validated at startup, secrets in config files that should be in env)
-7. **Test Quality & Coverage**
+7. **Dependency Freedom**
+   Audit all third-party dependencies for necessity. Every small library is an attack surface — supply chain compromises are real and common.
+   Focus:
+   - Extract the full dependency list from the project manifest (`package.json`, `Cargo.toml`, `pyproject.toml`, `go.mod`, `Gemfile`, etc.)
+   - Classify each dependency into tiers:
+     - **Acceptable**: large, widely-audited libraries (react, express, d3, three.js, next, vue, fastify, typescript, eslint, prisma, tailwindcss, tokio, serde, django, flask, pandas, etc.) — skip these
+     - **Suspect**: smaller libraries where we may only use 1-2 functions, wrappers over built-in APIs, single-purpose utilities
+     - **Removable**: libraries where the used functionality is <50 lines to implement, wraps a now-native API (e.g., `crypto.randomUUID()` replacing uuid, `structuredClone` replacing lodash.cloneDeep, `Array.prototype.flat` replacing array-flatten, `node:fs/promises` replacing fs-extra for most uses), unmaintained with known vulnerabilities, or micro-packages (is-odd, is-number, left-pad tier)
+   - For each suspect/removable dependency: search all source files for imports, list every function/class/type used, count call sites, assess replacement complexity (Trivial <20 lines, Moderate 20-100, Complex 100-300, Infeasible 300+)
+   - Check maintenance status: last publish date, open security issues, known CVEs
+   - Report format: `**[SEVERITY]** {package-name} — {Tier}. Uses: {functions}. Call sites: {N} in {M} files. Replacement: {complexity}. Reason: {why removable}`
+   - Severity mapping: unmaintained with CVEs → CRITICAL, unmaintained without CVEs → HIGH, replaceable single-function usage → MEDIUM, suspect but complex replacement → LOW
+8. **Test Quality & Coverage**
    Uses Batch 1 findings as context to prioritize.
    Focus areas:
    **Coverage gaps:**
    - Missing test files for critical modules, untested edge cases, tests that only cover happy paths
    - Areas with high complexity (identified by agents 1-5) but no tests
-   - Remediation changes from agents 1-6 that lack corresponding test coverage
+   - Remediation changes from agents 1-7 that lack corresponding test coverage
    **Vacuous tests (tests that don't actually test anything):**
    - Tests that assert on mocked return values instead of real behavior (testing the mock, not the code)
@@ -257,6 +270,7 @@ For each file touched by multiple categories, document why it was assigned to on
 ### Architecture & SOLID
 ### Bugs, Performance & Error Handling
 ### Stack-Specific
+### Dependency Freedom
 ### Test Quality & Coverage
 ```
@@ -267,6 +281,7 @@ For each file touched by multiple categories, document why it was assigned to on
    - Architecture → Architecture & SOLID → `architecture`
    - Bugs & Perf → Bugs, Performance & Error Handling → `bugs-perf`
    - Stack-Specific → Stack-Specific → `stack-specific`
+   - Dep Freedom → Dependency Freedom → `deps`
    - Tests → Test Quality & Coverage → `tests`
 ```
@@ -278,6 +293,7 @@ For each file touched by multiple categories, document why it was assigned to on
 | Architecture      | ...      | ...  | ...    | ... | ...   |
 | Bugs & Perf       | ...      | ...  | ...    | ... | ...   |
 | Stack-Specific    | ...      | ...  | ...    | ... | ...   |
+| Dep Freedom       | ...      | ...  | ...    | ... | ...   |
 | Tests             | ...      | ...  | ...    | ... | ...   |
 | TOTAL             | ...      | ...  | ...    | ... | ...   |
 ```
@@ -332,6 +348,7 @@ If no shared utilities were identified, skip this step.
    - Architecture & SOLID
    - Bugs, Performance & Error Handling
    - Stack-Specific
+   - Dependency Freedom
 3. Only create tasks for categories that have actionable findings
 4. Spawn up to 5 general-purpose agents as teammates. **Pass `REMEDIATION_MODEL` as the `model` parameter on each agent.** If `REMEDIATION_MODEL` is `opus`, omit the parameter to inherit from session.
@@ -339,9 +356,13 @@ If no shared utilities were identified, skip this step.
 !`cat ~/.claude/lib/remediation-agent-template.md`
+### Dependency Freedom agent — special instructions:
+The Dependency Freedom remediation agent has a unique task: for each removable dependency, it must (1) write replacement code (utility function or inline native API call), (2) update ALL import/require statements across the codebase, (3) remove the package from the manifest, and (4) regenerate the lock file (`npm install` / `cargo update` / etc.). After all replacements, verify no source file still references the removed package. See `/do:depfree` Phase 3b for the full agent template.
 ### Conflict avoidance:
 - Review all findings before task assignment. If two categories touch the same file, assign both sets of findings to the same agent.
 - Security agent gets priority on validation logic; DRY agent gets priority on import consolidation.
+- Dependency Freedom agent gets priority on files that are solely import/usage sites of a removed package.
 </plan_and_remediate>
@@ -421,7 +442,7 @@ Before creating PRs, run a deep code review on all remediation changes to catch
 ## Phase 4c: Test Enhancement
-After internal code review passes, evaluate and enhance the project's test suite. This phase acts on Agent 7's findings AND ensures all remediation work from Phase 3 has proper test coverage.
+After internal code review passes, evaluate and enhance the project's test suite. This phase acts on Agent 8's findings AND ensures all remediation work from Phase 3 has proper test coverage.
 ### 4c.0: Record Start SHA
@@ -433,7 +454,7 @@ PHASE_4C_START_SHA="$(git rev-parse HEAD)"
 ### 4c.1: Test Audit Triage
-Review Agent 7 findings from Phase 1 and categorize them:
+Review Agent 8 (Test Quality & Coverage) findings from Phase 1 and categorize them:
 1. **`[VACUOUS]` findings** — tests that exist but don't test real behavior. These are the highest priority because they create a false sense of safety.
 2. **`[WEAK]` findings** — tests that partially cover behavior but miss important cases. Strengthen with additional assertions and edge cases.
@@ -535,7 +556,7 @@ Initialize `CREATED_CATEGORY_SLUGS=""` (empty space-delimited string). After eac
 For each category that has findings:
 1. Switch to `{DEFAULT_BRANCH}`: `git checkout {DEFAULT_BRANCH}`
 2. Create a category branch: `git checkout -b better/{CATEGORY_SLUG}`
-   - Use slugs: `security`, `code-quality`, `dry`, `architecture`, `bugs-perf`, `stack-specific`, `tests`
+   - Use slugs: `security`, `code-quality`, `dry`, `architecture`, `bugs-perf`, `stack-specific`, `deps`, `tests`
 3. For each file assigned to this category in `FILE_OWNER_MAP`:
    - **Modified files**: `git checkout better/{DATE} -- {file_path}`
    - **New files (Added)**: `git checkout better/{DATE} -- {file_path}`
@@ -757,6 +778,7 @@ If merge fails (e.g., branch protection, merge conflicts from a prior PR):
 | Architecture       | ...      | ...   | ...     | #number  | pass   | approved |
 | Bugs & Perf        | ...      | ...   | ...     | #number  | pass   | approved |
 | Stack-Specific     | ...      | ...   | ...     | #number  | pass   | approved |
+| Dep Freedom        | ...      | ...   | ...     | #number  | pass   | approved |
 | Tests              | ...      | ...   | ...     | #number  | pass   | approved |
 | TOTAL              | ...      | ...   | ...     | N PRs    |        |          |
@@ -791,6 +813,7 @@ Test Enhancement Stats:
 - When extracting modules, always add backward-compatible re-exports in the original module to prevent cross-PR breakage
 - Version bump happens exactly once on the first category branch based on aggregate commit analysis
 - Only CRITICAL, HIGH, and MEDIUM findings are auto-remediated for code categories; LOW findings remain tracked in PLAN.md
+- Dependency Freedom findings replace unnecessary third-party packages with owned code — see `/do:depfree` for standalone usage
 - Test Quality & Coverage findings are remediated in Phase 4c with a dedicated test enhancement agent that verifies tests fail when code is broken
 - GitLab projects skip the Copilot review loop entirely (Phase 6) and stop after MR creation
 - CI must pass on each PR before requesting Copilot review or merging

package/commands/do/depfree.md ADDED Viewed

@@ -0,0 +1,529 @@
+---
+description: Audit third-party dependencies and remove unnecessary ones by writing replacement code
+argument-hint: "[--interactive] [--scan-only] [--no-merge] [specific packages to evaluate]"
+---
+# Depfree — Dependency Freedom Audit
+Audit all third-party dependencies, classify them as acceptable (large, widely-audited) or suspect (small, replaceable), analyze actual usage of suspect dependencies, and replace them with owned code where feasible.
+Every small library is an attack surface. Supply chain compromises are real and common. Large, widely-audited libraries (express, react, d3, three.js, next, vue, fastify, lodash-es, etc.) are acceptable. But for smaller libraries or libraries where only one helper function is used, we should write the code ourselves.
+**Default mode: fully autonomous.** Uses Balanced model profile, proceeds through all phases without prompting.
+**`--interactive` mode:** Pauses for classification approval, replacement review, and merge confirmation.
+Parse `$ARGUMENTS` for:
+- **`--interactive`**: pause at each decision point for user approval
+- **`--scan-only`**: run Phase 0 + 1 + 2 only (audit and plan), skip remediation
+- **`--no-merge`**: run through PR creation, skip merge
+- **Specific packages**: limit audit scope to named packages (e.g., "chalk dotenv")
+## Configuration
+### Default Mode (autonomous)
+Use the **Balanced** model profile automatically (`AUDIT_MODEL=sonnet`, `REMEDIATION_MODEL=sonnet`).
+### Interactive Mode (`--interactive`)
+Present the user with configuration options using `AskUserQuestion`:
+```
+AskUserQuestion([{
+  question: "Which model profile for audit and remediation agents?",
+  header: "Model",
+  multiSelect: false,
+  options: [
+    { label: "Quality", description: "Opus for all agents — fewest false positives, best replacements (highest cost)" },
+    { label: "Balanced (Recommended)", description: "Sonnet for audit and remediation — good quality at moderate cost" },
+    { label: "Budget", description: "Haiku for audit, Sonnet for remediation — fastest and cheapest" }
+  ]
+}])
+```
+Record the selection as `MODEL_PROFILE` and derive:
+- `AUDIT_MODEL`: `opus` / `sonnet` / `haiku` based on profile
+- `REMEDIATION_MODEL`: `opus` / `sonnet` / `sonnet` based on profile
+When the resolved model is `opus`, **omit** the `model` parameter on the Agent call so the agent inherits the session's Opus version.
+## Compaction Guidance
+When compacting during this workflow, always preserve:
+- The `DEPENDENCY_MAP` (complete classification of all dependencies)
+- All REMOVABLE findings with package names and usage details
+- The current phase number and what phases remain
+- All PR numbers and URLs created so far
+- `BUILD_CMD`, `TEST_CMD`, `PROJECT_TYPE`, `WORKTREE_DIR`, `REPO_DIR` values
+- `VCS_HOST`, `CLI_TOOL`, `DEFAULT_BRANCH`, `CURRENT_BRANCH`
+## Phase 0: Discovery & Setup
+### 0a: VCS Host Detection
+Run `gh auth status` to check GitHub CLI. If it fails, run `glab auth status` for GitLab.
+- Set `VCS_HOST` to `github` or `gitlab`
+- Set `CLI_TOOL` to `gh` or `glab`
+- If neither is authenticated, warn the user and halt
+### 0b: Project Type Detection
+Check for project manifests to determine the tech stack:
+- `package.json` → Node.js (check for `next`, `react`, `vue`, `express`, etc.)
+- `Cargo.toml` → Rust
+- `pyproject.toml` / `requirements.txt` / `setup.py` → Python
+- `go.mod` → Go
+- `pom.xml` / `build.gradle` → Java/Kotlin
+- `Gemfile` → Ruby
+- `*.csproj` / `*.sln` → .NET
+Record the detected stack as `PROJECT_TYPE`.
+### 0c: Build & Test Command Detection
+Derive build and test commands from the project type:
+- Node.js: check `package.json` scripts for `build`, `test`, `typecheck`, `lint`
+- Rust: `cargo build`, `cargo test`
+- Python: `pytest`, `python -m pytest`
+- Go: `go build ./...`, `go test ./...`
+- If ambiguous, check project conventions already in context
+Record as `BUILD_CMD` and `TEST_CMD`.
+### 0d: State Snapshot
+- Record `REPO_DIR` via `git rev-parse --show-toplevel`
+- Record `CURRENT_BRANCH` via `git rev-parse --abbrev-ref HEAD`
+- Record `DEFAULT_BRANCH` via `gh repo view --json defaultBranchRef --jq '.defaultBranchRef.name'` (or `glab` equivalent)
+- Record `IS_DIRTY` via `git status --porcelain`
+## Phase 1: Dependency Inventory
+### 1a: Extract All Dependencies
+Based on `PROJECT_TYPE`, extract the full dependency list:
+**Node.js:**
+- Read `package.json` → `dependencies` and `devDependencies`
+- Note: `devDependencies` used only in build/test are lower priority but still worth auditing
+- Check for workspace packages (monorepo) in `workspaces` field
+**Rust:**
+- Read `Cargo.toml` → `[dependencies]`, `[dev-dependencies]`, `[build-dependencies]`
+**Python:**
+- Read `pyproject.toml` → `[project.dependencies]`, `[project.optional-dependencies]`
+- Or `requirements.txt`, `setup.py`
+**Go:**
+- Read `go.mod` → `require` block
+**Ruby:**
+- Read `Gemfile`
+### 1b: Classify Dependencies
+For each dependency, classify it into one of three tiers:
+**Tier 1 — ACCEPTABLE (keep without question):**
+Large, widely-audited, foundational libraries. Examples by ecosystem:
+- **Node.js**: react, next, vue, express, fastify, hono, typescript, eslint, prettier, webpack, vite, jest, vitest, mocha, d3, three, prisma, drizzle, @types/*, tailwindcss, postcss
+- **Rust**: tokio, serde, clap, reqwest, hyper, tracing, sqlx, axum, actix-web
+- **Python**: django, flask, fastapi, sqlalchemy, pandas, numpy, scipy, pytest, requests, httpx, pydantic
+- **Go**: standard library (no third-party needed for most things)
+- **Ruby**: rails, rspec, sidekiq, puma, devise
+- Any dependency with >10M weekly downloads (npm) or equivalent popularity metric for the ecosystem
+**Tier 2 — SUSPECT (audit usage):**
+Smaller libraries that may be doing something we can write ourselves. Indicators:
+- <1M weekly downloads (npm) or equivalent
+- Single-purpose utility (does one thing)
+- We only use 1-2 functions from it
+- Wrapper libraries that add thin abstractions over built-in APIs
+- Libraries that replicate functionality available in newer language/runtime versions
+- Abandoned or unmaintained (no commits in 12+ months, open security issues)
+**Tier 3 — REMOVABLE (strong candidate for replacement):**
+Libraries where the cost of owning the code is clearly lower than the supply chain risk:
+- We use a single function that's <50 lines to implement
+- The library wraps a built-in API with minimal added value
+- The library is unmaintained with known vulnerabilities
+- The library's functionality is now available natively (e.g., `node:fs/promises` replacing `fs-extra` for most use cases, `structuredClone` replacing `lodash.cloneDeep`, `Array.prototype.flat` replacing `array-flatten`)
+- Color/string utilities where we use 1-2 functions (e.g., using `chalk` just for `chalk.red()` when a 10-line ANSI wrapper suffices)
+- UUID generation when `crypto.randomUUID()` is available
+- Deep merge/clone when `structuredClone` suffices
+- `dotenv` when the runtime supports `--env-file` natively
+- `is-odd`, `is-number`, `left-pad` tier micro-packages
+Record the full classification as `DEPENDENCY_MAP`.
+### 1c: Usage Analysis (Tier 2 & 3 only)
+For each Tier 2 and Tier 3 dependency, launch parallel Explore agents (using `AUDIT_MODEL`) to determine actual usage:
+Each agent should:
+1. Search all source files for imports/requires of the package
+2. List every function, class, constant, or type imported from it
+3. Count call sites per imported symbol
+4. Assess complexity of replacement:
+   - **Trivial** (<20 lines): simple wrapper, single utility function, type alias
+   - **Moderate** (20-100 lines): multi-function utility, needs tests, edge cases to handle
+   - **Complex** (100-300 lines): significant logic, crypto, parsing, protocol implementation
+   - **Infeasible** (300+ lines or requires deep domain expertise): keep the dependency
+5. Check if the package has known vulnerabilities: `npm audit`, `cargo audit`, `pip-audit`, etc.
+6. Check last publish date and maintenance status
+Report format:
+```
+- **{package-name}** — Tier {2|3}
+  - Imports: {list of imported symbols}
+  - Call sites: {count} across {N} files
+  - Functions used: {list with brief description of each}
+  - Replacement complexity: {Trivial|Moderate|Complex|Infeasible}
+  - Maintenance: {last publish date, open issues, known CVEs}
+  - Recommendation: **REMOVE** / **KEEP** / **EVALUATE**
+  - Replacement sketch: {brief description of how to replace, if REMOVE}
+```
+Wait for all agents to complete before proceeding.
+## Phase 2: Replacement Plan
+1. Read the existing `PLAN.md` (create if it doesn't exist)
+2. Filter to only REMOVE recommendations from Phase 1c
+3. For EVALUATE recommendations: **Default mode** — treat as KEEP (conservative). **Interactive mode** — present to user via `AskUserQuestion` for each
+4. Group removable dependencies by replacement strategy:
+   - **Native replacement**: built-in API replaces the library (e.g., `crypto.randomUUID()`)
+   - **Inline replacement**: write a small utility function (e.g., ANSI color wrapper)
+   - **Consolidation**: multiple small deps replaced by one owned utility module
+5. Estimate total lines of replacement code needed
+6. Add a new section to PLAN.md:
+```markdown
+## Depfree Audit - {YYYY-MM-DD}
+Summary: {N} total dependencies. {A} acceptable (Tier 1), {B} audited and kept (Tier 2), {C} to remove (Tier 3).
+Estimated replacement code: ~{lines} lines across {files} new/modified files.
+### Dependencies to Remove
+| Package | Tier | Used Functions | Call Sites | Replacement | Complexity | Risk |
+|---------|------|---------------|------------|-------------|------------|------|
+| ...     | ...  | ...           | ...        | ...         | ...        | ...  |
+### Dependencies Kept (with rationale)
+| Package | Tier | Reason Kept |
+|---------|------|-------------|
+| ...     | ...  | ...         |
+### Replacement Tasks
+For each dependency to remove:
+- [ ] **{package}** — {strategy}. Replace {N} call sites in {M} files. Write {utility name} ({est. lines} lines). Complexity: {level}.
+```
+7. Print summary table:
+```
+| Status     | Count | Examples                          |
+|------------|-------|-----------------------------------|
+| Acceptable | ...   | react, express, typescript, ...   |
+| Kept       | ...   | {packages kept with reasons}      |
+| Removable  | ...   | {packages to remove}              |
+| Total      | ...   |                                   |
+```
+**GATE: If `--scan-only` was passed, STOP HERE.** Print the summary and exit.
+**GATE: If no removable dependencies were found, print "All dependencies are justified" and exit.**
+**Interactive mode**: Present the removal plan via `AskUserQuestion`:
+```
+AskUserQuestion([{
+  question: "Dependency removal plan:\n{summary of packages to remove}\n\nProceed with replacement?",
+  options: [
+    { label: "Proceed", description: "Remove all listed dependencies and write replacement code" },
+    { label: "Review individually", description: "Let me approve/reject each removal" },
+    { label: "Abort", description: "Stop here — I'll review the plan manually" }
+  ]
+}])
+```
+If "Review individually": present each dependency with REMOVE/KEEP options, then proceed with only approved removals.
+## Phase 3: Worktree Remediation
+### 3a: Setup
+1. If `IS_DIRTY` is true: `git stash --include-untracked -m "depfree: pre-audit stash"`
+2. Set `DATE` to today's date in YYYY-MM-DD format
+3. Create the worktree:
+   ```bash
+   git worktree add ../depfree-{DATE} -b depfree/{DATE}
+   ```
+4. Set `WORKTREE_DIR` to `../depfree-{DATE}`
+### 3b: Write Replacement Code
+For each dependency to remove, spawn a general-purpose agent (using `REMEDIATION_MODEL`) with these instructions:
+```
+<context>
+Project type: {PROJECT_TYPE}
+Build command: {BUILD_CMD}
+Test command: {TEST_CMD}
+Working directory: {WORKTREE_DIR} (this is a git worktree — all work happens here)
+</context>
+<task>
+Remove the dependency on `{PACKAGE_NAME}` and replace with owned code.
+Current usage:
+{USAGE_DETAILS from Phase 1c — imported symbols, call sites, files}
+Replacement strategy: {STRATEGY from Phase 2}
+Steps:
+1. Write the replacement code (utility function, inline replacement, or native API call)
+2. Update ALL import/require statements across the codebase to use the new code
+3. Remove the package from the manifest ({package.json, Cargo.toml, etc.})
+4. Run `{BUILD_CMD}` to verify compilation
+5. Run `{TEST_CMD}` to verify tests pass
+6. If tests reference the removed package directly (mocking it, importing test helpers from it), update those tests too
+</task>
+<guardrails>
+- The replacement must preserve behavior for all currently-used call sites and documented invariants
+- You may omit handling for input shapes or edge cases that are provably unreachable based on {USAGE_DETAILS}, but do not narrow behavior for any actual call site
+- Do NOT introduce new dependencies to replace old ones
+- Do NOT use `git add -A` or `git add .` — stage specific files only
+- Keep replacement code minimal
+- If replacement is more complex than estimated (>2x the estimated lines), report back and skip — do not force a bad replacement
+- Place shared utility replacements in a sensible location (e.g., `src/utils/`, `lib/`, `internal/`) following existing project conventions
+- Commit each replacement independently: `refactor: replace {package} with owned {utility/code}`
+</guardrails>
+```
+**Parallelization**: Launch up to 5 agents in parallel. If >5 dependencies to remove, batch them. Assign each agent a non-overlapping set of dependencies (no two agents should modify the same files — if overlap exists, group those dependencies into one agent).
+### 3c: Lock File Update
+After all replacement agents complete:
+1. Remove all replaced packages from the lock file:
+   ```bash
+   cd {WORKTREE_DIR}
+   # Node.js: refresh lockfile only, without running lifecycle scripts
+   npm install --package-lock-only --ignore-scripts
+   # Or: yarn install --mode=update-lockfile --ignore-scripts
+   # Or: pnpm install --lockfile-only --ignore-scripts
+   # Rust: let a check refresh Cargo.lock to reflect manifest changes only
+   cargo check
+   # Python: use the project's lock tool to refresh
+   # poetry lock --no-update
+   # pip-compile requirements.in
+   ```
+2. Commit the lock file update:
+   ```bash
+   git -C {WORKTREE_DIR} add {lock file}
+   git -C {WORKTREE_DIR} commit -m "chore: update lock file after dependency removal"
+   ```
+## Phase 4: Verification
+### 4a: Build & Test
+1. Run the full build:
+   ```bash
+   cd {WORKTREE_DIR} && {BUILD_CMD}
+   ```
+2. Run all tests:
+   ```bash
+   cd {WORKTREE_DIR} && {TEST_CMD}
+   ```
+3. If build or tests fail:
+   - Identify which replacement caused the failure
+   - Attempt to fix in a new commit
+   - If unfixable, revert the replacement commit AND re-add the dependency:
+     ```bash
+     git -C {WORKTREE_DIR} revert <sha>
+     ```
+     Note the reverted package as "kept — replacement failed"
+### 4b: Internal Code Review
+1. Generate the diff:
+   ```bash
+   cd {WORKTREE_DIR} && git diff {DEFAULT_BRANCH}...HEAD
+   ```
+2. Review all replacement code for:
+   - Functional equivalence (does the replacement handle the same inputs/outputs?)
+   - Missing edge cases that the original library handled
+   - Security regressions (e.g., replacing a sanitization library with a naive regex)
+   - Performance regressions (e.g., replacing an optimized parser with O(n^2) code)
+   - Correct error handling at system boundaries
+3. Fix any issues found, commit each fix separately
+### 4c: Verify No Phantom Dependencies
+Confirm no source file still references a removed package:
+```bash
+cd {WORKTREE_DIR}
+for pkg in {REMOVED_PACKAGES}; do
+  grep -r "$pkg" \
+    --include='*.ts' \
+    --include='*.js' \
+    --include='*.tsx' \
+    --include='*.jsx' \
+    --include='*.py' \
+    --include='*.rs' \
+    --include='*.go' \
+    --include='*.rb' \
+    . && echo "WARN: $pkg still referenced"
+done
+```
+Fix any remaining references.
+## Phase 5: PR Creation
+### 5a: Push & Create PR
+```bash
+cd {WORKTREE_DIR}
+git push -u origin depfree/{DATE}
+```
+Create the PR:
+**GitHub:**
+```bash
+gh pr create --head depfree/{DATE} --base {DEFAULT_BRANCH} \
+  --title "refactor: remove {N} unnecessary dependencies" \
+  --body "$(cat <<'EOF'
+## Depfree Audit — Dependency Removal
+### Summary
+Removed {N} unnecessary third-party dependencies and replaced with owned code.
+Estimated supply chain attack surface reduction: {N} packages ({transitive count} including transitive deps).
+### Dependencies Removed
+| Package | Replacement | Lines of Owned Code |
+|---------|-------------|-------------------|
+{table of removed packages}
+### Dependencies Kept (audited)
+{count} dependencies audited and kept with rationale. See PLAN.md for details.
+### Replacement Code
+{bulleted list of new utility files or inline changes}
+### Verification
+- [ ] Build passes
+- [ ] All tests pass
+- [ ] No phantom references to removed packages
+- [ ] Lock file updated
+EOF
+)"
+```
+**GitLab:**
+```bash
+glab mr create --source-branch depfree/{DATE} --target-branch {DEFAULT_BRANCH} \
+  --title "refactor: remove {N} unnecessary dependencies" --description "..."
+```
+Record `PR_NUMBER` and `PR_URL`.
+**GATE: If `--no-merge` was passed, STOP HERE.** Print the PR URL and summary.
+### 5b: CI Verification
+1. Wait 30 seconds for CI to start
+2. Poll CI status:
+   ```bash
+   gh pr checks {PR_NUMBER}
+   ```
+   Poll every 30 seconds, max 10 minutes.
+3. If CI fails:
+   - Fetch failure logs, diagnose, fix, commit, push
+   - Max 3 fix attempts before informing the user
+### 5c: Copilot Review Loop (GitHub only)
+If `VCS_HOST` is `github`, run the Copilot review loop using the shared template:
+!`cat ~/.claude/lib/copilot-review-loop.md`
+Pass: `{PR_NUMBER}`, `{OWNER}/{REPO}`, `depfree/{DATE}`, and `{BUILD_CMD}`.
+### 5d: Merge
+**Default mode**: Auto-merge if review is clean.
+**Interactive mode**: Ask user for merge approval.
+```bash
+gh pr merge {PR_NUMBER} --merge
+```
+## Phase 6: Cleanup
+1. Remove the worktree:
+   ```bash
+   git worktree remove {WORKTREE_DIR}
+   ```
+2. Delete the local branch:
+   ```bash
+   git checkout {DEFAULT_BRANCH}
+   git branch -D depfree/{DATE}
+   if git ls-remote --exit-code --heads origin "depfree/{DATE}" >/dev/null 2>&1; then
+       git push origin --delete "depfree/{DATE}"
+   else
+       echo "warning: remote branch depfree/{DATE} not found or already deleted"
+   fi
+   ```
+3. Restore stashed changes if applicable:
+   ```bash
+   git stash pop
+   ```
+4. Update PLAN.md:
+   - Mark completed removals with `[x]`
+   - Add PR link
+   - Note any packages that were reverted
+5. Print the final summary:
+```
+| Package          | Status   | Replacement              | Lines |
+|------------------|----------|--------------------------|-------|
+| {package}        | Removed  | {utility/native API}     | {N}   |
+| {package}        | Kept     | {reason}                 | —     |
+| {package}        | Reverted | {reason for failure}     | —     |
+Total dependencies before: {before}
+Total dependencies after:  {after}
+Packages removed: {count}
+Owned replacement code: ~{lines} lines
+Transitive deps eliminated: ~{count} (estimated)
+```
+## Error Recovery
+- **Agent failure**: continue with remaining agents, note gaps in the summary
+- **Build failure in worktree**: attempt fix; if unfixable, revert the problematic replacement and re-add the dependency
+- **Push failure**: `git pull --rebase --autostash` then retry push
+- **CI failure on PR**: investigate logs, fix, push (max 3 attempts)
+- **Replacement too complex**: if an agent reports that replacement exceeds 2x estimated complexity, skip that dependency and keep it with a note
+- **Test failure from replacement**: if tests fail and the fix isn't obvious, revert the replacement — a working dependency is better than broken owned code
+- **Existing worktree found at startup**: ask user — resume or clean up
+!`cat ~/.claude/lib/graphql-escaping.md`
+## Notes
+- This command complements `/do:better` — run `depfree` for dependency hygiene, `better` for code quality
+- All remediation happens in an isolated worktree — the user's working directory is never modified
+- The threshold for "acceptable" libraries is deliberately generous — the goal is to remove obvious attack surface, not to rewrite everything
+- Replacement code should be minimal and focused — don't over-engineer utilities that replace single-purpose packages
+- When in doubt, keep the dependency. A maintained library is better than a buggy reimplementation
+- devDependencies are lower priority since they don't ship to production, but unmaintained build tools still pose supply chain risk
+- For monorepos, audit the root manifest and each workspace package manifest

package/commands/do/help.md CHANGED Viewed

@@ -14,6 +14,7 @@ List all available `/do:*` commands with their descriptions.
 |---|---|
 | `/do:better` | Unified DevSecOps audit, remediation, per-category PRs, CI verification, and Copilot review loop |
 | `/do:better-swift` | SwiftUI-optimized DevSecOps audit with multi-platform coverage (iOS, macOS, watchOS, tvOS, visionOS) |
+| `/do:depfree` | Audit third-party dependencies and remove unnecessary ones by writing replacement code |
 | `/do:fpr` | Commit, push to fork, and open a PR against the upstream repo |
 | `/do:goals` | Scan codebase to infer project goals, clarify with user, and generate GOALS.md |
 | `/do:help` | List all available slashdo commands |

package/commands/do/review.md CHANGED Viewed

@@ -97,6 +97,7 @@ Check every file against this checklist. The checklist is organized into tiers
 - If the PR adds a new endpoint, trace where existing endpoints are registered and verify the new one is wired in all runtime adapters (serverless handler map, framework route file, API gateway config, local dev server) — a route registered in one adapter but missing from another will silently 404 in the missing runtime
 - If the PR adds a new call to an external service that has established mock/test infrastructure (mock mode flags, test helpers, dev stubs), verify the new call uses the same patterns — bypassing them makes the new code path untestable in offline/dev environments and inconsistent with existing integrations
 - If the PR adds a new UI component or client-side consumer against an existing API endpoint, read the actual endpoint handler or response shape — verify every field name, nesting level, identifier property, and response envelope path used in the consumer matches what the producer returns. This is the #1 source of "renders empty" bugs in new views built against existing APIs
+- If the PR adds or modifies a discovery/catalog endpoint that enumerates available capabilities (actions, node types, valid options) for a downstream consumer API, trace the full enumerated set against the consumer's actual supported inputs: verify every advertised item can be consumed without error, every consumer-supported item is discoverable, and any identifier transformations (naming conventions, case conversions, key format changes) between discovery output and consumer input preserve the format the consumer expects — mismatches produce runtime errors that no amount of unit testing will catch because the two sides are tested independently
 **Push/real-time event scoping**
 - If the PR adds or modifies WebSocket, SSE, or pub/sub event emission, trace the event scope: does the event reach only the originating session/user, or is it broadcast to all connected clients? Check payloads for sensitive content (user inputs, images, tokens) that should not leak across sessions. If the consumer filters by a correlation ID, verify the producer includes one and that the ID is generated server-side or validated against the session
@@ -143,6 +144,7 @@ Check every file against this checklist. The checklist is organized into tiers
 **Sanitization/validation/normalization coverage**
 - If the PR introduces a new validation or sanitization function for a data field, trace every code path that writes to that field (create, update, import, sync, rename, raw/bulk persist) — verify they all use the same sanitization. Partial application is the #1 way invalid data re-enters through an unguarded path
 - If the PR adds a "raw" or bypass write path (e.g., `raw: true` flag, bulk import, migration backfill), compare the normalization it applies against what the standard read/parse path assumes — ID prefixes, required defaults, shape invariants. Data that passes through the raw path must still be valid when reloaded through the normal path
+- If the PR adds a new dispatch branch within a multi-type handler (e.g., coercing a new data shape, handling a new entity subtype), trace sibling branches and verify the new one applies equivalent validation, type-checking, and error-handling constraints — new branches commonly bypass validation that existing branches enforce because the author focuses on happy-path behavior
 **Bootstrap/initialization ordering**
 - If the PR adds resilience or self-healing code (dependency installers, auto-repair, migration runners), trace the execution order: does the main code path resolve or import the dependencies BEFORE the resilience code runs? If so, the bootstrapper never executes when it's needed most — restructure so verification/installation precedes resolution
@@ -175,7 +177,7 @@ Check every file against this checklist. The checklist is organized into tiers
 - If the PR adds or reorders sequential steps/instructions, verify the ordering matches execution dependencies — readers following steps in order must not perform an action before its prerequisite
 **Transactional write integrity**
-- If the PR performs multi-item writes (database transactions, batch operations), verify each write includes condition expressions that prevent stale-read races (TOCTOU) — an unconditioned write after a read can upsert deleted records, double-count aggregates, or drive counters negative. Trace the gap between read and write for each operation
+- If the PR performs multi-item writes (database transactions, batch operations), verify each write includes condition expressions that prevent stale-read races (TOCTOU) — an unconditioned write after a read can upsert deleted records, double-count aggregates, or drive counters negative. Trace the gap between read and write for each operation. Also verify that update/modify operations won't silently create records when the target key doesn't exist — database update operations often have implicit upsert semantics (e.g., DynamoDB UpdateItem, MongoDB update with upsert) that create partial records for invalid IDs; add existence condition expressions when the operation should only modify existing records
 - If the PR catches transaction/conditional failures, verify the error is translated to a client-appropriate status (409, 404) rather than bubbling as 500 — expected concurrency failures are not server errors
 **Batch/paginated API consumption**
@@ -213,6 +215,10 @@ Check every file against this checklist. The checklist is organized into tiers
 **Abstraction layer fidelity**
 - If the PR calls a third-party API through an internal wrapper/abstraction layer, trace whether the wrapper requests and forwards all fields the handler depends on — third-party APIs often have optional response attributes that require explicit opt-in (e.g., cancellation reasons, extended metadata). Code branching on fields the wrapper doesn't forward will silently receive `undefined` and take the wrong path. Also verify that test mocks match what the real wrapper returns, not what the underlying API could theoretically return
 - If the PR passes multiple parameters through a wrapper/abstraction layer to an underlying API, check whether any parameter combinations are mutually exclusive in the underlying API (e.g., projection expressions + count-only select modes) — the wrapper should strip conflicting parameters rather than forwarding all unconditionally, which causes validation errors at the underlying layer
+- If the PR calls framework or library functions with discriminated input formats (e.g., content paths vs script paths, different loader functions per format), trace each call site to verify the function variant used actually handles the input format being passed — especially fallback/default branches in multi-format dispatchers, where the fallback commonly uses the wrong function. Also verify positional argument order matches the called function's parameter order (not assumed from variable names) and that the object type passed matches what the API expects (e.g., asset object vs class reference, property access vs method call)
+**Parameter consumption tracing**
+- If the PR adds a function with validated input parameters (schema validation, input decorators, type annotations), trace each validated parameter through to where it's actually consumed in the implementation. Parameters that pass validation but are never read create dead API surface — callers believe they're configuring behavior that's silently ignored. Either wire the parameter through or remove it from the public API
 **Summary/aggregation endpoint consistency**
 - If the PR adds a summary or dashboard endpoint that aggregates counts/previews across multiple data sources, trace each category's computation logic against the corresponding detail view it links to — verify they apply the same filters (e.g., orphan exclusion, status filtering), the same ordering guarantees (sort keys that actually exist on the queried index), and that navigation links propagate the aggregated context (e.g., `?status=pending`) so the destination page matches what the summary promised
@@ -240,6 +246,9 @@ Check every file against this checklist. The checklist is organized into tiers
 **Bulk vs single-item operation parity**
 - If the PR modifies a single-item CRUD operation (create, update, delete) to handle new fields or apply new logic, trace the corresponding bulk/batch operation for the same entity — it often has its own independent implementation that won't pick up the change. Verify both paths handle the same fields, apply the same validation, and preserve the same secondary data
+**Bulk operation selection lifecycle**
+- If the PR adds operations that act on a user-selected subset of items (bulk actions, batch operations), trace the complete lifecycle of the selection state: when is it cleared (data refresh, item deletion), when is it not cleared but should be (filter/sort/page changes), and whether the operation re-validates the selection at execution time (especially after confirmation dialogs where the underlying data may change between display and confirmation)
 **Config value provenance for auto-upgrade**
 - If the PR adds auto-upgrade logic that replaces config values with newer defaults (prompt versions, schema migrations, template updates), verify the code can distinguish "user customized this value" from "this is the previous default." Without provenance tracking (version stamps, customization flags, or comparison against known previous defaults), auto-upgrade will overwrite intentional user customizations or skip legitimate upgrades

package/commands/do/rpr.md CHANGED Viewed

@@ -54,6 +54,8 @@ Address the latest code review feedback on the current branch's pull request usi
 9. **Request another Copilot review** (only if `is_fork_pr=false`): After pushing fixes, request a fresh Copilot code review and repeat from step 3 until the review passes clean. **Skip for fork-to-upstream PRs.**
+   **Repeated-comment dedup**: When fetching threads after a new Copilot review round, compare each new unresolved thread's comment body and file/line against threads from the previous round that were intentionally left unresolved (replied to as non-issues or disagreements). If all new unresolved threads are repeats of previously-dismissed feedback, treat the review as clean (no new actionable comments) and exit the loop.
 10. **Report summary**: Print a table of all threads addressed with file, line, and a brief description of the fix. Include a final count line: "Resolved X/Y threads." If any threads remain unresolved, list them with reasons (unclear feedback, disagreement, requires user input).
 !`cat ~/.claude/lib/graphql-escaping.md`
@@ -78,7 +80,7 @@ Poll using GraphQL to check for a new review with a `submittedAt` timestamp afte
 gh api graphql -f query='{ repository(owner: "OWNER", name: "REPO") { pullRequest(number: PR_NUM) { reviews(last: 3) { nodes { state body author { login } submittedAt } } reviewThreads(first: 100) { nodes { id isResolved comments(first: 3) { nodes { body path line author { login } } } } } } } }'
 ```
-**Dynamic poll timing**: Before your first poll, check how long the most recent Copilot review on this PR took by comparing consecutive Copilot review `submittedAt` timestamps (or PR creation time for the first review). Use that duration as your expected wait. If no prior review exists, default to 5 minutes. Set poll interval to 60 seconds and max wait to **2x the expected duration** (minimum 5 minutes, maximum 20 minutes). Copilot reviews can take **10-15 minutes** for large diffs — do NOT give up early.
+**Dynamic poll timing**: Before your first poll, check how long the most recent Copilot review on this PR took by comparing consecutive Copilot review `submittedAt` timestamps (or PR creation time for the first review). Use that duration as your expected wait. If no prior review exists, default to 5 minutes. Use **progressive poll intervals**: 15s, 15s, 30s, 30s, then 60s thereafter — small diffs often complete in under a minute, so early frequent checks avoid wasting time. Set max wait to **2x the expected duration** (minimum 5 minutes, maximum 20 minutes). Copilot reviews can take **10-15 minutes** for large diffs — do NOT give up early.
 The review is complete when a new `copilot-pull-request-reviewer` review node appears. If no review appears after max wait: **Default mode**: auto-skip and continue. **Interactive mode (`--interactive`)**: ask the user whether to continue waiting, re-request, or skip.

package/lib/code-review-checklist.md CHANGED Viewed

@@ -16,7 +16,7 @@
    **Runtime correctness**
    - Null/undefined access without guards, off-by-one errors, object spread of potentially-null values (spread of null is `{}`, silently discarding state) or non-object values (spreading a string produces indexed character keys, spreading an array produces numeric keys) — guard with a plain-object check before spreading
    - Data from external/user sources (parsed JSON, API responses, file reads) used without structural validation — guard against parse failures, missing properties, wrong types, and null elements before accessing nested values. When parsed data is optional enrichment, isolate failures so they don't abort the main operation
-   - Type coercion edge cases — `Number('')` is `0` not empty, `0` is falsy in truthy checks, `NaN` comparisons are always false; string comparison operators (`<`, `>`, `localeCompare`) do lexicographic, not semantic, ordering (e.g., `"10" < "2"`). Use explicit type checks (`Number.isFinite()`, `!= null`) and dedicated libraries (e.g., semver for versions) instead of truthy guards or lexicographic ordering when zero/empty are valid values or semantic ordering matters. Boolean values round-tripping through text serialization (markdown metadata, query strings, form data, flat-file config) become strings — `"false"` is truthy in JavaScript, so truthiness checks on deserialized booleans silently treat explicit `false` as `true`. Use strict equality (`=== true`, `=== 'true'`) or a dedicated coercion function; ensure the same coercion is applied at every consumption site
+   - Type coercion edge cases — `Number('')` is `0` not empty, `0` is falsy in truthy checks, `NaN` comparisons are always false; string comparison operators (`<`, `>`, `localeCompare`) do lexicographic, not semantic, ordering (e.g., `"10" < "2"`). Use explicit type checks (`Number.isFinite()`, `!= null`) and dedicated libraries (e.g., semver for versions) instead of truthy guards or lexicographic ordering when zero/empty are valid values or semantic ordering matters. Boolean values round-tripping through text serialization (markdown metadata, query strings, form data, flat-file config) become strings — `"false"` is truthy in JavaScript, so truthiness checks on deserialized booleans silently treat explicit `false` as `true`. Use strict equality (`=== true`, `=== 'true'`) or a dedicated coercion function; ensure the same coercion is applied at every consumption site. Language type hierarchies may admit surprising subtypes through standard type-check predicates (`isinstance(x, int)` accepts `bool` in Python, `typeof NaN === 'number'` in JavaScript) — when validating numeric inputs, explicitly exclude known subtypes that would pass the check but produce wrong behavior
    - Functions that index into arrays without guarding empty arrays; aggregate operations (`every`, `some`, `reduce`) on potentially-empty collections returning vacuously true/default values that mask misconfiguration or missing data; state/variables declared but never updated or only partially wired up
    - Parallel arrays or tuples coupled by index position (e.g., a names array, a promises array, and a destructuring assignment that must stay aligned) — insertion or reordering in one silently misaligns all others. Use objects/maps keyed by a stable identifier instead
    - Shared mutable references — module-level defaults passed by reference mutate across calls (use `structuredClone()`/spread); `useCallback`/`useMemo` referencing a later `const` (temporal dead zone); object spread followed by unconditional assignment that clobbers spread values
@@ -41,7 +41,7 @@
    **Async & state consistency** _[applies when: code uses async/await, Promises, or UI state]_
    - Optimistic state changes (view switches, navigation, success callbacks) before async completion — if the operation fails or is cancelled, the UI is stuck with no rollback. Check return values/errors before calling success callbacks. Handle both failure and cancellation paths. Watch for `.catch(() => null)` followed by unconditional success code (toast, state update) — the catch silences the error but the success path still runs. Either let errors propagate naturally or check the return value before proceeding
-   - Multiple coupled state variables updated independently — actions that change one must update all related fields; debounced/cancelable operations must reset loading state on every exit path (cleared, stale, failed, aborted). Component state initialized from props via `useState(prop)` only captures the initial value — if the prop updates asynchronously (data fetch, parent re-render), the local state goes stale. Sync with an effect when the user is not actively editing, or lift state to avoid the copy
+   - Multiple coupled state variables updated independently — actions that change one must update all related fields; debounced/cancelable operations must reset loading state on every exit path (cleared, stale, failed, aborted). Reference/selection sets that point to items in a data collection must be pruned when items are removed and invalidated when the collection is reloaded, filtered, paginated, or sorted — stale references send nonexistent IDs to downstream operations. Operations triggered from a confirmation dialog must re-validate preconditions (selection non-empty, items still exist) at execution time — the underlying data may change between dialog display and user confirmation. Component state initialized from props via `useState(prop)` only captures the initial value — if the prop updates asynchronously (data fetch, parent re-render), the local state goes stale. Sync with an effect when the user is not actively editing, or lift state to avoid the copy
    - Error notification at multiple layers (shared API client + component-level) — verify exactly one layer owns user-facing error messages. For periodic polling, also check that error notifications are throttled or deduplicated (only fire on state transitions like success→error, not on every failed iteration) and that failure doesn't make the UI section disappear entirely (component returning null when data is null/errored) — render an error or stale-data state instead of absence
    - Optimistic updates using full-collection snapshots for rollback — a second in-flight action gets clobbered. Use per-item rollback and functional state updaters after async gaps; sync optimistic changes to parent via callback or trigger refetch on remount. When appending items to a list optimistically, guard against duplicates (check existence before append) — concurrent or repeated operations can insert the same item multiple times
    - State updates guarded by truthiness of the new value (`if (arr?.length)`) — prevents clearing state when the source legitimately returns empty. Distinguish "no response" from "empty response"
@@ -78,11 +78,13 @@
    - Update/patch endpoints with explicit field allowlists (destructured picks, permitted-key arrays) — when the data model gains new configurable fields, the allowlist must be updated or the new fields are silently dropped on save. Trace from model definition to the update handler's field extraction to verify coverage
    - New endpoints/schemas should match validation patterns of existing similar endpoints — field limits, required fields, types, error handling. If validation exists on one endpoint for a param, the same param on other endpoints needs the same validation. API documentation schemas (OpenAPI, JSON Schema) must be structurally complete — array types require `items` definitions, required fields must be listed, and the documented shape must match what the implementation actually returns
    - Summary/aggregation endpoints that compute counts or previews via a different query path, filter set, or data source than the detail views they link to — users see inconsistent numbers between the dashboard and the destination page. Trace the computation logic in both paths and verify they apply the same filters, exclusions, and ordering guarantees (or document the intentional difference)
-   - When a validation/sanitization/normalization function is introduced for a field, trace ALL write paths (create, update, sync, import, raw/bulk persist) — partial application means invalid values re-enter through the unguarded path. This includes structural normalization (ID prefixes, required defaults, shape invariants) that the read/parse path depends on — a "raw" write path that skips normalization produces data that changes identity or shape on reload
+   - Discovery or catalog endpoints that enumerate available capabilities (actions, supported types, valid options) for a downstream consumer must derive the enumerated set from or validate it against the consumer's actual supported set — advertising items the consumer can't handle produces runtime errors at consumption time, while omitting items the consumer supports makes them undiscoverable. If the catalog transforms identifiers (naming conventions, key formats) between producer and consumer, verify the transformation preserves the format the consumer expects
+   - When a validation/sanitization/normalization function is introduced for a field, trace ALL write paths (create, update, sync, import, raw/bulk persist) — partial application means invalid values re-enter through the unguarded path. This includes structural normalization (ID prefixes, required defaults, shape invariants) that the read/parse path depends on — a "raw" write path that skips normalization produces data that changes identity or shape on reload. Conversely, when a new code branch handles data similar to existing branches within the same function (e.g., a new data format, entity subtype, or input shape), verify it applies the same validation and coercion as its siblings — new branches that bypass established validation are the most common source of type-safety regressions
    - Stored config/settings merged with hardcoded defaults using shallow spread — nested objects in the stored copy entirely replace the default, dropping newly added default keys on upgrade. Use deep merge for nested config objects (while preserving explicit `null` to clear a field), or flatten the config structure so shallow merge suffices
-   - Schema fields accepting values downstream code can't handle; Zod/schema stripping fields the service reads (silent `undefined`); config values persisted but silently ignored by the implementation — trace each field through schema → service → consumer. Update schemas derived from create schemas (e.g., `.partial()`) must also make nested object fields optional — shallow partial on a deeply-required schema rejects valid partial updates. Additionally, `.deepPartial()` or `.partial()` on schemas with `.default()` values will apply those defaults on update, silently overwriting existing persisted values with defaults — create explicit update schemas without defaults instead
+   - Schema fields accepting values downstream code can't handle; Zod/schema stripping fields the service reads (silent `undefined`); config values persisted but silently ignored by the implementation — trace each field through schema → service → consumer. Also check for parameters accepted and validated in the schema but never consumed by the implementation — dead API surface that misleads callers into believing they're configuring behavior that's silently ignored; remove unused parameters or wire them through to the implementation. Update schemas derived from create schemas (e.g., `.partial()`) must also make nested object fields optional — shallow partial on a deeply-required schema rejects valid partial updates. Additionally, `.deepPartial()` or `.partial()` on schemas with `.default()` values will apply those defaults on update, silently overwriting existing persisted values with defaults — create explicit update schemas without defaults instead
+   - Multi-part UI features (e.g., table header + rows) whose rendering is gated on different prop/condition subsets — if the header checks prop A while rows check prop B, partial provision causes structural misalignment (column count mismatch, orphaned interactive elements without handlers). Derive a single enablement boolean from the complete prop set and use it consistently across all participating components
    - Entity creation without case-insensitive uniqueness checks — names differing only in case (e.g., "MyAgent" vs "myagent") cause collisions in case-insensitive contexts (file paths, git branches, URLs). Normalize to lowercase before comparing
-   - Code reading properties from API responses, framework-provided objects, or internal abstraction layers using field names the source doesn't populate or forward — silent `undefined`. Verify property names and nesting depth match the actual response shape (e.g., `response.items` vs `response.data.items`, `obj.placeId` vs `obj.id`, flat fields vs nested sub-objects). When building a new consumer against an existing API, check the producer's actual response — not assumed conventions. When branching on fields from a wrapped third-party API, confirm the wrapper actually requests and forwards those fields (e.g., optional response attributes that require explicit opt-in)
+   - Code reading properties from API responses, framework-provided objects, or internal abstraction layers using field names the source doesn't populate or forward — silent `undefined`. Verify property names and nesting depth match the actual response shape (e.g., `response.items` vs `response.data.items`, `obj.placeId` vs `obj.id`, flat fields vs nested sub-objects). When building a new consumer against an existing API, check the producer's actual response — not assumed conventions. When branching on fields from a wrapped third-party API, confirm the wrapper actually requests and forwards those fields (e.g., optional response attributes that require explicit opt-in). Also verify call sites pass inputs in the format the called function actually accepts — framework constructors with non-obvious positional argument order, loaders with format-specific variants (content paths vs script paths, asset objects vs class references), and accessor APIs with distinct method-vs-property semantics. Fallback branches in multi-format dispatchers commonly use the wrong function for the input type
    - Data model fields that have different names depending on the creation/write path (e.g., `createdAt` vs `created`) — code referencing only one naming convention silently misses records created through other paths. Trace all write paths to discover the actual field names in use. When new logic (access control, UI display, queries) checks only a newly introduced field, verify it falls back to any legacy field that existing records still use — otherwise records created before the migration are silently excluded or inaccessible. Also check entity identity keys: if code looks up or matches entities using a computed key (e.g., `e.id || e.externalId`), all code paths that perform the same lookup must use the same key computation — one path using `e.id` while another uses `e.id || e.externalId` causes mismatches for entities missing the primary key
    - Entity type changes without invariant revalidation — when an entity has a discriminator field (type, kind, category) and the user changes it, all type-specific invariants must be enforced on the new type AND type-specific fields from the old type must be cleared or revalidated. A job changing from `shell` to `agent` without clearing `command`, or changing to `shell` without requiring `command`, leaves the entity in an invalid hybrid state that fails at runtime or resurfaces stale data
    - Invariant relationships between configuration flags (flag A implies flag B) not enforced across all layers — UI toggle handlers, API validation schemas, server default-application functions, and serialization/deserialization must all preserve the invariant. If any layer allows setting A=true with B=false (or vice versa), cascading defaults and toggle logic produce contradictory state. Trace the invariant through: UI state handlers, form submission, route validation, service defaults, and persistence round-trip
@@ -91,7 +93,7 @@
    - Validation functions that delegate to runtime-behavior computations (next schedule occurrence, URL reachability, resource resolution) — conflating "no result within search window" or "temporarily unavailable" with "invalid input" rejects valid configurations. Validate syntax and structure independently of runtime feasibility
    - Numeric values from strings used without `NaN`/type guards — `NaN` comparisons silently pass bounds checks. Clamp query params to safe lower bounds
    - UI elements hidden from navigation but still accessible via direct URL — enforce restrictions at the route level
-   - Summary counters/accumulators that miss edge cases (removals, branch coverage, underflow on decrements — guard against going negative with lower-bound conditions); counters incremented before confirming the operation actually changed state — rejected, skipped, or no-op iterations inflate success counts. Silent operations in verbose sequences where all branches should print status
+   - Summary counters/accumulators that miss edge cases (removals, branch coverage, underflow on decrements — guard against going negative with lower-bound conditions); counters incremented before confirming the operation actually changed state — rejected, skipped, or no-op iterations inflate success counts. Batch operations that report overall success while silently logging per-item failures — callers see success but partial work was done; collect and return per-item failures in the response. Silent operations in verbose sequences where all branches should print status
    **Concurrency & data integrity** _[applies when: code has shared state, database writes, or multi-step mutations]_
    - Shared mutable state accessed by concurrent requests without locking or atomic writes; multi-step read-modify-write cycles that can interleave — use conditional writes/optimistic concurrency (e.g., condition expressions, version checks) to close the gap between read and write; if the conditional write fails, surface a retryable error instead of letting it bubble as a 500
@@ -103,7 +105,7 @@
    **Input handling** _[applies when: code accepts user/external input]_
    - Trimming values where whitespace is significant (API keys, tokens, passwords, base64) — only trim identifiers/names
-   - Endpoints accepting unbounded arrays/collections without upper limits — enforce max size or move to background jobs. Also check internal operations that fan out unbounded parallel I/O (e.g., `Promise.all(files.map(readFile))`) — large collections risk EMFILE (too many open file descriptors) or memory exhaustion. Use a concurrency limiter or batch processing for collections that can grow without bound
+   - Endpoints accepting unbounded arrays/collections without upper limits — enforce max size or move to background jobs. Also validate element-level invariants (types, format, non-empty) and deduplicate — duplicate elements inflate operation counts, repeat side effects, and skew success/failure metrics. Also check internal operations that fan out unbounded parallel I/O (e.g., `Promise.all(files.map(readFile))`) — large collections risk EMFILE (too many open file descriptors) or memory exhaustion. Use a concurrency limiter or batch processing for collections that can grow without bound
    - Security/sanitization functions (redaction, escaping, validation) that only handle one input format — if data can arrive in multiple formats (JSON `"KEY": "value"`, shell `KEY=value`, URL-encoded, headers), the function must cover all formats present in the system or sensitive data leaks through the unhandled format
 ## Tier 3 — Domain-Specific (Check Only When File Type Matches)
@@ -146,6 +148,7 @@
    - Subprocess output buffered in memory without size limits — a noisy or stuck child process can cause unbounded memory growth. Cap in-memory buffers and truncate or stream to disk for long-running commands
    - Platform-specific assumptions — hardcoded shell interpreters, `path.join()` backslashes breaking ESM imports. Use `pathToFileURL()` for dynamic imports
    - Naive whitespace splitting of command strings (`str.split(/\s+/)`) breaks quoted arguments — use a proper argv parser or explicitly disallow quoted/multi-word arguments when validating shell commands
+   - Shell expansions (brace `{a,b}`, glob `*`, tilde `~`, variable `$VAR`) suppressed by quoting context — single quotes prevent all expansion, so patterns like `--include='*.{ts,js}'` pass the literal braces to the command instead of expanding. Use multiple flags, unquoted brace expansion (bash-only), or other command-specific syntax when expansion is required
    **Search & navigation** _[applies when: code implements search results or deep-linking]_
    - Search results linking to generic list pages instead of deep-linking to the specific record
@@ -205,7 +208,7 @@
    **Test coverage**
    - New logic/schemas/services without corresponding tests when similar existing code has tests
    - New error paths untestable because services throw generic errors instead of typed ones
-   - Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses
+   - Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses. Includes tests that assert by inspecting function source code (string-matching implementation details) rather than calling the function and checking behavior — they break on harmless refactors while missing actual behavioral changes. Also tests that mutate global state at import time (module registries, sys.modules) without fixture-scoped cleanup — causes ordering-dependent failures across the test session
    - Tests depending on real wall-clock time or external dependencies when testing logic — use fake timers and mocks
    - Missing tests for trust-boundary enforcement — submit tampered values, verify server ignores them
    - Tests that exercise code paths depending on features the integration layer doesn't expose — they pass against mocks but the behavior can't trigger in production. Verify mocked responses match what the real dependency actually returns

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "slash-do",
-  "version": "2.0.0",
+  "version": "2.2.0",
   "description": "Curated slash commands for AI coding assistants — Claude Code, OpenCode, Gemini CLI, and Codex",
   "author": "Adam Eivy <adam@eivy.com>",
   "license": "MIT",

package/src/environments.js CHANGED Viewed

@@ -56,7 +56,7 @@ const ENVIRONMENTS = {
     libDir: null,
     hooksDir: null,
     versionFile: path.join(HOME, '.codex', '.slashdo-version'),
-    format: 'skill-md',
+    format: 'yaml-frontmatter',
     ext: null,
     namespacing: 'directory',
     libPathPrefix: null,

package/src/transformer.js CHANGED Viewed

@@ -65,12 +65,6 @@ function toTomlHeader(fm) {
   return lines.join('\n');
 }
-function toSkillHeader(fm) {
-  const lines = [];
-  if (fm.description) lines.push(`# ${fm.description}`);
-  return lines.join('\n');
-}
 function getTargetFilename(relPath, env) {
   const basename = path.basename(relPath, '.md');
   const dir = path.dirname(relPath);
@@ -114,9 +108,6 @@ function transformCommand(content, env, sourceLibDir) {
     case 'toml':
       header = toTomlHeader(frontmatter);
       break;
-    case 'skill-md':
-      header = toSkillHeader(frontmatter);
-      break;
     default:
       header = toYamlFrontmatter(frontmatter);
   }