slash-do 2.0.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -24,7 +24,7 @@
24
24
  <p align="center">
25
25
  <img src="https://img.shields.io/npm/v/slash-do?style=flat-square&color=blue" alt="npm version" />
26
26
  <img src="https://img.shields.io/badge/environments-4-green?style=flat-square" alt="environments" />
27
- <img src="https://img.shields.io/badge/commands-12-orange?style=flat-square" alt="commands" />
27
+ <img src="https://img.shields.io/badge/commands-14-orange?style=flat-square" alt="commands" />
28
28
  <img src="https://img.shields.io/badge/license-MIT-lightgrey?style=flat-square" alt="license" />
29
29
  </p>
30
30
 
@@ -60,8 +60,9 @@ All commands live under the `do:` namespace:
60
60
  | `/do:rpr` | Resolve PR review feedback with parallel agents |
61
61
  | `/do:release` | Create a release PR with version bump and changelog |
62
62
  | `/do:review` | Deep code review against best practices |
63
- | `/do:better` | Full DevSecOps audit with 7-agent scan and remediation |
63
+ | `/do:better` | Full DevSecOps audit with 8-agent scan and remediation |
64
64
  | `/do:better-swift` | SwiftUI DevSecOps audit with multi-platform coverage |
65
+ | `/do:depfree` | Audit dependencies, remove unnecessary ones, write replacement code |
65
66
  | `/do:goals` | Generate GOALS.md from codebase analysis |
66
67
  | `/do:replan` | Review and clean up PLAN.md |
67
68
  | `/do:omd` | Audit and optimize markdown files |
@@ -5,7 +5,7 @@ argument-hint: "[--interactive] [--scan-only] [--no-merge] [path filter or focus
5
5
 
6
6
  # Better — Unified DevSecOps Pipeline
7
7
 
8
- Run the full DevSecOps lifecycle: audit the codebase with 7 deduplicated agents, consolidate findings, remediate in an isolated worktree, create **separate PRs per category** with SemVer bump, verify CI, run Copilot review loops, and merge.
8
+ Run the full DevSecOps lifecycle: audit the codebase with 8 deduplicated agents, consolidate findings, remediate in an isolated worktree, create **separate PRs per category** with SemVer bump, verify CI, run Copilot review loops, and merge.
9
9
 
10
10
  **Default mode: fully autonomous.** Uses Balanced model profile, proceeds through all phases without prompting, auto-merges PRs with clean reviews.
11
11
 
@@ -35,7 +35,7 @@ AskUserQuestion([
35
35
  header: "Model",
36
36
  multiSelect: false,
37
37
  options: [
38
- { label: "Quality", description: "Opus for all agents — fewest false positives, best fixes (highest cost, 7+ Opus agents)" },
38
+ { label: "Quality", description: "Opus for all agents — fewest false positives, best fixes (highest cost, 8+ Opus agents)" },
39
39
  { label: "Balanced (Recommended)", description: "Sonnet for audit and remediation — good quality at moderate cost" },
40
40
  { label: "Budget", description: "Haiku for audit, Sonnet for remediation — fastest and cheapest" }
41
41
  ]
@@ -47,7 +47,7 @@ Record the selection as `MODEL_PROFILE` and derive agent models from this table:
47
47
 
48
48
  | Agent Role | Quality | Balanced | Budget |
49
49
  |------------|---------|----------|--------|
50
- | Audit agents (7 Explore agents, Phase 1) | opus | sonnet | haiku |
50
+ | Audit agents (8 Explore agents, Phase 1) | opus | sonnet | haiku |
51
51
  | Remediation agents (general-purpose, Phase 3) | opus | sonnet | sonnet |
52
52
 
53
53
  Derive two variables:
@@ -121,7 +121,7 @@ Record as `BUILD_CMD` and `TEST_CMD`.
121
121
 
122
122
  Project conventions are already in your context. Pass relevant conventions to each agent.
123
123
 
124
- Launch 7 Explore agents in two batches. Each agent must report findings in this format:
124
+ Launch 8 Explore agents in two batches. Each agent must report findings in this format:
125
125
  ```
126
126
  - **[CRITICAL/HIGH/MEDIUM/LOW]** `file:line` - Description. Suggested fix: ... Complexity: Simple/Medium/Complex
127
127
  ```
@@ -174,7 +174,7 @@ Skip step 4 if steps 1-3 reveal the code is correct.
174
174
  Resilience: external calls without timeouts, missing fallback for unavailable downstream services, retry without backoff ceiling/jitter, missing health check endpoints
175
175
  Observability: production paths without structured logging, error logs missing reproduction context (request ID, input params), async flows without correlation IDs
176
176
 
177
- ### Batch 2 (2 agents after Batch 1 completes):
177
+ ### Batch 2 (3 agents after Batch 1 completes):
178
178
 
179
179
  **Model**: Same `AUDIT_MODEL` as Batch 1.
180
180
 
@@ -188,14 +188,27 @@ Skip step 4 if steps 1-3 reveal the code is correct.
188
188
  - **Database migrations**: exclusive-lock ALTER TABLE on large tables, CREATE INDEX without CONCURRENTLY, missing down migrations or untested rollback paths
189
189
  - General: framework-specific security issues, language-specific gotchas, domain-specific compliance, environment variable hygiene (missing `.env.example`, required env vars not validated at startup, secrets in config files that should be in env)
190
190
 
191
- 7. **Test Quality & Coverage**
191
+ 7. **Dependency Freedom**
192
+ Audit all third-party dependencies for necessity. Every small library is an attack surface — supply chain compromises are real and common.
193
+ Focus:
194
+ - Extract the full dependency list from the project manifest (`package.json`, `Cargo.toml`, `pyproject.toml`, `go.mod`, `Gemfile`, etc.)
195
+ - Classify each dependency into tiers:
196
+ - **Acceptable**: large, widely-audited libraries (react, express, d3, three.js, next, vue, fastify, typescript, eslint, prisma, tailwindcss, tokio, serde, django, flask, pandas, etc.) — skip these
197
+ - **Suspect**: smaller libraries where we may only use 1-2 functions, wrappers over built-in APIs, single-purpose utilities
198
+ - **Removable**: libraries where the used functionality is <50 lines to implement, wraps a now-native API (e.g., `crypto.randomUUID()` replacing uuid, `structuredClone` replacing lodash.cloneDeep, `Array.prototype.flat` replacing array-flatten, `node:fs/promises` replacing fs-extra for most uses), unmaintained with known vulnerabilities, or micro-packages (is-odd, is-number, left-pad tier)
199
+ - For each suspect/removable dependency: search all source files for imports, list every function/class/type used, count call sites, assess replacement complexity (Trivial <20 lines, Moderate 20-100, Complex 100-300, Infeasible 300+)
200
+ - Check maintenance status: last publish date, open security issues, known CVEs
201
+ - Report format: `**[SEVERITY]** {package-name} — {Tier}. Uses: {functions}. Call sites: {N} in {M} files. Replacement: {complexity}. Reason: {why removable}`
202
+ - Severity mapping: unmaintained with CVEs → CRITICAL, unmaintained without CVEs → HIGH, replaceable single-function usage → MEDIUM, suspect but complex replacement → LOW
203
+
204
+ 8. **Test Quality & Coverage**
192
205
  Uses Batch 1 findings as context to prioritize.
193
206
  Focus areas:
194
207
 
195
208
  **Coverage gaps:**
196
209
  - Missing test files for critical modules, untested edge cases, tests that only cover happy paths
197
210
  - Areas with high complexity (identified by agents 1-5) but no tests
198
- - Remediation changes from agents 1-6 that lack corresponding test coverage
211
+ - Remediation changes from agents 1-7 that lack corresponding test coverage
199
212
 
200
213
  **Vacuous tests (tests that don't actually test anything):**
201
214
  - Tests that assert on mocked return values instead of real behavior (testing the mock, not the code)
@@ -257,6 +270,7 @@ For each file touched by multiple categories, document why it was assigned to on
257
270
  ### Architecture & SOLID
258
271
  ### Bugs, Performance & Error Handling
259
272
  ### Stack-Specific
273
+ ### Dependency Freedom
260
274
  ### Test Quality & Coverage
261
275
  ```
262
276
 
@@ -267,6 +281,7 @@ For each file touched by multiple categories, document why it was assigned to on
267
281
  - Architecture → Architecture & SOLID → `architecture`
268
282
  - Bugs & Perf → Bugs, Performance & Error Handling → `bugs-perf`
269
283
  - Stack-Specific → Stack-Specific → `stack-specific`
284
+ - Dep Freedom → Dependency Freedom → `deps`
270
285
  - Tests → Test Quality & Coverage → `tests`
271
286
 
272
287
  ```
@@ -278,6 +293,7 @@ For each file touched by multiple categories, document why it was assigned to on
278
293
  | Architecture | ... | ... | ... | ... | ... |
279
294
  | Bugs & Perf | ... | ... | ... | ... | ... |
280
295
  | Stack-Specific | ... | ... | ... | ... | ... |
296
+ | Dep Freedom | ... | ... | ... | ... | ... |
281
297
  | Tests | ... | ... | ... | ... | ... |
282
298
  | TOTAL | ... | ... | ... | ... | ... |
283
299
  ```
@@ -332,6 +348,7 @@ If no shared utilities were identified, skip this step.
332
348
  - Architecture & SOLID
333
349
  - Bugs, Performance & Error Handling
334
350
  - Stack-Specific
351
+ - Dependency Freedom
335
352
  3. Only create tasks for categories that have actionable findings
336
353
  4. Spawn up to 5 general-purpose agents as teammates. **Pass `REMEDIATION_MODEL` as the `model` parameter on each agent.** If `REMEDIATION_MODEL` is `opus`, omit the parameter to inherit from session.
337
354
 
@@ -339,9 +356,13 @@ If no shared utilities were identified, skip this step.
339
356
 
340
357
  !`cat ~/.claude/lib/remediation-agent-template.md`
341
358
 
359
+ ### Dependency Freedom agent — special instructions:
360
+ The Dependency Freedom remediation agent has a unique task: for each removable dependency, it must (1) write replacement code (utility function or inline native API call), (2) update ALL import/require statements across the codebase, (3) remove the package from the manifest, and (4) regenerate the lock file (`npm install` / `cargo update` / etc.). After all replacements, verify no source file still references the removed package. See `/do:depfree` Phase 3b for the full agent template.
361
+
342
362
  ### Conflict avoidance:
343
363
  - Review all findings before task assignment. If two categories touch the same file, assign both sets of findings to the same agent.
344
364
  - Security agent gets priority on validation logic; DRY agent gets priority on import consolidation.
365
+ - Dependency Freedom agent gets priority on files that are solely import/usage sites of a removed package.
345
366
 
346
367
  </plan_and_remediate>
347
368
 
@@ -421,7 +442,7 @@ Before creating PRs, run a deep code review on all remediation changes to catch
421
442
 
422
443
  ## Phase 4c: Test Enhancement
423
444
 
424
- After internal code review passes, evaluate and enhance the project's test suite. This phase acts on Agent 7's findings AND ensures all remediation work from Phase 3 has proper test coverage.
445
+ After internal code review passes, evaluate and enhance the project's test suite. This phase acts on Agent 8's findings AND ensures all remediation work from Phase 3 has proper test coverage.
425
446
 
426
447
  ### 4c.0: Record Start SHA
427
448
 
@@ -433,7 +454,7 @@ PHASE_4C_START_SHA="$(git rev-parse HEAD)"
433
454
 
434
455
  ### 4c.1: Test Audit Triage
435
456
 
436
- Review Agent 7 findings from Phase 1 and categorize them:
457
+ Review Agent 8 (Test Quality & Coverage) findings from Phase 1 and categorize them:
437
458
 
438
459
  1. **`[VACUOUS]` findings** — tests that exist but don't test real behavior. These are the highest priority because they create a false sense of safety.
439
460
  2. **`[WEAK]` findings** — tests that partially cover behavior but miss important cases. Strengthen with additional assertions and edge cases.
@@ -535,7 +556,7 @@ Initialize `CREATED_CATEGORY_SLUGS=""` (empty space-delimited string). After eac
535
556
  For each category that has findings:
536
557
  1. Switch to `{DEFAULT_BRANCH}`: `git checkout {DEFAULT_BRANCH}`
537
558
  2. Create a category branch: `git checkout -b better/{CATEGORY_SLUG}`
538
- - Use slugs: `security`, `code-quality`, `dry`, `architecture`, `bugs-perf`, `stack-specific`, `tests`
559
+ - Use slugs: `security`, `code-quality`, `dry`, `architecture`, `bugs-perf`, `stack-specific`, `deps`, `tests`
539
560
  3. For each file assigned to this category in `FILE_OWNER_MAP`:
540
561
  - **Modified files**: `git checkout better/{DATE} -- {file_path}`
541
562
  - **New files (Added)**: `git checkout better/{DATE} -- {file_path}`
@@ -757,6 +778,7 @@ If merge fails (e.g., branch protection, merge conflicts from a prior PR):
757
778
  | Architecture | ... | ... | ... | #number | pass | approved |
758
779
  | Bugs & Perf | ... | ... | ... | #number | pass | approved |
759
780
  | Stack-Specific | ... | ... | ... | #number | pass | approved |
781
+ | Dep Freedom | ... | ... | ... | #number | pass | approved |
760
782
  | Tests | ... | ... | ... | #number | pass | approved |
761
783
  | TOTAL | ... | ... | ... | N PRs | | |
762
784
 
@@ -791,6 +813,7 @@ Test Enhancement Stats:
791
813
  - When extracting modules, always add backward-compatible re-exports in the original module to prevent cross-PR breakage
792
814
  - Version bump happens exactly once on the first category branch based on aggregate commit analysis
793
815
  - Only CRITICAL, HIGH, and MEDIUM findings are auto-remediated for code categories; LOW findings remain tracked in PLAN.md
816
+ - Dependency Freedom findings replace unnecessary third-party packages with owned code — see `/do:depfree` for standalone usage
794
817
  - Test Quality & Coverage findings are remediated in Phase 4c with a dedicated test enhancement agent that verifies tests fail when code is broken
795
818
  - GitLab projects skip the Copilot review loop entirely (Phase 6) and stop after MR creation
796
819
  - CI must pass on each PR before requesting Copilot review or merging
@@ -0,0 +1,529 @@
1
+ ---
2
+ description: Audit third-party dependencies and remove unnecessary ones by writing replacement code
3
+ argument-hint: "[--interactive] [--scan-only] [--no-merge] [specific packages to evaluate]"
4
+ ---
5
+
6
+ # Depfree — Dependency Freedom Audit
7
+
8
+ Audit all third-party dependencies, classify them as acceptable (large, widely-audited) or suspect (small, replaceable), analyze actual usage of suspect dependencies, and replace them with owned code where feasible.
9
+
10
+ Every small library is an attack surface. Supply chain compromises are real and common. Large, widely-audited libraries (express, react, d3, three.js, next, vue, fastify, lodash-es, etc.) are acceptable. But for smaller libraries or libraries where only one helper function is used, we should write the code ourselves.
11
+
12
+ **Default mode: fully autonomous.** Uses Balanced model profile, proceeds through all phases without prompting.
13
+
14
+ **`--interactive` mode:** Pauses for classification approval, replacement review, and merge confirmation.
15
+
16
+ Parse `$ARGUMENTS` for:
17
+ - **`--interactive`**: pause at each decision point for user approval
18
+ - **`--scan-only`**: run Phase 0 + 1 + 2 only (audit and plan), skip remediation
19
+ - **`--no-merge`**: run through PR creation, skip merge
20
+ - **Specific packages**: limit audit scope to named packages (e.g., "chalk dotenv")
21
+
22
+ ## Configuration
23
+
24
+ ### Default Mode (autonomous)
25
+
26
+ Use the **Balanced** model profile automatically (`AUDIT_MODEL=sonnet`, `REMEDIATION_MODEL=sonnet`).
27
+
28
+ ### Interactive Mode (`--interactive`)
29
+
30
+ Present the user with configuration options using `AskUserQuestion`:
31
+
32
+ ```
33
+ AskUserQuestion([{
34
+ question: "Which model profile for audit and remediation agents?",
35
+ header: "Model",
36
+ multiSelect: false,
37
+ options: [
38
+ { label: "Quality", description: "Opus for all agents — fewest false positives, best replacements (highest cost)" },
39
+ { label: "Balanced (Recommended)", description: "Sonnet for audit and remediation — good quality at moderate cost" },
40
+ { label: "Budget", description: "Haiku for audit, Sonnet for remediation — fastest and cheapest" }
41
+ ]
42
+ }])
43
+ ```
44
+
45
+ Record the selection as `MODEL_PROFILE` and derive:
46
+ - `AUDIT_MODEL`: `opus` / `sonnet` / `haiku` based on profile
47
+ - `REMEDIATION_MODEL`: `opus` / `sonnet` / `sonnet` based on profile
48
+
49
+ When the resolved model is `opus`, **omit** the `model` parameter on the Agent call so the agent inherits the session's Opus version.
50
+
51
+ ## Compaction Guidance
52
+
53
+ When compacting during this workflow, always preserve:
54
+ - The `DEPENDENCY_MAP` (complete classification of all dependencies)
55
+ - All REMOVABLE findings with package names and usage details
56
+ - The current phase number and what phases remain
57
+ - All PR numbers and URLs created so far
58
+ - `BUILD_CMD`, `TEST_CMD`, `PROJECT_TYPE`, `WORKTREE_DIR`, `REPO_DIR` values
59
+ - `VCS_HOST`, `CLI_TOOL`, `DEFAULT_BRANCH`, `CURRENT_BRANCH`
60
+
61
+
62
+ ## Phase 0: Discovery & Setup
63
+
64
+ ### 0a: VCS Host Detection
65
+ Run `gh auth status` to check GitHub CLI. If it fails, run `glab auth status` for GitLab.
66
+ - Set `VCS_HOST` to `github` or `gitlab`
67
+ - Set `CLI_TOOL` to `gh` or `glab`
68
+ - If neither is authenticated, warn the user and halt
69
+
70
+ ### 0b: Project Type Detection
71
+ Check for project manifests to determine the tech stack:
72
+ - `package.json` → Node.js (check for `next`, `react`, `vue`, `express`, etc.)
73
+ - `Cargo.toml` → Rust
74
+ - `pyproject.toml` / `requirements.txt` / `setup.py` → Python
75
+ - `go.mod` → Go
76
+ - `pom.xml` / `build.gradle` → Java/Kotlin
77
+ - `Gemfile` → Ruby
78
+ - `*.csproj` / `*.sln` → .NET
79
+
80
+ Record the detected stack as `PROJECT_TYPE`.
81
+
82
+ ### 0c: Build & Test Command Detection
83
+ Derive build and test commands from the project type:
84
+ - Node.js: check `package.json` scripts for `build`, `test`, `typecheck`, `lint`
85
+ - Rust: `cargo build`, `cargo test`
86
+ - Python: `pytest`, `python -m pytest`
87
+ - Go: `go build ./...`, `go test ./...`
88
+ - If ambiguous, check project conventions already in context
89
+
90
+ Record as `BUILD_CMD` and `TEST_CMD`.
91
+
92
+ ### 0d: State Snapshot
93
+ - Record `REPO_DIR` via `git rev-parse --show-toplevel`
94
+ - Record `CURRENT_BRANCH` via `git rev-parse --abbrev-ref HEAD`
95
+ - Record `DEFAULT_BRANCH` via `gh repo view --json defaultBranchRef --jq '.defaultBranchRef.name'` (or `glab` equivalent)
96
+ - Record `IS_DIRTY` via `git status --porcelain`
97
+
98
+
99
+ ## Phase 1: Dependency Inventory
100
+
101
+ ### 1a: Extract All Dependencies
102
+
103
+ Based on `PROJECT_TYPE`, extract the full dependency list:
104
+
105
+ **Node.js:**
106
+ - Read `package.json` → `dependencies` and `devDependencies`
107
+ - Note: `devDependencies` used only in build/test are lower priority but still worth auditing
108
+ - Check for workspace packages (monorepo) in `workspaces` field
109
+
110
+ **Rust:**
111
+ - Read `Cargo.toml` → `[dependencies]`, `[dev-dependencies]`, `[build-dependencies]`
112
+
113
+ **Python:**
114
+ - Read `pyproject.toml` → `[project.dependencies]`, `[project.optional-dependencies]`
115
+ - Or `requirements.txt`, `setup.py`
116
+
117
+ **Go:**
118
+ - Read `go.mod` → `require` block
119
+
120
+ **Ruby:**
121
+ - Read `Gemfile`
122
+
123
+ ### 1b: Classify Dependencies
124
+
125
+ For each dependency, classify it into one of three tiers:
126
+
127
+ **Tier 1 — ACCEPTABLE (keep without question):**
128
+ Large, widely-audited, foundational libraries. Examples by ecosystem:
129
+ - **Node.js**: react, next, vue, express, fastify, hono, typescript, eslint, prettier, webpack, vite, jest, vitest, mocha, d3, three, prisma, drizzle, @types/*, tailwindcss, postcss
130
+ - **Rust**: tokio, serde, clap, reqwest, hyper, tracing, sqlx, axum, actix-web
131
+ - **Python**: django, flask, fastapi, sqlalchemy, pandas, numpy, scipy, pytest, requests, httpx, pydantic
132
+ - **Go**: standard library (no third-party needed for most things)
133
+ - **Ruby**: rails, rspec, sidekiq, puma, devise
134
+ - Any dependency with >10M weekly downloads (npm) or equivalent popularity metric for the ecosystem
135
+
136
+ **Tier 2 — SUSPECT (audit usage):**
137
+ Smaller libraries that may be doing something we can write ourselves. Indicators:
138
+ - <1M weekly downloads (npm) or equivalent
139
+ - Single-purpose utility (does one thing)
140
+ - We only use 1-2 functions from it
141
+ - Wrapper libraries that add thin abstractions over built-in APIs
142
+ - Libraries that replicate functionality available in newer language/runtime versions
143
+ - Abandoned or unmaintained (no commits in 12+ months, open security issues)
144
+
145
+ **Tier 3 — REMOVABLE (strong candidate for replacement):**
146
+ Libraries where the cost of owning the code is clearly lower than the supply chain risk:
147
+ - We use a single function that's <50 lines to implement
148
+ - The library wraps a built-in API with minimal added value
149
+ - The library is unmaintained with known vulnerabilities
150
+ - The library's functionality is now available natively (e.g., `node:fs/promises` replacing `fs-extra` for most use cases, `structuredClone` replacing `lodash.cloneDeep`, `Array.prototype.flat` replacing `array-flatten`)
151
+ - Color/string utilities where we use 1-2 functions (e.g., using `chalk` just for `chalk.red()` when a 10-line ANSI wrapper suffices)
152
+ - UUID generation when `crypto.randomUUID()` is available
153
+ - Deep merge/clone when `structuredClone` suffices
154
+ - `dotenv` when the runtime supports `--env-file` natively
155
+ - `is-odd`, `is-number`, `left-pad` tier micro-packages
156
+
157
+ Record the full classification as `DEPENDENCY_MAP`.
158
+
159
+ ### 1c: Usage Analysis (Tier 2 & 3 only)
160
+
161
+ For each Tier 2 and Tier 3 dependency, launch parallel Explore agents (using `AUDIT_MODEL`) to determine actual usage:
162
+
163
+ Each agent should:
164
+ 1. Search all source files for imports/requires of the package
165
+ 2. List every function, class, constant, or type imported from it
166
+ 3. Count call sites per imported symbol
167
+ 4. Assess complexity of replacement:
168
+ - **Trivial** (<20 lines): simple wrapper, single utility function, type alias
169
+ - **Moderate** (20-100 lines): multi-function utility, needs tests, edge cases to handle
170
+ - **Complex** (100-300 lines): significant logic, crypto, parsing, protocol implementation
171
+ - **Infeasible** (300+ lines or requires deep domain expertise): keep the dependency
172
+ 5. Check if the package has known vulnerabilities: `npm audit`, `cargo audit`, `pip-audit`, etc.
173
+ 6. Check last publish date and maintenance status
174
+
175
+ Report format:
176
+ ```
177
+ - **{package-name}** — Tier {2|3}
178
+ - Imports: {list of imported symbols}
179
+ - Call sites: {count} across {N} files
180
+ - Functions used: {list with brief description of each}
181
+ - Replacement complexity: {Trivial|Moderate|Complex|Infeasible}
182
+ - Maintenance: {last publish date, open issues, known CVEs}
183
+ - Recommendation: **REMOVE** / **KEEP** / **EVALUATE**
184
+ - Replacement sketch: {brief description of how to replace, if REMOVE}
185
+ ```
186
+
187
+ Wait for all agents to complete before proceeding.
188
+
189
+
190
+ ## Phase 2: Replacement Plan
191
+
192
+ 1. Read the existing `PLAN.md` (create if it doesn't exist)
193
+ 2. Filter to only REMOVE recommendations from Phase 1c
194
+ 3. For EVALUATE recommendations: **Default mode** — treat as KEEP (conservative). **Interactive mode** — present to user via `AskUserQuestion` for each
195
+ 4. Group removable dependencies by replacement strategy:
196
+ - **Native replacement**: built-in API replaces the library (e.g., `crypto.randomUUID()`)
197
+ - **Inline replacement**: write a small utility function (e.g., ANSI color wrapper)
198
+ - **Consolidation**: multiple small deps replaced by one owned utility module
199
+ 5. Estimate total lines of replacement code needed
200
+ 6. Add a new section to PLAN.md:
201
+
202
+ ```markdown
203
+ ## Depfree Audit - {YYYY-MM-DD}
204
+
205
+ Summary: {N} total dependencies. {A} acceptable (Tier 1), {B} audited and kept (Tier 2), {C} to remove (Tier 3).
206
+ Estimated replacement code: ~{lines} lines across {files} new/modified files.
207
+
208
+ ### Dependencies to Remove
209
+ | Package | Tier | Used Functions | Call Sites | Replacement | Complexity | Risk |
210
+ |---------|------|---------------|------------|-------------|------------|------|
211
+ | ... | ... | ... | ... | ... | ... | ... |
212
+
213
+ ### Dependencies Kept (with rationale)
214
+ | Package | Tier | Reason Kept |
215
+ |---------|------|-------------|
216
+ | ... | ... | ... |
217
+
218
+ ### Replacement Tasks
219
+ For each dependency to remove:
220
+ - [ ] **{package}** — {strategy}. Replace {N} call sites in {M} files. Write {utility name} ({est. lines} lines). Complexity: {level}.
221
+ ```
222
+
223
+ 7. Print summary table:
224
+ ```
225
+ | Status | Count | Examples |
226
+ |------------|-------|-----------------------------------|
227
+ | Acceptable | ... | react, express, typescript, ... |
228
+ | Kept | ... | {packages kept with reasons} |
229
+ | Removable | ... | {packages to remove} |
230
+ | Total | ... | |
231
+ ```
232
+
233
+ **GATE: If `--scan-only` was passed, STOP HERE.** Print the summary and exit.
234
+
235
+ **GATE: If no removable dependencies were found, print "All dependencies are justified" and exit.**
236
+
237
+ **Interactive mode**: Present the removal plan via `AskUserQuestion`:
238
+ ```
239
+ AskUserQuestion([{
240
+ question: "Dependency removal plan:\n{summary of packages to remove}\n\nProceed with replacement?",
241
+ options: [
242
+ { label: "Proceed", description: "Remove all listed dependencies and write replacement code" },
243
+ { label: "Review individually", description: "Let me approve/reject each removal" },
244
+ { label: "Abort", description: "Stop here — I'll review the plan manually" }
245
+ ]
246
+ }])
247
+ ```
248
+
249
+ If "Review individually": present each dependency with REMOVE/KEEP options, then proceed with only approved removals.
250
+
251
+
252
+ ## Phase 3: Worktree Remediation
253
+
254
+ ### 3a: Setup
255
+
256
+ 1. If `IS_DIRTY` is true: `git stash --include-untracked -m "depfree: pre-audit stash"`
257
+ 2. Set `DATE` to today's date in YYYY-MM-DD format
258
+ 3. Create the worktree:
259
+ ```bash
260
+ git worktree add ../depfree-{DATE} -b depfree/{DATE}
261
+ ```
262
+ 4. Set `WORKTREE_DIR` to `../depfree-{DATE}`
263
+
264
+ ### 3b: Write Replacement Code
265
+
266
+ For each dependency to remove, spawn a general-purpose agent (using `REMEDIATION_MODEL`) with these instructions:
267
+
268
+ ```
269
+ <context>
270
+ Project type: {PROJECT_TYPE}
271
+ Build command: {BUILD_CMD}
272
+ Test command: {TEST_CMD}
273
+ Working directory: {WORKTREE_DIR} (this is a git worktree — all work happens here)
274
+ </context>
275
+
276
+ <task>
277
+ Remove the dependency on `{PACKAGE_NAME}` and replace with owned code.
278
+
279
+ Current usage:
280
+ {USAGE_DETAILS from Phase 1c — imported symbols, call sites, files}
281
+
282
+ Replacement strategy: {STRATEGY from Phase 2}
283
+
284
+ Steps:
285
+ 1. Write the replacement code (utility function, inline replacement, or native API call)
286
+ 2. Update ALL import/require statements across the codebase to use the new code
287
+ 3. Remove the package from the manifest ({package.json, Cargo.toml, etc.})
288
+ 4. Run `{BUILD_CMD}` to verify compilation
289
+ 5. Run `{TEST_CMD}` to verify tests pass
290
+ 6. If tests reference the removed package directly (mocking it, importing test helpers from it), update those tests too
291
+ </task>
292
+
293
+ <guardrails>
294
+ - The replacement must preserve behavior for all currently-used call sites and documented invariants
295
+ - You may omit handling for input shapes or edge cases that are provably unreachable based on {USAGE_DETAILS}, but do not narrow behavior for any actual call site
296
+ - Do NOT introduce new dependencies to replace old ones
297
+ - Do NOT use `git add -A` or `git add .` — stage specific files only
298
+ - Keep replacement code minimal
299
+ - If replacement is more complex than estimated (>2x the estimated lines), report back and skip — do not force a bad replacement
300
+ - Place shared utility replacements in a sensible location (e.g., `src/utils/`, `lib/`, `internal/`) following existing project conventions
301
+ - Commit each replacement independently: `refactor: replace {package} with owned {utility/code}`
302
+ </guardrails>
303
+ ```
304
+
305
+ **Parallelization**: Launch up to 5 agents in parallel. If >5 dependencies to remove, batch them. Assign each agent a non-overlapping set of dependencies (no two agents should modify the same files — if overlap exists, group those dependencies into one agent).
306
+
307
+ ### 3c: Lock File Update
308
+
309
+ After all replacement agents complete:
310
+ 1. Remove all replaced packages from the lock file:
311
+ ```bash
312
+ cd {WORKTREE_DIR}
313
+ # Node.js: refresh lockfile only, without running lifecycle scripts
314
+ npm install --package-lock-only --ignore-scripts
315
+ # Or: yarn install --mode=update-lockfile --ignore-scripts
316
+ # Or: pnpm install --lockfile-only --ignore-scripts
317
+ # Rust: let a check refresh Cargo.lock to reflect manifest changes only
318
+ cargo check
319
+ # Python: use the project's lock tool to refresh
320
+ # poetry lock --no-update
321
+ # pip-compile requirements.in
322
+ ```
323
+ 2. Commit the lock file update:
324
+ ```bash
325
+ git -C {WORKTREE_DIR} add {lock file}
326
+ git -C {WORKTREE_DIR} commit -m "chore: update lock file after dependency removal"
327
+ ```
328
+
329
+
330
+ ## Phase 4: Verification
331
+
332
+ ### 4a: Build & Test
333
+
334
+ 1. Run the full build:
335
+ ```bash
336
+ cd {WORKTREE_DIR} && {BUILD_CMD}
337
+ ```
338
+ 2. Run all tests:
339
+ ```bash
340
+ cd {WORKTREE_DIR} && {TEST_CMD}
341
+ ```
342
+ 3. If build or tests fail:
343
+ - Identify which replacement caused the failure
344
+ - Attempt to fix in a new commit
345
+ - If unfixable, revert the replacement commit AND re-add the dependency:
346
+ ```bash
347
+ git -C {WORKTREE_DIR} revert <sha>
348
+ ```
349
+ Note the reverted package as "kept — replacement failed"
350
+
351
+ ### 4b: Internal Code Review
352
+
353
+ 1. Generate the diff:
354
+ ```bash
355
+ cd {WORKTREE_DIR} && git diff {DEFAULT_BRANCH}...HEAD
356
+ ```
357
+ 2. Review all replacement code for:
358
+ - Functional equivalence (does the replacement handle the same inputs/outputs?)
359
+ - Missing edge cases that the original library handled
360
+ - Security regressions (e.g., replacing a sanitization library with a naive regex)
361
+ - Performance regressions (e.g., replacing an optimized parser with O(n^2) code)
362
+ - Correct error handling at system boundaries
363
+ 3. Fix any issues found, commit each fix separately
364
+
365
+ ### 4c: Verify No Phantom Dependencies
366
+
367
+ Confirm no source file still references a removed package:
368
+ ```bash
369
+ cd {WORKTREE_DIR}
370
+ for pkg in {REMOVED_PACKAGES}; do
371
+ grep -r "$pkg" \
372
+ --include='*.ts' \
373
+ --include='*.js' \
374
+ --include='*.tsx' \
375
+ --include='*.jsx' \
376
+ --include='*.py' \
377
+ --include='*.rs' \
378
+ --include='*.go' \
379
+ --include='*.rb' \
380
+ . && echo "WARN: $pkg still referenced"
381
+ done
382
+ ```
383
+ Fix any remaining references.
384
+
385
+
386
+ ## Phase 5: PR Creation
387
+
388
+ ### 5a: Push & Create PR
389
+
390
+ ```bash
391
+ cd {WORKTREE_DIR}
392
+ git push -u origin depfree/{DATE}
393
+ ```
394
+
395
+ Create the PR:
396
+
397
+ **GitHub:**
398
+ ```bash
399
+ gh pr create --head depfree/{DATE} --base {DEFAULT_BRANCH} \
400
+ --title "refactor: remove {N} unnecessary dependencies" \
401
+ --body "$(cat <<'EOF'
402
+ ## Depfree Audit — Dependency Removal
403
+
404
+ ### Summary
405
+ Removed {N} unnecessary third-party dependencies and replaced with owned code.
406
+ Estimated supply chain attack surface reduction: {N} packages ({transitive count} including transitive deps).
407
+
408
+ ### Dependencies Removed
409
+ | Package | Replacement | Lines of Owned Code |
410
+ |---------|-------------|-------------------|
411
+ {table of removed packages}
412
+
413
+ ### Dependencies Kept (audited)
414
+ {count} dependencies audited and kept with rationale. See PLAN.md for details.
415
+
416
+ ### Replacement Code
417
+ {bulleted list of new utility files or inline changes}
418
+
419
+ ### Verification
420
+ - [ ] Build passes
421
+ - [ ] All tests pass
422
+ - [ ] No phantom references to removed packages
423
+ - [ ] Lock file updated
424
+ EOF
425
+ )"
426
+ ```
427
+
428
+ **GitLab:**
429
+ ```bash
430
+ glab mr create --source-branch depfree/{DATE} --target-branch {DEFAULT_BRANCH} \
431
+ --title "refactor: remove {N} unnecessary dependencies" --description "..."
432
+ ```
433
+
434
+ Record `PR_NUMBER` and `PR_URL`.
435
+
436
+ **GATE: If `--no-merge` was passed, STOP HERE.** Print the PR URL and summary.
437
+
438
+ ### 5b: CI Verification
439
+
440
+ 1. Wait 30 seconds for CI to start
441
+ 2. Poll CI status:
442
+ ```bash
443
+ gh pr checks {PR_NUMBER}
444
+ ```
445
+ Poll every 30 seconds, max 10 minutes.
446
+ 3. If CI fails:
447
+ - Fetch failure logs, diagnose, fix, commit, push
448
+ - Max 3 fix attempts before informing the user
449
+
450
+ ### 5c: Copilot Review Loop (GitHub only)
451
+
452
+ If `VCS_HOST` is `github`, run the Copilot review loop using the shared template:
453
+
454
+ !`cat ~/.claude/lib/copilot-review-loop.md`
455
+
456
+ Pass: `{PR_NUMBER}`, `{OWNER}/{REPO}`, `depfree/{DATE}`, and `{BUILD_CMD}`.
457
+
458
+ ### 5d: Merge
459
+
460
+ **Default mode**: Auto-merge if review is clean.
461
+ **Interactive mode**: Ask user for merge approval.
462
+
463
+ ```bash
464
+ gh pr merge {PR_NUMBER} --merge
465
+ ```
466
+
467
+
468
+ ## Phase 6: Cleanup
469
+
470
+ 1. Remove the worktree:
471
+ ```bash
472
+ git worktree remove {WORKTREE_DIR}
473
+ ```
474
+ 2. Delete the local branch:
475
+ ```bash
476
+ git checkout {DEFAULT_BRANCH}
477
+ git branch -D depfree/{DATE}
478
+ if git ls-remote --exit-code --heads origin "depfree/{DATE}" >/dev/null 2>&1; then
479
+ git push origin --delete "depfree/{DATE}"
480
+ else
481
+ echo "warning: remote branch depfree/{DATE} not found or already deleted"
482
+ fi
483
+ ```
484
+ 3. Restore stashed changes if applicable:
485
+ ```bash
486
+ git stash pop
487
+ ```
488
+ 4. Update PLAN.md:
489
+ - Mark completed removals with `[x]`
490
+ - Add PR link
491
+ - Note any packages that were reverted
492
+ 5. Print the final summary:
493
+
494
+ ```
495
+ | Package | Status | Replacement | Lines |
496
+ |------------------|----------|--------------------------|-------|
497
+ | {package} | Removed | {utility/native API} | {N} |
498
+ | {package} | Kept | {reason} | — |
499
+ | {package} | Reverted | {reason for failure} | — |
500
+
501
+ Total dependencies before: {before}
502
+ Total dependencies after: {after}
503
+ Packages removed: {count}
504
+ Owned replacement code: ~{lines} lines
505
+ Transitive deps eliminated: ~{count} (estimated)
506
+ ```
507
+
508
+
509
+ ## Error Recovery
510
+
511
+ - **Agent failure**: continue with remaining agents, note gaps in the summary
512
+ - **Build failure in worktree**: attempt fix; if unfixable, revert the problematic replacement and re-add the dependency
513
+ - **Push failure**: `git pull --rebase --autostash` then retry push
514
+ - **CI failure on PR**: investigate logs, fix, push (max 3 attempts)
515
+ - **Replacement too complex**: if an agent reports that replacement exceeds 2x estimated complexity, skip that dependency and keep it with a note
516
+ - **Test failure from replacement**: if tests fail and the fix isn't obvious, revert the replacement — a working dependency is better than broken owned code
517
+ - **Existing worktree found at startup**: ask user — resume or clean up
518
+
519
+ !`cat ~/.claude/lib/graphql-escaping.md`
520
+
521
+ ## Notes
522
+
523
+ - This command complements `/do:better` — run `depfree` for dependency hygiene, `better` for code quality
524
+ - All remediation happens in an isolated worktree — the user's working directory is never modified
525
+ - The threshold for "acceptable" libraries is deliberately generous — the goal is to remove obvious attack surface, not to rewrite everything
526
+ - Replacement code should be minimal and focused — don't over-engineer utilities that replace single-purpose packages
527
+ - When in doubt, keep the dependency. A maintained library is better than a buggy reimplementation
528
+ - devDependencies are lower priority since they don't ship to production, but unmaintained build tools still pose supply chain risk
529
+ - For monorepos, audit the root manifest and each workspace package manifest
@@ -14,6 +14,7 @@ List all available `/do:*` commands with their descriptions.
14
14
  |---|---|
15
15
  | `/do:better` | Unified DevSecOps audit, remediation, per-category PRs, CI verification, and Copilot review loop |
16
16
  | `/do:better-swift` | SwiftUI-optimized DevSecOps audit with multi-platform coverage (iOS, macOS, watchOS, tvOS, visionOS) |
17
+ | `/do:depfree` | Audit third-party dependencies and remove unnecessary ones by writing replacement code |
17
18
  | `/do:fpr` | Commit, push to fork, and open a PR against the upstream repo |
18
19
  | `/do:goals` | Scan codebase to infer project goals, clarify with user, and generate GOALS.md |
19
20
  | `/do:help` | List all available slashdo commands |
@@ -97,6 +97,7 @@ Check every file against this checklist. The checklist is organized into tiers
97
97
  - If the PR adds a new endpoint, trace where existing endpoints are registered and verify the new one is wired in all runtime adapters (serverless handler map, framework route file, API gateway config, local dev server) — a route registered in one adapter but missing from another will silently 404 in the missing runtime
98
98
  - If the PR adds a new call to an external service that has established mock/test infrastructure (mock mode flags, test helpers, dev stubs), verify the new call uses the same patterns — bypassing them makes the new code path untestable in offline/dev environments and inconsistent with existing integrations
99
99
  - If the PR adds a new UI component or client-side consumer against an existing API endpoint, read the actual endpoint handler or response shape — verify every field name, nesting level, identifier property, and response envelope path used in the consumer matches what the producer returns. This is the #1 source of "renders empty" bugs in new views built against existing APIs
100
+ - If the PR adds or modifies a discovery/catalog endpoint that enumerates available capabilities (actions, node types, valid options) for a downstream consumer API, trace the full enumerated set against the consumer's actual supported inputs: verify every advertised item can be consumed without error, every consumer-supported item is discoverable, and any identifier transformations (naming conventions, case conversions, key format changes) between discovery output and consumer input preserve the format the consumer expects — mismatches produce runtime errors that no amount of unit testing will catch because the two sides are tested independently
100
101
 
101
102
  **Push/real-time event scoping**
102
103
  - If the PR adds or modifies WebSocket, SSE, or pub/sub event emission, trace the event scope: does the event reach only the originating session/user, or is it broadcast to all connected clients? Check payloads for sensitive content (user inputs, images, tokens) that should not leak across sessions. If the consumer filters by a correlation ID, verify the producer includes one and that the ID is generated server-side or validated against the session
@@ -143,6 +144,7 @@ Check every file against this checklist. The checklist is organized into tiers
143
144
  **Sanitization/validation/normalization coverage**
144
145
  - If the PR introduces a new validation or sanitization function for a data field, trace every code path that writes to that field (create, update, import, sync, rename, raw/bulk persist) — verify they all use the same sanitization. Partial application is the #1 way invalid data re-enters through an unguarded path
145
146
  - If the PR adds a "raw" or bypass write path (e.g., `raw: true` flag, bulk import, migration backfill), compare the normalization it applies against what the standard read/parse path assumes — ID prefixes, required defaults, shape invariants. Data that passes through the raw path must still be valid when reloaded through the normal path
147
+ - If the PR adds a new dispatch branch within a multi-type handler (e.g., coercing a new data shape, handling a new entity subtype), trace sibling branches and verify the new one applies equivalent validation, type-checking, and error-handling constraints — new branches commonly bypass validation that existing branches enforce because the author focuses on happy-path behavior
146
148
 
147
149
  **Bootstrap/initialization ordering**
148
150
  - If the PR adds resilience or self-healing code (dependency installers, auto-repair, migration runners), trace the execution order: does the main code path resolve or import the dependencies BEFORE the resilience code runs? If so, the bootstrapper never executes when it's needed most — restructure so verification/installation precedes resolution
@@ -175,7 +177,7 @@ Check every file against this checklist. The checklist is organized into tiers
175
177
  - If the PR adds or reorders sequential steps/instructions, verify the ordering matches execution dependencies — readers following steps in order must not perform an action before its prerequisite
176
178
 
177
179
  **Transactional write integrity**
178
- - If the PR performs multi-item writes (database transactions, batch operations), verify each write includes condition expressions that prevent stale-read races (TOCTOU) — an unconditioned write after a read can upsert deleted records, double-count aggregates, or drive counters negative. Trace the gap between read and write for each operation
180
+ - If the PR performs multi-item writes (database transactions, batch operations), verify each write includes condition expressions that prevent stale-read races (TOCTOU) — an unconditioned write after a read can upsert deleted records, double-count aggregates, or drive counters negative. Trace the gap between read and write for each operation. Also verify that update/modify operations won't silently create records when the target key doesn't exist — database update operations often have implicit upsert semantics (e.g., DynamoDB UpdateItem, MongoDB update with upsert) that create partial records for invalid IDs; add existence condition expressions when the operation should only modify existing records
179
181
  - If the PR catches transaction/conditional failures, verify the error is translated to a client-appropriate status (409, 404) rather than bubbling as 500 — expected concurrency failures are not server errors
180
182
 
181
183
  **Batch/paginated API consumption**
@@ -213,6 +215,10 @@ Check every file against this checklist. The checklist is organized into tiers
213
215
  **Abstraction layer fidelity**
214
216
  - If the PR calls a third-party API through an internal wrapper/abstraction layer, trace whether the wrapper requests and forwards all fields the handler depends on — third-party APIs often have optional response attributes that require explicit opt-in (e.g., cancellation reasons, extended metadata). Code branching on fields the wrapper doesn't forward will silently receive `undefined` and take the wrong path. Also verify that test mocks match what the real wrapper returns, not what the underlying API could theoretically return
215
217
  - If the PR passes multiple parameters through a wrapper/abstraction layer to an underlying API, check whether any parameter combinations are mutually exclusive in the underlying API (e.g., projection expressions + count-only select modes) — the wrapper should strip conflicting parameters rather than forwarding all unconditionally, which causes validation errors at the underlying layer
218
+ - If the PR calls framework or library functions with discriminated input formats (e.g., content paths vs script paths, different loader functions per format), trace each call site to verify the function variant used actually handles the input format being passed — especially fallback/default branches in multi-format dispatchers, where the fallback commonly uses the wrong function. Also verify positional argument order matches the called function's parameter order (not assumed from variable names) and that the object type passed matches what the API expects (e.g., asset object vs class reference, property access vs method call)
219
+
220
+ **Parameter consumption tracing**
221
+ - If the PR adds a function with validated input parameters (schema validation, input decorators, type annotations), trace each validated parameter through to where it's actually consumed in the implementation. Parameters that pass validation but are never read create dead API surface — callers believe they're configuring behavior that's silently ignored. Either wire the parameter through or remove it from the public API
216
222
 
217
223
  **Summary/aggregation endpoint consistency**
218
224
  - If the PR adds a summary or dashboard endpoint that aggregates counts/previews across multiple data sources, trace each category's computation logic against the corresponding detail view it links to — verify they apply the same filters (e.g., orphan exclusion, status filtering), the same ordering guarantees (sort keys that actually exist on the queried index), and that navigation links propagate the aggregated context (e.g., `?status=pending`) so the destination page matches what the summary promised
@@ -240,6 +246,9 @@ Check every file against this checklist. The checklist is organized into tiers
240
246
  **Bulk vs single-item operation parity**
241
247
  - If the PR modifies a single-item CRUD operation (create, update, delete) to handle new fields or apply new logic, trace the corresponding bulk/batch operation for the same entity — it often has its own independent implementation that won't pick up the change. Verify both paths handle the same fields, apply the same validation, and preserve the same secondary data
242
248
 
249
+ **Bulk operation selection lifecycle**
250
+ - If the PR adds operations that act on a user-selected subset of items (bulk actions, batch operations), trace the complete lifecycle of the selection state: when is it cleared (data refresh, item deletion), when is it not cleared but should be (filter/sort/page changes), and whether the operation re-validates the selection at execution time (especially after confirmation dialogs where the underlying data may change between display and confirmation)
251
+
243
252
  **Config value provenance for auto-upgrade**
244
253
  - If the PR adds auto-upgrade logic that replaces config values with newer defaults (prompt versions, schema migrations, template updates), verify the code can distinguish "user customized this value" from "this is the previous default." Without provenance tracking (version stamps, customization flags, or comparison against known previous defaults), auto-upgrade will overwrite intentional user customizations or skip legitimate upgrades
245
254
 
@@ -54,6 +54,8 @@ Address the latest code review feedback on the current branch's pull request usi
54
54
 
55
55
  9. **Request another Copilot review** (only if `is_fork_pr=false`): After pushing fixes, request a fresh Copilot code review and repeat from step 3 until the review passes clean. **Skip for fork-to-upstream PRs.**
56
56
 
57
+ **Repeated-comment dedup**: When fetching threads after a new Copilot review round, compare each new unresolved thread's comment body and file/line against threads from the previous round that were intentionally left unresolved (replied to as non-issues or disagreements). If all new unresolved threads are repeats of previously-dismissed feedback, treat the review as clean (no new actionable comments) and exit the loop.
58
+
57
59
  10. **Report summary**: Print a table of all threads addressed with file, line, and a brief description of the fix. Include a final count line: "Resolved X/Y threads." If any threads remain unresolved, list them with reasons (unclear feedback, disagreement, requires user input).
58
60
 
59
61
  !`cat ~/.claude/lib/graphql-escaping.md`
@@ -78,7 +80,7 @@ Poll using GraphQL to check for a new review with a `submittedAt` timestamp afte
78
80
  gh api graphql -f query='{ repository(owner: "OWNER", name: "REPO") { pullRequest(number: PR_NUM) { reviews(last: 3) { nodes { state body author { login } submittedAt } } reviewThreads(first: 100) { nodes { id isResolved comments(first: 3) { nodes { body path line author { login } } } } } } } }'
79
81
  ```
80
82
 
81
- **Dynamic poll timing**: Before your first poll, check how long the most recent Copilot review on this PR took by comparing consecutive Copilot review `submittedAt` timestamps (or PR creation time for the first review). Use that duration as your expected wait. If no prior review exists, default to 5 minutes. Set poll interval to 60 seconds and max wait to **2x the expected duration** (minimum 5 minutes, maximum 20 minutes). Copilot reviews can take **10-15 minutes** for large diffs — do NOT give up early.
83
+ **Dynamic poll timing**: Before your first poll, check how long the most recent Copilot review on this PR took by comparing consecutive Copilot review `submittedAt` timestamps (or PR creation time for the first review). Use that duration as your expected wait. If no prior review exists, default to 5 minutes. Use **progressive poll intervals**: 15s, 15s, 30s, 30s, then 60s thereafter — small diffs often complete in under a minute, so early frequent checks avoid wasting time. Set max wait to **2x the expected duration** (minimum 5 minutes, maximum 20 minutes). Copilot reviews can take **10-15 minutes** for large diffs — do NOT give up early.
82
84
 
83
85
  The review is complete when a new `copilot-pull-request-reviewer` review node appears. If no review appears after max wait: **Default mode**: auto-skip and continue. **Interactive mode (`--interactive`)**: ask the user whether to continue waiting, re-request, or skip.
84
86
 
@@ -16,7 +16,7 @@
16
16
  **Runtime correctness**
17
17
  - Null/undefined access without guards, off-by-one errors, object spread of potentially-null values (spread of null is `{}`, silently discarding state) or non-object values (spreading a string produces indexed character keys, spreading an array produces numeric keys) — guard with a plain-object check before spreading
18
18
  - Data from external/user sources (parsed JSON, API responses, file reads) used without structural validation — guard against parse failures, missing properties, wrong types, and null elements before accessing nested values. When parsed data is optional enrichment, isolate failures so they don't abort the main operation
19
- - Type coercion edge cases — `Number('')` is `0` not empty, `0` is falsy in truthy checks, `NaN` comparisons are always false; string comparison operators (`<`, `>`, `localeCompare`) do lexicographic, not semantic, ordering (e.g., `"10" < "2"`). Use explicit type checks (`Number.isFinite()`, `!= null`) and dedicated libraries (e.g., semver for versions) instead of truthy guards or lexicographic ordering when zero/empty are valid values or semantic ordering matters. Boolean values round-tripping through text serialization (markdown metadata, query strings, form data, flat-file config) become strings — `"false"` is truthy in JavaScript, so truthiness checks on deserialized booleans silently treat explicit `false` as `true`. Use strict equality (`=== true`, `=== 'true'`) or a dedicated coercion function; ensure the same coercion is applied at every consumption site
19
+ - Type coercion edge cases — `Number('')` is `0` not empty, `0` is falsy in truthy checks, `NaN` comparisons are always false; string comparison operators (`<`, `>`, `localeCompare`) do lexicographic, not semantic, ordering (e.g., `"10" < "2"`). Use explicit type checks (`Number.isFinite()`, `!= null`) and dedicated libraries (e.g., semver for versions) instead of truthy guards or lexicographic ordering when zero/empty are valid values or semantic ordering matters. Boolean values round-tripping through text serialization (markdown metadata, query strings, form data, flat-file config) become strings — `"false"` is truthy in JavaScript, so truthiness checks on deserialized booleans silently treat explicit `false` as `true`. Use strict equality (`=== true`, `=== 'true'`) or a dedicated coercion function; ensure the same coercion is applied at every consumption site. Language type hierarchies may admit surprising subtypes through standard type-check predicates (`isinstance(x, int)` accepts `bool` in Python, `typeof NaN === 'number'` in JavaScript) — when validating numeric inputs, explicitly exclude known subtypes that would pass the check but produce wrong behavior
20
20
  - Functions that index into arrays without guarding empty arrays; aggregate operations (`every`, `some`, `reduce`) on potentially-empty collections returning vacuously true/default values that mask misconfiguration or missing data; state/variables declared but never updated or only partially wired up
21
21
  - Parallel arrays or tuples coupled by index position (e.g., a names array, a promises array, and a destructuring assignment that must stay aligned) — insertion or reordering in one silently misaligns all others. Use objects/maps keyed by a stable identifier instead
22
22
  - Shared mutable references — module-level defaults passed by reference mutate across calls (use `structuredClone()`/spread); `useCallback`/`useMemo` referencing a later `const` (temporal dead zone); object spread followed by unconditional assignment that clobbers spread values
@@ -41,7 +41,7 @@
41
41
 
42
42
  **Async & state consistency** _[applies when: code uses async/await, Promises, or UI state]_
43
43
  - Optimistic state changes (view switches, navigation, success callbacks) before async completion — if the operation fails or is cancelled, the UI is stuck with no rollback. Check return values/errors before calling success callbacks. Handle both failure and cancellation paths. Watch for `.catch(() => null)` followed by unconditional success code (toast, state update) — the catch silences the error but the success path still runs. Either let errors propagate naturally or check the return value before proceeding
44
- - Multiple coupled state variables updated independently — actions that change one must update all related fields; debounced/cancelable operations must reset loading state on every exit path (cleared, stale, failed, aborted). Component state initialized from props via `useState(prop)` only captures the initial value — if the prop updates asynchronously (data fetch, parent re-render), the local state goes stale. Sync with an effect when the user is not actively editing, or lift state to avoid the copy
44
+ - Multiple coupled state variables updated independently — actions that change one must update all related fields; debounced/cancelable operations must reset loading state on every exit path (cleared, stale, failed, aborted). Reference/selection sets that point to items in a data collection must be pruned when items are removed and invalidated when the collection is reloaded, filtered, paginated, or sorted — stale references send nonexistent IDs to downstream operations. Operations triggered from a confirmation dialog must re-validate preconditions (selection non-empty, items still exist) at execution time — the underlying data may change between dialog display and user confirmation. Component state initialized from props via `useState(prop)` only captures the initial value — if the prop updates asynchronously (data fetch, parent re-render), the local state goes stale. Sync with an effect when the user is not actively editing, or lift state to avoid the copy
45
45
  - Error notification at multiple layers (shared API client + component-level) — verify exactly one layer owns user-facing error messages. For periodic polling, also check that error notifications are throttled or deduplicated (only fire on state transitions like success→error, not on every failed iteration) and that failure doesn't make the UI section disappear entirely (component returning null when data is null/errored) — render an error or stale-data state instead of absence
46
46
  - Optimistic updates using full-collection snapshots for rollback — a second in-flight action gets clobbered. Use per-item rollback and functional state updaters after async gaps; sync optimistic changes to parent via callback or trigger refetch on remount. When appending items to a list optimistically, guard against duplicates (check existence before append) — concurrent or repeated operations can insert the same item multiple times
47
47
  - State updates guarded by truthiness of the new value (`if (arr?.length)`) — prevents clearing state when the source legitimately returns empty. Distinguish "no response" from "empty response"
@@ -78,11 +78,13 @@
78
78
  - Update/patch endpoints with explicit field allowlists (destructured picks, permitted-key arrays) — when the data model gains new configurable fields, the allowlist must be updated or the new fields are silently dropped on save. Trace from model definition to the update handler's field extraction to verify coverage
79
79
  - New endpoints/schemas should match validation patterns of existing similar endpoints — field limits, required fields, types, error handling. If validation exists on one endpoint for a param, the same param on other endpoints needs the same validation. API documentation schemas (OpenAPI, JSON Schema) must be structurally complete — array types require `items` definitions, required fields must be listed, and the documented shape must match what the implementation actually returns
80
80
  - Summary/aggregation endpoints that compute counts or previews via a different query path, filter set, or data source than the detail views they link to — users see inconsistent numbers between the dashboard and the destination page. Trace the computation logic in both paths and verify they apply the same filters, exclusions, and ordering guarantees (or document the intentional difference)
81
- - When a validation/sanitization/normalization function is introduced for a field, trace ALL write paths (create, update, sync, import, raw/bulk persist) partial application means invalid values re-enter through the unguarded path. This includes structural normalization (ID prefixes, required defaults, shape invariants) that the read/parse path depends on a "raw" write path that skips normalization produces data that changes identity or shape on reload
81
+ - Discovery or catalog endpoints that enumerate available capabilities (actions, supported types, valid options) for a downstream consumer must derive the enumerated set from or validate it against the consumer's actual supported set advertising items the consumer can't handle produces runtime errors at consumption time, while omitting items the consumer supports makes them undiscoverable. If the catalog transforms identifiers (naming conventions, key formats) between producer and consumer, verify the transformation preserves the format the consumer expects
82
+ - When a validation/sanitization/normalization function is introduced for a field, trace ALL write paths (create, update, sync, import, raw/bulk persist) — partial application means invalid values re-enter through the unguarded path. This includes structural normalization (ID prefixes, required defaults, shape invariants) that the read/parse path depends on — a "raw" write path that skips normalization produces data that changes identity or shape on reload. Conversely, when a new code branch handles data similar to existing branches within the same function (e.g., a new data format, entity subtype, or input shape), verify it applies the same validation and coercion as its siblings — new branches that bypass established validation are the most common source of type-safety regressions
82
83
  - Stored config/settings merged with hardcoded defaults using shallow spread — nested objects in the stored copy entirely replace the default, dropping newly added default keys on upgrade. Use deep merge for nested config objects (while preserving explicit `null` to clear a field), or flatten the config structure so shallow merge suffices
83
- - Schema fields accepting values downstream code can't handle; Zod/schema stripping fields the service reads (silent `undefined`); config values persisted but silently ignored by the implementation — trace each field through schema → service → consumer. Update schemas derived from create schemas (e.g., `.partial()`) must also make nested object fields optional — shallow partial on a deeply-required schema rejects valid partial updates. Additionally, `.deepPartial()` or `.partial()` on schemas with `.default()` values will apply those defaults on update, silently overwriting existing persisted values with defaults — create explicit update schemas without defaults instead
84
+ - Schema fields accepting values downstream code can't handle; Zod/schema stripping fields the service reads (silent `undefined`); config values persisted but silently ignored by the implementation — trace each field through schema → service → consumer. Also check for parameters accepted and validated in the schema but never consumed by the implementation — dead API surface that misleads callers into believing they're configuring behavior that's silently ignored; remove unused parameters or wire them through to the implementation. Update schemas derived from create schemas (e.g., `.partial()`) must also make nested object fields optional — shallow partial on a deeply-required schema rejects valid partial updates. Additionally, `.deepPartial()` or `.partial()` on schemas with `.default()` values will apply those defaults on update, silently overwriting existing persisted values with defaults — create explicit update schemas without defaults instead
85
+ - Multi-part UI features (e.g., table header + rows) whose rendering is gated on different prop/condition subsets — if the header checks prop A while rows check prop B, partial provision causes structural misalignment (column count mismatch, orphaned interactive elements without handlers). Derive a single enablement boolean from the complete prop set and use it consistently across all participating components
84
86
  - Entity creation without case-insensitive uniqueness checks — names differing only in case (e.g., "MyAgent" vs "myagent") cause collisions in case-insensitive contexts (file paths, git branches, URLs). Normalize to lowercase before comparing
85
- - Code reading properties from API responses, framework-provided objects, or internal abstraction layers using field names the source doesn't populate or forward — silent `undefined`. Verify property names and nesting depth match the actual response shape (e.g., `response.items` vs `response.data.items`, `obj.placeId` vs `obj.id`, flat fields vs nested sub-objects). When building a new consumer against an existing API, check the producer's actual response — not assumed conventions. When branching on fields from a wrapped third-party API, confirm the wrapper actually requests and forwards those fields (e.g., optional response attributes that require explicit opt-in)
87
+ - Code reading properties from API responses, framework-provided objects, or internal abstraction layers using field names the source doesn't populate or forward — silent `undefined`. Verify property names and nesting depth match the actual response shape (e.g., `response.items` vs `response.data.items`, `obj.placeId` vs `obj.id`, flat fields vs nested sub-objects). When building a new consumer against an existing API, check the producer's actual response — not assumed conventions. When branching on fields from a wrapped third-party API, confirm the wrapper actually requests and forwards those fields (e.g., optional response attributes that require explicit opt-in). Also verify call sites pass inputs in the format the called function actually accepts — framework constructors with non-obvious positional argument order, loaders with format-specific variants (content paths vs script paths, asset objects vs class references), and accessor APIs with distinct method-vs-property semantics. Fallback branches in multi-format dispatchers commonly use the wrong function for the input type
86
88
  - Data model fields that have different names depending on the creation/write path (e.g., `createdAt` vs `created`) — code referencing only one naming convention silently misses records created through other paths. Trace all write paths to discover the actual field names in use. When new logic (access control, UI display, queries) checks only a newly introduced field, verify it falls back to any legacy field that existing records still use — otherwise records created before the migration are silently excluded or inaccessible. Also check entity identity keys: if code looks up or matches entities using a computed key (e.g., `e.id || e.externalId`), all code paths that perform the same lookup must use the same key computation — one path using `e.id` while another uses `e.id || e.externalId` causes mismatches for entities missing the primary key
87
89
  - Entity type changes without invariant revalidation — when an entity has a discriminator field (type, kind, category) and the user changes it, all type-specific invariants must be enforced on the new type AND type-specific fields from the old type must be cleared or revalidated. A job changing from `shell` to `agent` without clearing `command`, or changing to `shell` without requiring `command`, leaves the entity in an invalid hybrid state that fails at runtime or resurfaces stale data
88
90
  - Invariant relationships between configuration flags (flag A implies flag B) not enforced across all layers — UI toggle handlers, API validation schemas, server default-application functions, and serialization/deserialization must all preserve the invariant. If any layer allows setting A=true with B=false (or vice versa), cascading defaults and toggle logic produce contradictory state. Trace the invariant through: UI state handlers, form submission, route validation, service defaults, and persistence round-trip
@@ -91,7 +93,7 @@
91
93
  - Validation functions that delegate to runtime-behavior computations (next schedule occurrence, URL reachability, resource resolution) — conflating "no result within search window" or "temporarily unavailable" with "invalid input" rejects valid configurations. Validate syntax and structure independently of runtime feasibility
92
94
  - Numeric values from strings used without `NaN`/type guards — `NaN` comparisons silently pass bounds checks. Clamp query params to safe lower bounds
93
95
  - UI elements hidden from navigation but still accessible via direct URL — enforce restrictions at the route level
94
- - Summary counters/accumulators that miss edge cases (removals, branch coverage, underflow on decrements — guard against going negative with lower-bound conditions); counters incremented before confirming the operation actually changed state — rejected, skipped, or no-op iterations inflate success counts. Silent operations in verbose sequences where all branches should print status
96
+ - Summary counters/accumulators that miss edge cases (removals, branch coverage, underflow on decrements — guard against going negative with lower-bound conditions); counters incremented before confirming the operation actually changed state — rejected, skipped, or no-op iterations inflate success counts. Batch operations that report overall success while silently logging per-item failures — callers see success but partial work was done; collect and return per-item failures in the response. Silent operations in verbose sequences where all branches should print status
95
97
 
96
98
  **Concurrency & data integrity** _[applies when: code has shared state, database writes, or multi-step mutations]_
97
99
  - Shared mutable state accessed by concurrent requests without locking or atomic writes; multi-step read-modify-write cycles that can interleave — use conditional writes/optimistic concurrency (e.g., condition expressions, version checks) to close the gap between read and write; if the conditional write fails, surface a retryable error instead of letting it bubble as a 500
@@ -103,7 +105,7 @@
103
105
 
104
106
  **Input handling** _[applies when: code accepts user/external input]_
105
107
  - Trimming values where whitespace is significant (API keys, tokens, passwords, base64) — only trim identifiers/names
106
- - Endpoints accepting unbounded arrays/collections without upper limits — enforce max size or move to background jobs. Also check internal operations that fan out unbounded parallel I/O (e.g., `Promise.all(files.map(readFile))`) — large collections risk EMFILE (too many open file descriptors) or memory exhaustion. Use a concurrency limiter or batch processing for collections that can grow without bound
108
+ - Endpoints accepting unbounded arrays/collections without upper limits — enforce max size or move to background jobs. Also validate element-level invariants (types, format, non-empty) and deduplicate — duplicate elements inflate operation counts, repeat side effects, and skew success/failure metrics. Also check internal operations that fan out unbounded parallel I/O (e.g., `Promise.all(files.map(readFile))`) — large collections risk EMFILE (too many open file descriptors) or memory exhaustion. Use a concurrency limiter or batch processing for collections that can grow without bound
107
109
  - Security/sanitization functions (redaction, escaping, validation) that only handle one input format — if data can arrive in multiple formats (JSON `"KEY": "value"`, shell `KEY=value`, URL-encoded, headers), the function must cover all formats present in the system or sensitive data leaks through the unhandled format
108
110
 
109
111
  ## Tier 3 — Domain-Specific (Check Only When File Type Matches)
@@ -146,6 +148,7 @@
146
148
  - Subprocess output buffered in memory without size limits — a noisy or stuck child process can cause unbounded memory growth. Cap in-memory buffers and truncate or stream to disk for long-running commands
147
149
  - Platform-specific assumptions — hardcoded shell interpreters, `path.join()` backslashes breaking ESM imports. Use `pathToFileURL()` for dynamic imports
148
150
  - Naive whitespace splitting of command strings (`str.split(/\s+/)`) breaks quoted arguments — use a proper argv parser or explicitly disallow quoted/multi-word arguments when validating shell commands
151
+ - Shell expansions (brace `{a,b}`, glob `*`, tilde `~`, variable `$VAR`) suppressed by quoting context — single quotes prevent all expansion, so patterns like `--include='*.{ts,js}'` pass the literal braces to the command instead of expanding. Use multiple flags, unquoted brace expansion (bash-only), or other command-specific syntax when expansion is required
149
152
 
150
153
  **Search & navigation** _[applies when: code implements search results or deep-linking]_
151
154
  - Search results linking to generic list pages instead of deep-linking to the specific record
@@ -205,7 +208,7 @@
205
208
  **Test coverage**
206
209
  - New logic/schemas/services without corresponding tests when similar existing code has tests
207
210
  - New error paths untestable because services throw generic errors instead of typed ones
208
- - Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses
211
+ - Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses. Includes tests that assert by inspecting function source code (string-matching implementation details) rather than calling the function and checking behavior — they break on harmless refactors while missing actual behavioral changes. Also tests that mutate global state at import time (module registries, sys.modules) without fixture-scoped cleanup — causes ordering-dependent failures across the test session
209
212
  - Tests depending on real wall-clock time or external dependencies when testing logic — use fake timers and mocks
210
213
  - Missing tests for trust-boundary enforcement — submit tampered values, verify server ignores them
211
214
  - Tests that exercise code paths depending on features the integration layer doesn't expose — they pass against mocks but the behavior can't trigger in production. Verify mocked responses match what the real dependency actually returns
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "slash-do",
3
- "version": "2.0.0",
3
+ "version": "2.2.0",
4
4
  "description": "Curated slash commands for AI coding assistants — Claude Code, OpenCode, Gemini CLI, and Codex",
5
5
  "author": "Adam Eivy <adam@eivy.com>",
6
6
  "license": "MIT",
@@ -56,7 +56,7 @@ const ENVIRONMENTS = {
56
56
  libDir: null,
57
57
  hooksDir: null,
58
58
  versionFile: path.join(HOME, '.codex', '.slashdo-version'),
59
- format: 'skill-md',
59
+ format: 'yaml-frontmatter',
60
60
  ext: null,
61
61
  namespacing: 'directory',
62
62
  libPathPrefix: null,
@@ -65,12 +65,6 @@ function toTomlHeader(fm) {
65
65
  return lines.join('\n');
66
66
  }
67
67
 
68
- function toSkillHeader(fm) {
69
- const lines = [];
70
- if (fm.description) lines.push(`# ${fm.description}`);
71
- return lines.join('\n');
72
- }
73
-
74
68
  function getTargetFilename(relPath, env) {
75
69
  const basename = path.basename(relPath, '.md');
76
70
  const dir = path.dirname(relPath);
@@ -114,9 +108,6 @@ function transformCommand(content, env, sourceLibDir) {
114
108
  case 'toml':
115
109
  header = toTomlHeader(frontmatter);
116
110
  break;
117
- case 'skill-md':
118
- header = toSkillHeader(frontmatter);
119
- break;
120
111
  default:
121
112
  header = toYamlFrontmatter(frontmatter);
122
113
  }