slash-do 2.0.0 → 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -2
- package/commands/do/better.md +31 -8
- package/commands/do/depfree.md +525 -0
- package/commands/do/help.md +1 -0
- package/commands/do/review.md +8 -1
- package/commands/do/rpr.md +3 -1
- package/lib/code-review-checklist.md +6 -5
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -24,7 +24,7 @@
|
|
|
24
24
|
<p align="center">
|
|
25
25
|
<img src="https://img.shields.io/npm/v/slash-do?style=flat-square&color=blue" alt="npm version" />
|
|
26
26
|
<img src="https://img.shields.io/badge/environments-4-green?style=flat-square" alt="environments" />
|
|
27
|
-
<img src="https://img.shields.io/badge/commands-
|
|
27
|
+
<img src="https://img.shields.io/badge/commands-14-orange?style=flat-square" alt="commands" />
|
|
28
28
|
<img src="https://img.shields.io/badge/license-MIT-lightgrey?style=flat-square" alt="license" />
|
|
29
29
|
</p>
|
|
30
30
|
|
|
@@ -60,8 +60,9 @@ All commands live under the `do:` namespace:
|
|
|
60
60
|
| `/do:rpr` | Resolve PR review feedback with parallel agents |
|
|
61
61
|
| `/do:release` | Create a release PR with version bump and changelog |
|
|
62
62
|
| `/do:review` | Deep code review against best practices |
|
|
63
|
-
| `/do:better` | Full DevSecOps audit with
|
|
63
|
+
| `/do:better` | Full DevSecOps audit with 8-agent scan and remediation |
|
|
64
64
|
| `/do:better-swift` | SwiftUI DevSecOps audit with multi-platform coverage |
|
|
65
|
+
| `/do:depfree` | Audit dependencies, remove unnecessary ones, write replacement code |
|
|
65
66
|
| `/do:goals` | Generate GOALS.md from codebase analysis |
|
|
66
67
|
| `/do:replan` | Review and clean up PLAN.md |
|
|
67
68
|
| `/do:omd` | Audit and optimize markdown files |
|
package/commands/do/better.md
CHANGED
|
@@ -5,7 +5,7 @@ argument-hint: "[--interactive] [--scan-only] [--no-merge] [path filter or focus
|
|
|
5
5
|
|
|
6
6
|
# Better — Unified DevSecOps Pipeline
|
|
7
7
|
|
|
8
|
-
Run the full DevSecOps lifecycle: audit the codebase with
|
|
8
|
+
Run the full DevSecOps lifecycle: audit the codebase with 8 deduplicated agents, consolidate findings, remediate in an isolated worktree, create **separate PRs per category** with SemVer bump, verify CI, run Copilot review loops, and merge.
|
|
9
9
|
|
|
10
10
|
**Default mode: fully autonomous.** Uses Balanced model profile, proceeds through all phases without prompting, auto-merges PRs with clean reviews.
|
|
11
11
|
|
|
@@ -47,7 +47,7 @@ Record the selection as `MODEL_PROFILE` and derive agent models from this table:
|
|
|
47
47
|
|
|
48
48
|
| Agent Role | Quality | Balanced | Budget |
|
|
49
49
|
|------------|---------|----------|--------|
|
|
50
|
-
| Audit agents (
|
|
50
|
+
| Audit agents (8 Explore agents, Phase 1) | opus | sonnet | haiku |
|
|
51
51
|
| Remediation agents (general-purpose, Phase 3) | opus | sonnet | sonnet |
|
|
52
52
|
|
|
53
53
|
Derive two variables:
|
|
@@ -121,7 +121,7 @@ Record as `BUILD_CMD` and `TEST_CMD`.
|
|
|
121
121
|
|
|
122
122
|
Project conventions are already in your context. Pass relevant conventions to each agent.
|
|
123
123
|
|
|
124
|
-
Launch
|
|
124
|
+
Launch 8 Explore agents in two batches. Each agent must report findings in this format:
|
|
125
125
|
```
|
|
126
126
|
- **[CRITICAL/HIGH/MEDIUM/LOW]** `file:line` - Description. Suggested fix: ... Complexity: Simple/Medium/Complex
|
|
127
127
|
```
|
|
@@ -174,7 +174,7 @@ Skip step 4 if steps 1-3 reveal the code is correct.
|
|
|
174
174
|
Resilience: external calls without timeouts, missing fallback for unavailable downstream services, retry without backoff ceiling/jitter, missing health check endpoints
|
|
175
175
|
Observability: production paths without structured logging, error logs missing reproduction context (request ID, input params), async flows without correlation IDs
|
|
176
176
|
|
|
177
|
-
### Batch 2 (
|
|
177
|
+
### Batch 2 (3 agents after Batch 1 completes):
|
|
178
178
|
|
|
179
179
|
**Model**: Same `AUDIT_MODEL` as Batch 1.
|
|
180
180
|
|
|
@@ -188,14 +188,27 @@ Skip step 4 if steps 1-3 reveal the code is correct.
|
|
|
188
188
|
- **Database migrations**: exclusive-lock ALTER TABLE on large tables, CREATE INDEX without CONCURRENTLY, missing down migrations or untested rollback paths
|
|
189
189
|
- General: framework-specific security issues, language-specific gotchas, domain-specific compliance, environment variable hygiene (missing `.env.example`, required env vars not validated at startup, secrets in config files that should be in env)
|
|
190
190
|
|
|
191
|
-
7. **
|
|
191
|
+
7. **Dependency Freedom**
|
|
192
|
+
Audit all third-party dependencies for necessity. Every small library is an attack surface — supply chain compromises are real and common.
|
|
193
|
+
Focus:
|
|
194
|
+
- Extract the full dependency list from the project manifest (`package.json`, `Cargo.toml`, `pyproject.toml`, `go.mod`, `Gemfile`, etc.)
|
|
195
|
+
- Classify each dependency into tiers:
|
|
196
|
+
- **Acceptable**: large, widely-audited libraries (react, express, d3, three.js, next, vue, fastify, typescript, eslint, prisma, tailwindcss, tokio, serde, django, flask, pandas, etc.) — skip these
|
|
197
|
+
- **Suspect**: smaller libraries where we may only use 1-2 functions, wrappers over built-in APIs, single-purpose utilities
|
|
198
|
+
- **Removable**: libraries where the used functionality is <50 lines to implement, wraps a now-native API (e.g., `crypto.randomUUID()` replacing uuid, `structuredClone` replacing lodash.cloneDeep, `Array.prototype.flat` replacing array-flatten, `node:fs/promises` replacing fs-extra for most uses), unmaintained with known vulnerabilities, or micro-packages (is-odd, is-number, left-pad tier)
|
|
199
|
+
- For each suspect/removable dependency: search all source files for imports, list every function/class/type used, count call sites, assess replacement complexity (Trivial <20 lines, Moderate 20-100, Complex 100-300, Infeasible 300+)
|
|
200
|
+
- Check maintenance status: last publish date, open security issues, known CVEs
|
|
201
|
+
- Report format: `**[SEVERITY]** {package-name} — {Tier}. Uses: {functions}. Call sites: {N} in {M} files. Replacement: {complexity}. Reason: {why removable}`
|
|
202
|
+
- Severity mapping: unmaintained with CVEs → CRITICAL, unmaintained without CVEs → HIGH, replaceable single-function usage → MEDIUM, suspect but complex replacement → LOW
|
|
203
|
+
|
|
204
|
+
8. **Test Quality & Coverage**
|
|
192
205
|
Uses Batch 1 findings as context to prioritize.
|
|
193
206
|
Focus areas:
|
|
194
207
|
|
|
195
208
|
**Coverage gaps:**
|
|
196
209
|
- Missing test files for critical modules, untested edge cases, tests that only cover happy paths
|
|
197
210
|
- Areas with high complexity (identified by agents 1-5) but no tests
|
|
198
|
-
- Remediation changes from agents 1-
|
|
211
|
+
- Remediation changes from agents 1-7 that lack corresponding test coverage
|
|
199
212
|
|
|
200
213
|
**Vacuous tests (tests that don't actually test anything):**
|
|
201
214
|
- Tests that assert on mocked return values instead of real behavior (testing the mock, not the code)
|
|
@@ -257,6 +270,7 @@ For each file touched by multiple categories, document why it was assigned to on
|
|
|
257
270
|
### Architecture & SOLID
|
|
258
271
|
### Bugs, Performance & Error Handling
|
|
259
272
|
### Stack-Specific
|
|
273
|
+
### Dependency Freedom
|
|
260
274
|
### Test Quality & Coverage
|
|
261
275
|
```
|
|
262
276
|
|
|
@@ -267,6 +281,7 @@ For each file touched by multiple categories, document why it was assigned to on
|
|
|
267
281
|
- Architecture → Architecture & SOLID → `architecture`
|
|
268
282
|
- Bugs & Perf → Bugs, Performance & Error Handling → `bugs-perf`
|
|
269
283
|
- Stack-Specific → Stack-Specific → `stack-specific`
|
|
284
|
+
- Dep Freedom → Dependency Freedom → `deps`
|
|
270
285
|
- Tests → Test Quality & Coverage → `tests`
|
|
271
286
|
|
|
272
287
|
```
|
|
@@ -278,6 +293,7 @@ For each file touched by multiple categories, document why it was assigned to on
|
|
|
278
293
|
| Architecture | ... | ... | ... | ... | ... |
|
|
279
294
|
| Bugs & Perf | ... | ... | ... | ... | ... |
|
|
280
295
|
| Stack-Specific | ... | ... | ... | ... | ... |
|
|
296
|
+
| Dep Freedom | ... | ... | ... | ... | ... |
|
|
281
297
|
| Tests | ... | ... | ... | ... | ... |
|
|
282
298
|
| TOTAL | ... | ... | ... | ... | ... |
|
|
283
299
|
```
|
|
@@ -332,6 +348,7 @@ If no shared utilities were identified, skip this step.
|
|
|
332
348
|
- Architecture & SOLID
|
|
333
349
|
- Bugs, Performance & Error Handling
|
|
334
350
|
- Stack-Specific
|
|
351
|
+
- Dependency Freedom
|
|
335
352
|
3. Only create tasks for categories that have actionable findings
|
|
336
353
|
4. Spawn up to 5 general-purpose agents as teammates. **Pass `REMEDIATION_MODEL` as the `model` parameter on each agent.** If `REMEDIATION_MODEL` is `opus`, omit the parameter to inherit from session.
|
|
337
354
|
|
|
@@ -339,9 +356,13 @@ If no shared utilities were identified, skip this step.
|
|
|
339
356
|
|
|
340
357
|
!`cat ~/.claude/lib/remediation-agent-template.md`
|
|
341
358
|
|
|
359
|
+
### Dependency Freedom agent — special instructions:
|
|
360
|
+
The Dependency Freedom remediation agent has a unique task: for each removable dependency, it must (1) write replacement code (utility function or inline native API call), (2) update ALL import/require statements across the codebase, (3) remove the package from the manifest, and (4) regenerate the lock file (`npm install` / `cargo update` / etc.). After all replacements, verify no source file still references the removed package. See `/do:depfree` Phase 3b for the full agent template.
|
|
361
|
+
|
|
342
362
|
### Conflict avoidance:
|
|
343
363
|
- Review all findings before task assignment. If two categories touch the same file, assign both sets of findings to the same agent.
|
|
344
364
|
- Security agent gets priority on validation logic; DRY agent gets priority on import consolidation.
|
|
365
|
+
- Dependency Freedom agent gets priority on files that are solely import/usage sites of a removed package.
|
|
345
366
|
|
|
346
367
|
</plan_and_remediate>
|
|
347
368
|
|
|
@@ -433,7 +454,7 @@ PHASE_4C_START_SHA="$(git rev-parse HEAD)"
|
|
|
433
454
|
|
|
434
455
|
### 4c.1: Test Audit Triage
|
|
435
456
|
|
|
436
|
-
Review Agent
|
|
457
|
+
Review Agent 8 (Test Quality & Coverage) findings from Phase 1 and categorize them:
|
|
437
458
|
|
|
438
459
|
1. **`[VACUOUS]` findings** — tests that exist but don't test real behavior. These are the highest priority because they create a false sense of safety.
|
|
439
460
|
2. **`[WEAK]` findings** — tests that partially cover behavior but miss important cases. Strengthen with additional assertions and edge cases.
|
|
@@ -535,7 +556,7 @@ Initialize `CREATED_CATEGORY_SLUGS=""` (empty space-delimited string). After eac
|
|
|
535
556
|
For each category that has findings:
|
|
536
557
|
1. Switch to `{DEFAULT_BRANCH}`: `git checkout {DEFAULT_BRANCH}`
|
|
537
558
|
2. Create a category branch: `git checkout -b better/{CATEGORY_SLUG}`
|
|
538
|
-
- Use slugs: `security`, `code-quality`, `dry`, `architecture`, `bugs-perf`, `stack-specific`, `tests`
|
|
559
|
+
- Use slugs: `security`, `code-quality`, `dry`, `architecture`, `bugs-perf`, `stack-specific`, `deps`, `tests`
|
|
539
560
|
3. For each file assigned to this category in `FILE_OWNER_MAP`:
|
|
540
561
|
- **Modified files**: `git checkout better/{DATE} -- {file_path}`
|
|
541
562
|
- **New files (Added)**: `git checkout better/{DATE} -- {file_path}`
|
|
@@ -757,6 +778,7 @@ If merge fails (e.g., branch protection, merge conflicts from a prior PR):
|
|
|
757
778
|
| Architecture | ... | ... | ... | #number | pass | approved |
|
|
758
779
|
| Bugs & Perf | ... | ... | ... | #number | pass | approved |
|
|
759
780
|
| Stack-Specific | ... | ... | ... | #number | pass | approved |
|
|
781
|
+
| Dep Freedom | ... | ... | ... | #number | pass | approved |
|
|
760
782
|
| Tests | ... | ... | ... | #number | pass | approved |
|
|
761
783
|
| TOTAL | ... | ... | ... | N PRs | | |
|
|
762
784
|
|
|
@@ -791,6 +813,7 @@ Test Enhancement Stats:
|
|
|
791
813
|
- When extracting modules, always add backward-compatible re-exports in the original module to prevent cross-PR breakage
|
|
792
814
|
- Version bump happens exactly once on the first category branch based on aggregate commit analysis
|
|
793
815
|
- Only CRITICAL, HIGH, and MEDIUM findings are auto-remediated for code categories; LOW findings remain tracked in PLAN.md
|
|
816
|
+
- Dependency Freedom findings replace unnecessary third-party packages with owned code — see `/do:depfree` for standalone usage
|
|
794
817
|
- Test Quality & Coverage findings are remediated in Phase 4c with a dedicated test enhancement agent that verifies tests fail when code is broken
|
|
795
818
|
- GitLab projects skip the Copilot review loop entirely (Phase 6) and stop after MR creation
|
|
796
819
|
- CI must pass on each PR before requesting Copilot review or merging
|
|
@@ -0,0 +1,525 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Audit third-party dependencies and remove unnecessary ones by writing replacement code
|
|
3
|
+
argument-hint: "[--interactive] [--scan-only] [--no-merge] [specific packages to evaluate]"
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Depfree — Dependency Freedom Audit
|
|
7
|
+
|
|
8
|
+
Audit all third-party dependencies, classify them as acceptable (large, widely-audited) or suspect (small, replaceable), analyze actual usage of suspect dependencies, and replace them with owned code where feasible.
|
|
9
|
+
|
|
10
|
+
Every small library is an attack surface. Supply chain compromises are real and common. Large, widely-audited libraries (express, react, d3, three.js, next, vue, fastify, lodash-es, etc.) are acceptable. But for smaller libraries or libraries where only one helper function is used, we should write the code ourselves.
|
|
11
|
+
|
|
12
|
+
**Default mode: fully autonomous.** Uses Balanced model profile, proceeds through all phases without prompting.
|
|
13
|
+
|
|
14
|
+
**`--interactive` mode:** Pauses for classification approval, replacement review, and merge confirmation.
|
|
15
|
+
|
|
16
|
+
Parse `$ARGUMENTS` for:
|
|
17
|
+
- **`--interactive`**: pause at each decision point for user approval
|
|
18
|
+
- **`--scan-only`**: run Phase 0 + 1 + 2 only (audit and plan), skip remediation
|
|
19
|
+
- **`--no-merge`**: run through PR creation, skip merge
|
|
20
|
+
- **Specific packages**: limit audit scope to named packages (e.g., "chalk dotenv")
|
|
21
|
+
|
|
22
|
+
## Configuration
|
|
23
|
+
|
|
24
|
+
### Default Mode (autonomous)
|
|
25
|
+
|
|
26
|
+
Use the **Balanced** model profile automatically (`AUDIT_MODEL=sonnet`, `REMEDIATION_MODEL=sonnet`).
|
|
27
|
+
|
|
28
|
+
### Interactive Mode (`--interactive`)
|
|
29
|
+
|
|
30
|
+
Present the user with configuration options using `AskUserQuestion`:
|
|
31
|
+
|
|
32
|
+
```
|
|
33
|
+
AskUserQuestion([{
|
|
34
|
+
question: "Which model profile for audit and remediation agents?",
|
|
35
|
+
header: "Model",
|
|
36
|
+
multiSelect: false,
|
|
37
|
+
options: [
|
|
38
|
+
{ label: "Quality", description: "Opus for all agents — fewest false positives, best replacements (highest cost)" },
|
|
39
|
+
{ label: "Balanced (Recommended)", description: "Sonnet for audit and remediation — good quality at moderate cost" },
|
|
40
|
+
{ label: "Budget", description: "Haiku for audit, Sonnet for remediation — fastest and cheapest" }
|
|
41
|
+
]
|
|
42
|
+
}])
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
Record the selection as `MODEL_PROFILE` and derive:
|
|
46
|
+
- `AUDIT_MODEL`: `opus` / `sonnet` / `haiku` based on profile
|
|
47
|
+
- `REMEDIATION_MODEL`: `opus` / `sonnet` / `sonnet` based on profile
|
|
48
|
+
|
|
49
|
+
When the resolved model is `opus`, **omit** the `model` parameter on the Agent call so the agent inherits the session's Opus version.
|
|
50
|
+
|
|
51
|
+
## Compaction Guidance
|
|
52
|
+
|
|
53
|
+
When compacting during this workflow, always preserve:
|
|
54
|
+
- The `DEPENDENCY_MAP` (complete classification of all dependencies)
|
|
55
|
+
- All REMOVABLE findings with package names and usage details
|
|
56
|
+
- The current phase number and what phases remain
|
|
57
|
+
- All PR numbers and URLs created so far
|
|
58
|
+
- `BUILD_CMD`, `TEST_CMD`, `PROJECT_TYPE`, `WORKTREE_DIR`, `REPO_DIR` values
|
|
59
|
+
- `VCS_HOST`, `CLI_TOOL`, `DEFAULT_BRANCH`, `CURRENT_BRANCH`
|
|
60
|
+
|
|
61
|
+
|
|
62
|
+
## Phase 0: Discovery & Setup
|
|
63
|
+
|
|
64
|
+
### 0a: VCS Host Detection
|
|
65
|
+
Run `gh auth status` to check GitHub CLI. If it fails, run `glab auth status` for GitLab.
|
|
66
|
+
- Set `VCS_HOST` to `github` or `gitlab`
|
|
67
|
+
- Set `CLI_TOOL` to `gh` or `glab`
|
|
68
|
+
- If neither is authenticated, warn the user and halt
|
|
69
|
+
|
|
70
|
+
### 0b: Project Type Detection
|
|
71
|
+
Check for project manifests to determine the tech stack:
|
|
72
|
+
- `package.json` → Node.js (check for `next`, `react`, `vue`, `express`, etc.)
|
|
73
|
+
- `Cargo.toml` → Rust
|
|
74
|
+
- `pyproject.toml` / `requirements.txt` / `setup.py` → Python
|
|
75
|
+
- `go.mod` → Go
|
|
76
|
+
- `pom.xml` / `build.gradle` → Java/Kotlin
|
|
77
|
+
- `Gemfile` → Ruby
|
|
78
|
+
- `*.csproj` / `*.sln` → .NET
|
|
79
|
+
|
|
80
|
+
Record the detected stack as `PROJECT_TYPE`.
|
|
81
|
+
|
|
82
|
+
### 0c: Build & Test Command Detection
|
|
83
|
+
Derive build and test commands from the project type:
|
|
84
|
+
- Node.js: check `package.json` scripts for `build`, `test`, `typecheck`, `lint`
|
|
85
|
+
- Rust: `cargo build`, `cargo test`
|
|
86
|
+
- Python: `pytest`, `python -m pytest`
|
|
87
|
+
- Go: `go build ./...`, `go test ./...`
|
|
88
|
+
- If ambiguous, check project conventions already in context
|
|
89
|
+
|
|
90
|
+
Record as `BUILD_CMD` and `TEST_CMD`.
|
|
91
|
+
|
|
92
|
+
### 0d: State Snapshot
|
|
93
|
+
- Record `REPO_DIR` via `git rev-parse --show-toplevel`
|
|
94
|
+
- Record `CURRENT_BRANCH` via `git rev-parse --abbrev-ref HEAD`
|
|
95
|
+
- Record `DEFAULT_BRANCH` via `gh repo view --json defaultBranchRef --jq '.defaultBranchRef.name'` (or `glab` equivalent)
|
|
96
|
+
- Record `IS_DIRTY` via `git status --porcelain`
|
|
97
|
+
|
|
98
|
+
|
|
99
|
+
## Phase 1: Dependency Inventory
|
|
100
|
+
|
|
101
|
+
### 1a: Extract All Dependencies
|
|
102
|
+
|
|
103
|
+
Based on `PROJECT_TYPE`, extract the full dependency list:
|
|
104
|
+
|
|
105
|
+
**Node.js:**
|
|
106
|
+
- Read `package.json` → `dependencies` and `devDependencies`
|
|
107
|
+
- Note: `devDependencies` used only in build/test are lower priority but still worth auditing
|
|
108
|
+
- Check for workspace packages (monorepo) in `workspaces` field
|
|
109
|
+
|
|
110
|
+
**Rust:**
|
|
111
|
+
- Read `Cargo.toml` → `[dependencies]`, `[dev-dependencies]`, `[build-dependencies]`
|
|
112
|
+
|
|
113
|
+
**Python:**
|
|
114
|
+
- Read `pyproject.toml` → `[project.dependencies]`, `[project.optional-dependencies]`
|
|
115
|
+
- Or `requirements.txt`, `setup.py`
|
|
116
|
+
|
|
117
|
+
**Go:**
|
|
118
|
+
- Read `go.mod` → `require` block
|
|
119
|
+
|
|
120
|
+
**Ruby:**
|
|
121
|
+
- Read `Gemfile`
|
|
122
|
+
|
|
123
|
+
### 1b: Classify Dependencies
|
|
124
|
+
|
|
125
|
+
For each dependency, classify it into one of three tiers:
|
|
126
|
+
|
|
127
|
+
**Tier 1 — ACCEPTABLE (keep without question):**
|
|
128
|
+
Large, widely-audited, foundational libraries. Examples by ecosystem:
|
|
129
|
+
- **Node.js**: react, next, vue, express, fastify, hono, typescript, eslint, prettier, webpack, vite, jest, vitest, mocha, d3, three, prisma, drizzle, @types/*, tailwindcss, postcss
|
|
130
|
+
- **Rust**: tokio, serde, clap, reqwest, hyper, tracing, sqlx, axum, actix-web
|
|
131
|
+
- **Python**: django, flask, fastapi, sqlalchemy, pandas, numpy, scipy, pytest, requests, httpx, pydantic
|
|
132
|
+
- **Go**: standard library (no third-party needed for most things)
|
|
133
|
+
- **Ruby**: rails, rspec, sidekiq, puma, devise
|
|
134
|
+
- Any dependency with >10M weekly downloads (npm) or equivalent popularity metric for the ecosystem
|
|
135
|
+
|
|
136
|
+
**Tier 2 — SUSPECT (audit usage):**
|
|
137
|
+
Smaller libraries that may be doing something we can write ourselves. Indicators:
|
|
138
|
+
- <1M weekly downloads (npm) or equivalent
|
|
139
|
+
- Single-purpose utility (does one thing)
|
|
140
|
+
- We only use 1-2 functions from it
|
|
141
|
+
- Wrapper libraries that add thin abstractions over built-in APIs
|
|
142
|
+
- Libraries that replicate functionality available in newer language/runtime versions
|
|
143
|
+
- Abandoned or unmaintained (no commits in 12+ months, open security issues)
|
|
144
|
+
|
|
145
|
+
**Tier 3 — REMOVABLE (strong candidate for replacement):**
|
|
146
|
+
Libraries where the cost of owning the code is clearly lower than the supply chain risk:
|
|
147
|
+
- We use a single function that's <50 lines to implement
|
|
148
|
+
- The library wraps a built-in API with minimal added value
|
|
149
|
+
- The library is unmaintained with known vulnerabilities
|
|
150
|
+
- The library's functionality is now available natively (e.g., `node:fs/promises` replacing `fs-extra` for most use cases, `structuredClone` replacing `lodash.cloneDeep`, `Array.prototype.flat` replacing `array-flatten`)
|
|
151
|
+
- Color/string utilities where we use 1-2 functions (e.g., using `chalk` just for `chalk.red()` when a 10-line ANSI wrapper suffices)
|
|
152
|
+
- UUID generation when `crypto.randomUUID()` is available
|
|
153
|
+
- Deep merge/clone when `structuredClone` suffices
|
|
154
|
+
- `dotenv` when the runtime supports `--env-file` natively
|
|
155
|
+
- `is-odd`, `is-number`, `left-pad` tier micro-packages
|
|
156
|
+
|
|
157
|
+
Record the full classification as `DEPENDENCY_MAP`.
|
|
158
|
+
|
|
159
|
+
### 1c: Usage Analysis (Tier 2 & 3 only)
|
|
160
|
+
|
|
161
|
+
For each Tier 2 and Tier 3 dependency, launch parallel Explore agents (using `AUDIT_MODEL`) to determine actual usage:
|
|
162
|
+
|
|
163
|
+
Each agent should:
|
|
164
|
+
1. Search all source files for imports/requires of the package
|
|
165
|
+
2. List every function, class, constant, or type imported from it
|
|
166
|
+
3. Count call sites per imported symbol
|
|
167
|
+
4. Assess complexity of replacement:
|
|
168
|
+
- **Trivial** (<20 lines): simple wrapper, single utility function, type alias
|
|
169
|
+
- **Moderate** (20-100 lines): multi-function utility, needs tests, edge cases to handle
|
|
170
|
+
- **Complex** (100-300 lines): significant logic, crypto, parsing, protocol implementation
|
|
171
|
+
- **Infeasible** (300+ lines or requires deep domain expertise): keep the dependency
|
|
172
|
+
5. Check if the package has known vulnerabilities: `npm audit`, `cargo audit`, `pip-audit`, etc.
|
|
173
|
+
6. Check last publish date and maintenance status
|
|
174
|
+
|
|
175
|
+
Report format:
|
|
176
|
+
```
|
|
177
|
+
- **{package-name}** — Tier {2|3}
|
|
178
|
+
- Imports: {list of imported symbols}
|
|
179
|
+
- Call sites: {count} across {N} files
|
|
180
|
+
- Functions used: {list with brief description of each}
|
|
181
|
+
- Replacement complexity: {Trivial|Moderate|Complex|Infeasible}
|
|
182
|
+
- Maintenance: {last publish date, open issues, known CVEs}
|
|
183
|
+
- Recommendation: **REMOVE** / **KEEP** / **EVALUATE**
|
|
184
|
+
- Replacement sketch: {brief description of how to replace, if REMOVE}
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
Wait for all agents to complete before proceeding.
|
|
188
|
+
|
|
189
|
+
|
|
190
|
+
## Phase 2: Replacement Plan
|
|
191
|
+
|
|
192
|
+
1. Read the existing `PLAN.md` (create if it doesn't exist)
|
|
193
|
+
2. Filter to only REMOVE recommendations from Phase 1c
|
|
194
|
+
3. For EVALUATE recommendations: **Default mode** — treat as KEEP (conservative). **Interactive mode** — present to user via `AskUserQuestion` for each
|
|
195
|
+
4. Group removable dependencies by replacement strategy:
|
|
196
|
+
- **Native replacement**: built-in API replaces the library (e.g., `crypto.randomUUID()`)
|
|
197
|
+
- **Inline replacement**: write a small utility function (e.g., ANSI color wrapper)
|
|
198
|
+
- **Consolidation**: multiple small deps replaced by one owned utility module
|
|
199
|
+
5. Estimate total lines of replacement code needed
|
|
200
|
+
6. Add a new section to PLAN.md:
|
|
201
|
+
|
|
202
|
+
```markdown
|
|
203
|
+
## Depfree Audit - {YYYY-MM-DD}
|
|
204
|
+
|
|
205
|
+
Summary: {N} total dependencies. {A} acceptable (Tier 1), {B} audited and kept (Tier 2), {C} to remove (Tier 3).
|
|
206
|
+
Estimated replacement code: ~{lines} lines across {files} new/modified files.
|
|
207
|
+
|
|
208
|
+
### Dependencies to Remove
|
|
209
|
+
| Package | Tier | Used Functions | Call Sites | Replacement | Complexity | Risk |
|
|
210
|
+
|---------|------|---------------|------------|-------------|------------|------|
|
|
211
|
+
| ... | ... | ... | ... | ... | ... | ... |
|
|
212
|
+
|
|
213
|
+
### Dependencies Kept (with rationale)
|
|
214
|
+
| Package | Tier | Reason Kept |
|
|
215
|
+
|---------|------|-------------|
|
|
216
|
+
| ... | ... | ... |
|
|
217
|
+
|
|
218
|
+
### Replacement Tasks
|
|
219
|
+
For each dependency to remove:
|
|
220
|
+
- [ ] **{package}** — {strategy}. Replace {N} call sites in {M} files. Write {utility name} ({est. lines} lines). Complexity: {level}.
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
7. Print summary table:
|
|
224
|
+
```
|
|
225
|
+
| Status | Count | Examples |
|
|
226
|
+
|------------|-------|-----------------------------------|
|
|
227
|
+
| Acceptable | ... | react, express, typescript, ... |
|
|
228
|
+
| Kept | ... | {packages kept with reasons} |
|
|
229
|
+
| Removable | ... | {packages to remove} |
|
|
230
|
+
| Total | ... | |
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
**GATE: If `--scan-only` was passed, STOP HERE.** Print the summary and exit.
|
|
234
|
+
|
|
235
|
+
**GATE: If no removable dependencies were found, print "All dependencies are justified" and exit.**
|
|
236
|
+
|
|
237
|
+
**Interactive mode**: Present the removal plan via `AskUserQuestion`:
|
|
238
|
+
```
|
|
239
|
+
AskUserQuestion([{
|
|
240
|
+
question: "Dependency removal plan:\n{summary of packages to remove}\n\nProceed with replacement?",
|
|
241
|
+
options: [
|
|
242
|
+
{ label: "Proceed", description: "Remove all listed dependencies and write replacement code" },
|
|
243
|
+
{ label: "Review individually", description: "Let me approve/reject each removal" },
|
|
244
|
+
{ label: "Abort", description: "Stop here — I'll review the plan manually" }
|
|
245
|
+
]
|
|
246
|
+
}])
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
If "Review individually": present each dependency with REMOVE/KEEP options, then proceed with only approved removals.
|
|
250
|
+
|
|
251
|
+
|
|
252
|
+
## Phase 3: Worktree Remediation
|
|
253
|
+
|
|
254
|
+
### 3a: Setup
|
|
255
|
+
|
|
256
|
+
1. If `IS_DIRTY` is true: `git stash --include-untracked -m "depfree: pre-audit stash"`
|
|
257
|
+
2. Set `DATE` to today's date in YYYY-MM-DD format
|
|
258
|
+
3. Create the worktree:
|
|
259
|
+
```bash
|
|
260
|
+
git worktree add ../depfree-{DATE} -b depfree/{DATE}
|
|
261
|
+
```
|
|
262
|
+
4. Set `WORKTREE_DIR` to `../depfree-{DATE}`
|
|
263
|
+
|
|
264
|
+
### 3b: Write Replacement Code
|
|
265
|
+
|
|
266
|
+
For each dependency to remove, spawn a general-purpose agent (using `REMEDIATION_MODEL`) with these instructions:
|
|
267
|
+
|
|
268
|
+
```
|
|
269
|
+
<context>
|
|
270
|
+
Project type: {PROJECT_TYPE}
|
|
271
|
+
Build command: {BUILD_CMD}
|
|
272
|
+
Test command: {TEST_CMD}
|
|
273
|
+
Working directory: {WORKTREE_DIR} (this is a git worktree — all work happens here)
|
|
274
|
+
</context>
|
|
275
|
+
|
|
276
|
+
<task>
|
|
277
|
+
Remove the dependency on `{PACKAGE_NAME}` and replace with owned code.
|
|
278
|
+
|
|
279
|
+
Current usage:
|
|
280
|
+
{USAGE_DETAILS from Phase 1c — imported symbols, call sites, files}
|
|
281
|
+
|
|
282
|
+
Replacement strategy: {STRATEGY from Phase 2}
|
|
283
|
+
|
|
284
|
+
Steps:
|
|
285
|
+
1. Write the replacement code (utility function, inline replacement, or native API call)
|
|
286
|
+
2. Update ALL import/require statements across the codebase to use the new code
|
|
287
|
+
3. Remove the package from the manifest ({package.json, Cargo.toml, etc.})
|
|
288
|
+
4. Run `{BUILD_CMD}` to verify compilation
|
|
289
|
+
5. Run `{TEST_CMD}` to verify tests pass
|
|
290
|
+
6. If tests reference the removed package directly (mocking it, importing test helpers from it), update those tests too
|
|
291
|
+
</task>
|
|
292
|
+
|
|
293
|
+
<guardrails>
|
|
294
|
+
- The replacement must preserve behavior for all currently-used call sites and documented invariants
|
|
295
|
+
- You may omit handling for input shapes or edge cases that are provably unreachable based on {USAGE_DETAILS}, but do not narrow behavior for any actual call site
|
|
296
|
+
- Do NOT introduce new dependencies to replace old ones
|
|
297
|
+
- Do NOT use `git add -A` or `git add .` — stage specific files only
|
|
298
|
+
- Keep replacement code minimal
|
|
299
|
+
- If replacement is more complex than estimated (>2x the estimated lines), report back and skip — do not force a bad replacement
|
|
300
|
+
- Place shared utility replacements in a sensible location (e.g., `src/utils/`, `lib/`, `internal/`) following existing project conventions
|
|
301
|
+
- Commit each replacement independently: `refactor: replace {package} with owned {utility/code}`
|
|
302
|
+
</guardrails>
|
|
303
|
+
```
|
|
304
|
+
|
|
305
|
+
**Parallelization**: Launch up to 5 agents in parallel. If >5 dependencies to remove, batch them. Assign each agent a non-overlapping set of dependencies (no two agents should modify the same files — if overlap exists, group those dependencies into one agent).
|
|
306
|
+
|
|
307
|
+
### 3c: Lock File Update
|
|
308
|
+
|
|
309
|
+
After all replacement agents complete:
|
|
310
|
+
1. Remove all replaced packages from the lock file:
|
|
311
|
+
```bash
|
|
312
|
+
cd {WORKTREE_DIR}
|
|
313
|
+
# Node.js: refresh lockfile only, without running lifecycle scripts
|
|
314
|
+
npm install --package-lock-only --ignore-scripts
|
|
315
|
+
# Or: yarn install --mode=update-lockfile --ignore-scripts
|
|
316
|
+
# Or: pnpm install --lockfile-only --ignore-scripts
|
|
317
|
+
# Rust: let a check refresh Cargo.lock to reflect manifest changes only
|
|
318
|
+
cargo check
|
|
319
|
+
# Python: use the project's lock tool to refresh
|
|
320
|
+
# poetry lock --no-update
|
|
321
|
+
# pip-compile requirements.in
|
|
322
|
+
```
|
|
323
|
+
2. Commit the lock file update:
|
|
324
|
+
```bash
|
|
325
|
+
git -C {WORKTREE_DIR} add {lock file}
|
|
326
|
+
git -C {WORKTREE_DIR} commit -m "chore: update lock file after dependency removal"
|
|
327
|
+
```
|
|
328
|
+
|
|
329
|
+
|
|
330
|
+
## Phase 4: Verification
|
|
331
|
+
|
|
332
|
+
### 4a: Build & Test
|
|
333
|
+
|
|
334
|
+
1. Run the full build:
|
|
335
|
+
```bash
|
|
336
|
+
cd {WORKTREE_DIR} && {BUILD_CMD}
|
|
337
|
+
```
|
|
338
|
+
2. Run all tests:
|
|
339
|
+
```bash
|
|
340
|
+
cd {WORKTREE_DIR} && {TEST_CMD}
|
|
341
|
+
```
|
|
342
|
+
3. If build or tests fail:
|
|
343
|
+
- Identify which replacement caused the failure
|
|
344
|
+
- Attempt to fix in a new commit
|
|
345
|
+
- If unfixable, revert the replacement commit AND re-add the dependency:
|
|
346
|
+
```bash
|
|
347
|
+
git -C {WORKTREE_DIR} revert <sha>
|
|
348
|
+
```
|
|
349
|
+
Note the reverted package as "kept — replacement failed"
|
|
350
|
+
|
|
351
|
+
### 4b: Internal Code Review
|
|
352
|
+
|
|
353
|
+
1. Generate the diff:
|
|
354
|
+
```bash
|
|
355
|
+
cd {WORKTREE_DIR} && git diff {DEFAULT_BRANCH}...HEAD
|
|
356
|
+
```
|
|
357
|
+
2. Review all replacement code for:
|
|
358
|
+
- Functional equivalence (does the replacement handle the same inputs/outputs?)
|
|
359
|
+
- Missing edge cases that the original library handled
|
|
360
|
+
- Security regressions (e.g., replacing a sanitization library with a naive regex)
|
|
361
|
+
- Performance regressions (e.g., replacing an optimized parser with O(n^2) code)
|
|
362
|
+
- Correct error handling at system boundaries
|
|
363
|
+
3. Fix any issues found, commit each fix separately
|
|
364
|
+
|
|
365
|
+
### 4c: Verify No Phantom Dependencies
|
|
366
|
+
|
|
367
|
+
Confirm no source file still references a removed package:
|
|
368
|
+
```bash
|
|
369
|
+
cd {WORKTREE_DIR}
|
|
370
|
+
for pkg in {REMOVED_PACKAGES}; do
|
|
371
|
+
grep -r "$pkg" \
|
|
372
|
+
--include='*.ts' \
|
|
373
|
+
--include='*.js' \
|
|
374
|
+
--include='*.tsx' \
|
|
375
|
+
--include='*.jsx' \
|
|
376
|
+
--include='*.py' \
|
|
377
|
+
--include='*.rs' \
|
|
378
|
+
--include='*.go' \
|
|
379
|
+
--include='*.rb' \
|
|
380
|
+
. && echo "WARN: $pkg still referenced"
|
|
381
|
+
done
|
|
382
|
+
```
|
|
383
|
+
Fix any remaining references.
|
|
384
|
+
|
|
385
|
+
|
|
386
|
+
## Phase 5: PR Creation
|
|
387
|
+
|
|
388
|
+
### 5a: Push & Create PR
|
|
389
|
+
|
|
390
|
+
```bash
|
|
391
|
+
cd {WORKTREE_DIR}
|
|
392
|
+
git push -u origin depfree/{DATE}
|
|
393
|
+
```
|
|
394
|
+
|
|
395
|
+
Create the PR:
|
|
396
|
+
|
|
397
|
+
**GitHub:**
|
|
398
|
+
```bash
|
|
399
|
+
gh pr create --head depfree/{DATE} --base {DEFAULT_BRANCH} \
|
|
400
|
+
--title "refactor: remove {N} unnecessary dependencies" \
|
|
401
|
+
--body "$(cat <<'EOF'
|
|
402
|
+
## Depfree Audit — Dependency Removal
|
|
403
|
+
|
|
404
|
+
### Summary
|
|
405
|
+
Removed {N} unnecessary third-party dependencies and replaced with owned code.
|
|
406
|
+
Estimated supply chain attack surface reduction: {N} packages ({transitive count} including transitive deps).
|
|
407
|
+
|
|
408
|
+
### Dependencies Removed
|
|
409
|
+
| Package | Replacement | Lines of Owned Code |
|
|
410
|
+
|---------|-------------|-------------------|
|
|
411
|
+
{table of removed packages}
|
|
412
|
+
|
|
413
|
+
### Dependencies Kept (audited)
|
|
414
|
+
{count} dependencies audited and kept with rationale. See PLAN.md for details.
|
|
415
|
+
|
|
416
|
+
### Replacement Code
|
|
417
|
+
{bulleted list of new utility files or inline changes}
|
|
418
|
+
|
|
419
|
+
### Verification
|
|
420
|
+
- [ ] Build passes
|
|
421
|
+
- [ ] All tests pass
|
|
422
|
+
- [ ] No phantom references to removed packages
|
|
423
|
+
- [ ] Lock file updated
|
|
424
|
+
EOF
|
|
425
|
+
)"
|
|
426
|
+
```
|
|
427
|
+
|
|
428
|
+
**GitLab:**
|
|
429
|
+
```bash
|
|
430
|
+
glab mr create --source-branch depfree/{DATE} --target-branch {DEFAULT_BRANCH} \
|
|
431
|
+
--title "refactor: remove {N} unnecessary dependencies" --description "..."
|
|
432
|
+
```
|
|
433
|
+
|
|
434
|
+
Record `PR_NUMBER` and `PR_URL`.
|
|
435
|
+
|
|
436
|
+
**GATE: If `--no-merge` was passed, STOP HERE.** Print the PR URL and summary.
|
|
437
|
+
|
|
438
|
+
### 5b: CI Verification
|
|
439
|
+
|
|
440
|
+
1. Wait 30 seconds for CI to start
|
|
441
|
+
2. Poll CI status:
|
|
442
|
+
```bash
|
|
443
|
+
gh pr checks {PR_NUMBER}
|
|
444
|
+
```
|
|
445
|
+
Poll every 30 seconds, max 10 minutes.
|
|
446
|
+
3. If CI fails:
|
|
447
|
+
- Fetch failure logs, diagnose, fix, commit, push
|
|
448
|
+
- Max 3 fix attempts before informing the user
|
|
449
|
+
|
|
450
|
+
### 5c: Copilot Review Loop (GitHub only)
|
|
451
|
+
|
|
452
|
+
If `VCS_HOST` is `github`, run the Copilot review loop using the shared template:
|
|
453
|
+
|
|
454
|
+
!`cat ~/.claude/lib/copilot-review-loop.md`
|
|
455
|
+
|
|
456
|
+
Pass: `{PR_NUMBER}`, `{OWNER}/{REPO}`, `depfree/{DATE}`, and `{BUILD_CMD}`.
|
|
457
|
+
|
|
458
|
+
### 5d: Merge
|
|
459
|
+
|
|
460
|
+
**Default mode**: Auto-merge if review is clean.
|
|
461
|
+
**Interactive mode**: Ask user for merge approval.
|
|
462
|
+
|
|
463
|
+
```bash
|
|
464
|
+
gh pr merge {PR_NUMBER} --merge
|
|
465
|
+
```
|
|
466
|
+
|
|
467
|
+
|
|
468
|
+
## Phase 6: Cleanup
|
|
469
|
+
|
|
470
|
+
1. Remove the worktree:
|
|
471
|
+
```bash
|
|
472
|
+
git worktree remove {WORKTREE_DIR}
|
|
473
|
+
```
|
|
474
|
+
2. Delete the local branch:
|
|
475
|
+
```bash
|
|
476
|
+
git checkout {DEFAULT_BRANCH}
|
|
477
|
+
git branch -D depfree/{DATE}
|
|
478
|
+
git push origin --delete depfree/{DATE}
|
|
479
|
+
```
|
|
480
|
+
3. Restore stashed changes if applicable:
|
|
481
|
+
```bash
|
|
482
|
+
git stash pop
|
|
483
|
+
```
|
|
484
|
+
4. Update PLAN.md:
|
|
485
|
+
- Mark completed removals with `[x]`
|
|
486
|
+
- Add PR link
|
|
487
|
+
- Note any packages that were reverted
|
|
488
|
+
5. Print the final summary:
|
|
489
|
+
|
|
490
|
+
```
|
|
491
|
+
| Package | Status | Replacement | Lines |
|
|
492
|
+
|------------------|----------|--------------------------|-------|
|
|
493
|
+
| {package} | Removed | {utility/native API} | {N} |
|
|
494
|
+
| {package} | Kept | {reason} | — |
|
|
495
|
+
| {package} | Reverted | {reason for failure} | — |
|
|
496
|
+
|
|
497
|
+
Total dependencies before: {before}
|
|
498
|
+
Total dependencies after: {after}
|
|
499
|
+
Packages removed: {count}
|
|
500
|
+
Owned replacement code: ~{lines} lines
|
|
501
|
+
Transitive deps eliminated: ~{count} (estimated)
|
|
502
|
+
```
|
|
503
|
+
|
|
504
|
+
|
|
505
|
+
## Error Recovery
|
|
506
|
+
|
|
507
|
+
- **Agent failure**: continue with remaining agents, note gaps in the summary
|
|
508
|
+
- **Build failure in worktree**: attempt fix; if unfixable, revert the problematic replacement and re-add the dependency
|
|
509
|
+
- **Push failure**: `git pull --rebase --autostash` then retry push
|
|
510
|
+
- **CI failure on PR**: investigate logs, fix, push (max 3 attempts)
|
|
511
|
+
- **Replacement too complex**: if an agent reports that replacement exceeds 2x estimated complexity, skip that dependency and keep it with a note
|
|
512
|
+
- **Test failure from replacement**: if tests fail and the fix isn't obvious, revert the replacement — a working dependency is better than broken owned code
|
|
513
|
+
- **Existing worktree found at startup**: ask user — resume or clean up
|
|
514
|
+
|
|
515
|
+
!`cat ~/.claude/lib/graphql-escaping.md`
|
|
516
|
+
|
|
517
|
+
## Notes
|
|
518
|
+
|
|
519
|
+
- This command complements `/do:better` — run `depfree` for dependency hygiene, `better` for code quality
|
|
520
|
+
- All remediation happens in an isolated worktree — the user's working directory is never modified
|
|
521
|
+
- The threshold for "acceptable" libraries is deliberately generous — the goal is to remove obvious attack surface, not to rewrite everything
|
|
522
|
+
- Replacement code should be minimal and focused — don't over-engineer utilities that replace single-purpose packages
|
|
523
|
+
- When in doubt, keep the dependency. A maintained library is better than a buggy reimplementation
|
|
524
|
+
- devDependencies are lower priority since they don't ship to production, but unmaintained build tools still pose supply chain risk
|
|
525
|
+
- For monorepos, audit the root manifest and each workspace package manifest
|
package/commands/do/help.md
CHANGED
|
@@ -14,6 +14,7 @@ List all available `/do:*` commands with their descriptions.
|
|
|
14
14
|
|---|---|
|
|
15
15
|
| `/do:better` | Unified DevSecOps audit, remediation, per-category PRs, CI verification, and Copilot review loop |
|
|
16
16
|
| `/do:better-swift` | SwiftUI-optimized DevSecOps audit with multi-platform coverage (iOS, macOS, watchOS, tvOS, visionOS) |
|
|
17
|
+
| `/do:depfree` | Audit third-party dependencies and remove unnecessary ones by writing replacement code |
|
|
17
18
|
| `/do:fpr` | Commit, push to fork, and open a PR against the upstream repo |
|
|
18
19
|
| `/do:goals` | Scan codebase to infer project goals, clarify with user, and generate GOALS.md |
|
|
19
20
|
| `/do:help` | List all available slashdo commands |
|
package/commands/do/review.md
CHANGED
|
@@ -175,7 +175,7 @@ Check every file against this checklist. The checklist is organized into tiers
|
|
|
175
175
|
- If the PR adds or reorders sequential steps/instructions, verify the ordering matches execution dependencies — readers following steps in order must not perform an action before its prerequisite
|
|
176
176
|
|
|
177
177
|
**Transactional write integrity**
|
|
178
|
-
- If the PR performs multi-item writes (database transactions, batch operations), verify each write includes condition expressions that prevent stale-read races (TOCTOU) — an unconditioned write after a read can upsert deleted records, double-count aggregates, or drive counters negative. Trace the gap between read and write for each operation
|
|
178
|
+
- If the PR performs multi-item writes (database transactions, batch operations), verify each write includes condition expressions that prevent stale-read races (TOCTOU) — an unconditioned write after a read can upsert deleted records, double-count aggregates, or drive counters negative. Trace the gap between read and write for each operation. Also verify that update/modify operations won't silently create records when the target key doesn't exist — database update operations often have implicit upsert semantics (e.g., DynamoDB UpdateItem, MongoDB update with upsert) that create partial records for invalid IDs; add existence condition expressions when the operation should only modify existing records
|
|
179
179
|
- If the PR catches transaction/conditional failures, verify the error is translated to a client-appropriate status (409, 404) rather than bubbling as 500 — expected concurrency failures are not server errors
|
|
180
180
|
|
|
181
181
|
**Batch/paginated API consumption**
|
|
@@ -213,6 +213,10 @@ Check every file against this checklist. The checklist is organized into tiers
|
|
|
213
213
|
**Abstraction layer fidelity**
|
|
214
214
|
- If the PR calls a third-party API through an internal wrapper/abstraction layer, trace whether the wrapper requests and forwards all fields the handler depends on — third-party APIs often have optional response attributes that require explicit opt-in (e.g., cancellation reasons, extended metadata). Code branching on fields the wrapper doesn't forward will silently receive `undefined` and take the wrong path. Also verify that test mocks match what the real wrapper returns, not what the underlying API could theoretically return
|
|
215
215
|
- If the PR passes multiple parameters through a wrapper/abstraction layer to an underlying API, check whether any parameter combinations are mutually exclusive in the underlying API (e.g., projection expressions + count-only select modes) — the wrapper should strip conflicting parameters rather than forwarding all unconditionally, which causes validation errors at the underlying layer
|
|
216
|
+
- If the PR calls framework or library functions with discriminated input formats (e.g., content paths vs script paths, different loader functions per format), trace each call site to verify the function variant used actually handles the input format being passed — especially fallback/default branches in multi-format dispatchers, where the fallback commonly uses the wrong function. Also verify positional argument order matches the called function's parameter order (not assumed from variable names) and that the object type passed matches what the API expects (e.g., asset object vs class reference, property access vs method call)
|
|
217
|
+
|
|
218
|
+
**Parameter consumption tracing**
|
|
219
|
+
- If the PR adds a function with validated input parameters (schema validation, input decorators, type annotations), trace each validated parameter through to where it's actually consumed in the implementation. Parameters that pass validation but are never read create dead API surface — callers believe they're configuring behavior that's silently ignored. Either wire the parameter through or remove it from the public API
|
|
216
220
|
|
|
217
221
|
**Summary/aggregation endpoint consistency**
|
|
218
222
|
- If the PR adds a summary or dashboard endpoint that aggregates counts/previews across multiple data sources, trace each category's computation logic against the corresponding detail view it links to — verify they apply the same filters (e.g., orphan exclusion, status filtering), the same ordering guarantees (sort keys that actually exist on the queried index), and that navigation links propagate the aggregated context (e.g., `?status=pending`) so the destination page matches what the summary promised
|
|
@@ -240,6 +244,9 @@ Check every file against this checklist. The checklist is organized into tiers
|
|
|
240
244
|
**Bulk vs single-item operation parity**
|
|
241
245
|
- If the PR modifies a single-item CRUD operation (create, update, delete) to handle new fields or apply new logic, trace the corresponding bulk/batch operation for the same entity — it often has its own independent implementation that won't pick up the change. Verify both paths handle the same fields, apply the same validation, and preserve the same secondary data
|
|
242
246
|
|
|
247
|
+
**Bulk operation selection lifecycle**
|
|
248
|
+
- If the PR adds operations that act on a user-selected subset of items (bulk actions, batch operations), trace the complete lifecycle of the selection state: when is it cleared (data refresh, item deletion), when is it not cleared but should be (filter/sort/page changes), and whether the operation re-validates the selection at execution time (especially after confirmation dialogs where the underlying data may change between display and confirmation)
|
|
249
|
+
|
|
243
250
|
**Config value provenance for auto-upgrade**
|
|
244
251
|
- If the PR adds auto-upgrade logic that replaces config values with newer defaults (prompt versions, schema migrations, template updates), verify the code can distinguish "user customized this value" from "this is the previous default." Without provenance tracking (version stamps, customization flags, or comparison against known previous defaults), auto-upgrade will overwrite intentional user customizations or skip legitimate upgrades
|
|
245
252
|
|
package/commands/do/rpr.md
CHANGED
|
@@ -54,6 +54,8 @@ Address the latest code review feedback on the current branch's pull request usi
|
|
|
54
54
|
|
|
55
55
|
9. **Request another Copilot review** (only if `is_fork_pr=false`): After pushing fixes, request a fresh Copilot code review and repeat from step 3 until the review passes clean. **Skip for fork-to-upstream PRs.**
|
|
56
56
|
|
|
57
|
+
**Repeated-comment dedup**: When fetching threads after a new Copilot review round, compare each new unresolved thread's comment body and file/line against threads from the previous round that were intentionally left unresolved (replied to as non-issues or disagreements). If all new unresolved threads are repeats of previously-dismissed feedback, treat the review as clean (no new actionable comments) and exit the loop.
|
|
58
|
+
|
|
57
59
|
10. **Report summary**: Print a table of all threads addressed with file, line, and a brief description of the fix. Include a final count line: "Resolved X/Y threads." If any threads remain unresolved, list them with reasons (unclear feedback, disagreement, requires user input).
|
|
58
60
|
|
|
59
61
|
!`cat ~/.claude/lib/graphql-escaping.md`
|
|
@@ -78,7 +80,7 @@ Poll using GraphQL to check for a new review with a `submittedAt` timestamp afte
|
|
|
78
80
|
gh api graphql -f query='{ repository(owner: "OWNER", name: "REPO") { pullRequest(number: PR_NUM) { reviews(last: 3) { nodes { state body author { login } submittedAt } } reviewThreads(first: 100) { nodes { id isResolved comments(first: 3) { nodes { body path line author { login } } } } } } } }'
|
|
79
81
|
```
|
|
80
82
|
|
|
81
|
-
**Dynamic poll timing**: Before your first poll, check how long the most recent Copilot review on this PR took by comparing consecutive Copilot review `submittedAt` timestamps (or PR creation time for the first review). Use that duration as your expected wait. If no prior review exists, default to 5 minutes.
|
|
83
|
+
**Dynamic poll timing**: Before your first poll, check how long the most recent Copilot review on this PR took by comparing consecutive Copilot review `submittedAt` timestamps (or PR creation time for the first review). Use that duration as your expected wait. If no prior review exists, default to 5 minutes. Use **progressive poll intervals**: 15s, 15s, 30s, 30s, then 60s thereafter — small diffs often complete in under a minute, so early frequent checks avoid wasting time. Set max wait to **2x the expected duration** (minimum 5 minutes, maximum 20 minutes). Copilot reviews can take **10-15 minutes** for large diffs — do NOT give up early.
|
|
82
84
|
|
|
83
85
|
The review is complete when a new `copilot-pull-request-reviewer` review node appears. If no review appears after max wait: **Default mode**: auto-skip and continue. **Interactive mode (`--interactive`)**: ask the user whether to continue waiting, re-request, or skip.
|
|
84
86
|
|
|
@@ -41,7 +41,7 @@
|
|
|
41
41
|
|
|
42
42
|
**Async & state consistency** _[applies when: code uses async/await, Promises, or UI state]_
|
|
43
43
|
- Optimistic state changes (view switches, navigation, success callbacks) before async completion — if the operation fails or is cancelled, the UI is stuck with no rollback. Check return values/errors before calling success callbacks. Handle both failure and cancellation paths. Watch for `.catch(() => null)` followed by unconditional success code (toast, state update) — the catch silences the error but the success path still runs. Either let errors propagate naturally or check the return value before proceeding
|
|
44
|
-
- Multiple coupled state variables updated independently — actions that change one must update all related fields; debounced/cancelable operations must reset loading state on every exit path (cleared, stale, failed, aborted). Component state initialized from props via `useState(prop)` only captures the initial value — if the prop updates asynchronously (data fetch, parent re-render), the local state goes stale. Sync with an effect when the user is not actively editing, or lift state to avoid the copy
|
|
44
|
+
- Multiple coupled state variables updated independently — actions that change one must update all related fields; debounced/cancelable operations must reset loading state on every exit path (cleared, stale, failed, aborted). Reference/selection sets that point to items in a data collection must be pruned when items are removed and invalidated when the collection is reloaded, filtered, paginated, or sorted — stale references send nonexistent IDs to downstream operations. Operations triggered from a confirmation dialog must re-validate preconditions (selection non-empty, items still exist) at execution time — the underlying data may change between dialog display and user confirmation. Component state initialized from props via `useState(prop)` only captures the initial value — if the prop updates asynchronously (data fetch, parent re-render), the local state goes stale. Sync with an effect when the user is not actively editing, or lift state to avoid the copy
|
|
45
45
|
- Error notification at multiple layers (shared API client + component-level) — verify exactly one layer owns user-facing error messages. For periodic polling, also check that error notifications are throttled or deduplicated (only fire on state transitions like success→error, not on every failed iteration) and that failure doesn't make the UI section disappear entirely (component returning null when data is null/errored) — render an error or stale-data state instead of absence
|
|
46
46
|
- Optimistic updates using full-collection snapshots for rollback — a second in-flight action gets clobbered. Use per-item rollback and functional state updaters after async gaps; sync optimistic changes to parent via callback or trigger refetch on remount. When appending items to a list optimistically, guard against duplicates (check existence before append) — concurrent or repeated operations can insert the same item multiple times
|
|
47
47
|
- State updates guarded by truthiness of the new value (`if (arr?.length)`) — prevents clearing state when the source legitimately returns empty. Distinguish "no response" from "empty response"
|
|
@@ -80,9 +80,10 @@
|
|
|
80
80
|
- Summary/aggregation endpoints that compute counts or previews via a different query path, filter set, or data source than the detail views they link to — users see inconsistent numbers between the dashboard and the destination page. Trace the computation logic in both paths and verify they apply the same filters, exclusions, and ordering guarantees (or document the intentional difference)
|
|
81
81
|
- When a validation/sanitization/normalization function is introduced for a field, trace ALL write paths (create, update, sync, import, raw/bulk persist) — partial application means invalid values re-enter through the unguarded path. This includes structural normalization (ID prefixes, required defaults, shape invariants) that the read/parse path depends on — a "raw" write path that skips normalization produces data that changes identity or shape on reload
|
|
82
82
|
- Stored config/settings merged with hardcoded defaults using shallow spread — nested objects in the stored copy entirely replace the default, dropping newly added default keys on upgrade. Use deep merge for nested config objects (while preserving explicit `null` to clear a field), or flatten the config structure so shallow merge suffices
|
|
83
|
-
- Schema fields accepting values downstream code can't handle; Zod/schema stripping fields the service reads (silent `undefined`); config values persisted but silently ignored by the implementation — trace each field through schema → service → consumer. Update schemas derived from create schemas (e.g., `.partial()`) must also make nested object fields optional — shallow partial on a deeply-required schema rejects valid partial updates. Additionally, `.deepPartial()` or `.partial()` on schemas with `.default()` values will apply those defaults on update, silently overwriting existing persisted values with defaults — create explicit update schemas without defaults instead
|
|
83
|
+
- Schema fields accepting values downstream code can't handle; Zod/schema stripping fields the service reads (silent `undefined`); config values persisted but silently ignored by the implementation — trace each field through schema → service → consumer. Also check for parameters accepted and validated in the schema but never consumed by the implementation — dead API surface that misleads callers into believing they're configuring behavior that's silently ignored; remove unused parameters or wire them through to the implementation. Update schemas derived from create schemas (e.g., `.partial()`) must also make nested object fields optional — shallow partial on a deeply-required schema rejects valid partial updates. Additionally, `.deepPartial()` or `.partial()` on schemas with `.default()` values will apply those defaults on update, silently overwriting existing persisted values with defaults — create explicit update schemas without defaults instead
|
|
84
|
+
- Multi-part UI features (e.g., table header + rows) whose rendering is gated on different prop/condition subsets — if the header checks prop A while rows check prop B, partial provision causes structural misalignment (column count mismatch, orphaned interactive elements without handlers). Derive a single enablement boolean from the complete prop set and use it consistently across all participating components
|
|
84
85
|
- Entity creation without case-insensitive uniqueness checks — names differing only in case (e.g., "MyAgent" vs "myagent") cause collisions in case-insensitive contexts (file paths, git branches, URLs). Normalize to lowercase before comparing
|
|
85
|
-
- Code reading properties from API responses, framework-provided objects, or internal abstraction layers using field names the source doesn't populate or forward — silent `undefined`. Verify property names and nesting depth match the actual response shape (e.g., `response.items` vs `response.data.items`, `obj.placeId` vs `obj.id`, flat fields vs nested sub-objects). When building a new consumer against an existing API, check the producer's actual response — not assumed conventions. When branching on fields from a wrapped third-party API, confirm the wrapper actually requests and forwards those fields (e.g., optional response attributes that require explicit opt-in)
|
|
86
|
+
- Code reading properties from API responses, framework-provided objects, or internal abstraction layers using field names the source doesn't populate or forward — silent `undefined`. Verify property names and nesting depth match the actual response shape (e.g., `response.items` vs `response.data.items`, `obj.placeId` vs `obj.id`, flat fields vs nested sub-objects). When building a new consumer against an existing API, check the producer's actual response — not assumed conventions. When branching on fields from a wrapped third-party API, confirm the wrapper actually requests and forwards those fields (e.g., optional response attributes that require explicit opt-in). Also verify call sites pass inputs in the format the called function actually accepts — framework constructors with non-obvious positional argument order, loaders with format-specific variants (content paths vs script paths, asset objects vs class references), and accessor APIs with distinct method-vs-property semantics. Fallback branches in multi-format dispatchers commonly use the wrong function for the input type
|
|
86
87
|
- Data model fields that have different names depending on the creation/write path (e.g., `createdAt` vs `created`) — code referencing only one naming convention silently misses records created through other paths. Trace all write paths to discover the actual field names in use. When new logic (access control, UI display, queries) checks only a newly introduced field, verify it falls back to any legacy field that existing records still use — otherwise records created before the migration are silently excluded or inaccessible. Also check entity identity keys: if code looks up or matches entities using a computed key (e.g., `e.id || e.externalId`), all code paths that perform the same lookup must use the same key computation — one path using `e.id` while another uses `e.id || e.externalId` causes mismatches for entities missing the primary key
|
|
87
88
|
- Entity type changes without invariant revalidation — when an entity has a discriminator field (type, kind, category) and the user changes it, all type-specific invariants must be enforced on the new type AND type-specific fields from the old type must be cleared or revalidated. A job changing from `shell` to `agent` without clearing `command`, or changing to `shell` without requiring `command`, leaves the entity in an invalid hybrid state that fails at runtime or resurfaces stale data
|
|
88
89
|
- Invariant relationships between configuration flags (flag A implies flag B) not enforced across all layers — UI toggle handlers, API validation schemas, server default-application functions, and serialization/deserialization must all preserve the invariant. If any layer allows setting A=true with B=false (or vice versa), cascading defaults and toggle logic produce contradictory state. Trace the invariant through: UI state handlers, form submission, route validation, service defaults, and persistence round-trip
|
|
@@ -103,7 +104,7 @@
|
|
|
103
104
|
|
|
104
105
|
**Input handling** _[applies when: code accepts user/external input]_
|
|
105
106
|
- Trimming values where whitespace is significant (API keys, tokens, passwords, base64) — only trim identifiers/names
|
|
106
|
-
- Endpoints accepting unbounded arrays/collections without upper limits — enforce max size or move to background jobs. Also check internal operations that fan out unbounded parallel I/O (e.g., `Promise.all(files.map(readFile))`) — large collections risk EMFILE (too many open file descriptors) or memory exhaustion. Use a concurrency limiter or batch processing for collections that can grow without bound
|
|
107
|
+
- Endpoints accepting unbounded arrays/collections without upper limits — enforce max size or move to background jobs. Also validate element-level invariants (types, format, non-empty) and deduplicate — duplicate elements inflate operation counts, repeat side effects, and skew success/failure metrics. Also check internal operations that fan out unbounded parallel I/O (e.g., `Promise.all(files.map(readFile))`) — large collections risk EMFILE (too many open file descriptors) or memory exhaustion. Use a concurrency limiter or batch processing for collections that can grow without bound
|
|
107
108
|
- Security/sanitization functions (redaction, escaping, validation) that only handle one input format — if data can arrive in multiple formats (JSON `"KEY": "value"`, shell `KEY=value`, URL-encoded, headers), the function must cover all formats present in the system or sensitive data leaks through the unhandled format
|
|
108
109
|
|
|
109
110
|
## Tier 3 — Domain-Specific (Check Only When File Type Matches)
|
|
@@ -205,7 +206,7 @@
|
|
|
205
206
|
**Test coverage**
|
|
206
207
|
- New logic/schemas/services without corresponding tests when similar existing code has tests
|
|
207
208
|
- New error paths untestable because services throw generic errors instead of typed ones
|
|
208
|
-
- Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses
|
|
209
|
+
- Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses. Includes tests that assert by inspecting function source code (string-matching implementation details) rather than calling the function and checking behavior — they break on harmless refactors while missing actual behavioral changes. Also tests that mutate global state at import time (module registries, sys.modules) without fixture-scoped cleanup — causes ordering-dependent failures across the test session
|
|
209
210
|
- Tests depending on real wall-clock time or external dependencies when testing logic — use fake timers and mocks
|
|
210
211
|
- Missing tests for trust-boundary enforcement — submit tampered values, verify server ignores them
|
|
211
212
|
- Tests that exercise code paths depending on features the integration layer doesn't expose — they pass against mocks but the behavior can't trigger in production. Verify mocked responses match what the real dependency actually returns
|
package/package.json
CHANGED