slash-do 2.11.0 → 2.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -63,6 +63,7 @@ All commands live under the `do:` namespace:
63
63
  | `/do:review` | Deep code review against best practices |
64
64
  | `/do:better` | Full DevSecOps audit with 8-agent scan and remediation |
65
65
  | `/do:better-swift` | SwiftUI DevSecOps audit with multi-platform coverage |
66
+ | `/do:scan` | Read-only safety audit of an unfamiliar directory — flags malware patterns, network calls, and vulnerable deps without executing code |
66
67
  | `/do:depfree` | Audit dependencies, remove unnecessary ones, write replacement code |
67
68
  | `/do:goals` | Generate GOALS.md from codebase analysis |
68
69
  | `/do:replan` | Review and clean up PLAN.md |
@@ -67,6 +67,7 @@ Key behavioral changes when `HEAVY_MODE` is `true`:
67
67
 
68
68
  When compacting during this workflow, always preserve:
69
69
  - The `DEPENDENCY_MAP` (complete classification of all dependencies)
70
+ - The `PRIOR_DECISIONS` map loaded from `./docs/DEPS.md`
70
71
  - All REMOVABLE findings with package names and usage details
71
72
  - The current phase number and what phases remain
72
73
  - All PR numbers and URLs created so far
@@ -111,6 +112,26 @@ Record as `BUILD_CMD` and `TEST_CMD`.
111
112
  - Record `DEFAULT_BRANCH` via `gh repo view --json defaultBranchRef --jq '.defaultBranchRef.name'` (or `glab` equivalent)
112
113
  - Record `IS_DIRTY` via `git status --porcelain`
113
114
 
115
+ ### 0e: Load Prior Decisions
116
+
117
+ Read `{REPO_DIR}/docs/DEPS.md` if it exists. This file records decisions from prior `/do:depfree` runs and is used to skip re-evaluation of dependencies that have already been audited.
118
+
119
+ Parse the file into `PRIOR_DECISIONS` — a map keyed by package name, with values:
120
+ - `decision`: one of `KEPT_TIER1`, `KEPT_AUDITED`, `KEPT_TRANSITIVE`, `REMOVED`, `REVERTED`, `SKIPPED_INFEASIBLE`
121
+ - `major_version`: the major version that was evaluated (e.g., `18` for react@18.x)
122
+ - `mode`: the mode the decision was made under (`default`, `heavy`, or `both`)
123
+ - `reason`: the rationale recorded
124
+ - `decision_date`: ISO date the decision was made
125
+
126
+ If the file does not exist, set `PRIOR_DECISIONS` to an empty map. The file will be created in Phase 4c only when remediation runs proceed past the scan-only phases (i.e., `--scan-only` was not passed).
127
+
128
+ A prior decision is **valid for skipping re-evaluation** when ALL of these are true:
129
+ 1. The package is still in the manifest at the same major version
130
+ 2. The recorded mode matches the current run mode (a `default` decision does NOT skip a `heavy` run; `both` skips either; `heavy` skips a `default` run)
131
+ 3. The decision is not `REMOVED` or `REVERTED` (those packages should not be in the manifest; if they are, treat as new)
132
+
133
+ Otherwise, the dependency is re-evaluated in Phase 1 normally.
134
+
114
135
 
115
136
  ## Phase 1: Dependency Inventory
116
137
 
@@ -138,7 +159,15 @@ Based on `PROJECT_TYPE`, extract the full dependency list:
138
159
 
139
160
  ### 1b: Classify Dependencies
140
161
 
141
- For each dependency, classify it into one of three tiers:
162
+ For each dependency, first check `PRIOR_DECISIONS` (from Phase 0e). If a valid prior decision exists for the package + major version + mode, carry it forward:
163
+ - `KEPT_TIER1` → classify as **Tier 1** (skip further audit)
164
+ - `KEPT_AUDITED` → classify as **Tier 2** with recommendation **KEEP** (skip Phase 1c usage analysis)
165
+ - `KEPT_TRANSITIVE` → classify as **Tier 2** with recommendation **KEEP (transitive)** (skip Phase 1c usage analysis and Phase 1d transitive check; the prior `Kept Via` chain is recorded)
166
+ - `SKIPPED_INFEASIBLE` → classify as **Tier 2** with recommendation **KEEP** (skip Phase 1c usage analysis)
167
+
168
+ Record carried-forward decisions in `DEPENDENCY_MAP` with a `from_prior: true` flag. Print one line per skipped dependency: `↻ {package}@{major} — carrying forward prior {decision} ({decision_date})`.
169
+
170
+ For all other dependencies (no prior decision, major version bump, or mode escalation from default → heavy), classify it into one of three tiers:
142
171
 
143
172
  **Tier 1 — ACCEPTABLE (keep without question):**
144
173
  Large, widely-audited, foundational libraries. Examples by ecosystem:
@@ -217,7 +246,7 @@ Record the full classification as `DEPENDENCY_MAP`.
217
246
 
218
247
  ### 1c: Usage Analysis (Tier 2 & 3 only)
219
248
 
220
- For each Tier 2 and Tier 3 dependency, launch parallel Explore agents (using `AUDIT_MODEL`) to determine actual usage:
249
+ Skip any dependency with `from_prior: true` (carried forward from Phase 1b). For all remaining Tier 2 and Tier 3 dependencies, launch parallel Explore agents (using `AUDIT_MODEL`) to determine actual usage:
221
250
 
222
251
  Each agent should:
223
252
  1. Search all source files for imports/requires of the package
@@ -455,7 +484,75 @@ After all replacement agents complete:
455
484
  - Correct error handling at system boundaries
456
485
  3. Fix any issues found, commit each fix separately
457
486
 
458
- ### 4c: Verify No Phantom Dependencies
487
+ ### 4c: Update DEPS.md
488
+
489
+ Write the consolidated decision record to `{WORKTREE_DIR}/docs/DEPS.md`. Create the `docs/` directory if it does not exist.
490
+
491
+ Build the new file from:
492
+ 1. **All carried-forward entries** from `PRIOR_DECISIONS` whose packages are still in the manifest at the same major version (preserve `decision_date` and `mode`)
493
+ 2. **New decisions** from this run:
494
+ - Each Tier 1 package → `KEPT_TIER1`
495
+ - Each Tier 2 package with KEEP recommendation → `KEPT_AUDITED`
496
+ - Each Tier 2/3 package downgraded to KEEP (transitive) in Phase 1d → `KEPT_TRANSITIVE`
497
+ - Each successfully removed package → `REMOVED`
498
+ - Each reverted package (replacement failed in 4a) → `REVERTED`
499
+ - Each skipped package (replacement infeasible / >2x estimate / >300 lines in heavy) → `SKIPPED_INFEASIBLE`
500
+ 3. **Mode merging**: if a prior decision was `default` and this run is `heavy` (or vice versa) and both runs reached the same conclusion for the same package + major version, set `mode` to `both`. Otherwise the new run's mode overwrites.
501
+
502
+ Use this layout:
503
+
504
+ ```markdown
505
+ # Dependency Audit Decisions
506
+
507
+ Auto-maintained by `/do:depfree`. Records prior audit decisions so repeat runs
508
+ skip re-evaluation. Re-audit triggers: major version bump, heavy-mode run after
509
+ default-mode decision, or manual deletion of an entry.
510
+
511
+ Last updated: {YYYY-MM-DD}
512
+
513
+ ## Kept — Tier 1 (foundational)
514
+
515
+ | Package | Major | Mode | Reviewed | Reason |
516
+ |---------|-------|------|----------|--------|
517
+ | ... | ... | ... | ... | ... |
518
+
519
+ ## Kept — Tier 2 (audited)
520
+
521
+ | Package | Major | Mode | Reviewed | Reason |
522
+ |---------|-------|------|----------|--------|
523
+
524
+ ## Kept — Transitive
525
+
526
+ | Package | Major | Mode | Reviewed | Kept Via |
527
+ |---------|-------|------|----------|----------|
528
+
529
+ ## Removed
530
+
531
+ | Package | Major | Mode | Removed | Replacement |
532
+ |---------|-------|------|---------|-------------|
533
+
534
+ ## Reverted (replacement failed, kept in manifest)
535
+
536
+ | Package | Major | Mode | Reviewed | Reason |
537
+ |---------|-------|------|----------|--------|
538
+
539
+ ## Skipped (replacement infeasible)
540
+
541
+ | Package | Major | Mode | Reviewed | Reason |
542
+ |---------|-------|------|----------|--------|
543
+ ```
544
+
545
+ Sort each section alphabetically by package name.
546
+
547
+ Commit the change (only if the file actually changed):
548
+ ```bash
549
+ git -C {WORKTREE_DIR} add -- docs/DEPS.md
550
+ if ! git -C {WORKTREE_DIR} diff --cached --quiet -- docs/DEPS.md; then
551
+ git -C {WORKTREE_DIR} commit -m "docs: update DEPS.md with audit decisions"
552
+ fi
553
+ ```
554
+
555
+ ### 4d: Verify No Phantom Dependencies
459
556
 
460
557
  Confirm no source file still references a removed package:
461
558
  ```bash
@@ -518,7 +615,9 @@ Estimated supply chain attack surface reduction: {N} packages ({transitive count
518
615
  - [ ] Build passes
519
616
  - [ ] All tests pass
520
617
  - [ ] No phantom references to removed packages
521
- - [ ] Lock file updated"
618
+ - [ ] Lock file updated
619
+ - [ ] \`docs/DEPS.md\` updated with audit decisions
620
+ "
522
621
 
523
622
  gh pr create --head depfree/{DATE} --base {DEFAULT_BRANCH} \
524
623
  --title "$PR_TITLE" \
@@ -622,6 +721,7 @@ Transitive deps eliminated: ~{count} (estimated)
622
721
 
623
722
  - This command complements `/do:better` — run `depfree` for dependency hygiene, `better` for code quality
624
723
  - All remediation happens in an isolated worktree — the user's working directory is never modified
724
+ - `docs/DEPS.md` is the persistent decision log. It is read at the start of every run (Phase 0e) to skip re-evaluation of unchanged dependencies, and rewritten at the end of Phase 4c with the merged set of prior + current decisions. Major version bumps and heavy-mode escalations bypass the cache. Manually delete an entry to force re-audit on the next run
625
725
  - **Default mode**: the threshold for "acceptable" libraries is deliberately generous — the goal is to remove obvious attack surface, not to rewrite everything
626
726
  - **Heavy mode**: the threshold narrows to foundational frameworks only — the goal is to own as much code as feasibly possible, eliminating supply chain risk from individual maintainers and small projects
627
727
  - Replacement code should be minimal and focused — don't over-engineer utilities that replace single-purpose packages
@@ -26,6 +26,7 @@ List all available `/do:*` commands with their descriptions.
26
26
  | `/do:replan` | Review and clean up PLAN.md, extract docs from completed work |
27
27
  | `/do:review` | Deep code review of changed files against best practices |
28
28
  | `/do:rpr` | Resolve PR review feedback with parallel agents |
29
+ | `/do:scan` | Read-only safety audit of an unfamiliar directory — flags malware patterns, network calls, and vulnerable deps without executing code |
29
30
  | `/do:update` | Update slashdo commands to the latest version |
30
31
 
31
32
  2. **Check for updates**: Run `npm view slash-do version` and compare to the installed version in `~/.claude/.slashdo-version`. If an update is available, mention it.
@@ -32,21 +32,31 @@ Before dispatching agents, understand what this change set claims to do:
32
32
 
33
33
  ## Dispatch Review Agents
34
34
 
35
- Read the three agent instruction files, then spawn **all three in parallel** using the Agent tool with `model: "opus"`. Each agent reviews ALL changed files independently. Opus-class reasoning catches issues that require drawing on broad software engineering principles, not just pattern-matching against checklists.
35
+ Read the five agent instruction files, then spawn **all five in parallel** using the Agent tool with `model: "opus"`. Each agent reviews ALL changed files independently. Opus-class reasoning catches issues that require drawing on broad software engineering principles, not just pattern-matching against checklists.
36
36
 
37
37
  <surface_scan_agent>
38
38
 
39
- ### 1. Surface Scan Agent
39
+ ### 1. Surface Scan Agent (Runtime)
40
40
 
41
- Catches per-file bugs: runtime crashes, hygiene, domain-specific issues, quality, and convention violations.
41
+ Catches per-file RUNTIME bugs: crashes, type/coercion errors, async/state, error handling, streaming, plus domain-specific runtime patterns (SQL, shell, wire protocols, accessibility).
42
42
 
43
43
  !`cat ~/.claude/lib/review-surface-scan.md`
44
44
 
45
45
  </surface_scan_agent>
46
46
 
47
+ <surface_quality_agent>
48
+
49
+ ### 2. Surface Quality Agent
50
+
51
+ Catches per-file QUALITY issues: intent-vs-implementation drift, AI-generated code patterns, dead config, missing tests, supply chain hygiene, style.
52
+
53
+ !`cat ~/.claude/lib/review-surface-quality.md`
54
+
55
+ </surface_quality_agent>
56
+
47
57
  <security_agent>
48
58
 
49
- ### 2. Security Audit Agent
59
+ ### 3. Security Audit Agent
50
60
 
51
61
  Catches trust boundary violations, injection, SSRF, data exposure, and access control gaps.
52
62
 
@@ -54,15 +64,25 @@ Catches trust boundary violations, injection, SSRF, data exposure, and access co
54
64
 
55
65
  </security_agent>
56
66
 
57
- <cross_file_agent>
67
+ <cross_file_tracing_agent>
58
68
 
59
- ### 3. Cross-File Tracing Agent
69
+ ### 4. Cross-File Tracing Agent (State/Lifecycle)
60
70
 
61
- Catches contract mismatches, broken call chains, stale state propagation, lifecycle gaps, and architectural violations.
71
+ Catches STATE/LIFECYCLE issues across files: stale state propagation, lifecycle gaps (mount/unmount, init/cleanup, started/completed), resource leaks, lock/flag exit paths, concurrent-mutation races.
62
72
 
63
73
  !`cat ~/.claude/lib/review-cross-file-tracing.md`
64
74
 
65
- </cross_file_agent>
75
+ </cross_file_tracing_agent>
76
+
77
+ <cross_file_contract_agent>
78
+
79
+ ### 5. Cross-File Contract Agent
80
+
81
+ Catches CONTRACT issues across files: schema/shape agreements, validation parity, error classification, field-set enumerations, intent-vs-implementation claims spanning files, architectural-pattern adherence.
82
+
83
+ !`cat ~/.claude/lib/review-cross-file-contract.md`
84
+
85
+ </cross_file_contract_agent>
66
86
 
67
87
  ### How to dispatch
68
88
 
@@ -72,7 +92,7 @@ For each agent, construct its prompt by combining:
72
92
  3. The list of changed files from the diff stat
73
93
  4. Instruction: "Read each changed file in full (not just diff hunks). Apply your checklist. Return structured findings."
74
94
 
75
- Spawn all three agents simultaneously. Each returns its findings independently.
95
+ Spawn all five agents simultaneously. Each returns its findings independently.
76
96
 
77
97
  ### Large PR handling
78
98
 
@@ -80,10 +100,10 @@ If the diff touches more than 20 files, tell each agent to batch files by direct
80
100
 
81
101
  ## Collect & Deduplicate
82
102
 
83
- After all three agents return:
103
+ After all five agents return:
84
104
 
85
105
  1. **Merge** all findings into a single list, tagged by source agent
86
- 2. **Deduplicate**: if two agents flagged the same `file:line` with overlapping descriptions, keep the most detailed version and note both agents found it
106
+ 2. **Deduplicate**: if two agents flagged the same `file:line` with overlapping descriptions, keep the most detailed version and note all agents that found it (overlap between Surface Scan and Surface Quality, or between Cross-File Tracing and Cross-File Contract, is expected for borderline issues — that's signal a finding is real, not noise)
87
107
  3. **PR coherence**: verify commits deliver what they claim — flag discrepancies as IMPROVEMENT findings
88
108
  4. **CLAUDE.md filter**: remove findings that conflict with explicit project conventions
89
109
 
@@ -116,13 +136,15 @@ Print a summary table of what was reviewed and found:
116
136
 
117
137
  | Agent | Files Checked | Issues Found | Fixed |
118
138
  |-------|--------------|-------------|-------|
119
- | Surface Scan | N | N | N |
139
+ | Surface Scan (Runtime) | N | N | N |
140
+ | Surface Quality | N | N | N |
120
141
  | Security Audit | N | N | N |
121
- | Cross-File Tracing | N | N | N |
142
+ | Cross-File Tracing (State) | N | N | N |
143
+ | Cross-File Contract | N | N | N |
122
144
  | **Total** | **N** | **N** | **N** |
123
145
 
124
146
  ### Issues Fixed
125
- - file:line — description of fix (agent: Surface/Security/Cross-File)
147
+ - file:line — description of fix (agent: Surface-Scan / Surface-Quality / Security / Cross-File-Tracing / Cross-File-Contract)
126
148
 
127
149
  ### Accepted As-Is (with rationale)
128
150
  - file:line — description and why it's acceptable
@@ -87,12 +87,12 @@ Verify the request was accepted by checking that `Copilot` appears in the respon
87
87
 
88
88
  ### Poll for review completion
89
89
 
90
- Poll using GraphQL to check for a new review with a `submittedAt` timestamp after the request:
90
+ Poll using GraphQL to check for a new review with a `submittedAt` timestamp after the request. Use stdin JSON piping (per the GraphQL escaping guidance) to avoid shell-quoting fragility:
91
91
  ```bash
92
- gh api graphql -f query='{ repository(owner: "OWNER", name: "REPO") { pullRequest(number: PR_NUM) { reviews(last: 3) { nodes { state body author { login } submittedAt } } reviewThreads(first: 100) { nodes { id isResolved comments(first: 3) { nodes { body path line author { login } } } } } } } }'
92
+ echo '{"query":"{ repository(owner: \"OWNER\", name: \"REPO\") { pullRequest(number: PR_NUM) { reviews(last: 3) { nodes { state body author { login } submittedAt } } reviewThreads(first: 100) { nodes { id isResolved comments(first: 3) { nodes { body path line author { login } } } } } } } }"}' | gh api graphql --input -
93
93
  ```
94
94
 
95
- **Dynamic poll timing**: Before your first poll, check how long the most recent Copilot review on this PR took by comparing consecutive Copilot review `submittedAt` timestamps (or PR creation time for the first review). Use that duration as your expected wait. If no prior review exists, default to 2 minutes. Use **progressive poll intervals**: 10s, 10s, 15s, 15s, then 30s thereafter — small diffs often complete in under a minute, so early frequent checks avoid wasting time. Set max wait to **2x the expected duration** (minimum 2 minutes, maximum 10 minutes). Copilot reviews typically complete in **2-5 minutes**; large diffs may take longer do NOT give up early.
95
+ **Dynamic poll timing**: Before your first poll, check how long the most recent Copilot review on this PR took by comparing consecutive Copilot review `submittedAt` timestamps (or PR creation time for the first review). Use that duration as your expected wait. If no prior review exists, default to **60 seconds**. Use **progressive poll intervals**: 5s, 5s, 10s, 10s, then 15s thereafter — Copilot reviews on small diffs typically land in **30–90 seconds**, so an early first check avoids burning a full minute on a review that's already sitting in the API. Set max wait to **3x the expected duration** (minimum 90 seconds, maximum 5 minutes); only large diffs (200+ changed lines) should ever approach the max. If the review hasn't arrived by then, treat it as stuck rather than slow.
96
96
 
97
97
  The review is complete when a new `copilot-pull-request-reviewer` review node appears. If no review appears after max wait: **Default mode**: auto-skip and continue. **Interactive mode (`--interactive`)**: ask the user whether to continue waiting, re-request, or skip.
98
98