npm - slash-do - Versions diffs - 2.4.0 → 2.6.0 - Mend

slash-do 2.4.0 → 2.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/commands/do/depfree.md CHANGED Viewed

@@ -1,13 +1,13 @@
 ---
 description: Audit third-party dependencies and remove unnecessary ones by writing replacement code
-argument-hint: "[--interactive] [--scan-only] [--no-merge] [specific packages to evaluate]"
+argument-hint: "[--interactive] [--scan-only] [--no-merge] [--heavy] [specific packages to evaluate]"
 ---
 # Depfree — Dependency Freedom Audit
 Audit all third-party dependencies, classify them as acceptable (large, widely-audited) or suspect (small, replaceable), analyze actual usage of suspect dependencies, and replace them with owned code where feasible.
-Every small library is an attack surface. Supply chain compromises are real and common. Large, widely-audited libraries (express, react, d3, three.js, next, vue, fastify, lodash-es, etc.) are acceptable. But for smaller libraries or libraries where only one helper function is used, we should write the code ourselves.
+Every small library is an attack surface. Supply chain compromises are real and common. In default mode, large, widely-audited libraries (express, react, d3, three.js, next, vue, fastify, lodash-es, etc.) are acceptable. But for smaller libraries or libraries where only one helper function is used, we should write the code ourselves. In heavy mode, the acceptability bar is much higher — see the Heavy Mode section below.
 **Default mode: fully autonomous.** Uses Balanced model profile, proceeds through all phases without prompting.
@@ -17,8 +17,11 @@ Parse `$ARGUMENTS` for:
 - **`--interactive`**: pause at each decision point for user approval
 - **`--scan-only`**: run Phase 0 + 1 + 2 only (audit and plan), skip remediation
 - **`--no-merge`**: run through PR creation, skip merge
+- **`--heavy`**: aggressive mode — only keep foundational frameworks and language runtimes; replace everything else that is feasibly replaceable (see Heavy Mode below)
 - **Specific packages**: limit audit scope to named packages (e.g., "chalk dotenv")
+Set `HEAVY_MODE` to `true` if `--heavy` was passed, `false` otherwise.
 ## Configuration
 ### Default Mode (autonomous)
@@ -48,6 +51,18 @@ Record the selection as `MODEL_PROFILE` and derive:
 When the resolved model is `opus`, **omit** the `model` parameter on the Agent call so the agent inherits the session's Opus version.
+## Heavy Mode (`--heavy`)
+Heavy mode shifts the philosophy from "remove obvious attack surface" to "own everything we feasibly can." The only dependencies that survive are foundational frameworks, core platform tooling, and language-level runtimes — the kind maintained by large teams with dedicated security processes. Everything else is a candidate for replacement.
+Key behavioral changes when `HEAVY_MODE` is `true`:
+1. **Tier 1 is narrowed** to only foundational frameworks and language runtimes (see Phase 1b overrides below). Libraries like lodash, chalk, dotenv, commander, yargs, uuid, axios, etc. are NOT Tier 1 in heavy mode — they move to Tier 2 or 3.
+2. **EVALUATE recommendations become REMOVE** — the bias flips from "when in doubt, keep" to "when in doubt, replace."
+3. **Complexity ceiling rises** — replacements up to 300 lines are acceptable (vs the default where agents bail at ~2x estimate). Only truly infeasible replacements (deep domain expertise, crypto primitives, protocol parsers) are skipped.
+4. **Maintenance status is irrelevant** — even well-maintained small libraries are candidates. The question is "can we own this code?" not "is this library risky?"
+5. **DevDependencies get equal priority** — build tools and test utilities are audited with the same aggression as production dependencies (overriding the default Phase 1a deprioritization of devDependencies).
 ## Compaction Guidance
 When compacting during this workflow, always preserve:
@@ -57,6 +72,7 @@ When compacting during this workflow, always preserve:
 - All PR numbers and URLs created so far
 - `BUILD_CMD`, `TEST_CMD`, `PROJECT_TYPE`, `WORKTREE_DIR`, `REPO_DIR` values
 - `VCS_HOST`, `CLI_TOOL`, `DEFAULT_BRANCH`, `CURRENT_BRANCH`
+- `HEAVY_MODE` flag
 ## Phase 0: Discovery & Setup
@@ -126,6 +142,8 @@ For each dependency, classify it into one of three tiers:
 **Tier 1 — ACCEPTABLE (keep without question):**
 Large, widely-audited, foundational libraries. Examples by ecosystem:
+**Default mode:**
 - **Node.js**: react, next, vue, express, fastify, hono, typescript, eslint, prettier, webpack, vite, jest, vitest, mocha, d3, three, prisma, drizzle, @types/*, tailwindcss, postcss
 - **Rust**: tokio, serde, clap, reqwest, hyper, tracing, sqlx, axum, actix-web
 - **Python**: django, flask, fastapi, sqlalchemy, pandas, numpy, scipy, pytest, requests, httpx, pydantic
@@ -133,8 +151,21 @@ Large, widely-audited, foundational libraries. Examples by ecosystem:
 - **Ruby**: rails, rspec, sidekiq, puma, devise
 - Any dependency with >10M weekly downloads (npm) or equivalent popularity metric for the ecosystem
+**Heavy mode (`HEAVY_MODE=true`) — Tier 1 is restricted to foundational frameworks, core platform tooling, and runtimes:**
+- **Node.js**: react, next, vue, express, fastify, typescript, webpack, vite, tailwindcss, postcss, prisma, drizzle
+- **Rust**: tokio, serde, hyper, sqlx, axum, actix-web
+- **Python**: django, flask, fastapi, sqlalchemy, pandas, numpy, scipy, pydantic
+- **Go**: standard library only
+- **Ruby**: rails, puma
+- Download count is NOT a factor — popularity does not exempt a library from replacement
+- Libraries that are wrappers, utilities, CLIs, or single-purpose tools are Tier 2 or 3 regardless of popularity
+- Linting/formatting tools (eslint, prettier) in heavy mode: remain Tier 1 when required by CI or organization-wide standards (do not attempt replacement); otherwise treat as Tier 2 (audit usage, but do not rewrite their behavior)
+- Examples of libraries that DROP from Tier 1 in heavy mode: lodash, chalk, commander, yargs, dotenv, uuid, axios, node-fetch, glob, minimatch, semver, debug, winston, morgan, cors, helmet, body-parser, cookie-parser, compression, color, ora, inquirer, boxen, marked, highlight.js, moment, dayjs, date-fns, underscore, ramda, rxjs (if only basic operators used), jest (if vitest is also present — deduplicate), mocha, d3 (unless the visualization requires it), three (unless 3D rendering is core), rspec, sidekiq, devise, requests, httpx, pytest, clap, reqwest, tracing
 **Tier 2 — SUSPECT (audit usage):**
-Smaller libraries that may be doing something we can write ourselves. Indicators:
+Smaller libraries that may be doing something we can write ourselves.
+**Default mode indicators:**
 - <1M weekly downloads (npm) or equivalent
 - Single-purpose utility (does one thing)
 - We only use 1-2 functions from it
@@ -142,8 +173,19 @@ Smaller libraries that may be doing something we can write ourselves. Indicators
 - Libraries that replicate functionality available in newer language/runtime versions
 - Abandoned or unmaintained (no commits in 12+ months, open security issues)
+**Heavy mode additional indicators** (these move libraries INTO Tier 2 that would otherwise be Tier 1):
+- Any library maintained by an individual or small team (not a major org/foundation)
+- Any library where we use <50% of its API surface
+- Utility collections where we use a handful of functions (lodash, ramda, underscore)
+- HTTP clients when the runtime has built-in fetch (axios, node-fetch, got, superagent)
+- Logging libraries (winston, pino, morgan, debug) — evaluate if a thin wrapper over console suffices
+- CLI argument parsers (commander, yargs, minimist) — evaluate if process.argv parsing is feasible
+- Test runners if multiple are present — deduplicate to one
 **Tier 3 — REMOVABLE (strong candidate for replacement):**
 Libraries where the cost of owning the code is clearly lower than the supply chain risk:
+**Default mode:**
 - We use a single function that's <50 lines to implement
 - The library wraps a built-in API with minimal added value
 - The library is unmaintained with known vulnerabilities
@@ -154,6 +196,23 @@ Libraries where the cost of owning the code is clearly lower than the supply cha
 - `dotenv` when the runtime supports `--env-file` natively
 - `is-odd`, `is-number`, `left-pad` tier micro-packages
+**Heavy mode — Tier 3 expands significantly:**
+All of the above, PLUS:
+- Any library where the replacement is <=300 lines of owned code (up from ~50 in default)
+- Utility libraries where we use any subset of functions, even if heavily used (write an owned utils module)
+- HTTP client wrappers — replace with native `fetch` + a thin owned wrapper
+- Color/terminal libraries regardless of how many functions we use (chalk, colors, kleur, ansi-colors) — write an ANSI utility
+- Argument parsers for CLIs with <20 flags — write a simple parser
+- Environment loaders (dotenv, envalid, env-var) — use runtime flags or write a loader
+- Date libraries if we use <10 functions (moment, dayjs, date-fns) — write owned date helpers
+- Glob/path matching (glob, minimatch, micromatch) if usage is simple — use native `fs.glob` (Node 22+) or write a matcher
+- String utilities (camelcase, slugify, pluralize, humanize) — write the specific transformations used
+- Validation libraries where we use <30% of their schemas (joi, yup, zod) — write focused validators
+- Retry/backoff libraries (p-retry, async-retry) — write a retry function
+- Deep equality/diff (deep-equal, fast-deep-equal, deep-diff) — write what's needed for actual use cases
+- Event emitter libraries (eventemitter3, mitt) — use native EventEmitter or EventTarget
+- Markdown parsers if only rendering basic markdown — consider native or minimal owned parser
 Record the full classification as `DEPENDENCY_MAP`.
 ### 1c: Usage Analysis (Tier 2 & 3 only)
@@ -191,7 +250,7 @@ Wait for all agents to complete before proceeding.
 1. Read the existing `PLAN.md` (create if it doesn't exist)
 2. Filter to only REMOVE recommendations from Phase 1c
-3. For EVALUATE recommendations: **Default mode** — treat as KEEP (conservative). **Interactive mode** — present to user via `AskUserQuestion` for each
+3. For EVALUATE recommendations: **Default mode** — treat as KEEP (conservative). **Heavy mode** — treat as REMOVE (aggressive). **Interactive mode** — present to user via `AskUserQuestion` for each. If both `--interactive` and `--heavy` are set, still prompt for each EVALUATE item (interactive takes precedence), but present REMOVE as the default suggestion
 4. Group removable dependencies by replacement strategy:
    - **Native replacement**: built-in API replaces the library (e.g., `crypto.randomUUID()`)
    - **Inline replacement**: write a small utility function (e.g., ANSI color wrapper)
@@ -296,7 +355,7 @@ Steps:
 - Do NOT introduce new dependencies to replace old ones
 - Do NOT use `git add -A` or `git add .` — stage specific files only
 - Keep replacement code minimal
-- If replacement is more complex than estimated (>2x the estimated lines), report back and skip — do not force a bad replacement
+- If replacement is more complex than estimated (>2x the estimated lines), report back and skip — do not force a bad replacement. In `HEAVY_MODE`, the ceiling is 300 lines per replacement — only skip if replacement requires deep domain expertise (crypto primitives, binary protocol parsers, codec implementations) or exceeds 300 lines
 - Place shared utility replacements in a sensible location (e.g., `src/utils/`, `lib/`, `internal/`) following existing project conventions
 - Commit each replacement independently: `refactor: replace {package} with owned {utility/code}`
 </guardrails>
@@ -396,10 +455,15 @@ Create the PR:
 **GitHub:**
 ```bash
-gh pr create --head depfree/{DATE} --base {DEFAULT_BRANCH} \
-  --title "refactor: remove {N} unnecessary dependencies" \
-  --body "$(cat <<'EOF'
-## Depfree Audit — Dependency Removal
+HEAVY_SUFFIX=""
+HEAVY_HEADING=""
+if [ "$HEAVY_MODE" = "true" ]; then
+  HEAVY_SUFFIX=" (heavy mode)"
+  HEAVY_HEADING=" (Heavy Mode)"
+fi
+PR_TITLE="refactor: remove {N} unnecessary dependencies${HEAVY_SUFFIX}"
+PR_BODY="## Depfree Audit — Dependency Removal${HEAVY_HEADING}
 ### Summary
 Removed {N} unnecessary third-party dependencies and replaced with owned code.
@@ -420,9 +484,11 @@ Estimated supply chain attack surface reduction: {N} packages ({transitive count
 - [ ] Build passes
 - [ ] All tests pass
 - [ ] No phantom references to removed packages
-- [ ] Lock file updated
-EOF
-)"
+- [ ] Lock file updated"
+gh pr create --head depfree/{DATE} --base {DEFAULT_BRANCH} \
+  --title "$PR_TITLE" \
+  --body "$PR_BODY"
 ```
 **GitLab:**
@@ -522,8 +588,10 @@ Transitive deps eliminated: ~{count} (estimated)
 - This command complements `/do:better` — run `depfree` for dependency hygiene, `better` for code quality
 - All remediation happens in an isolated worktree — the user's working directory is never modified
-- The threshold for "acceptable" libraries is deliberately generous — the goal is to remove obvious attack surface, not to rewrite everything
+- **Default mode**: the threshold for "acceptable" libraries is deliberately generous — the goal is to remove obvious attack surface, not to rewrite everything
+- **Heavy mode**: the threshold narrows to foundational frameworks only — the goal is to own as much code as feasibly possible, eliminating supply chain risk from individual maintainers and small projects
 - Replacement code should be minimal and focused — don't over-engineer utilities that replace single-purpose packages
-- When in doubt, keep the dependency. A maintained library is better than a buggy reimplementation
-- devDependencies are lower priority since they don't ship to production, but unmaintained build tools still pose supply chain risk
+- **Default mode**: when in doubt, keep the dependency. A maintained library is better than a buggy reimplementation
+- **Heavy mode**: when in doubt, replace it. Write owned code unless the replacement requires crypto primitives, binary protocol parsing, or deep domain expertise that would be unsafe to reimplement
+- **Default mode**: devDependencies are lower priority since they don't ship to production. **Heavy mode**: devDependencies are audited on par with production deps — unmaintained build tools still pose supply chain risk
 - For monorepos, audit the root manifest and each workspace package manifest

package/commands/do/help.md CHANGED Viewed

@@ -14,7 +14,7 @@ List all available `/do:*` commands with their descriptions.
 |---|---|
 | `/do:better` | Unified DevSecOps audit, remediation, per-category PRs, CI verification, and Copilot review loop |
 | `/do:better-swift` | SwiftUI-optimized DevSecOps audit with multi-platform coverage (iOS, macOS, watchOS, tvOS, visionOS) |
-| `/do:depfree` | Audit third-party dependencies and remove unnecessary ones by writing replacement code |
+| `/do:depfree` | Audit third-party dependencies and remove unnecessary ones by writing replacement code. Use `--heavy` for aggressive mode that targets all non-foundational libraries for replacement where feasible |
 | `/do:fpr` | Commit, push to fork, and open a PR against the upstream repo |
 | `/do:goals` | Scan codebase to infer project goals, clarify with user, and generate GOALS.md |
 | `/do:help` | List all available slashdo commands |

package/commands/do/rpr.md CHANGED Viewed

@@ -15,7 +15,16 @@ Address the latest code review feedback on the current branch's pull request usi
 1. **Get the current PR and determine repo ownership**: Use `gh pr view --json number,url,reviewDecision,reviews,headRefName,baseRefName` to find the PR for this branch. Parse owner/name from `gh repo view --json owner,name`. Also check the PR's base repository owner — if the PR targets an upstream repo you don't own (i.e., a fork-to-upstream PR), note this as `is_fork_pr=true`. You can detect this by comparing the PR URL's owner against your authenticated user (`gh api user --jq .login`).
-2. **Request Copilot code review** (only if `is_fork_pr=false`): Follow the "Requesting GitHub Copilot Code Review" section below to request a review, then poll until the review is complete before proceeding. **Skip this step entirely for fork-to-upstream PRs** — you don't have permission to request reviewers on repos you don't own.
+2. **Check for existing code review** (only if `is_fork_pr=false`): Before requesting a new review, check if there's already a completed Copilot review or a pending Copilot review in progress. Query the PR's review requests and recent reviews:
+   ```bash
+   gh api graphql -f query='{ repository(owner: "OWNER", name: "REPO") { pullRequest(number: PR_NUM) { reviewRequests(first: 10) { nodes { requestedReviewer { ... on Bot { login } } } } reviews(last: 20) { nodes { state body author { login } submittedAt } } } } }'
+   ```
+   - **If at least one completed Copilot review exists** (a review in `reviews.nodes` authored by `copilot-pull-request-reviewer`): Skip requesting a new review — proceed directly to step 3 to fetch and address the existing feedback threads.
+   - **If a Copilot review is currently pending** (Copilot appears in `reviewRequests.nodes[].requestedReviewer` as `copilot-pull-request-reviewer`): Treat the review as in progress. Poll for completion using the "Poll for review completion" section below, and consider it complete once a new Copilot review appears in `reviews.nodes` with a `submittedAt` timestamp later than the latest Copilot review timestamp you observed before starting to poll. Then proceed to step 3.
+   - **If no Copilot review exists and no Copilot review is currently requested**: Request a new Copilot review per the "Requesting GitHub Copilot Code Review" section below, poll until complete, then proceed.
+   - **Skip this step entirely for fork-to-upstream PRs** — you don't have permission to request reviewers on repos you don't own.
+   **While waiting for review**: During any review polling wait, check CI status in parallel (see "CI Health Check During Review Polling" section below). If CI is failing, fix the failures, commit, and push before the review completes. This avoids wasting a review cycle on code that won't pass CI anyway.
 3. **Fetch review comments**: Use `gh api graphql` with stdin JSON to get all unresolved review threads. **CRITICAL: Do NOT use `$variables` in GraphQL queries — shell expansion consumes `$` signs.** Always inline values and pipe JSON via stdin:
    ```bash
@@ -47,16 +56,18 @@ Address the latest code review feedback on the current branch's pull request usi
    - Stage all changed files and commit with a descriptive message summarizing what was addressed. Do not include co-author info.
    - Push to the branch.
-8. **Resolve conversations**: For each addressed thread, resolve it via GraphQL mutation using stdin JSON. Track resolution count against the total from step 3. **Never use `$variables` in the query — inline the thread ID directly**:
+7. **Resolve conversations**: For each addressed thread, resolve it via GraphQL mutation using stdin JSON. Track resolution count against the total from step 3. **Never use `$variables` in the query — inline the thread ID directly**:
    ```bash
    echo '{"query":"mutation { resolveReviewThread(input: {threadId: \"THREAD_ID_HERE\"}) { thread { id isResolved } } }"}' | gh api graphql --input -
    ```
-9. **Request another Copilot review** (only if `is_fork_pr=false`): After pushing fixes, request a fresh Copilot code review and repeat from step 3 until the review passes clean. **Skip for fork-to-upstream PRs.**
+8. **Request another Copilot review** (only if `is_fork_pr=false`): After pushing fixes, request a fresh Copilot code review and repeat from step 3 until the review passes clean. **Skip for fork-to-upstream PRs.**
+   **While waiting for review**: Check CI status in parallel during polling (see "CI Health Check During Review Polling" section below). Fix any CI failures before the review completes.
    **Repeated-comment dedup**: When fetching threads after a new Copilot review round, compare each new unresolved thread's comment body and file/line against threads from the previous round that were intentionally left unresolved (replied to as non-issues or disagreements). If all new unresolved threads are repeats of previously-dismissed feedback, treat the review as clean (no new actionable comments) and exit the loop.
-10. **Report summary**: Print a table of all threads addressed with file, line, and a brief description of the fix. Include a final count line: "Resolved X/Y threads." If any threads remain unresolved, list them with reasons (unclear feedback, disagreement, requires user input).
+9. **Report summary**: Print a table of all threads addressed with file, line, and a brief description of the fix. Include a final count line: "Resolved X/Y threads." If any threads remain unresolved, list them with reasons (unclear feedback, disagreement, requires user input).
 !`cat ~/.claude/lib/graphql-escaping.md`
@@ -86,6 +97,27 @@ The review is complete when a new `copilot-pull-request-reviewer` review node ap
 **Error detection**: After a review appears, check its `body` for error text such as "Copilot encountered an error" or "unable to review this pull request". If found, this is NOT a successful review — log a warning, re-request the review (same API call above), and resume polling. Allow up to 3 error retries. After 3 failures: **Default mode**: auto-skip and continue. **Interactive mode (`--interactive`)**: ask the user whether to continue or skip.
+## CI Health Check During Review Polling
+While polling for a Copilot review to complete, use the wait time productively by checking CI status:
+1. **Check for running/completed checks** on the current commit:
+   ```bash
+   gh pr checks --json name,state,conclusion,detailsUrl
+   ```
+2. **If any check has failed**: Extract the run ID from the failed check's `detailsUrl` and fetch logs:
+   ```bash
+   RUN_ID="$(gh pr checks --json name,conclusion,detailsUrl \
+     --jq '.[] | select(.conclusion=="FAILURE") | .detailsUrl | capture("/runs/(?<id>[0-9]+)") | .id' \
+     | head -n1)"
+   gh run view "$RUN_ID" --log-failed
+   ```
+   Fix the failure, run tests locally to confirm, commit, and push. The Copilot review request will automatically apply to the new commit on most repos — if not, re-request after the push.
+3. **If checks are still pending**: No action needed — continue polling for the review. Check CI status again on subsequent poll iterations.
+4. **If all checks pass**: No action needed — continue polling for the review.
+This ensures CI failures are caught and fixed early rather than discovered after a full review cycle.
 ## Notes
 - Only resolve threads where you've actually addressed the feedback

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "slash-do",
-  "version": "2.4.0",
+  "version": "2.6.0",
   "description": "Curated slash commands for AI coding assistants — Claude Code, OpenCode, Gemini CLI, and Codex",
   "author": "Adam Eivy <adam@eivy.com>",
   "license": "MIT",