npm - agentscamp - Versions diffs - 0.3.0 → 0.4.0 - Mend

agentscamp 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/README.md +3 -3
package/content/commands/add-caching.md +79 -0
package/content/commands/audit-accessibility.md +101 -0
package/content/commands/clean-branches.md +113 -0
package/content/commands/review-tests.md +98 -0
package/content/commands/scaffold-github-action.md +94 -0
package/content/commands/setup-precommit-hooks.md +72 -0
package/content/commands/write-design-doc.md +78 -0
package/content/manifest.json +214 -3
package/content/skills/connection-pool-tuner.md +46 -0
package/content/skills/dependency-upgrade-planner.md +42 -0
package/content/skills/memory-leak-hunter.md +35 -0
package/content/skills/pagination-designer.md +51 -0
package/content/skills/property-test-designer.md +63 -0
package/content/skills/security-headers-hardener.md +79 -0
package/content/skills/slo-definer.md +38 -0
package/content/skills/structured-logging-designer.md +42 -0
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # agentscamp
-> 153 ready-to-use Claude Code agents, skills, and slash commands — installable in one command.
+> 168 ready-to-use Claude Code agents, skills, and slash commands — installable in one command.
 [AgentsCamp](https://agentscamp.com) is a curated, format-validated directory of AI coding artifacts. This CLI bundles the full catalog and installs items straight into your `.claude/` directory.
@@ -43,8 +43,8 @@ These are Claude Code's standard locations — agents get delegated to automatic
 ## What's inside
 - **58 agents** — specialized subagents for development, data/AI, infra, security, and more → [browse agents](https://agentscamp.com/agents)
-- **52 skills** — on-demand capabilities for testing, databases, refactoring, releases → [browse skills](https://agentscamp.com/skills)
-- **43 commands** — reusable slash commands for planning, review, git, scaffolding → [browse commands](https://agentscamp.com/commands)
+- **60 skills** — on-demand capabilities for testing, databases, refactoring, releases → [browse skills](https://agentscamp.com/skills)
+- **50 commands** — reusable slash commands for planning, review, git, scaffolding → [browse commands](https://agentscamp.com/commands)
 Every item has a full page with docs, examples, and related picks at [agentscamp.com](https://agentscamp.com).

package/content/commands/add-caching.md ADDED Viewed

@@ -0,0 +1,79 @@
+---
+description: "Add a caching layer to one expensive function or endpoint correctly — confirm it's cacheable, design the cache key/TTL/layer/invalidation, handle stampedes, wrap the call in one place, and report the design."
+argument-hint: "<function or endpoint to cache>"
+allowed-tools: "Read, Grep, Glob, Edit"
+---
+## Scope
+Treat `$ARGUMENTS` as the single function or endpoint to add caching to — name it precisely (`getUserDashboard`, `GET /api/products/:id`, `computeRecommendations`). Restate the target in one sentence before touching anything.
+If `$ARGUMENTS` is empty, ask one question: *which function or endpoint is slow, and roughly how slow?* Do not guess and cache the wrong layer.
+> [!WARNING]
+> Caching is the second-best fix. Before adding a cache, check whether the cost is a missing index, an N+1, or an over-fetch — those should be fixed at the source, not papered over. Cache only after the work is genuinely expensive *and* repeated.
+## Step 1 — Confirm it is actually cacheable
+Read the target with `Read`/`Grep` and answer three questions before designing anything. If any answer is "no", stop and tell the user instead of caching:
+- **Deterministic enough?** Same inputs → same (or acceptably-close) output. A function that returns `now()`, a random sample, or live external state is not cacheable as-is.
+- **Read-heavy?** It's called far more than the underlying data changes. Caching a value that's read once per write saves nothing.
+- **Staleness-tolerant?** The caller can accept data that's a few seconds/minutes old. Balances, inventory counts, permissions, and auth checks usually cannot — say so and stop.
+## Step 2 — Locate and size the cost
+Find *what* is expensive inside the target so you cache the right boundary: a DB round-trip, an external API call, a heavy CPU computation, or fan-out. Grep the body for the query/fetch/compute that dominates. State the cost honestly ("one external API call, ~300ms, called per page load") so the TTL and layer choices below are grounded, not arbitrary.
+## Step 3 — Design the cache key
+This is the step that breaks correctness if done wrong. The key must include **every input that changes the result**:
+- the function's own arguments (normalized — sort/canonicalize so `{a,b}` and `{b,a}` collide intentionally, not accidentally);
+- the **identity scope**: user ID, tenant/org ID, or whatever isolation boundary the data belongs to;
+- request-shaping context that changes output: locale/language, feature flags, role/permission tier, currency;
+- a **version token** for the schema or serialization, so a deploy that changes the output shape doesn't serve old-shaped values.
+> [!WARNING]
+> An incomplete cache key is a cross-user data leak, not a perf nuisance. Omit the user/tenant from a per-user result and you will serve one account another account's data. When in doubt, over-scope the key — a too-specific key just lowers the hit rate; a too-broad key leaks.
+## Step 4 — Choose TTL and layer
+**TTL** = how stale the data is allowed to be, not a round number. Tie it to the write cadence: if the source changes every few minutes and 60s of staleness is fine, TTL is ~60s. A short TTL with no invalidation is often the simplest correct design.
+**Layer** — pick deliberately:
+- **In-process (LRU/`Map`):** fastest, zero infra, but **per-node** — caches diverge across a multi-instance fleet, and one node can serve stale data while another is fresh. Fine for single-instance, immutable, or short-TTL data.
+- **Shared (Redis/Memcached):** consistent across the fleet and survives restarts, at the cost of a network hop and a dependency. Use it when correctness across instances matters or the cache must be invalidated fleet-wide.
+> [!NOTE]
+> Don't reflexively reach for Redis. If the service runs as one process, or the data is effectively immutable for the TTL window, an in-process cache is simpler and faster. Reach for shared cache the moment you need explicit invalidation or cross-instance consistency.
+## Step 5 — Decide invalidation
+State exactly how a cached value stops being served:
+- **TTL expiry only** — simplest; acceptable when bounded staleness is fine. No write-path coupling.
+- **Explicit bust on write** — when a write must be visible immediately, delete/overwrite the key in the same code path that mutates the underlying data. The bust must reconstruct the *exact same key* from Step 3, or it deletes nothing. Co-locate the bust with the write so they can't drift apart.
+If the data is mutable and the user can't tolerate staleness, you need explicit invalidation — TTL alone will serve stale results until it expires.
+## Step 6 — Guard against the stampede
+When a hot key expires, many concurrent callers miss at once and all recompute the expensive work simultaneously (thundering herd) — the cache that was protecting the backend now amplifies load. Add one defense:
+- **Single-flight / request coalescing:** the first miss computes; concurrent callers for the same key await that one in-flight computation instead of launching their own.
+- **Jittered TTL:** add a small random spread to each TTL so keys populated together don't all expire on the same tick.
+Pick the one that fits the layer (single-flight for in-process is trivial; jitter is the cheap shared-cache option).
+## Step 7 — Implement at the boundary, not in the callers
+Wrap the expensive call **in one place** — a decorator, a cache-aside helper, or a thin wrapper around the function — so every caller benefits and the key/TTL/invalidation logic lives in exactly one spot. Use `Edit` to add the wrapper around the existing call site; do not sprinkle `cache.get`/`cache.set` through every caller (that's where keys drift and busts get forgotten). Keep the cache check, compute-on-miss, and store in the same function the call already flows through.
+> [!NOTE]
+> Cache-aside is the default shape: on call, look up the key; on hit return it; on miss compute, store with the TTL, return. Failures to reach the cache (e.g. Redis down) must fall through to computing the real value, never error the request.
+## Report
+Deliver, as your message: the **cache design** as a compact spec — **key** (every input included), **TTL** (with the staleness it implies), **layer** (in-process vs shared, and why), **invalidation** (TTL-only or explicit bust + where), and **stampede guard**. Then summarize the **change you made** (which boundary you wrapped, file:line). Close with the one verification step the user should run — confirm the hit rate and that a write is reflected within the expected window.

package/content/commands/audit-accessibility.md ADDED Viewed

@@ -0,0 +1,101 @@
+---
+description: "Audit a component or page for accessibility against WCAG — semantics, names, keyboard, ARIA, contrast, forms, motion."
+argument-hint: "<file, component, or page to audit>"
+allowed-tools: "Read, Grep, Glob"
+---
+Audit `$ARGUMENTS` for accessibility. Read the markup, reason about how a keyboard and screen-reader user would actually experience it, and report concrete WCAG-grounded problems with fixes. Do not modify any files — the findings are the whole deliverable.
+## Scope
+`$ARGUMENTS` is the thing to audit — a component file (`components/Modal.tsx`), a page/route, or a directory of views. Audit the rendered markup and the props that shape it, not the styling system in the abstract.
+If `$ARGUMENTS` is empty, do not guess. Ask one focused question: *"Which file, component, or page should I audit for accessibility?"*
+> [!WARNING]
+> Read-only mode. Use only Read, Grep, and Glob. Do not edit files or "fix" anything inline — propose fixes in the report.
+> [!CAUTION]
+> Automated tools (axe, Lighthouse) catch roughly a third of WCAG issues — mostly contrast and missing-attribute checks. Whether a control is keyboard-operable, whether its accessible *name* matches its visible label, and whether ARIA actually describes the behavior require the manual reasoning this command exists to do. Do not report "axe found nothing" as a pass.
+## Step 1 — Read the target and map the interactive surface
+Open `$ARGUMENTS` and list every interactive element and every image/icon. These are where accessibility breaks.
+```bash
+# Find controls faking buttons, and clickable non-buttons
+rg -n "onClick|onKeyDown|role=|tabIndex|<div|<span|<a " $ARGUMENTS
+```
+- For each control, note: what native element is it, what does it *do*, and what name would a screen reader announce.
+- For each `<img>`/SVG/icon, note whether it is meaningful (needs a name) or decorative (needs `alt=""`/`aria-hidden`).
+## Step 2 — Semantic HTML before anything else
+The single highest-leverage check. A real native element gives you role, focus, keyboard handling, and state for free.
+- **div-soup buttons** — `<div onClick>` / `<span onClick>` acting as a button. It is not focusable, not Enter/Space-operable, and has no role. Fix: use `<button type="button">`, not `<div role="button" tabIndex={0} onKeyDown>`.
+- **Heading order** — headings must descend without skipping (`h1 → h2 → h3`), and there is exactly one `h1` per page. A skipped level (`h1` then `h3`) breaks screen-reader navigation. Styling ≠ level — use CSS for size.
+- **Landmarks** — real `<nav>`, `<main>`, `<header>`, `<footer>` so users can jump by region. A page that is all `<div>` has no landmarks.
+- **Lists / tables** — repeated items should be `<ul>`/`<ol>`; tabular data should be a `<table>` with `<th scope>`, not a CSS grid of divs.
+> [!NOTE]
+> WCAG 1.3.1 (Info and Relationships) and 4.1.2 (Name, Role, Value) are violated by div-soup more than by anything else. Reach for a native element first; only add ARIA when no native element expresses the pattern.
+## Step 3 — Accessible names
+Every interactive element and meaningful image needs a name a screen reader can announce.
+- **Icon-only buttons** — a button whose only child is an SVG announces as "button", unlabeled. Fix: `aria-label="Close"` (or visually-hidden text). Confirm the label matches the visible/intended purpose.
+- **Images** — meaningful `<img>` needs descriptive `alt`; decorative ones need `alt=""` so they are skipped. `alt="image"` or a filename is a failure (1.1.1).
+- **Links** — "click here" / "read more" out of context fails 2.4.4. The link text should name the destination.
+- **Visible label vs accessible name** — if a control shows "Submit" but has `aria-label="Send form"`, voice-control users saying "click Submit" can't activate it (2.5.3). The accessible name must contain the visible text.
+## Step 4 — Keyboard operability
+Everything a mouse can do, a keyboard must do (2.1.1), and the path must be visible and escapable.
+- **Focusable** — every interactive element reachable by Tab. Custom controls built on `<div>` are not (see Step 2).
+- **Visible focus** — there is a focus indicator; flag `outline: none` / `:focus { outline: 0 }` without a replacement (2.4.7).
+- **Logical tab order** — DOM order matches reading order; flag positive `tabIndex` values (`tabIndex={1+}`), which hijack order and almost always cause bugs. `tabIndex={0}`/`{-1}` are fine.
+- **No keyboard trap** — modals/menus must be escapable (Esc) and must not trap Tab outside themselves (2.1.2). A modal should move focus in on open, trap *within* while open, and restore focus to the trigger on close.
+## Step 5 — ARIA correctness (and restraint)
+ARIA only changes how assistive tech perceives an element — it adds no behavior. Wrong ARIA is worse than none.
+- **Redundant ARIA on native elements** — `<button role="button">`, `<nav role="navigation">`, `<a href role="link">` are noise; `<ul role="list">` can even *strip* list semantics in some browsers. Remove it.
+- **State must track behavior** — a toggle needs `aria-expanded` that flips with the panel; a tab needs `aria-selected`; a checkbox-div needs `aria-checked` that updates. Static or stale state lies to the user (4.1.2).
+- **Referenced IDs must exist** — `aria-labelledby` / `aria-describedby` / `aria-controls` pointing at an absent or duplicated `id` resolves to nothing.
+- **`aria-hidden` on focusable content** — hiding an element that still contains a tabbable control creates a "phantom" focus stop announced as nothing.
+> [!WARNING]
+> The first rule of ARIA is don't use ARIA. If you find `role=`/`aria-*` bolted onto an element that has a native equivalent, the fix is almost always to delete the ARIA and switch to the native element, not to "correct" the attributes.
+## Step 6 — Contrast, forms, and motion
+- **Contrast (likely, not measured)** — you cannot compute exact ratios from source, so flag *risk*: light-grey text on white, text over images/gradients with no scrim, placeholder text used as a label, disabled states. Recommend ≥ 4.5:1 for body text, ≥ 3:1 for large text and UI/focus indicators (1.4.3, 1.4.11), and confirm with a contrast checker.
+- **Form labels** — every input needs a programmatic label: `<label htmlFor>` matching the input `id`, or wrapping `<label>`. A placeholder is not a label (it vanishes on input, 1.3.1/3.3.2).
+- **Error association** — validation errors must be tied to the field via `aria-describedby` and signalled with `aria-invalid`, not by color alone (1.4.1/3.3.1).
+- **Motion / autoplay** — auto-playing carousels, looping video, or large parallax/animation must be pausable and should respect `prefers-reduced-motion` (2.2.2, 2.3.3).
+## Report
+Deliver findings as your message, grouped by severity. For each finding give four things: the **WCAG-grounded problem**, the **location** (`file:line` you opened), the **user impact** (who is blocked and how), and the **concrete fix** (prefer a native element over ARIA).
+```markdown
+## Critical (blocks a user from completing a task)
+- [keyboard] `components/Menu.tsx:42` — `<div onClick>` dropdown trigger isn't focusable or Enter/Space-operable.
+  Impact: keyboard-only users cannot open the menu at all.
+  Fix: `<button type="button" aria-expanded={open} aria-controls="menu-list">` — drop the div + manual onKeyDown.
+## Serious (degraded but workable)
+- [name] `components/Header.tsx:18` — icon-only close button has no accessible name.
+  Impact: screen reader announces "button", purpose unknown.
+  Fix: add `aria-label="Close"`.
+## Moderate / Advisory
+- [contrast risk] `components/Card.tsx:60` — `text-gray-400` on white may fall below 4.5:1; verify with a checker.
+```
+Tag each finding (`[semantics]`, `[name]`, `[keyboard]`, `[aria]`, `[contrast]`, `[form]`, `[motion]`) and cite the exact line. End with the single highest-impact fix to make first — or, if the target is clean, say so and name the strongest pattern you saw (e.g. native button + visible focus + labeled inputs).

package/content/commands/clean-branches.md ADDED Viewed

@@ -0,0 +1,113 @@
+---
+description: "Safely prune merged and stale Git branches: drop dead remote-tracking refs, list merged candidates for review, then delete with the safe -d variant."
+allowed-tools: "Bash, Read"
+---
+This command takes no arguments. It prunes branches that are demonstrably safe to remove and hands everything else back for a human decision. The default posture is to delete nothing you cannot prove is merged.
+## Scope
+Ignore `$ARGUMENTS` — this command takes no input. Operate on the current repository only.
+> [!WARNING]
+> Deleting a branch can destroy unmerged commits. Only `git branch -d` (lowercase) is allowed here; it refuses to delete a branch with commits not reachable from its upstream or HEAD. Never run `git branch -D` (force) in this command. If `-d` refuses a branch, that refusal is correct — surface it, do not override it.
+## Step 1 — Establish where you are and what is protected
+You must know the current branch and the main branch before deciding anything.
+```bash
+# Current branch — NEVER a deletion candidate
+git rev-parse --abbrev-ref HEAD
+# The repo's default/main branch (used as the merge target)
+git remote show origin 2>/dev/null | sed -n 's/.*HEAD branch: //p'
+# Fall back to whichever exists locally if there is no remote
+git branch --list main master develop
+```
+Resolve the main branch in this priority order: the remote's `HEAD branch` → `main` → `master`. Build the protected set as: the **current** branch, `main`, `master`, `develop`, plus any release/long-lived branches you can see (`release/*`, `hotfix/*`, anything the user names in `CLAUDE.md` or branch protection). A branch in the protected set is never deleted, even if merged.
+## Step 2 — Prune dead remote-tracking refs
+Drop the local `origin/*` refs whose upstream branch was deleted on the remote. This touches **only** remote-tracking refs, never your local branches or anything on the server.
+```bash
+git fetch --prune
+```
+Report which `origin/*` refs were pruned (the command prints `[deleted]` lines). This is the safest step and never destroys local work.
+> [!NOTE]
+> `--prune` only removes refs that point at the configured remote. It does not delete any local branch, and it does not push deletions to the remote. If a teammate re-pushes a branch, the ref simply comes back on the next fetch.
+## Step 3 — Identify merged candidates (safe to delete)
+List local branches whose tip is already reachable from the resolved main branch — these contain no unique commits relative to main.
+```bash
+# Replace <main> with the branch resolved in Step 1
+git branch --merged <main> --format='%(refname:short)'
+```
+From that output, build the **candidate list** by removing every protected branch from Step 1 (current, main/master/develop, release branches). For each remaining candidate, show what removing it discards so the user can sanity-check before anything is deleted:
+```bash
+# For each candidate <b>: confirm it has no commits ahead of <main> (should print nothing)
+git log --oneline <main>..<b>
+# Last commit on the branch, for context
+git log -1 --format='%h %ci %s' <b>
+```
+Present the candidate list as a table: branch name, last commit date, last commit subject. Do not delete yet.
+> [!WARNING]
+> "Merged" is measured against the branch you check, and only via the default fast-forward reachability test. A branch that was **squash-merged** or **rebase-merged** (e.g. via a squashing PR merge) will NOT appear in `git branch --merged` even though its work shipped — its commits were rewritten, so reachability cannot see them. If a branch you know was squash-merged is missing from the candidate list, that is expected, not a bug: confirm its work landed on `<main>` by content (diff or PR), then treat it as the user's manual call in Step 5 — never auto-delete it just because you believe it merged.
+## Step 4 — Surface unmerged branches (never auto-delete)
+List local branches that are NOT merged into main. These may hold real, un-shipped work.
+```bash
+git branch --no-merged <main> --format='%(refname:short)'
+```
+For each, show how far ahead it is so the user can judge whether it is abandoned or live:
+```bash
+# Commits on <b> not yet in <main>
+git log --oneline <main>..<b>
+```
+Report these separately as **"left for manual review."** Do not delete any of them, do not suggest `-D` to clear them, and flag any whose last commit is recent or whose author is not the current user — those are most likely someone else's active work.
+> [!WARNING]
+> Never delete a branch someone else may still be using, even if it looks merged locally. A remote-tracking branch can lag; another contributor may have unpushed commits on a branch of the same name. When in doubt, leave it for review.
+## Step 5 — Delete merged candidates with the safe variant
+Only now, and only for the Step 3 candidate list, delete using the safe lowercase `-d`:
+```bash
+# Run per branch from the candidate list — <main> already excluded
+git branch -d <candidate>
+```
+If `-d` refuses a branch ("not fully merged"), stop on that branch: it has commits not reachable from main or its upstream. Do not escalate to `-D`. Move it into the manual-review bucket from Step 4 and explain why it was refused.
+> [!NOTE]
+> A `-d` deletion is recoverable for a while: the commit stays in the reflog (`git reflog`) and is reachable by hash until garbage collection runs. A `-D` force-delete of unmerged work has no such safety net once the reflog entry expires — another reason this command refuses it.
+## Report
+Deliver a summary as your message:
+- The main branch you resolved and the full protected set you excluded.
+- Remote-tracking refs pruned in Step 2.
+- Each merged branch deleted in Step 5 (name + last commit).
+- Each unmerged branch left for manual review, with how many commits it is ahead and whether it looks like someone else's active work.
+- Any branch `-d` refused, and why.
+End with the single recommended next action — typically: review the unmerged list and decide explicitly which, if any, to drop.

package/content/commands/review-tests.md ADDED Viewed

@@ -0,0 +1,98 @@
+---
+description: "Review the quality of a test suite, not just whether it passes — find weak assertions, missing edge cases, and tests coupled to implementation."
+argument-hint: "<test file or area to review>"
+allowed-tools: "Read, Grep, Glob"
+---
+Review the **quality** of the tests in `$ARGUMENTS`, not whether they pass. A green suite tells you the tests ran; it does not tell you they verify the right behavior, cover the failure paths, or survive a refactor. Your job is to find the gap between "passes" and "actually protects the contract," then report it. Follow the steps in order — the judgment is in Steps 3–4.
+## Scope
+`$ARGUMENTS` names what to review: a test file, a directory of tests, or an area ("the auth tests", "the cart reducer specs"). Use it to bound which test files you read and which production code you trace them against.
+If `$ARGUMENTS` is empty, ask one focused question: *which test file or area should I review?* Do not review the whole suite by default — a vague review produces vague findings.
+> [!WARNING]
+> Read-only. Use only Read, Grep, and Glob. Do not edit tests, run the suite, or change coverage config. You are diagnosing quality and reporting it, not fixing it.
+## Step 1 — Read the tests and the code under test
+Find the test files in scope, then read each one **alongside the production code it exercises**. You cannot judge whether an assertion is right without knowing the contract it's supposed to enforce.
+```bash
+# Find the tests in scope
+# (Glob) **/*.{test,spec}.{ts,tsx,js,jsx}   or   **/test_*.py   or   **/*_test.go
+```
+For each test, identify: what unit/behavior it claims to cover, what it asserts, what it mocks or stubs, and what setup/teardown it relies on. Then open the corresponding source so you know the real branches, error paths, and edge inputs that *should* be tested.
+> [!NOTE]
+> Map tests to behaviors, not to files. A `cart.test.ts` with twelve tests can still leave the "apply expired coupon" branch completely unverified. Coverage of files is not coverage of behavior.
+## Step 2 — Inventory what's actually asserted
+Before judging, list the concrete claims. For each test, write down the assertion(s) and the branch of production code they pin down. This surfaces two failure modes immediately: tests that assert almost nothing, and large swaths of source with no test pointing at them.
+Use Grep to find the tells fast across the scope:
+```bash
+# Weak / non-assertions: a test that only checks it didn't throw
+grep -rnE 'expect\([^)]*\)\.(toBeDefined|toBeTruthy|not\.toThrow)\(\)|assert\s+\w+\s*$' <scope>
+# Snapshot tests (often assert "it looks like last time", not correct behavior)
+grep -rnE 'toMatchSnapshot|toMatchInlineSnapshot' <scope>
+# Mock-heavy tests (count mocks per file — high counts hint the test verifies wiring, not behavior)
+grep -rcE 'jest\.mock|vi\.mock|mock\.|MagicMock|patch\(|nock\(|when\(' <scope>
+# Determinism hazards inside tests
+grep -rnE 'Date\.now|new Date\(\)|Math\.random|setTimeout|sleep\(|fetch\(|axios|http' <scope>
+```
+## Step 3 — Judge each weakness category
+This is the core of the review. For every test file, decide which of these it suffers from, and back the call with the specific line. State the category explicitly per finding.
+- **Change-detector tests (coupled to implementation).** The test asserts on private internals, call order of mocked collaborators, exact prop trees, or full snapshots — so any refactor that preserves behavior turns it red. *Tell:* the test would break if you renamed a private method or reordered two internal calls without changing output. These punish refactoring and train the team to regenerate snapshots blindly.
+- **Happy-path only.** The test covers the success case and skips the failure paths the code clearly handles — invalid input, empty/null, boundary values, the `catch` block, the early-return guard, concurrent access. *Tell:* the production function has 4 branches; the tests exercise 1.
+- **Weak assertions.** The test runs the code but asserts something trivially true: `toBeDefined`, `toBeTruthy`, "didn't throw", `status === 200` without checking the body, or asserting a mock was called without checking the *arguments* or the *effect*. *Tell:* you could break the real logic and the test stays green.
+- **Non-deterministic / non-isolated.** The test depends on real wall-clock time, unseeded randomness, network or a live DB, the filesystem, or on another test having run first (shared module state, ordering). *Tell:* it would flake under shuffle, in a different timezone, or offline. (Hand these to `/flaky-test-hunt`.)
+- **Over-mocking.** So much is mocked that the test exercises only the mocks — e.g. mocking the function under test, or stubbing every collaborator so the only thing verified is that you wired the stubs together. *Tell:* the assertions check mock call counts, and the real code could be deleted without failing the test.
+- **Coverage theatre.** Lines are executed (driving the coverage number up) but no meaningful assertion checks the result, or branch/edge coverage is missing under a high line-coverage headline. *Tell:* a test that calls a function inside a loop "for coverage" with no assertion on the outputs.
+> [!WARNING]
+> High line-coverage with weak assertions is **false confidence** — it reports that lines ran, not that behavior is correct. Call this out explicitly when you see it. A suite at 90% line coverage whose assertions are mostly `toBeTruthy` protects almost nothing.
+> [!NOTE]
+> Distinguish *coupled to implementation* from *testing behavior*. A good test pins the observable contract (inputs → outputs/side effects) and survives any internal rewrite that keeps that contract. If renaming a private helper would break the test, it's testing the implementation, not the behavior.
+## Step 4 — Find the highest-value missing tests
+The most valuable output is often a test that doesn't exist. From the production code you read in Step 1, list the behaviors and branches with **no** meaningful assertion, then rank them by blast radius: error/failure paths, security-relevant branches, boundary values, and concurrency before cosmetic gaps. Name the specific input and the expected result for each, so the suggestion is directly actionable.
+## Report
+Deliver as your message, ordered by severity. For every finding cite `file:line`, name the category, explain *why* it's weak, and give the concrete fix:
+```markdown
+## Test review — <scope>
+**Overall:** <2-3 sentences: does this suite verify behavior or just execute code? Biggest risk?>
+### High — weak or misleading tests
+- `cart.test.ts:42` — [Weak assertion] only asserts `toBeDefined()`; the discount math is never checked. Assert the exact total for a known cart. Breaking the formula currently keeps this green.
+- `auth.test.ts:88` — [Over-mocking] mocks `verifyToken`, the function under test; the test proves only that the mock returns its stub. Test the real verifier against a known-good and a tampered token.
+### Medium — coupled / non-deterministic
+- `render.test.tsx:15` — [Change-detector] full `toMatchSnapshot`; any markup refactor breaks it. Assert the visible text/role instead.
+- `expiry.test.ts:30` — [Non-deterministic] asserts against `Date.now()`; flakes near boundaries. Inject a fixed clock.
+### Missing tests (highest value first)
+- `refund()` error path — refund exceeding the original charge is never tested. Expect it to reject with `AmountExceedsCharge`.
+- `parseRange()` boundary — empty and single-element inputs untested.
+### Coverage note
+<If line coverage looks high but branches/assertions are thin, say so plainly and name the false-confidence risk.>
+```
+End with the single highest-value change: the one missing test or one weak assertion that, fixed first, removes the most risk.

package/content/commands/scaffold-github-action.md ADDED Viewed

@@ -0,0 +1,94 @@
+---
+description: "Scaffold a hardened GitHub Actions workflow for a stated goal, wired to the project's real test/lint/build commands."
+argument-hint: "<what the workflow should do — e.g. CI test on PR, lint, release/publish, nightly cron>"
+allowed-tools: "Read, Write, Glob, Grep"
+---
+Scaffold a GitHub Actions workflow for this repository. Treat `$ARGUMENTS` as the goal of the workflow — what it should do and when it should run (e.g. `CI test on PR`, `lint + typecheck`, `publish to npm on tag`, `nightly dependency audit`). If `$ARGUMENTS` is empty, ask exactly one question: *"What should this workflow do, and on what event should it run (PR, push to main, tag, schedule)?"* — then proceed.
+## Scope
+Produce one file: `.github/workflows/<name>.yml`, where `<name>` is a short kebab-case slug derived from the goal (`ci`, `lint`, `release`, `nightly-audit`). The workflow must run the project's **real** commands, declare **least-privilege** `permissions`, **pin** every third-party action to a commit SHA, **cache** dependencies, and **cancel** superseded runs via `concurrency`. Reference all credentials through `secrets.*`.
+> [!WARNING]
+> If `.github/workflows/<name>.yml` already exists, do not overwrite it. Read it, then either propose targeted edits in your report or write the new file as `<name>.new.yml` and say so. Never clobber a workflow that may be gating merges or shipping releases.
+## Step 1 — Map the goal to a trigger
+Classify `$ARGUMENTS` into one of these and set `on:` accordingly — do not add triggers the goal does not call for:
+- **CI / test / lint / typecheck** → `on: pull_request` (validate PRs) plus `push:` to the default branch only if post-merge runs are wanted. Gate jobs that touch credentials behind `pull_request`, not `pull_request_target`.
+- **Release / publish** → `on: push: tags: ['v*']` or `on: release: types: [published]`. Publishing on every `main` push is almost never what you want — prefer a tag/release trigger.
+- **Scheduled job** (audit, refresh, backup) → `on: schedule: - cron: '...'`. Cron runs in **UTC**; pick an off-peak minute (avoid `0 * * * *` — top-of-hour is heavily throttled and queued). Add `workflow_dispatch` so it can be run manually too.
+Detect the repo's default branch by `Read`ing `.git/HEAD` or any existing workflow; default to `main` if unknown and note the assumption.
+## Step 2 — Detect the stack and real commands
+Never invent `npm test`. Find what the project actually runs with `Glob`/`Read`/`Grep`:
+- **Node / Bun / Deno** — `package.json`: read `packageManager`, `engines.node`, and `scripts` (`test`, `lint`, `typecheck`, `build`). The lockfile picks the manager and the deterministic install + cache: `package-lock.json` → `npm ci`; `pnpm-lock.yaml` → `pnpm install --frozen-lockfile`; `yarn.lock` → `yarn install --immutable`; `bun.lockb` → `bun install --frozen-lockfile`.
+- **Python** — `pyproject.toml` / `requirements.txt` / `uv.lock` / `poetry.lock`; commands like `pytest`, `ruff check`, `mypy`.
+- **Go** — `go.mod`: `go test ./...`, `go vet ./...`, `go build ./...`; read the `go` directive for the version.
+- **Rust** — `Cargo.toml`: `cargo test`, `cargo clippy -- -D warnings`, `cargo build --release`.
+Record the **language + version**, **package manager + lockfile path**, and the **exact script names** that exist. If the goal asks for a step the project has no script for (e.g. no `lint`), say so in the report rather than fabricating one.
+## Step 3 — Write the hardened workflow
+Use the project's commands and the trigger from Step 1. The snippet below is illustrative for a Node CI workflow — adapt `setup-*`, the cache, and the run steps to the stack from Step 2.
+```yaml
+name: CI
+on:
+  pull_request:
+  push:
+    branches: [main]
+# Least privilege: read-only by default; add scopes per job only as needed.
+permissions:
+  contents: read
+# Cancel superseded runs for the same ref to save minutes.
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+      # Pin third-party actions to a full commit SHA, not a moving tag.
+      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
+      - uses: actions/setup-node@39370e3970a6d050c480ffad4ff0ed4d3fdee5af # v4.1.0
+        with:
+          node-version: 22
+          cache: npm # built-in dependency cache keyed on the lockfile
+      - run: npm ci
+      - run: npm run lint --if-present
+      - run: npm test
+```
+Rules for whatever stack and goal you target:
+- **`permissions:` is least-privilege.** Set a top-level `permissions: contents: read` baseline, then grant the minimum each job needs: `pull-requests: write` to comment on PRs, `packages: write` to push images, `id-token: write` for OIDC publishing. Never use a blanket `permissions: write-all`.
+- **Pin every third-party action to a 40-char commit SHA**, with a trailing `# vX.Y.Z` comment for readability. A moving tag like `@v4` lets a compromised or retagged release run arbitrary code with your token. First-party `actions/*` are still safer pinned.
+- **Cache dependencies** — prefer the `cache:` option built into `setup-node`/`setup-go`/`setup-python` (keyed on the lockfile) over a hand-rolled `actions/cache` unless you need a custom path.
+- **Reference secrets only as `${{ secrets.NAME }}`** — never paste a token literal, and never `echo` a secret. Pass them as `env:` on the single step that needs them, not workflow-wide.
+- **Concurrency** — for CI, cancel superseded runs (`cancel-in-progress: true`). For a release/publish workflow, set `cancel-in-progress: false` so an in-flight publish is never killed mid-upload.
+> [!WARNING]
+> Do not use `pull_request_target` to "fix" a workflow that needs secrets on fork PRs. It runs with the base repo's write token **and** the fork's untrusted code/`with:` inputs in the same context — a classic token-exfiltration vector. If a fork PR genuinely needs a secret, split into a privileged `workflow_run` job that never checks out untrusted code.
+> [!NOTE]
+> For npm/PyPI publishing, prefer **OIDC trusted publishing** (`permissions: id-token: write`) over a long-lived `NPM_TOKEN`/`PYPI_TOKEN` secret — it removes the standing credential entirely. Fall back to a `secrets.*` token only if the registry does not support OIDC.
+## Step 4 — Report
+Deliver the result as your message:
+- **File written** — `.github/workflows/<name>.yml` (or `<name>.new.yml` if you avoided overwriting), and the detected stack + package manager it targets.
+- **Triggers** — the exact `on:` events and, for a schedule, the cron expression in plain English ("daily at 07:00 UTC").
+- **Permissions** — the `GITHUB_TOKEN` scopes granted and why each is needed.
+- **Secrets to configure** — every `secrets.*` referenced, where to add it (`Settings → Secrets and variables → Actions`, or an Environment for protected deploys), and whether OIDC could replace it.
+- **Follow-ups** — any missing project script the goal assumed, and how to verify the pinned SHAs (e.g. `gh api repos/actions/checkout/git/refs/tags/v4.2.2` to confirm the SHA matches the tag) and re-pin them later with Dependabot's `package-ecosystem: github-actions`.

package/content/commands/setup-precommit-hooks.md ADDED Viewed

@@ -0,0 +1,72 @@
+---
+description: "Set up fast pre-commit hooks that catch problems before they land — detect the repo's existing stack and hook mechanism, run lint/format/typecheck plus a secret scan on staged files only, keep the slow test suite in CI, and make the setup reproducible for the whole team."
+allowed-tools: "Read, Write, Glob, Grep, Bash"
+---
+## Scope
+No arguments. Your job: leave this repo with pre-commit hooks that run in **seconds**, only on **staged** content, blocking the cheap mistakes (lint errors, unformatted code, type breaks, committed secrets) before they enter history — while the full test suite stays in CI.
+Match the tooling the repo already uses. Do not impose a new framework on a repo that has a working one, and do not introduce a second runner alongside an existing one.
+> [!WARNING]
+> Hooks that run the whole test suite on every commit are slow, so developers learn to type `--no-verify` and the hooks stop protecting anything. Keep the commit path under a few seconds. Slow, comprehensive checks belong in CI.
+## Step 1 — Detect the stack and what already exists
+Before writing anything, read the ground truth:
+- **Existing hook mechanism** — `.pre-commit-config.yaml` (the pre-commit framework), `.husky/` + a `lint-staged` block in `package.json`, `lefthook.yml`, or a hand-rolled `.git/hooks/pre-commit`. Also check `git config core.hooksPath`.
+- **Stack** — `package.json`, `pyproject.toml`/`requirements.txt`, `go.mod`, `Cargo.toml`, `Gemfile`.
+- **Tools the repo already has** — linter (eslint, ruff, golangci-lint, clippy), formatter (prettier, ruff format/black, gofmt, rustfmt), type checker (tsc, mypy, pyright), and how the test suite is invoked.
+Reuse those exact tools and their existing config. The hook should call the same `eslint`/`ruff`/`prettier` the team already runs, not a new one with different rules.
+## Step 2 — Pick the mechanism (match, don't impose)
+- A config already exists → **extend it**. Add missing checks to the current `.pre-commit-config.yaml` / `lint-staged` block.
+- JS/TS repo, nothing yet → **Husky + lint-staged** (`lint-staged` already runs only on staged files — that's the whole point).
+- Python or polyglot repo, nothing yet → **the `pre-commit` framework** (`.pre-commit-config.yaml`); it pins hook versions and handles staged-only runs.
+- Tiny/no package manager → a **native `.git/hooks/pre-commit`** script. Note that native hooks aren't shared by git, so Step 5 must check it into the repo and add an install step.
+State your choice and why in one line.
+## Step 3 — Configure fast, staged-only checks
+Wire these against **staged files only**, fastest-failing first:
+1. **Secret scan** — block committed credentials with `gitleaks protect --staged` or pre-commit's `detect-secrets`. This is the one check worth running first; a leaked key can't be un-pushed.
+2. **Format (auto-fix)** — run the formatter in write mode on staged files, then re-stage them (`prettier --write`, `ruff format`, `gofmt -w`). Auto-fixing beats rejecting the commit over whitespace.
+3. **Lint** — only the staged files (`eslint`, `ruff check`, `golangci-lint run`); enable `--fix` where the linter's fixes are safe.
+4. **Typecheck** — only if it's fast on the changed scope. `tsc` is whole-project and often too slow for the commit path; if so, leave it to CI rather than degrading the commit experience.
+With `lint-staged`, the staged-file list is passed to each command automatically. With the `pre-commit` framework, set `pass_filenames: true` (the default) and scope with `files:`/`types:`.
+> [!WARNING]
+> The hook must operate on staged content only. If a tool reads the working tree instead of the index, a developer can stage a clean version, leave a broken version unstaged, and the hook passes on code that won't be what's committed. `lint-staged` and the `pre-commit` framework stash unstaged changes to avoid exactly this — a raw native hook does not, so handle it explicitly there.
+## Step 4 — Keep the slow stuff in CI
+Do **not** put the full test suite, full-repo typecheck, or a full build in the commit hook. Confirm those run in CI (check `.github/workflows/`); if a needed check is missing there, name the exact job that should run it (lint, typecheck, full tests on push/PR) and flag that it belongs in CI, not the commit path. A `pre-push` hook is the acceptable home for a fast smoke subset — never a substitute for CI.
+## Step 5 — Make it reproducible for the team
+A hook that only works on your machine is worthless. Ensure:
+- The config file is **committed** (`.pre-commit-config.yaml`, the `lint-staged` block, `.husky/` scripts, or the checked-in native hook + an installer like `git config core.hooksPath .githooks`).
+- There is **one install command** a teammate runs after cloning — `npx husky` (wired via the `prepare` script in `package.json` so `npm install` does it), or `pre-commit install`.
+- Hook tool versions are **pinned** (pre-commit `rev:` tags; devDependencies for JS) so everyone runs identical checks.
+Verify it actually fires: stage a deliberately broken file and confirm the commit is rejected, then fix and confirm it passes.
+## Step 6 — Document the escape hatch
+Note in the config or a short README line that `git commit --no-verify` skips hooks for genuine emergencies (hotfix, mid-rebase WIP). Don't hide it — but pair it with the reminder that CI still enforces the same checks, so bypassing locally only defers the failure.
+## Report
+End with:
+- **Files written/changed** — config path(s) and any `package.json` script additions.
+- **One-time install command** teammates run after cloning (exact command).
+- **What runs on commit** vs. **what's left to CI**, and the bypass flag for emergencies.