agentscamp 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +64 -0
- package/content/agents/accessibility-auditor.md +66 -0
- package/content/agents/agent-architect.md +65 -0
- package/content/agents/agent-reliability-reviewer.md +40 -0
- package/content/agents/agent-tool-integration-engineer.md +38 -0
- package/content/agents/api-architect.md +84 -0
- package/content/agents/backend-developer.md +92 -0
- package/content/agents/browser-agent-engineer.md +37 -0
- package/content/agents/cloud-architect.md +72 -0
- package/content/agents/code-reviewer.md +69 -0
- package/content/agents/data-engineer.md +67 -0
- package/content/agents/data-scientist.md +79 -0
- package/content/agents/debugger.md +89 -0
- package/content/agents/dependency-manager.md +64 -0
- package/content/agents/devops-engineer.md +94 -0
- package/content/agents/documentation-engineer.md +52 -0
- package/content/agents/finetuning-engineer.md +43 -0
- package/content/agents/frontend-developer.md +78 -0
- package/content/agents/git-github-expert.md +66 -0
- package/content/agents/golang-pro.md +72 -0
- package/content/agents/graphql-architect.md +85 -0
- package/content/agents/kubernetes-specialist.md +87 -0
- package/content/agents/llm-cost-optimizer.md +39 -0
- package/content/agents/llm-evaluation-engineer.md +42 -0
- package/content/agents/llm-inference-engineer.md +42 -0
- package/content/agents/llm-integration-engineer.md +39 -0
- package/content/agents/llm-observability-engineer.md +41 -0
- package/content/agents/mcp-server-engineer.md +43 -0
- package/content/agents/ml-engineer.md +67 -0
- package/content/agents/mobile-developer.md +89 -0
- package/content/agents/performance-engineer.md +79 -0
- package/content/agents/postgres-migration-engineer.md +42 -0
- package/content/agents/prompt-engineer.md +58 -0
- package/content/agents/prompt-injection-auditor.md +42 -0
- package/content/agents/python-pro.md +77 -0
- package/content/agents/rag-pipeline-engineer.md +42 -0
- package/content/agents/react-specialist.md +83 -0
- package/content/agents/refactoring-specialist.md +78 -0
- package/content/agents/retrieval-engineer.md +41 -0
- package/content/agents/rust-pro.md +89 -0
- package/content/agents/security-auditor.md +78 -0
- package/content/agents/sql-pro.md +53 -0
- package/content/agents/sre-engineer.md +66 -0
- package/content/agents/system-architect.md +77 -0
- package/content/agents/terraform-specialist.md +73 -0
- package/content/agents/test-engineer.md +79 -0
- package/content/agents/typescript-pro.md +82 -0
- package/content/agents/vector-search-engineer.md +43 -0
- package/content/agents/voice-agent-engineer.md +38 -0
- package/content/agents/workflow-orchestrator.md +70 -0
- package/content/commands/add-docstrings.md +92 -0
- package/content/commands/add-human-approval.md +40 -0
- package/content/commands/add-mcp-server.md +50 -0
- package/content/commands/add-streaming-endpoint.md +34 -0
- package/content/commands/benchmark-rerankers.md +44 -0
- package/content/commands/breakdown-task.md +86 -0
- package/content/commands/commit.md +117 -0
- package/content/commands/create-pr.md +109 -0
- package/content/commands/db-migrate.md +47 -0
- package/content/commands/explain-code.md +71 -0
- package/content/commands/explain-error.md +98 -0
- package/content/commands/extract-function.md +107 -0
- package/content/commands/find-bug.md +93 -0
- package/content/commands/fix-failing-test.md +106 -0
- package/content/commands/new-component.md +119 -0
- package/content/commands/plan-feature.md +71 -0
- package/content/commands/profile-postgres-queries.md +41 -0
- package/content/commands/red-team-llm.md +45 -0
- package/content/commands/refactor.md +82 -0
- package/content/commands/review-pr.md +101 -0
- package/content/commands/run-evals.md +34 -0
- package/content/commands/scaffold-pgvector-schema.md +42 -0
- package/content/commands/scaffold-vllm-config.md +44 -0
- package/content/commands/security-scan.md +129 -0
- package/content/commands/set-perf-budget.md +47 -0
- package/content/commands/setup-claude-ci.md +60 -0
- package/content/commands/sync-branch.md +138 -0
- package/content/commands/update-readme.md +108 -0
- package/content/commands/write-tests.md +81 -0
- package/content/manifest.json +1709 -0
- package/content/skills/adr-writer.md +90 -0
- package/content/skills/branch-rebaser.md +86 -0
- package/content/skills/bundle-analyzer.md +77 -0
- package/content/skills/changelog-from-prs.md +81 -0
- package/content/skills/chunking-strategy-optimizer.md +34 -0
- package/content/skills/claude-settings-auditor.md +38 -0
- package/content/skills/conventional-commits.md +80 -0
- package/content/skills/coverage-gap-finder.md +72 -0
- package/content/skills/dead-code-finder.md +65 -0
- package/content/skills/dependency-audit.md +64 -0
- package/content/skills/embedding-index-tuner.md +34 -0
- package/content/skills/embedding-set-inspector.md +34 -0
- package/content/skills/finetune-dataset-builder.md +33 -0
- package/content/skills/graphrag-scaffolder.md +39 -0
- package/content/skills/hook-writer.md +39 -0
- package/content/skills/human-in-the-loop-gate.md +33 -0
- package/content/skills/llm-as-judge-scorer.md +33 -0
- package/content/skills/llm-eval-suite-scaffolder.md +30 -0
- package/content/skills/llm-guardrails-designer.md +33 -0
- package/content/skills/llm-output-schema-generator.md +32 -0
- package/content/skills/mcp-server-scaffolder.md +33 -0
- package/content/skills/mock-data-factory.md +75 -0
- package/content/skills/multimodal-document-extractor.md +39 -0
- package/content/skills/openapi-doc-writer.md +88 -0
- package/content/skills/plugin-scaffolder.md +38 -0
- package/content/skills/postgres-index-strategist.md +38 -0
- package/content/skills/pr-description.md +87 -0
- package/content/skills/prompt-cache-optimizer.md +34 -0
- package/content/skills/prompt-optimizer.md +40 -0
- package/content/skills/prompt-pii-redactor.md +33 -0
- package/content/skills/provider-fallback-wrapper.md +33 -0
- package/content/skills/qlora-finetune-runner.md +33 -0
- package/content/skills/readme-generator.md +84 -0
- package/content/skills/secret-scanner.md +65 -0
- package/content/skills/sql-optimizer.md +77 -0
- package/content/skills/test-scaffolder.md +74 -0
- package/content/skills/tool-definition-generator.md +33 -0
- package/content/skills/web-research-pipeline.md +39 -0
- package/dist/index.js +384 -0
- package/package.json +44 -0
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: "Extract a code region into a well-named function and update the call site."
|
|
3
|
+
argument-hint: "<file:lines or description>"
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Edit"
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
Extract a region of code into a single, well-named function and replace the original code with a call to it. The result must behave exactly as before — this is a mechanical refactor, not a redesign.
|
|
8
|
+
|
|
9
|
+
## Scope
|
|
10
|
+
|
|
11
|
+
Interpret `$ARGUMENTS` as the region to extract:
|
|
12
|
+
|
|
13
|
+
- **`path/to/file.ts:40-72`** — extract those line numbers in that file.
|
|
14
|
+
- **A description** like *"the validation block in `createUser`"* — locate the matching region with `Grep`/`Glob` before touching anything.
|
|
15
|
+
|
|
16
|
+
If `$ARGUMENTS` is empty, ask which region to extract. Do not guess — extracting the wrong span produces a function nobody asked for.
|
|
17
|
+
|
|
18
|
+
> [!WARNING]
|
|
19
|
+
> This is behavior-preserving. Do not add features, change return values, fix bugs you notice along the way, or alter side effects. If you spot a real bug, stop and report it instead of folding a fix into the extraction.
|
|
20
|
+
|
|
21
|
+
## Step 1 — Locate and read the region
|
|
22
|
+
|
|
23
|
+
Read the file and pin down the exact span. Read enough surrounding context to see what comes before and after the region, not just the lines themselves.
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
# When $ARGUMENTS names a file
|
|
27
|
+
rg -n "createUser" path/to/file.ts
|
|
28
|
+
|
|
29
|
+
# Confirm the enclosing function and its boundaries
|
|
30
|
+
rg -n "^\s*(function|const|def|fn|public|private)" path/to/file.ts
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
Confirm the region is a coherent unit — one job, with a clear top and bottom. If the lines straddle two unrelated concerns, extract the smaller coherent piece and say so.
|
|
34
|
+
|
|
35
|
+
## Step 2 — Determine inputs and outputs
|
|
36
|
+
|
|
37
|
+
This is the part that breaks naive extractions. Work out exactly what the region consumes and what it produces.
|
|
38
|
+
|
|
39
|
+
- **Inputs (parameters):** every variable the region *reads* but does not itself define inside the region. These become parameters.
|
|
40
|
+
- **Outputs (return value):** variables defined inside the region that are *read after* it. One value → return it. Several → return a tuple/object, or keep the extraction smaller.
|
|
41
|
+
- **Mutations:** values the region mutates in place (array pushes, object field writes, mutated arguments). The caller must still observe these — pass the object through and mutate it, or return the new value and reassign at the call site.
|
|
42
|
+
|
|
43
|
+
> [!WARNING]
|
|
44
|
+
> Watch for **captured closure state** and **early returns**. A `return`/`break`/`continue`/`throw` inside the region changes control flow for the *enclosing* function — a plain extracted function cannot reproduce a `break` in the caller's loop, and an early `return` becomes a return from the new function, not the original. If the region contains either, model it explicitly (return a sentinel and branch at the call site, or leave the control-flow line outside the extraction) or report that the region is not cleanly extractable.
|
|
45
|
+
|
|
46
|
+
## Step 3 — Write the function
|
|
47
|
+
|
|
48
|
+
Create the function with a name that states what it does, not how. Place it sensibly: a module-level helper near related functions, or a private method on the same class if it uses instance state.
|
|
49
|
+
|
|
50
|
+
```ts
|
|
51
|
+
// Before — inline region inside createUser
|
|
52
|
+
const trimmed = input.email.trim().toLowerCase();
|
|
53
|
+
if (!trimmed.includes("@") || trimmed.length > 254) {
|
|
54
|
+
throw new ValidationError("invalid email");
|
|
55
|
+
}
|
|
56
|
+
const email = trimmed;
|
|
57
|
+
|
|
58
|
+
// After — extracted, single responsibility, descriptive name
|
|
59
|
+
function normalizeEmail(raw: string): string {
|
|
60
|
+
const trimmed = raw.trim().toLowerCase();
|
|
61
|
+
if (!trimmed.includes("@") || trimmed.length > 254) {
|
|
62
|
+
throw new ValidationError("invalid email");
|
|
63
|
+
}
|
|
64
|
+
return trimmed;
|
|
65
|
+
}
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
Match the file's existing conventions — async/sync, error handling, naming style, and how other helpers in the file are declared and exported.
|
|
69
|
+
|
|
70
|
+
## Step 4 — Replace the original with a call
|
|
71
|
+
|
|
72
|
+
Swap the region for a call that wires up the same inputs and outputs. Keep the surrounding variable names identical so the rest of the function is untouched.
|
|
73
|
+
|
|
74
|
+
```ts
|
|
75
|
+
// Inside createUser
|
|
76
|
+
const email = normalizeEmail(input.email);
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
Preserve order of operations. If the region had side effects (logging, I/O, mutation) sequenced relative to its neighbors, the call must sit at the exact same point.
|
|
80
|
+
|
|
81
|
+
## Step 5 — Verify behavior is unchanged
|
|
82
|
+
|
|
83
|
+
Find and read every caller of the enclosing code path, then confirm the contract still holds.
|
|
84
|
+
|
|
85
|
+
```bash
|
|
86
|
+
# Find callers of the function you edited
|
|
87
|
+
rg -n "createUser\(" --type ts
|
|
88
|
+
|
|
89
|
+
# Run the tests covering this code
|
|
90
|
+
npm test # or: pytest, go test ./..., cargo test
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
- The same arguments must flow in and the same value/mutations must flow out.
|
|
94
|
+
- Run the linter and type checker — a type error here usually means an input or output was mis-classified in Step 2.
|
|
95
|
+
|
|
96
|
+
> [!NOTE]
|
|
97
|
+
> If a test had to change to keep passing, behavior changed. Revert and re-examine the inputs/outputs — a correct extraction never requires touching assertions.
|
|
98
|
+
|
|
99
|
+
## Report
|
|
100
|
+
|
|
101
|
+
Summarize concisely:
|
|
102
|
+
|
|
103
|
+
- **Extracted** — the new function name, its signature, and where it now lives.
|
|
104
|
+
- **Call site** — the file and line where the region was replaced.
|
|
105
|
+
- **Inputs/outputs** — parameters in, value(s) out, and any mutations preserved.
|
|
106
|
+
- **Verification** — that callers, tests, lint, and types still pass.
|
|
107
|
+
- **Caveats** — any early-return or closure-capture handling, or anything you left out of scope.
|
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: "Investigate a reported symptom, form hypotheses, and locate the root cause."
|
|
3
|
+
argument-hint: "[symptom]"
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
You are debugging a reported issue. The symptom to investigate is: **$ARGUMENTS**
|
|
7
|
+
|
|
8
|
+
Your goal is to find the *root cause* — not the first plausible explanation, and not a patch over the symptom. Work methodically through the phases below. Do not skip ahead to a fix until you can explain exactly why the bug happens.
|
|
9
|
+
|
|
10
|
+
## 1. Reproduce and characterize the symptom
|
|
11
|
+
|
|
12
|
+
Before touching the code, pin down what "$ARGUMENTS" actually means in concrete terms.
|
|
13
|
+
|
|
14
|
+
- Identify the **exact** observable failure: error message, stack trace, wrong output, crash, hang, or incorrect state.
|
|
15
|
+
- Determine **when** it happens: every time, intermittently, only with certain inputs, only in certain environments.
|
|
16
|
+
- Establish a reliable reproduction. If you cannot reproduce it, that is your first task — search logs, tests, and recent reports for clues.
|
|
17
|
+
|
|
18
|
+
> [!NOTE]
|
|
19
|
+
> A bug you cannot reproduce is a bug you cannot confidently fix. Invest here first; everything downstream depends on a stable repro.
|
|
20
|
+
|
|
21
|
+
Capture the failing signal directly when possible:
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
# Re-run the failing command/test and capture full output
|
|
25
|
+
<your test or repro command> 2>&1 | tee /tmp/find-bug.log
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## 2. Gather evidence
|
|
29
|
+
|
|
30
|
+
Collect facts before forming theories. Read the actual error, don't paraphrase it.
|
|
31
|
+
|
|
32
|
+
- Read the **full** stack trace top to bottom; the deepest frame in *your* code is usually the most relevant.
|
|
33
|
+
- Grep for the error message, the failing symbol, and surrounding identifiers to locate candidate files.
|
|
34
|
+
- Check recent changes — a regression often points straight at the culprit.
|
|
35
|
+
|
|
36
|
+
```bash
|
|
37
|
+
# Find what changed recently around the suspect area
|
|
38
|
+
git log --oneline -20 -- <suspect_path>
|
|
39
|
+
git blame -L <start>,<end> <file>
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
## 3. Form hypotheses
|
|
43
|
+
|
|
44
|
+
List the **plausible** root causes as explicit, testable statements. Rank them by likelihood given the evidence.
|
|
45
|
+
|
|
46
|
+
Common categories to consider:
|
|
47
|
+
|
|
48
|
+
- **State / lifecycle** — stale cache, race condition, uninitialized or mutated shared state.
|
|
49
|
+
- **Boundary conditions** — null/empty/zero, off-by-one, overflow, timezone, encoding.
|
|
50
|
+
- **Contract mismatch** — API shape changed, wrong type, wrong units, optional treated as required.
|
|
51
|
+
- **Environment** — config, env vars, dependency version, build vs. runtime difference.
|
|
52
|
+
|
|
53
|
+
Write each hypothesis with the prediction it implies, e.g. *"If the cache is stale, then clearing it before the call will make the symptom disappear."*
|
|
54
|
+
|
|
55
|
+
## 4. Test each hypothesis
|
|
56
|
+
|
|
57
|
+
Confirm or eliminate hypotheses one at a time. Change **one** variable per experiment so the result is unambiguous.
|
|
58
|
+
|
|
59
|
+
- Add targeted logging or assertions at the boundary you suspect.
|
|
60
|
+
- Use a debugger or a minimal script to inspect actual values at the point of failure.
|
|
61
|
+
- Bisect when the cause is unclear and you have a known-good past state:
|
|
62
|
+
|
|
63
|
+
```bash
|
|
64
|
+
git bisect start
|
|
65
|
+
git bisect bad # current revision is broken
|
|
66
|
+
git bisect good <known_good> # last revision known to work
|
|
67
|
+
# git replays commits; mark each: git bisect good | git bisect bad
|
|
68
|
+
git bisect reset # when finished
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
> [!WARNING]
|
|
72
|
+
> Resist confirmation bias. Actively try to *disprove* your favorite hypothesis. If an experiment "kind of" supports it, treat that as a no until you have a clean, repeatable result.
|
|
73
|
+
|
|
74
|
+
## 5. Identify the root cause
|
|
75
|
+
|
|
76
|
+
You have found the root cause only when you can state, precisely:
|
|
77
|
+
|
|
78
|
+
- **Where** it is — the specific file, function, and line(s).
|
|
79
|
+
- **Why** it produces the symptom — the exact chain of cause and effect.
|
|
80
|
+
- **When** it triggers — the conditions required, which must match the reproduction from Step 1.
|
|
81
|
+
|
|
82
|
+
If your explanation does not fully account for the observed behavior (including any intermittency), you are not done — return to Step 3.
|
|
83
|
+
|
|
84
|
+
## 6. Report findings
|
|
85
|
+
|
|
86
|
+
Summarize concisely so the fix is obvious and safe:
|
|
87
|
+
|
|
88
|
+
- **Root cause:** one or two sentences naming the exact defect.
|
|
89
|
+
- **Evidence:** the experiments and observations that prove it.
|
|
90
|
+
- **Affected scope:** other call sites or inputs that hit the same defect.
|
|
91
|
+
- **Suggested fix:** the minimal change that addresses the cause, plus a test that would have caught it.
|
|
92
|
+
|
|
93
|
+
Do not apply the fix unless asked — your job here is to locate and explain the root cause. Hand off a clear, verifiable diagnosis.
|
|
@@ -0,0 +1,106 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: "Diagnose and fix a failing test by finding the real root cause."
|
|
3
|
+
argument-hint: "[test name or path]"
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Edit, Bash"
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
Make a failing test green by fixing the **actual root cause**, not by papering over it. Follow the steps below in order. The hard part is the judgment call in Step 3 — whether the test or the code is wrong — so do not skip it.
|
|
8
|
+
|
|
9
|
+
## Scope
|
|
10
|
+
|
|
11
|
+
`$ARGUMENTS` names the failing test to fix. It may be a test file path (`src/lib/parse.test.ts`), a single test name or pattern (`parses nested config`), or a `file::test` selector your runner understands. Use it to scope the run so you iterate on one failure at a time.
|
|
12
|
+
|
|
13
|
+
If `$ARGUMENTS` is empty, run the full suite first, find the failing test(s), and pick the first failure to work on. If several tests fail, fix them one at a time and re-run between fixes — a single root cause often explains a cluster of failures, and fixing it may turn the rest green for free.
|
|
14
|
+
|
|
15
|
+
## Step 1 — Reproduce and read the real failure
|
|
16
|
+
|
|
17
|
+
Detect the test runner (`jest`, `vitest`, `pytest`, `go test`, `cargo test`, …) from the manifest, then run only the target so the output stays readable.
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
# vitest / jest — by name pattern or file
|
|
21
|
+
npx vitest run -t "parses nested config"
|
|
22
|
+
npx jest path/to/file.test.ts -t "parses nested config"
|
|
23
|
+
|
|
24
|
+
# pytest — node id or keyword
|
|
25
|
+
pytest "tests/test_parse.py::test_nested_config" -x -vv
|
|
26
|
+
|
|
27
|
+
# go — single test by regexp
|
|
28
|
+
go test ./pkg/parse -run TestNestedConfig -v
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
Read the *entire* failure block, not just the red summary line:
|
|
32
|
+
|
|
33
|
+
- The assertion that failed, with **expected vs. actual** values.
|
|
34
|
+
- The stack trace — find the first frame inside the project's own source, not framework internals.
|
|
35
|
+
- The exact input the test fed in, so you can replay the path by hand.
|
|
36
|
+
|
|
37
|
+
> [!NOTE]
|
|
38
|
+
> Confirm the test fails for the reason you think it does. A `ReferenceError`, an import that throws at load, a timeout, or a snapshot mismatch are different problems than a logic assertion — don't start fixing math when the real issue is the suite can't even import the module.
|
|
39
|
+
|
|
40
|
+
## Step 2 — Locate the code under test
|
|
41
|
+
|
|
42
|
+
Trace from the failing assertion back to the production code that produced the wrong value.
|
|
43
|
+
|
|
44
|
+
```bash
|
|
45
|
+
# Find the symbol the test imports / calls
|
|
46
|
+
rg -n "parseConfig" src
|
|
47
|
+
|
|
48
|
+
# Open the test and the implementation side by side
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
Read both the test and the implementation. Reconstruct what value the code returns for the test's input and *why*, walking the same branch the failing case takes. Check whether the behavior recently changed.
|
|
52
|
+
|
|
53
|
+
```bash
|
|
54
|
+
# What touched this code lately, and was the test updated alongside it?
|
|
55
|
+
git log --oneline -10 -- src/lib/parse.ts
|
|
56
|
+
git log -p -1 -- src/lib/parse.ts
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
## Step 3 — Decide: is the TEST wrong or the CODE wrong?
|
|
60
|
+
|
|
61
|
+
This is the decision that determines everything else. **State your verdict explicitly before you edit anything**, and back it with the evidence from Steps 1–2. One side is wrong:
|
|
62
|
+
|
|
63
|
+
- **The code is wrong** when the test encodes the genuinely intended behavior (a clear spec, a documented contract, the obvious correct answer) and the implementation produces something else. Fix the implementation.
|
|
64
|
+
- **The test is wrong** when the implementation is correct and the test asserts the wrong thing — a stale expectation after an intentional behavior change, a bad fixture, a flawed mock, a brittle snapshot, or an order/timing assumption that was never guaranteed. Fix the test.
|
|
65
|
+
|
|
66
|
+
When it is genuinely ambiguous (no spec says which behavior is right), do not guess silently. State both interpretations and the user-facing consequence of each, and ask which is intended before changing code.
|
|
67
|
+
|
|
68
|
+
> [!WARNING]
|
|
69
|
+
> Never make a test pass by weakening or deleting the assertion — loosening an exact match to `toBeTruthy()`, widening a tolerance, adding `.skip`/`xit`, or editing the expected value to match buggy output. That hides a real bug behind a green check. If the assertion is correct, fix the code; only relax an assertion when the assertion itself is provably wrong.
|
|
70
|
+
|
|
71
|
+
## Step 4 — Fix the correct side
|
|
72
|
+
|
|
73
|
+
Apply the smallest change that addresses the root cause you identified.
|
|
74
|
+
|
|
75
|
+
- **Fixing the code:** correct the actual defect — the off-by-one, the wrong operator, the unhandled `null`, the bad early return. Don't special-case the one input the test uses; fix the general behavior so related inputs are right too.
|
|
76
|
+
- **Fixing the test:** update the expectation, fixture, or mock to match the genuinely correct behavior, and leave a one-line comment on *why* if it isn't obvious. If the test was brittle (timing, ordering, snapshot churn), make it deterministic rather than just nudging the expected value.
|
|
77
|
+
|
|
78
|
+
Touch only what the diagnosis requires. Leave unrelated cleanup for another change.
|
|
79
|
+
|
|
80
|
+
## Step 5 — Re-run and confirm green
|
|
81
|
+
|
|
82
|
+
Run the scoped target first to confirm the specific failure is gone.
|
|
83
|
+
|
|
84
|
+
```bash
|
|
85
|
+
npx vitest run -t "parses nested config"
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
Then run the surrounding file and the full suite to make sure the fix didn't break a sibling test.
|
|
89
|
+
|
|
90
|
+
```bash
|
|
91
|
+
npx vitest run path/to/file.test.ts # the whole file
|
|
92
|
+
npx vitest run # the full suite
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
> [!NOTE]
|
|
96
|
+
> If the target now passes but a previously-green test fails, your change had a side effect that the broader suite encodes as intended behavior. That regression is a new signal — return to Step 3 and reconcile the two expectations rather than suppressing either one.
|
|
97
|
+
|
|
98
|
+
## Report
|
|
99
|
+
|
|
100
|
+
Summarize for the user, concisely:
|
|
101
|
+
|
|
102
|
+
- **Verdict:** which side was wrong (test or code) and the one-line root cause.
|
|
103
|
+
- **Fix:** the file and what you changed, and why it addresses the cause rather than the symptom.
|
|
104
|
+
- **Result:** the target test and full suite are green (paste the final pass count), or the open question you need answered if the intended behavior was ambiguous.
|
|
105
|
+
|
|
106
|
+
Do not commit or push — leave the change staged for the user to review unless they explicitly ask you to commit.
|
|
@@ -0,0 +1,119 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: "Scaffold a new UI component matching the project conventions."
|
|
3
|
+
argument-hint: "<ComponentName> [props]"
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Write, Edit"
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
Scaffold a new UI component named in `$ARGUMENTS`, generated to match this repository's existing conventions exactly. Discover the conventions first by reading real neighbor components — never impose a structure the repo does not already use.
|
|
8
|
+
|
|
9
|
+
## Scope
|
|
10
|
+
|
|
11
|
+
Read `$ARGUMENTS` as `<ComponentName> [props]`:
|
|
12
|
+
|
|
13
|
+
- The first token is the component name in the project's casing (e.g. `UserCard`, `user-card`). Normalize it to whatever convention the codebase uses, not your own preference.
|
|
14
|
+
- Any remaining tokens are a rough prop list — `title:string variant?:primary|secondary count:number onSelect:fn`. Treat `?` as optional and `fn` as a callback. If props are vague, infer a minimal sensible interface and note your assumptions.
|
|
15
|
+
|
|
16
|
+
If `$ARGUMENTS` is empty, ask for the component name and its purpose before generating anything. Do not invent a component the user did not request.
|
|
17
|
+
|
|
18
|
+
## Step 1 — Detect the project's conventions
|
|
19
|
+
|
|
20
|
+
Before writing a single line, find a representative existing component and study how it is built. This is the most important step — everything downstream mirrors what you find here.
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
# Find existing components to mirror (adapt globs to the repo)
|
|
24
|
+
fd -e tsx -e jsx -e vue -e svelte . src/components src/app 2>/dev/null | head -30
|
|
25
|
+
|
|
26
|
+
# Inspect the manifest for framework, test runner, and styling deps
|
|
27
|
+
cat package.json
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
From a real neighbor file, extract and write down:
|
|
31
|
+
|
|
32
|
+
- **Framework & file type** — React `.tsx`, Vue SFC `.vue`, Svelte `.svelte`, Solid, Angular.
|
|
33
|
+
- **File layout** — one file per component vs. a folder (`Button/index.tsx`, `Button.tsx`, `Button.test.tsx`, `Button.stories.tsx`).
|
|
34
|
+
- **Styling** — Tailwind classes, CSS Modules, `styled-components`, vanilla-extract, plain CSS. Note any `cn()`/`clsx` helper and variant utility (`cva`, `tv`).
|
|
35
|
+
- **Prop typing** — `interface Props` vs. `type Props`, `React.FC` vs. plain function, `forwardRef`, default exports vs. named.
|
|
36
|
+
- **Test / story patterns** — the test framework (`vitest`, `jest`, `@testing-library`), and whether stories use CSF, MDX, or none.
|
|
37
|
+
- **Imports & aliases** — path aliases (`@/components`), import ordering, and how shared primitives are imported.
|
|
38
|
+
|
|
39
|
+
> [!NOTE]
|
|
40
|
+
> Pick the closest neighbor to what you are building (a card if scaffolding a card) and mirror it line for line — directory, naming, export style, and formatting. A component that looks hand-written by the team beats a "correct" one that fights the codebase.
|
|
41
|
+
|
|
42
|
+
## Step 2 — Generate the component
|
|
43
|
+
|
|
44
|
+
Create the component file at the location and with the naming the codebase uses. The block below is illustrative — match the framework and style you found in Step 1, not this snippet.
|
|
45
|
+
|
|
46
|
+
```tsx
|
|
47
|
+
import { cn } from "@/lib/utils";
|
|
48
|
+
|
|
49
|
+
interface UserCardProps {
|
|
50
|
+
title: string;
|
|
51
|
+
variant?: "primary" | "secondary";
|
|
52
|
+
count: number;
|
|
53
|
+
onSelect?: () => void;
|
|
54
|
+
}
|
|
55
|
+
|
|
56
|
+
export function UserCard({
|
|
57
|
+
title,
|
|
58
|
+
variant = "primary",
|
|
59
|
+
count,
|
|
60
|
+
onSelect,
|
|
61
|
+
}: UserCardProps) {
|
|
62
|
+
return (
|
|
63
|
+
<div
|
|
64
|
+
className={cn("rounded-lg border p-4", variant === "primary" && "bg-card")}
|
|
65
|
+
onClick={onSelect}
|
|
66
|
+
>
|
|
67
|
+
<h3>{title}</h3>
|
|
68
|
+
<span>{count}</span>
|
|
69
|
+
</div>
|
|
70
|
+
);
|
|
71
|
+
}
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
- Reuse existing primitives and helpers (the local `cn()`, shared `Button`, design tokens) instead of reintroducing your own.
|
|
75
|
+
- Keep the public prop surface minimal and typed; derive optionality from the `?` markers in `$ARGUMENTS`.
|
|
76
|
+
- Match the neighbor's export style (named vs. default) so existing import patterns keep working.
|
|
77
|
+
|
|
78
|
+
> [!WARNING]
|
|
79
|
+
> Only create the files needed for this component. Do not edit unrelated files, restructure folders, add dependencies, or change shared config. If a missing helper or barrel export is required, flag it rather than silently introducing a new pattern.
|
|
80
|
+
|
|
81
|
+
## Step 3 — Add types, test, and story
|
|
82
|
+
|
|
83
|
+
Generate the supporting files that the neighbor component has — no more, no fewer. If the project keeps types inline, keep them inline; if it ships a `.test.tsx` and a `.stories.tsx` alongside each component, produce both in the same folder.
|
|
84
|
+
|
|
85
|
+
```tsx
|
|
86
|
+
// UserCard.test.tsx — mirror the project's test framework and queries
|
|
87
|
+
import { render, screen } from "@testing-library/react";
|
|
88
|
+
import { UserCard } from "./UserCard";
|
|
89
|
+
|
|
90
|
+
test("renders the title", () => {
|
|
91
|
+
render(<UserCard title="Ada" count={3} />);
|
|
92
|
+
expect(screen.getByText("Ada")).toBeInTheDocument();
|
|
93
|
+
});
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
- Put each file exactly where the codebase puts it, and use its import aliases.
|
|
97
|
+
- Cover the rendered output and one prop-driven branch in the test; do not over-test scaffolding.
|
|
98
|
+
- If the repo has no tests or no stories, skip that artifact — do not introduce a tool the project does not use.
|
|
99
|
+
|
|
100
|
+
> [!NOTE]
|
|
101
|
+
> If you add the component to a barrel file (`index.ts`) or registry, only do so when neighbors are exported the same way. Follow the existing export ordering.
|
|
102
|
+
|
|
103
|
+
## Step 4 — Verify and report
|
|
104
|
+
|
|
105
|
+
Confirm the generated files fit the project before handing back.
|
|
106
|
+
|
|
107
|
+
```bash
|
|
108
|
+
# Adapt to the repo's commands
|
|
109
|
+
npm run lint
|
|
110
|
+
npx tsc --noEmit # or the project's typecheck/build command
|
|
111
|
+
# npm run build # heavier fallback if you need a full bundle
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
Report concisely:
|
|
115
|
+
|
|
116
|
+
- **Files created** — each path, and the neighbor file each one was modeled on.
|
|
117
|
+
- **Props** — the resolved interface and any optionality or types you inferred.
|
|
118
|
+
- **Conventions followed** — framework, styling approach, export style, and test/story pattern matched.
|
|
119
|
+
- **Follow-ups** — anything intentionally skipped (no story because the repo has none) or a missing helper the user should wire up.
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: "Explore the codebase and produce an implementation plan for a feature."
|
|
3
|
+
argument-hint: "<feature description>"
|
|
4
|
+
allowed-tools: "Read, Grep, Glob"
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Scope
|
|
8
|
+
|
|
9
|
+
Treat `$ARGUMENTS` as the feature request — what the user wants to build (`add CSV export to the reports page`, `support OAuth login`, `rate-limit the public API`). Restate it in one sentence to confirm you understood it before planning.
|
|
10
|
+
|
|
11
|
+
If `$ARGUMENTS` is empty, ask one focused question: *"What feature should I plan?"* Do not guess a feature out of thin air.
|
|
12
|
+
|
|
13
|
+
> [!WARNING]
|
|
14
|
+
> Read-only mode. Do not modify the repository, run migrations, install packages, or scaffold code. Your only output is the written plan. If the user wants you to start building, that is a separate follow-up.
|
|
15
|
+
|
|
16
|
+
> [!NOTE]
|
|
17
|
+
> Where the request is ambiguous (auth provider, storage backend, UI placement, scope boundaries), state your assumptions explicitly and plan against them rather than stalling. Flag each assumption so the user can correct it.
|
|
18
|
+
|
|
19
|
+
## Step 1 — Understand the request
|
|
20
|
+
|
|
21
|
+
Break the feature into concrete capabilities and acceptance criteria before touching the code.
|
|
22
|
+
|
|
23
|
+
- What is the user-facing behavior when this is done? What is the smallest version that ships value?
|
|
24
|
+
- What is explicitly **out of scope**? Name it so the plan stays bounded.
|
|
25
|
+
- What existing behavior must not break?
|
|
26
|
+
|
|
27
|
+
## Step 2 — Explore the code to ground the plan
|
|
28
|
+
|
|
29
|
+
A plan written without reading the code is a guess. Map the feature onto the real structure before proposing anything. You only have `Read`, `Grep`, and `Glob` here — explore with those, not the shell:
|
|
30
|
+
|
|
31
|
+
- **Orient first:** `Read` `README.md`, `package.json`, and `CLAUDE.md` for project type, scripts, conventions, and the test/lint/typecheck commands. Use `Glob` (e.g. `src/**/*.ts`, `**/routes/**`) to see how the tree is laid out.
|
|
32
|
+
- **Find the area the feature touches:** `Grep` for terms drawn from `$ARGUMENTS` (e.g. `export|report|download`) to locate the relevant files. Map the entry points, data flow, and layers the feature crosses (routes, services, models, UI).
|
|
33
|
+
- **Find a pattern to mirror:** `Grep` for the shape of similar existing features (e.g. `router\.|app\.(get|post)|export function`), then `Read` the **closest existing feature** end to end. The cleanest plan usually copies an established pattern rather than inventing one.
|
|
34
|
+
|
|
35
|
+
## Step 3 — Write the plan
|
|
36
|
+
|
|
37
|
+
Output the plan in this structure. Be specific — cite real file paths and symbols you found in Step 2, not placeholders.
|
|
38
|
+
|
|
39
|
+
```markdown
|
|
40
|
+
## Plan — <one-line feature summary>
|
|
41
|
+
|
|
42
|
+
### Assumptions
|
|
43
|
+
- <each ambiguity you resolved, and how>
|
|
44
|
+
|
|
45
|
+
### Affected files & modules
|
|
46
|
+
- `path/to/file.ts` — <what changes and why>
|
|
47
|
+
- `path/to/other.ts` — <new function / modified signature>
|
|
48
|
+
|
|
49
|
+
### Proposed approach
|
|
50
|
+
<2-4 paragraphs describing the design: data flow, where new code lives,
|
|
51
|
+
how it hooks into existing patterns, and the public interface.>
|
|
52
|
+
|
|
53
|
+
### Trade-offs & alternatives
|
|
54
|
+
- **Chosen:** <approach> — <why it wins here>
|
|
55
|
+
- **Alternative:** <other option> — <why rejected / when it would be better>
|
|
56
|
+
|
|
57
|
+
### Risks & unknowns
|
|
58
|
+
- <thing that could break, perf concern, migration risk, or open question>
|
|
59
|
+
|
|
60
|
+
### Implementation steps
|
|
61
|
+
1. <smallest first, each independently reviewable>
|
|
62
|
+
2. ...
|
|
63
|
+
|
|
64
|
+
### Test plan
|
|
65
|
+
- <unit / integration tests to add, key cases & edge cases>
|
|
66
|
+
- <how to verify manually; commands to run>
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
## Report
|
|
70
|
+
|
|
71
|
+
Deliver the plan as your message — it is the whole deliverable. Keep it tight enough to read in one pass, specific enough to start coding from immediately, and end with the single recommended first step. Remember: no files were changed; this is a plan to act on, not the action.
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: "Profile a Postgres workload to find the queries actually costing you — rank by total time with pg_stat_statements, EXPLAIN the worst offenders, and recommend the highest-leverage fix."
|
|
3
|
+
argument-hint: "<database/connection details, a slow endpoint, or a description of the workload>"
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Bash"
|
|
5
|
+
model: sonnet
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Scope
|
|
9
|
+
|
|
10
|
+
Treat `$ARGUMENTS` as the workload to profile — a database/connection, a slow endpoint or report, or a description of where the database feels slow. The job here is **triage**: find *which* queries cost the most before optimizing any one of them, so effort goes where it pays.
|
|
11
|
+
|
|
12
|
+
> [!NOTE]
|
|
13
|
+
> This command profiles a workload to rank its worst queries. To then fix a single slow query from its plan, hand off to the [sql-optimizer](/skills/data/sql-optimizer) skill; to choose the right index for it, the [postgres-index-strategist](/skills/database/postgres-index-strategist).
|
|
14
|
+
|
|
15
|
+
## Step 1 — Establish the data source
|
|
16
|
+
|
|
17
|
+
Prefer **`pg_stat_statements`** (the aggregated view of normalized query stats) — confirm the extension is enabled. If it isn't available, fall back to the slow-query log or a representative trace, and say so. Profiling against an empty dev database tells you nothing; use representative data and traffic.
|
|
18
|
+
|
|
19
|
+
## Step 2 — Rank by total cost, not just slowness
|
|
20
|
+
|
|
21
|
+
Pull the top queries by **`total_exec_time`** (total time spent across all calls) — the real cost driver — alongside `calls`, `mean_exec_time`, and `rows`. A fast query run a million times can outweigh a slow one run twice. Report the top offenders by total time and by call count.
|
|
22
|
+
|
|
23
|
+
## Step 3 — EXPLAIN the worst offenders
|
|
24
|
+
|
|
25
|
+
For each top query, run `EXPLAIN (ANALYZE, BUFFERS)` on a representative instance and read for the dominant cost: sequential scans on large filtered tables, estimate-vs-actual row blowups (stale statistics), nested loops over huge intermediates, or sorts spilling to disk.
|
|
26
|
+
|
|
27
|
+
## Step 4 — Classify the fix
|
|
28
|
+
|
|
29
|
+
For each, name the highest-leverage fix and route it:
|
|
30
|
+
|
|
31
|
+
- **Missing/wrong index** → an index recommendation (type matters — B-Tree vs. GIN vs. BRIN; see [postgres-index-strategist](/skills/database/postgres-index-strategist) and [Indexing Postgres at Scale](/guides/database/postgres-indexing-at-scale)).
|
|
32
|
+
- **Stale statistics** → `ANALYZE` the table before anything else.
|
|
33
|
+
- **A single slow query needing a rewrite** → [sql-optimizer](/skills/data/sql-optimizer).
|
|
34
|
+
- **App-side N+1** (same query, huge `calls`) → fix in the application (eager-load / batch), not the database.
|
|
35
|
+
|
|
36
|
+
## Step 5 — Report a prioritized plan
|
|
37
|
+
|
|
38
|
+
Produce a ranked table — query | total time | calls | mean | the diagnosis | the proposed fix — ordered by total cost so the team fixes the biggest win first. Quantify where you can ("this one query is 40% of total DB time").
|
|
39
|
+
|
|
40
|
+
> [!WARNING]
|
|
41
|
+
> Optimize by **total** time, not by the single slowest query. The query that dominates your database's load is often a moderately-fast one executed constantly — chasing the one query with the worst single-run time can spend effort where it barely moves the needle.
|
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: "Red-team an LLM app or agent for prompt injection, jailbreaks, and data leakage — probe the real attack surface (input, RAG, tools, system prompt) with adversarial inputs and report what got through and how to fix it."
|
|
3
|
+
argument-hint: "<the app/endpoint/agent to test, or a description of its inputs, tools, and data>"
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Bash"
|
|
5
|
+
model: sonnet
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Scope
|
|
9
|
+
|
|
10
|
+
Treat `$ARGUMENTS` as the LLM app/agent to red-team — an endpoint, an agent, or a description of its inputs, tools, retrieved sources, and data. Restate the target and its attack surface in one sentence before probing.
|
|
11
|
+
|
|
12
|
+
> [!WARNING]
|
|
13
|
+
> Red-team only systems you are **authorized** to test. This command runs adversarial attacks; confirm you have permission for the target and use a non-production or isolated environment where possible. The aim is to find holes before an attacker does — on your own system.
|
|
14
|
+
|
|
15
|
+
Goal: probe the **real** attack surface with adversarial inputs, record what succeeds and its blast radius, and return prioritized fixes — an active attack campaign, complementary to the design review the [prompt-injection-auditor](/agents/quality-security/prompt-injection-auditor) performs.
|
|
16
|
+
|
|
17
|
+
## Step 1 — Map the attack surface
|
|
18
|
+
|
|
19
|
+
Enumerate every channel that reaches the model: direct user input, **retrieved/RAG content**, **tool outputs**, browsed pages or ingested files, and the system prompt. The indirect channels (content the system reads while working) are the ones most worth attacking.
|
|
20
|
+
|
|
21
|
+
## Step 2 — Choose attack categories
|
|
22
|
+
|
|
23
|
+
Cover the categories that matter for this target:
|
|
24
|
+
|
|
25
|
+
- **Direct prompt injection** — instruction-override in user input.
|
|
26
|
+
- **Indirect injection** — payloads planted in a document, tool result, or page the system ingests.
|
|
27
|
+
- **Jailbreak** — bypassing safety/policy constraints.
|
|
28
|
+
- **System-prompt leakage** — extracting the hidden instructions (LLM07).
|
|
29
|
+
- **Data exfiltration** — making the model reveal data or secrets it shouldn't.
|
|
30
|
+
- **Tool misuse** — inducing a harmful or out-of-scope tool call (for agents).
|
|
31
|
+
|
|
32
|
+
## Step 3 — Run the probes
|
|
33
|
+
|
|
34
|
+
Execute adversarial inputs for each category — automated with a red-teaming tool like [promptfoo](/tools/promptfoo) (injection/jailbreak suites) and/or targeted manual probes, including the **indirect** path (seed a poisoned document/tool result and see if the agent obeys it). Vary phrasings; a single failed attempt proves nothing.
|
|
35
|
+
|
|
36
|
+
## Step 4 — Record what got through and its blast radius
|
|
37
|
+
|
|
38
|
+
For each successful attack, capture the input, what the model did, and — critically — the **impact**: data leaked, action taken, constraint bypassed. Rank by blast radius (what it could actually cause), not by novelty.
|
|
39
|
+
|
|
40
|
+
## Step 5 — Recommend fixes and re-test
|
|
41
|
+
|
|
42
|
+
Map each finding to a mitigation — least privilege, human approval, trust boundaries, input/output guardrails, secrets out of context (see [Defending Against Prompt Injection](/guides/ai-safety/defending-prompt-injection)) — then re-run the successful attacks to confirm the fix contains them. An attack that now achieves nothing is the success criterion, not one you believe you blocked.
|
|
43
|
+
|
|
44
|
+
> [!NOTE]
|
|
45
|
+
> Report negatives honestly: state which attack categories you ran, which you didn't, and that passing today's probes is not proof of safety — red-teaming is continuous, because new bypasses appear. Gate releases on it, don't treat it as a one-time sign-off.
|