agentscamp 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +64 -0
- package/content/agents/accessibility-auditor.md +66 -0
- package/content/agents/agent-architect.md +65 -0
- package/content/agents/agent-reliability-reviewer.md +40 -0
- package/content/agents/agent-tool-integration-engineer.md +38 -0
- package/content/agents/api-architect.md +84 -0
- package/content/agents/backend-developer.md +92 -0
- package/content/agents/browser-agent-engineer.md +37 -0
- package/content/agents/cloud-architect.md +72 -0
- package/content/agents/code-reviewer.md +69 -0
- package/content/agents/data-engineer.md +67 -0
- package/content/agents/data-scientist.md +79 -0
- package/content/agents/debugger.md +89 -0
- package/content/agents/dependency-manager.md +64 -0
- package/content/agents/devops-engineer.md +94 -0
- package/content/agents/documentation-engineer.md +52 -0
- package/content/agents/finetuning-engineer.md +43 -0
- package/content/agents/frontend-developer.md +78 -0
- package/content/agents/git-github-expert.md +66 -0
- package/content/agents/golang-pro.md +72 -0
- package/content/agents/graphql-architect.md +85 -0
- package/content/agents/kubernetes-specialist.md +87 -0
- package/content/agents/llm-cost-optimizer.md +39 -0
- package/content/agents/llm-evaluation-engineer.md +42 -0
- package/content/agents/llm-inference-engineer.md +42 -0
- package/content/agents/llm-integration-engineer.md +39 -0
- package/content/agents/llm-observability-engineer.md +41 -0
- package/content/agents/mcp-server-engineer.md +43 -0
- package/content/agents/ml-engineer.md +67 -0
- package/content/agents/mobile-developer.md +89 -0
- package/content/agents/performance-engineer.md +79 -0
- package/content/agents/postgres-migration-engineer.md +42 -0
- package/content/agents/prompt-engineer.md +58 -0
- package/content/agents/prompt-injection-auditor.md +42 -0
- package/content/agents/python-pro.md +77 -0
- package/content/agents/rag-pipeline-engineer.md +42 -0
- package/content/agents/react-specialist.md +83 -0
- package/content/agents/refactoring-specialist.md +78 -0
- package/content/agents/retrieval-engineer.md +41 -0
- package/content/agents/rust-pro.md +89 -0
- package/content/agents/security-auditor.md +78 -0
- package/content/agents/sql-pro.md +53 -0
- package/content/agents/sre-engineer.md +66 -0
- package/content/agents/system-architect.md +77 -0
- package/content/agents/terraform-specialist.md +73 -0
- package/content/agents/test-engineer.md +79 -0
- package/content/agents/typescript-pro.md +82 -0
- package/content/agents/vector-search-engineer.md +43 -0
- package/content/agents/voice-agent-engineer.md +38 -0
- package/content/agents/workflow-orchestrator.md +70 -0
- package/content/commands/add-docstrings.md +92 -0
- package/content/commands/add-human-approval.md +40 -0
- package/content/commands/add-mcp-server.md +50 -0
- package/content/commands/add-streaming-endpoint.md +34 -0
- package/content/commands/benchmark-rerankers.md +44 -0
- package/content/commands/breakdown-task.md +86 -0
- package/content/commands/commit.md +117 -0
- package/content/commands/create-pr.md +109 -0
- package/content/commands/db-migrate.md +47 -0
- package/content/commands/explain-code.md +71 -0
- package/content/commands/explain-error.md +98 -0
- package/content/commands/extract-function.md +107 -0
- package/content/commands/find-bug.md +93 -0
- package/content/commands/fix-failing-test.md +106 -0
- package/content/commands/new-component.md +119 -0
- package/content/commands/plan-feature.md +71 -0
- package/content/commands/profile-postgres-queries.md +41 -0
- package/content/commands/red-team-llm.md +45 -0
- package/content/commands/refactor.md +82 -0
- package/content/commands/review-pr.md +101 -0
- package/content/commands/run-evals.md +34 -0
- package/content/commands/scaffold-pgvector-schema.md +42 -0
- package/content/commands/scaffold-vllm-config.md +44 -0
- package/content/commands/security-scan.md +129 -0
- package/content/commands/set-perf-budget.md +47 -0
- package/content/commands/setup-claude-ci.md +60 -0
- package/content/commands/sync-branch.md +138 -0
- package/content/commands/update-readme.md +108 -0
- package/content/commands/write-tests.md +81 -0
- package/content/manifest.json +1709 -0
- package/content/skills/adr-writer.md +90 -0
- package/content/skills/branch-rebaser.md +86 -0
- package/content/skills/bundle-analyzer.md +77 -0
- package/content/skills/changelog-from-prs.md +81 -0
- package/content/skills/chunking-strategy-optimizer.md +34 -0
- package/content/skills/claude-settings-auditor.md +38 -0
- package/content/skills/conventional-commits.md +80 -0
- package/content/skills/coverage-gap-finder.md +72 -0
- package/content/skills/dead-code-finder.md +65 -0
- package/content/skills/dependency-audit.md +64 -0
- package/content/skills/embedding-index-tuner.md +34 -0
- package/content/skills/embedding-set-inspector.md +34 -0
- package/content/skills/finetune-dataset-builder.md +33 -0
- package/content/skills/graphrag-scaffolder.md +39 -0
- package/content/skills/hook-writer.md +39 -0
- package/content/skills/human-in-the-loop-gate.md +33 -0
- package/content/skills/llm-as-judge-scorer.md +33 -0
- package/content/skills/llm-eval-suite-scaffolder.md +30 -0
- package/content/skills/llm-guardrails-designer.md +33 -0
- package/content/skills/llm-output-schema-generator.md +32 -0
- package/content/skills/mcp-server-scaffolder.md +33 -0
- package/content/skills/mock-data-factory.md +75 -0
- package/content/skills/multimodal-document-extractor.md +39 -0
- package/content/skills/openapi-doc-writer.md +88 -0
- package/content/skills/plugin-scaffolder.md +38 -0
- package/content/skills/postgres-index-strategist.md +38 -0
- package/content/skills/pr-description.md +87 -0
- package/content/skills/prompt-cache-optimizer.md +34 -0
- package/content/skills/prompt-optimizer.md +40 -0
- package/content/skills/prompt-pii-redactor.md +33 -0
- package/content/skills/provider-fallback-wrapper.md +33 -0
- package/content/skills/qlora-finetune-runner.md +33 -0
- package/content/skills/readme-generator.md +84 -0
- package/content/skills/secret-scanner.md +65 -0
- package/content/skills/sql-optimizer.md +77 -0
- package/content/skills/test-scaffolder.md +74 -0
- package/content/skills/tool-definition-generator.md +33 -0
- package/content/skills/web-research-pipeline.md +39 -0
- package/dist/index.js +384 -0
- package/package.json +44 -0
|
@@ -0,0 +1,88 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "openapi-doc-writer"
|
|
3
|
+
description: "Produce and maintain OpenAPI documentation for an HTTP API. Use when documenting endpoints, request/response schemas, or generating API reference docs."
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
Author and maintain accurate, spec-compliant OpenAPI 3.1 documents that describe an HTTP API end to end — paths, operations, request bodies, responses, and reusable component schemas. This skill produces a single source of truth that drives reference docs, client SDK generation, and contract tests, while keeping the spec in sync with the actual code.
|
|
8
|
+
|
|
9
|
+
## When to use this skill
|
|
10
|
+
|
|
11
|
+
Use this skill when you need to:
|
|
12
|
+
|
|
13
|
+
- Document a new endpoint or a whole service in OpenAPI (YAML or JSON).
|
|
14
|
+
- Add or correct request/response schemas, parameters, headers, or status codes.
|
|
15
|
+
- Reconcile an existing spec with route handlers that have drifted from it.
|
|
16
|
+
- Generate a human-readable API reference or set up client/server code generation from the spec.
|
|
17
|
+
|
|
18
|
+
Skip it for internal RPC, GraphQL, or non-HTTP interfaces — OpenAPI does not model those well.
|
|
19
|
+
|
|
20
|
+
## Instructions
|
|
21
|
+
|
|
22
|
+
Follow these steps in order.
|
|
23
|
+
|
|
24
|
+
1. **Locate or create the spec.** Look for an existing `openapi.yaml`, `openapi.json`, or `swagger.*`. If none exists, create `openapi.yaml` with `openapi: 3.1.0`, an `info` block (`title`, `version`), and a `servers` list. Prefer YAML for readability.
|
|
25
|
+
2. **Inventory the endpoints.** Read the route definitions / controllers to enumerate every method + path, its parameters, request body shape, and all possible responses (including errors). Treat the code as the source of truth when it conflicts with stale docs.
|
|
26
|
+
3. **Model reusable schemas first.** Define shared object shapes under `components/schemas` and reference them with `$ref`. Never inline the same object twice. Mark fields `required` deliberately and express nullability with JSON Schema type arrays (e.g. `type: [string, "null"]`) — the `nullable` keyword was removed in OpenAPI 3.1.
|
|
27
|
+
4. **Write each operation.** Under `paths`, give every operation an `operationId` (unique, camelCase), a one-line `summary`, `tags` for grouping, typed parameters, a `requestBody` where applicable, and a `responses` map covering success and documented error codes (e.g. `400`, `401`, `404`, `422`).
|
|
28
|
+
5. **Add examples.** Provide at least one realistic `example` (or `examples`) per request body and key response. Examples must validate against their schema.
|
|
29
|
+
6. **Validate.** Run a linter such as `redocly lint` or `spectral lint` and fix every error and warning before finishing.
|
|
30
|
+
7. **Render or generate (if requested).** Produce reference HTML or client/server stubs from the validated spec.
|
|
31
|
+
|
|
32
|
+
> [!NOTE]
|
|
33
|
+
> When you need exact field placement, data-type keywords, or security-scheme syntax, consult the official OpenAPI 3.1 specification rather than guessing.
|
|
34
|
+
|
|
35
|
+
> [!WARNING]
|
|
36
|
+
> Keep `info.version` in step with releases and bump it on any breaking schema change. Downstream SDK generators and contract tests key off it.
|
|
37
|
+
|
|
38
|
+
## Examples
|
|
39
|
+
|
|
40
|
+
Documenting `GET /users/{id}` with a reusable schema and error response:
|
|
41
|
+
|
|
42
|
+
```yaml
|
|
43
|
+
paths:
|
|
44
|
+
/users/{id}:
|
|
45
|
+
get:
|
|
46
|
+
operationId: getUserById
|
|
47
|
+
summary: Retrieve a single user
|
|
48
|
+
tags: [Users]
|
|
49
|
+
parameters:
|
|
50
|
+
- name: id
|
|
51
|
+
in: path
|
|
52
|
+
required: true
|
|
53
|
+
schema: { type: string, format: uuid }
|
|
54
|
+
responses:
|
|
55
|
+
"200":
|
|
56
|
+
description: The requested user
|
|
57
|
+
content:
|
|
58
|
+
application/json:
|
|
59
|
+
schema: { $ref: "#/components/schemas/User" }
|
|
60
|
+
example: { id: "9f1c...", email: "ada@example.com", active: true }
|
|
61
|
+
"404":
|
|
62
|
+
description: User not found
|
|
63
|
+
content:
|
|
64
|
+
application/json:
|
|
65
|
+
schema: { $ref: "#/components/schemas/Error" }
|
|
66
|
+
|
|
67
|
+
components:
|
|
68
|
+
schemas:
|
|
69
|
+
User:
|
|
70
|
+
type: object
|
|
71
|
+
required: [id, email]
|
|
72
|
+
properties:
|
|
73
|
+
id: { type: string, format: uuid }
|
|
74
|
+
email: { type: string, format: email }
|
|
75
|
+
active: { type: boolean, default: true }
|
|
76
|
+
Error:
|
|
77
|
+
type: object
|
|
78
|
+
required: [code, message]
|
|
79
|
+
properties:
|
|
80
|
+
code: { type: integer }
|
|
81
|
+
message: { type: string }
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
Validate before committing:
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
npx @redocly/cli lint openapi.yaml
|
|
88
|
+
```
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "plugin-scaffolder"
|
|
3
|
+
description: "Scaffold a complete, valid Claude Code plugin from a description — the .claude-plugin/plugin.json manifest, component directories (skills, agents, commands, hooks, MCP config), portable ${CLAUDE_PLUGIN_ROOT} wiring, a local test loop with --plugin-dir, and a marketplace.json for distribution. Use when turning scattered .claude/ customizations into one installable, versioned package."
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Write, Edit, Bash"
|
|
5
|
+
version: 1.0.0
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
A Claude Code plugin is mostly *layout discipline*: the manifest goes in `.claude-plugin/`, every component directory goes at the plugin root, paths must use `${CLAUDE_PLUGIN_ROOT}` to survive installation, and the marketplace entry has its own schema. This skill encodes that discipline — describe the plugin (or point at the `.claude/` setup you want to package) and get a valid, testable scaffold.
|
|
9
|
+
|
|
10
|
+
## When to use this skill
|
|
11
|
+
|
|
12
|
+
- You're starting a plugin and want the structure, manifest, and one working example of each component generated correctly the first time.
|
|
13
|
+
- You have customizations scattered across `.claude/agents/`, `.claude/skills/`, hooks, and an `.mcp.json`, and want them migrated into one installable package.
|
|
14
|
+
- You're setting up a team or personal **marketplace** repo and need the `marketplace.json` wired so `/plugin install` works.
|
|
15
|
+
|
|
16
|
+
## When NOT to use this skill
|
|
17
|
+
|
|
18
|
+
- You're sharing a *single* procedure — one [skill file](/guides/skills/writing-your-first-skill) needs no plugin around it.
|
|
19
|
+
- The customization is for exactly one repo and travels with it — checked-in `.claude/` files already do that; packaging adds versioning overhead you don't need yet. See [the plugins guide](/guides/configuration/claude-code-plugins) for the dividing line.
|
|
20
|
+
|
|
21
|
+
## Instructions
|
|
22
|
+
|
|
23
|
+
1. **Inventory what the plugin carries.** From the description (or by reading the existing `.claude/` directory), list the components: skills, agents, commands, hooks, MCP servers, LSP config. Confirm the plugin's name (kebab-case, unique — it becomes the namespace prefix users type, e.g. `/my-plugin:release-notes`).
|
|
24
|
+
2. **Scaffold the layout exactly.** Create `.claude-plugin/plugin.json` — **only the manifest lives in that folder** — and component directories at the plugin root: `skills/<name>/SKILL.md`, `agents/*.md`, `commands/*.md`, `hooks/hooks.json`, `.mcp.json`. Manifest gets `name` (required), plus `version`, `description`, `author`, and `repository` so marketplaces render it properly.
|
|
25
|
+
3. **Write working samples, not lorem ipsum.** Each requested component gets a minimal but real implementation derived from the user's description — a skill with actual instructions, a hook with a functioning script. Migrating existing files? Copy them in, then fix what packaging breaks (next step).
|
|
26
|
+
4. **Make paths portable.** Anything referencing files inside the plugin uses `${CLAUDE_PLUGIN_ROOT}` (the install location changes and moves on update); anything writing caches or generated state uses `${CLAUDE_PLUGIN_DATA}` (survives updates); anything touching the user's project uses `${CLAUDE_PROJECT_DIR}`. Hardcoded relative paths are the #1 way plugins break after install.
|
|
27
|
+
5. **Test, then validate.** Run the local loop: `claude --plugin-dir ./<plugin>` to load it for a session, exercise each component (the namespaced command, the hook trigger, the MCP connection), iterate with `/reload-plugins`. Finish with `claude plugin validate ./<plugin> --strict` and fix every warning.
|
|
28
|
+
6. **Wire distribution.** Generate or update the `marketplace.json` (in this repo or the user's marketplace repo) with the plugin's entry and source. State the consumer install path explicitly: `/plugin marketplace add <owner>/<repo>` then `/plugin install <name>@<marketplace>` — and for teams, note the project-scope install that gives every clone the plugin after a trust prompt.
|
|
29
|
+
|
|
30
|
+
> [!WARNING]
|
|
31
|
+
> A plugin executes on other people's machines: its hooks run shell commands and its MCP servers receive credentials. Don't bundle secrets (use env expansion), pin any third-party servers it pulls in, and keep the manifest's `repository` honest so consumers can read the source they're trusting.
|
|
32
|
+
|
|
33
|
+
> [!TIP]
|
|
34
|
+
> Keep version discipline from day one — bump `version` on every behavioral change. Marketplaces surface it, and "which version are you on?" is the first debugging question you'll ask a teammate.
|
|
35
|
+
|
|
36
|
+
## Output
|
|
37
|
+
|
|
38
|
+
A complete plugin directory that passes `claude plugin validate --strict`: manifest, all requested components implemented, portable path variables throughout, plus the `marketplace.json` entry and a short INSTALL note covering the marketplace-add, install, and local `--plugin-dir` test commands.
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "postgres-index-strategist"
|
|
3
|
+
description: "Recommend the right Postgres index for a query or workload — choosing B-Tree vs. GIN vs. BRIN vs. partial/covering/expression, checking for redundant or unused indexes, and verifying the choice against the query plan. Use when a query needs an index, when deciding an index type for jsonb/array/full-text/time-series data, or when auditing an over-indexed table."
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Bash"
|
|
5
|
+
version: 1.0.0
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
Most Postgres index problems are one of two mistakes: reaching for B-Tree when the column is multi-value (jsonb, array, full-text) and a GIN would be transformative, or piling on speculative indexes that tax every write for reads that never happen. This skill matches the **index type to the query and the data shape**, prunes the indexes that aren't earning their keep, and verifies the choice against the actual plan — so you add the index that helps and skip the one that just costs.
|
|
9
|
+
|
|
10
|
+
## When to use this skill
|
|
11
|
+
|
|
12
|
+
- A query is slow and you suspect a missing or wrong-type index.
|
|
13
|
+
- You're indexing `jsonb`, arrays, full-text (`tsvector`), trigram/`ILIKE`, or a huge time-series table and need to choose between B-Tree, GIN, and BRIN.
|
|
14
|
+
- A table feels over-indexed — slow writes, lots of indexes — and you want to find redundant or unused ones to drop.
|
|
15
|
+
- Designing indexes for a new table's expected query patterns.
|
|
16
|
+
|
|
17
|
+
## Instructions
|
|
18
|
+
|
|
19
|
+
1. **Start from the query, not the column.** Collect the actual `WHERE`, `JOIN`, `ORDER BY`, and the operators used (`=`, range, `@>`, `@@`, `ILIKE`, array membership). The operator and selectivity decide the index type — index the workload, not the schema in the abstract.
|
|
20
|
+
2. **Match type to shape.**
|
|
21
|
+
- **B-Tree** — scalar equality, ranges, sorting, uniqueness (the default; most indexes).
|
|
22
|
+
- **GIN** — `jsonb` containment, array membership, full-text `tsvector`, trigram (`pg_trgm`) for fuzzy/`ILIKE '%x%'`.
|
|
23
|
+
- **BRIN** — very large tables physically ordered by the column (time-series, append-only by `created_at`/monotonic id).
|
|
24
|
+
- **Partial** (`WHERE`) when queries always filter a subset; **covering** (`INCLUDE`) for index-only scans; **expression** index for `lower(col)` / `date(col)` predicates.
|
|
25
|
+
3. **Get multi-column order right.** For composite B-Tree indexes, put equality columns before range/sort columns, and lead with the column queries filter on. A leading-column mismatch makes the index unusable for the query.
|
|
26
|
+
4. **Check for redundancy and waste before adding.** Inspect existing indexes (`\d table`, `pg_indexes`) and usage (`pg_stat_user_indexes` — `idx_scan = 0` is unused). Don't add an index whose job a prefix of an existing one already does; flag redundant/unused indexes to drop (with `DROP INDEX CONCURRENTLY`).
|
|
27
|
+
5. **Verify against the plan.** Apply the index (on a copy or with `CONCURRENTLY`) and re-run `EXPLAIN (ANALYZE, BUFFERS)` to confirm the planner uses it and the cost drops. An index the planner ignores — wrong type, non-sargable predicate, poor selectivity — is not a fix; reconsider rather than keep it.
|
|
28
|
+
6. **State the write cost.** Every index slows writes and uses storage. Recommend the smallest set that serves the queries, and name the trade for each index kept.
|
|
29
|
+
|
|
30
|
+
> [!WARNING]
|
|
31
|
+
> An index only helps a **sargable** predicate whose leading column matches. `WHERE date(created_at) = …` or `WHERE email ILIKE '%@acme.com'` can't use a plain B-Tree — fix the predicate or use the right index (expression index, or GIN+trigram) instead of adding one the planner will ignore.
|
|
32
|
+
|
|
33
|
+
> [!NOTE]
|
|
34
|
+
> This skill covers scalar/text indexing. For nearest-neighbour search over embeddings stored in Postgres, the index is HNSW/IVFFlat via [pgvector](/tools/pgvector) — tune those parameters with the [Embedding Index Tuner](/skills/database/embedding-index-tuner) instead.
|
|
35
|
+
|
|
36
|
+
## Output
|
|
37
|
+
|
|
38
|
+
A concrete index recommendation: the index type and definition (with column order), the rationale tied to the query and data shape, any redundant/unused indexes to drop, and an EXPLAIN before/after confirming the planner uses it and the cost fell — plus the write-cost trade-off for each index kept.
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "pr-description"
|
|
3
|
+
description: "Draft a clear pull request description from the branch diff against its base. Use when you have a finished branch and want a reviewer-ready PR body before opening the PR."
|
|
4
|
+
allowed-tools: "Read, Bash"
|
|
5
|
+
version: 1.0.0
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
Turn the diff between your branch and its base into a reviewer-ready pull request description. The skill computes the real changeset with `git diff --merge-base`, reads the touched code and the commit log, and drafts a structured body: a one-line summary, what changed and *why*, notable implementation notes, how it was tested, and risk/rollout. It is strictly read-only — it produces text for you to paste, it does not open or modify the PR.
|
|
9
|
+
|
|
10
|
+
## When to use this skill
|
|
11
|
+
|
|
12
|
+
- You have a finished branch and want a clear PR body before opening the pull request.
|
|
13
|
+
- An existing PR description is thin ("misc fixes") and a reviewer needs the real story.
|
|
14
|
+
- You want the *why* and the test evidence written down, not just a list of file names.
|
|
15
|
+
- You are about to request review and want to front-load the context reviewers always ask for.
|
|
16
|
+
|
|
17
|
+
> [!NOTE]
|
|
18
|
+
> This drafts text only. It never runs `gh pr create`, pushes, or edits the PR — copy the output into your PR yourself (or hand it to the `create-pr` command). The "how it was tested" section reports what the diff and history *show*; confirm the claims match what you actually ran.
|
|
19
|
+
|
|
20
|
+
## Instructions
|
|
21
|
+
|
|
22
|
+
1. **Find the base and the diff.** Determine the branch's merge base and capture the full changeset. Prefer the merge-base form so unrelated changes already on `main` are excluded:
|
|
23
|
+
```bash
|
|
24
|
+
git diff --merge-base origin/main
|
|
25
|
+
```
|
|
26
|
+
Fall back in order if that fails: `git diff --merge-base main`, then `git merge-base HEAD origin/main` + `git diff <base>..HEAD`, then `git diff main...HEAD`. If you still cannot resolve a base, ask the user which branch to diff against rather than guessing.
|
|
27
|
+
2. **Detect the base branch — do not assume `main`.** Read `git remote show origin | grep "HEAD branch"` (or `git symbolic-ref refs/remotes/origin/HEAD`) to find the real default branch; many repos use `master`, `develop`, or `trunk`. Use that name everywhere below.
|
|
28
|
+
3. **Read the commit narrative.** Run `git log $(git merge-base HEAD origin/<base>)..HEAD --oneline` and `git diff --merge-base origin/<base> --stat` (substituting the real base name from step 2) to see the scope and the author's own framing. Skim the actual hunks of the largest or most behavior-changing files — the summary must describe intent, not just churn.
|
|
29
|
+
4. **Detect existing PR conventions.** Check for `.github/PULL_REQUEST_TEMPLATE.md` (or `docs/`) and mirror its headings, checklists, and required sections exactly. If the repo uses a template, fill it in rather than imposing your own structure.
|
|
30
|
+
5. **Draft the body** with these sections (or the template's equivalents):
|
|
31
|
+
- **Summary** — one imperative line a reviewer could read in the merge log.
|
|
32
|
+
- **What changed & why** — the motivation and the approach, grouped by concern, not a file dump. Explain *why* this approach over the obvious alternative when it is not self-evident.
|
|
33
|
+
- **Implementation notes** — non-obvious decisions, new dependencies, migrations, follow-ups deliberately left out of scope.
|
|
34
|
+
- **Testing** — what was added or run. Cite real signals: new test files in the diff, a CI config, or commands the user can reproduce. Do **not** claim a test ran if the diff shows no test.
|
|
35
|
+
- **Risk & rollout** — blast radius, backward-compat or migration steps, feature flags, and how to roll back.
|
|
36
|
+
6. **Verify the draft against the diff.** Cross-check every claim: does each "added X" map to a real hunk? Are migration/`.env`/breaking changes mentioned if the diff touches schemas, configs, or public signatures? Re-run a focused `git diff --merge-base origin/<base> -- <path>` (using the real base from step 2) to confirm anything you are unsure about.
|
|
37
|
+
7. **Report and flag gaps.** Output the finished markdown body. Below it, flag what you could *not* infer — missing test coverage for changed files, an empty "why", or risky changes (deleted migrations, dependency bumps) the author should address before requesting review.
|
|
38
|
+
|
|
39
|
+
> [!WARNING]
|
|
40
|
+
> If the diff is empty, the branch is identical to its base — stop and tell the user (they likely have not committed, or are on the base branch itself). Never invent a changelog for a branch with no changes.
|
|
41
|
+
|
|
42
|
+
## Examples
|
|
43
|
+
|
|
44
|
+
Given a branch `feat/rate-limit` diffed with `git diff --merge-base origin/main --stat`:
|
|
45
|
+
|
|
46
|
+
```text
|
|
47
|
+
src/middleware/rateLimit.ts | 84 +++++++++++++++++++++
|
|
48
|
+
src/middleware/rateLimit.test.ts | 52 ++++++++++++
|
|
49
|
+
src/server.ts | 6 ++
|
|
50
|
+
package.json | 1 +
|
|
51
|
+
4 files changed, 142 insertions(+), 1 deletion(-)
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
The skill drafts this PR body:
|
|
55
|
+
|
|
56
|
+
```markdown
|
|
57
|
+
## Summary
|
|
58
|
+
Add a sliding-window rate limiter to the public API to absorb bursty clients.
|
|
59
|
+
|
|
60
|
+
## What changed & why
|
|
61
|
+
- New `rateLimit` middleware (`src/middleware/rateLimit.ts`) enforces 100 req/min
|
|
62
|
+
per API key using a sliding window backed by the existing Redis client.
|
|
63
|
+
- Wired into the request pipeline in `src/server.ts`, ahead of route handlers so
|
|
64
|
+
rejected requests never hit business logic.
|
|
65
|
+
|
|
66
|
+
Chosen sliding-window over a fixed-window counter to avoid the double-burst
|
|
67
|
+
problem at window boundaries; the Redis sorted-set approach reuses the
|
|
68
|
+
connection pool we already provision, so no new infra.
|
|
69
|
+
|
|
70
|
+
## Implementation notes
|
|
71
|
+
- Adds one dependency: `ms` for human-readable window config.
|
|
72
|
+
- Limit is read from `RATE_LIMIT_PER_MIN` (defaults to 100) — no migration.
|
|
73
|
+
- Per-route overrides are intentionally out of scope; tracked in #421.
|
|
74
|
+
|
|
75
|
+
## Testing
|
|
76
|
+
- `src/middleware/rateLimit.test.ts` covers under-limit, at-limit, over-limit,
|
|
77
|
+
and window-expiry cases.
|
|
78
|
+
- Run locally: `npx vitest run src/middleware/rateLimit.test.ts`.
|
|
79
|
+
|
|
80
|
+
## Risk & rollout
|
|
81
|
+
- Low blast radius: middleware is fail-open — if Redis is unreachable it logs and
|
|
82
|
+
allows the request, so an outage degrades to today's behavior.
|
|
83
|
+
- Rollback: revert this PR; no schema or data changes.
|
|
84
|
+
- Heads-up: set `RATE_LIMIT_PER_MIN` in prod before merge if 100 is too low.
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
Then it flags any gaps, e.g.: *`src/server.ts` changed but is not covered by a test — confirm the wiring manually, and document the new `RATE_LIMIT_PER_MIN` env var in the README.*
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "prompt-cache-optimizer"
|
|
3
|
+
description: "Restructure an LLM call to maximize prompt-cache hit rate and add response/semantic caching — move the stable prefix (system prompt, instructions, few-shot, context) to the front and variable input to the end, set cache breakpoints, and measure the hit rate and savings. Use when repeated calls share large common context and token cost or latency is too high."
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Edit, Write, Bash"
|
|
5
|
+
version: 1.0.0
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
Most providers cache the **longest common prefix** of your prompt: send the same opening tokens again within the cache window and you pay a fraction of the price and get a faster first token. The catch is that caching is prefix-based and order-sensitive — one varying token near the top busts the whole cache. This skill restructures calls so the cache actually hits, and adds higher-level caching where it pays.
|
|
9
|
+
|
|
10
|
+
## When to use this skill
|
|
11
|
+
|
|
12
|
+
- Many calls share a large, stable chunk — a long system prompt, a fixed instruction block, few-shot examples, a retrieved document, or a tool schema.
|
|
13
|
+
- Token cost is dominated by **input** tokens repeated across calls.
|
|
14
|
+
- Time-to-first-token is too slow on prompts with a big static preamble.
|
|
15
|
+
- You have repeated or near-duplicate queries that could be served from a response cache instead of the model.
|
|
16
|
+
|
|
17
|
+
## Instructions
|
|
18
|
+
|
|
19
|
+
1. **Confirm how the target provider caches.** Check whether it's automatic prefix caching or requires explicit cache breakpoints/control, the minimum cacheable length, the cache TTL/window, and the discount on cached tokens. The strategy follows from the mechanism — don't assume one provider's rules apply to another.
|
|
20
|
+
2. **Put the stable prefix first.** Order the prompt **static → dynamic**: system prompt, durable instructions, few-shot examples, tool definitions, and long shared context at the top; the per-request user input and anything that changes every call at the **end**. The goal is the longest possible identical prefix across calls.
|
|
21
|
+
3. **Hunt for cache-busters near the top.** A timestamp, a request ID, a per-user name, or shuffled few-shot order in the preamble invalidates the prefix for every call. Move all of it below the cacheable block, or remove it.
|
|
22
|
+
4. **Set cache breakpoints where supported.** On providers with explicit cache control, mark the end of the stable block so the prefix up to that point is cached; keep the marked prefix byte-for-byte identical between requests.
|
|
23
|
+
5. **Add response/semantic caching above the model.** For exact-repeat queries, cache the full response keyed on the normalized request. For near-duplicate queries (FAQs, classification), consider semantic caching at the gateway ([Portkey](/tools/portkey), [Helicone](/tools/helicone)) — with a TTL and invalidation that match how often the underlying answer changes.
|
|
24
|
+
6. **Measure the hit rate and the savings.** Instrument cached vs. uncached tokens (or cache-hit count) and compare cost and time-to-first-token before and after. A cache you can't see the hit rate of is a cache you can't trust — report the real numbers, not the theoretical discount.
|
|
25
|
+
|
|
26
|
+
> [!WARNING]
|
|
27
|
+
> Don't cache what shouldn't be reused. Response/semantic caches can serve a stale or wrong answer for an input that *looks* similar but isn't (different user, different entitlements, time-sensitive data). Scope the cache key correctly and set a TTL that matches volatility — a cache bug is a correctness bug, not just a cost one.
|
|
28
|
+
|
|
29
|
+
> [!NOTE]
|
|
30
|
+
> Prompt caching changes economics but not quality: the model sees the same tokens, just cheaper and faster. Pair this with model right-sizing and prompt trimming (the [llm-cost-optimizer](/agents/data-ai/llm-cost-optimizer)) for the full cost win, and see [LLM Cost and Latency Engineering](/guides/advanced/llm-cost-latency-engineering) for the broader playbook.
|
|
31
|
+
|
|
32
|
+
## Output
|
|
33
|
+
|
|
34
|
+
The restructured prompt (static prefix first, variable input last, cache breakpoints set where supported), any response/semantic caching added with its key and TTL, and a before/after measurement of cache-hit rate, input-token cost, and time-to-first-token — so the change is proven, not assumed.
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "prompt-optimizer"
|
|
3
|
+
description: "Diagnose why a prompt underperforms and rewrite it with the technique that fixes it — clearer structure, few-shot examples, an explicit output contract, or reasoning scaffolding — returning an optimized prompt, the rationale for every change, and what to measure to confirm the lift. Use when a prompt is flaky, verbose, drifting in format, or just not good enough."
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Edit, Write"
|
|
5
|
+
version: 1.0.0
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
Give this skill an underperforming prompt and it returns an optimized one — with the reasoning. It works the way a good prompt engineer does on a single prompt: figure out *which* failure mode you're hitting, apply the *one* technique that addresses it, and say what to measure so the change is verified rather than assumed. It optimizes the prompt in front of it; it does not invent requirements you didn't state.
|
|
9
|
+
|
|
10
|
+
## When to use this skill
|
|
11
|
+
|
|
12
|
+
- A prompt is flaky — inconsistent output, format drift, occasional hallucination or refusal.
|
|
13
|
+
- Output isn't reliably parseable, or doesn't follow the structure your code expects.
|
|
14
|
+
- A prompt works but is bloated — too many tokens, redundant instructions, over-long examples.
|
|
15
|
+
- You want a stronger first draft of a prompt for a well-defined task before wiring evals around it.
|
|
16
|
+
|
|
17
|
+
## When NOT to use this skill
|
|
18
|
+
|
|
19
|
+
- You need the full lifecycle — build an eval set, baseline, iterate, and gate in CI. That's the **prompt-engineer** agent; this skill optimizes one prompt, it doesn't own the regression suite.
|
|
20
|
+
- You want prompts *compiled* automatically against a metric and dataset across a multi-step pipeline. That's programmatic optimization with [DSPy](/tools/dspy) — see [Programmatic Prompt Optimization with DSPy](/guides/prompting/dspy-prompt-optimization).
|
|
21
|
+
|
|
22
|
+
## Instructions
|
|
23
|
+
|
|
24
|
+
1. **Diagnose the failure mode first.** Read the prompt and any failing outputs and name the specific problem before changing anything: vague/ambiguous instructions, format drift, missing examples, no output contract, weak reasoning on multi-step cases, or simply token bloat. The fix follows from the diagnosis — don't apply techniques shotgun.
|
|
25
|
+
2. **Fix structure before wording.** Lead with the role and the single job. Separate instructions from data with sections or delimiters (`# Task`, `# Rules`, `<input>…</input>`) so the model can't confuse them. State the output format explicitly and put the most important constraint where it won't get buried. Prefer positive instructions ("respond with only the JSON object") over a wall of "do not."
|
|
26
|
+
3. **Add few-shot examples where they pay.** If the failure is format or convention, add two to five short, varied examples that demonstrate the exact shape — including the edge cases the model gets wrong (empty input, ambiguity, the desired "unknown"/refusal). Don't add examples the failure mode doesn't call for; they cost tokens and can overfit.
|
|
27
|
+
4. **Add an output contract when output is consumed by code.** Specify the exact shape (fields, enums, types) and recommend backing it with the provider's native structured-output/JSON mode plus validate-and-retry, not just a prose "return JSON." See [Few-Shot vs Chain-of-Thought vs Structured Prompting](/guides/prompting/prompting-techniques-2026).
|
|
28
|
+
5. **Add reasoning only where the task needs it.** For genuinely multi-step problems on a non-reasoning model, add chain-of-thought. On reasoning models, don't — they reason internally, and an explicit "think step by step" is often redundant. Match the technique to the model class.
|
|
29
|
+
6. **Cut bloat last.** Once quality is addressed, trim redundant instructions, prune low-value examples, and shorten verbose schemas — without dropping anything that was load-bearing for a failure mode.
|
|
30
|
+
7. **Say what to measure.** Every optimization is a hypothesis. State the single change you made, why, and the concrete check that would confirm it helped (a handful of held-out cases, an exact-match or schema-valid rate). Recommend changing one thing at a time so the lift is attributable.
|
|
31
|
+
|
|
32
|
+
> [!WARNING]
|
|
33
|
+
> "It looks better" is how regressions ship. This skill produces an *optimized candidate* and the check to validate it — it is not a substitute for an eval set. If output quality matters, run the proposed prompt against held-out cases before trusting it, and graduate to the prompt-engineer agent for a real regression suite.
|
|
34
|
+
|
|
35
|
+
> [!TIP]
|
|
36
|
+
> When output is malformed, fix structure before prose: a strict output spec, structured-output mode, or a one-line format reminder at the end of the prompt usually beats another paragraph of instructions.
|
|
37
|
+
|
|
38
|
+
## Output
|
|
39
|
+
|
|
40
|
+
The optimized prompt, copy-pasteable and ready to drop in, plus: the diagnosed failure mode, a short rationale for each change (which technique and why), any examples or schema added, an estimate of the token-cost delta, and the specific check to run to confirm the change actually helped.
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "prompt-pii-redactor"
|
|
3
|
+
description: "Detect and redact PII and secrets from prompts (and logs/traces) before they reach an LLM provider — mask or tokenize emails, phone numbers, names, IDs, and API keys, reversibly where the response needs the real values back. Use when sending user or document data to a third-party model, or when LLM request logs may capture sensitive data."
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Bash, Write, Edit"
|
|
5
|
+
version: 1.0.0
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
Every prompt you send to a hosted model leaves your environment, and every request you log may persist sensitive data. This skill puts a redaction layer in front of that boundary: it detects PII and secrets in outgoing prompts (and in traces/logs), masks or tokenizes them before they're sent, and — where the model's answer needs the real values — restores them on the way back. The goal is that third parties and log stores never see data they shouldn't.
|
|
9
|
+
|
|
10
|
+
## When to use this skill
|
|
11
|
+
|
|
12
|
+
- Sending user messages or document content to a third-party LLM API where PII/secrets shouldn't leave your environment.
|
|
13
|
+
- LLM request/response **logging or tracing** that could capture sensitive data in plaintext.
|
|
14
|
+
- A compliance or data-residency requirement to minimize personal data sent to or stored by external services.
|
|
15
|
+
|
|
16
|
+
## Instructions
|
|
17
|
+
|
|
18
|
+
1. **Define what's sensitive here.** Enumerate the categories that matter for this app and jurisdiction: direct identifiers (names, emails, phones, addresses), government/financial IDs (SSN, card numbers), and **secrets** (API keys, tokens, credentials). Don't over-redact data the task genuinely needs — redaction that breaks the use case gets turned off.
|
|
19
|
+
2. **Detect with layered methods.** Combine high-precision pattern/format detection (regex/validators for emails, cards, keys) with NER/model-based detection for free-form PII (names, locations). A library like [LLM Guard](/tools/llm-guard)'s anonymize/secrets scanners covers much of this; match it to your data.
|
|
20
|
+
3. **Choose mask vs. reversible tokenize.** For data the model never needs in the clear, **mask** (irreversible placeholder). For data the response must reference or return, **tokenize reversibly** — replace with a stable placeholder, then re-insert the original in the model's output (a vault/map held only in your environment).
|
|
21
|
+
4. **Apply at the boundary — both directions.** Redact on the request before it leaves for the provider, and de-tokenize on the response if you tokenized. Apply the same redaction to anything written to **logs/traces**, which are an equally common leak.
|
|
22
|
+
5. **Verify and measure.** Test against representative data for both misses (sensitive data that slipped through) and over-redaction (broke the task), and log redaction counts (not the values) so coverage is auditable.
|
|
23
|
+
6. **State the residual risk.** Detection is imperfect — novel formats and contextual PII evade detectors. Note what's covered and recommend pairing with least-data-collection and provider data-handling controls (no-retention/zero-retention options) rather than relying on redaction alone.
|
|
24
|
+
|
|
25
|
+
> [!WARNING]
|
|
26
|
+
> Reversible tokenization means the mapping from placeholder to real value lives in **your** environment and never in the prompt. If you send the model a key to reverse the tokens, you've sent the data — defeating the point. Keep the vault server-side and re-insert originals only after the response returns.
|
|
27
|
+
|
|
28
|
+
> [!NOTE]
|
|
29
|
+
> Don't forget the logs. Teams redact the prompt to the provider but log the raw request for debugging — and the sensitive data lands in the log store anyway. Redact on the way to logs/traces too, or scrub at the logging layer.
|
|
30
|
+
|
|
31
|
+
## Output
|
|
32
|
+
|
|
33
|
+
A redaction layer applied at the LLM boundary: the sensitive-data categories handled, the detection methods, the mask-vs-reversible-tokenize decisions, request/response and logging integration, and a coverage check (misses and over-redaction) — plus a clear statement of residual risk and the complementary controls (data minimization, provider no-retention) it should sit alongside.
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "provider-fallback-wrapper"
|
|
3
|
+
description: "Wrap LLM calls so a provider outage, rate limit, or timeout degrades gracefully — with multi-provider fallback, bounded retries with backoff, and timeouts. Use when an app depends on a single model/provider and needs production resilience."
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Edit, Write, Bash"
|
|
5
|
+
version: 1.0.0
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
LLM providers have outages, rate limits, and latency spikes. If your feature calls one model directly, every one of those is an incident. This skill wraps LLM calls with the resilience patterns that keep the feature up: timeouts, sensible retries, and fallback to an alternate model or provider.
|
|
9
|
+
|
|
10
|
+
## When to use this skill
|
|
11
|
+
|
|
12
|
+
- A production feature depends on a single model/provider and needs to survive outages and rate limits.
|
|
13
|
+
- You're seeing user-facing failures from transient `429`/`5xx`/timeout errors.
|
|
14
|
+
- You want a cheaper/faster primary model with a stronger fallback (or vice versa).
|
|
15
|
+
|
|
16
|
+
## Instructions
|
|
17
|
+
|
|
18
|
+
1. **Set a timeout.** Every call gets a deadline. A hung provider should fail fast into retry/fallback, not block the request indefinitely.
|
|
19
|
+
2. **Retry only what's retryable.** Retry transient failures — timeouts, rate limits (`429`), and `5xx` — with **exponential backoff and jitter** and a hard attempt cap. Do **not** retry non-retryable errors (`400` bad request, `401` auth, content-policy refusals); retrying those just wastes time and money.
|
|
20
|
+
3. **Fall back across providers/models.** On exhausting retries (or on specific errors), route to an alternate model or provider. Decide the order by cost/quality and keep the request/response shape stable so callers don't care which served it. A gateway like [LiteLLM](/tools/litellm) or [OpenRouter](/tools/openrouter) can do fallback for you; otherwise implement it explicitly.
|
|
21
|
+
4. **Mind semantic differences.** Fallback models may differ in format adherence and quality — re-apply structured-output validation after fallback, and don't silently downgrade a critical response without noting it.
|
|
22
|
+
5. **Make it observable.** Log which provider served each request, retry counts, and fallback events, and emit metrics so you can see when you're leaning on the fallback (a signal the primary is degraded).
|
|
23
|
+
6. **Guard cost.** Fallbacks and retries cost tokens; cap attempts and consider a circuit breaker that stops hammering a provider that's clearly down.
|
|
24
|
+
|
|
25
|
+
> [!WARNING]
|
|
26
|
+
> Don't retry non-idempotent, side-effecting calls blindly — for tool-executing agents, a naive retry can repeat an action. Retry the model call, but make any side effects idempotent (see the agent tool-calling guidance).
|
|
27
|
+
|
|
28
|
+
> [!NOTE]
|
|
29
|
+
> Fallback adds resilience, not correctness. A degraded fallback model can still produce worse output — validate it, and surface when you're running on the backup.
|
|
30
|
+
|
|
31
|
+
## Output
|
|
32
|
+
|
|
33
|
+
A wrapper around the app's LLM calls implementing timeouts, retryable-only backoff retries, multi-provider/model fallback, validation after fallback, and logging/metrics — with attempt and cost caps.
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "qlora-finetune-runner"
|
|
3
|
+
description: "Run a QLoRA (4-bit LoRA) fine-tune of an open-weight model from a prepared dataset — set up the config, train memory-efficiently (e.g. with Unsloth/PEFT), watch for overfitting, save the adapter, and run a quick eval against the prepared split. Use when you have a clean dataset and want to execute a parameter-efficient fine-tune on a single GPU."
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Bash, Write, Edit"
|
|
5
|
+
version: 1.0.0
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
QLoRA makes fine-tuning a real option on one modest GPU: it quantizes the base model to 4-bit and trains only small LoRA adapters on top, so a model that wouldn't fit in memory for full fine-tuning trains comfortably. This skill executes that run from a **prepared** dataset — it sets a sensible config, trains, watches for the failure modes, saves the adapter, and sanity-checks the result, reproducibly.
|
|
9
|
+
|
|
10
|
+
## When to use this skill
|
|
11
|
+
|
|
12
|
+
- You have a cleaned, split dataset (see the [Fine-Tune Dataset Builder](/skills/data/finetune-dataset-builder)) and want to run a parameter-efficient fine-tune.
|
|
13
|
+
- Fine-tuning on a single GPU where full fine-tuning won't fit — QLoRA's 4-bit base makes it possible.
|
|
14
|
+
- Iterating on a fine-tune: adjusting LoRA rank, learning rate, or epochs and re-running cleanly.
|
|
15
|
+
|
|
16
|
+
## Instructions
|
|
17
|
+
|
|
18
|
+
1. **Verify the dataset is ready.** Confirm it's in the trainer's format (typically JSONL chat/instruction records), deduped, and has a held-out eval split that does not overlap training. If it isn't prepared, stop and prepare it first — the run is only as good as the data. (See [Preparing a Fine-Tuning Dataset](/guides/mlops/finetune-dataset-prep).)
|
|
19
|
+
2. **Detect the environment.** Check the GPU/VRAM, framework, and whether [Unsloth](/tools/unsloth), TRL/PEFT, or another trainer is in use; match the project's existing setup rather than introducing a new stack.
|
|
20
|
+
3. **Set a sane QLoRA config.** 4-bit base (NF4), LoRA on the attention (and often MLP) projection modules, a modest rank (e.g. 8–32) and matching alpha, a low learning rate, and a small number of epochs (1–3 — more usually overfits). State each choice; they're the knobs you'll tune.
|
|
21
|
+
4. **Train with the eval split wired in.** Run training with periodic evaluation on the held-out split so you can see validation loss, not just training loss. Keep it reproducible: fixed seed, logged config, recorded dataset version.
|
|
22
|
+
5. **Watch for the failure modes.** Stop or adjust if validation loss climbs while training loss falls (**overfitting**), or if outputs lose general ability (**catastrophic forgetting**). The fix is usually fewer epochs or better data, not a bigger rank.
|
|
23
|
+
6. **Save the adapter and sanity-check.** Save the LoRA adapter (and a merged model if you'll serve it merged), then run a quick eval on held-out examples to confirm the behavior changed in the intended direction before handing off to a full evaluation.
|
|
24
|
+
|
|
25
|
+
> [!WARNING]
|
|
26
|
+
> Training loss going down means the model is memorizing, not that it's good. Always evaluate on the **held-out split** — and if it never saw a real eval set, this run can't be trusted regardless of how clean the loss curve looks.
|
|
27
|
+
|
|
28
|
+
> [!NOTE]
|
|
29
|
+
> QLoRA's 4-bit quantization is for fitting the *base* model in memory during training; it's separate from quantizing the final model for serving. Note which you mean, and re-check quality after any serving-time quantization.
|
|
30
|
+
|
|
31
|
+
## Output
|
|
32
|
+
|
|
33
|
+
A trained LoRA adapter plus the run's record: the QLoRA config (quantization, rank/alpha, target modules, LR, epochs, seed), the training/validation loss curves, the dataset version, and a quick held-out eval confirming the intended behavior change — reproducible enough to re-run or tune. Full evaluation and the ship/no-ship call belong to the [finetuning-engineer](/agents/data-ai/finetuning-engineer).
|
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "readme-generator"
|
|
3
|
+
description: "Generate or refresh a project README grounded in the actual repository. Use when a project has no README, a stale one, or you want install/usage/scripts/structure sections that match the real code."
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Write, Bash"
|
|
5
|
+
version: 1.0.0
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
Produce a `README.md` that reflects what the repository actually contains — not a generic template. The skill detects the stack, build tooling, runnable scripts, entry points, and directory layout by reading real manifest files, then assembles a title, a one-line plus short description, and install / usage / scripts / project-structure sections. Every command it prints is one the project can actually run, so a new contributor can clone, install, and start without guessing.
|
|
9
|
+
|
|
10
|
+
## When to use this skill
|
|
11
|
+
|
|
12
|
+
- A project has no README, or an outdated one that no longer matches the code.
|
|
13
|
+
- You want install and usage instructions derived from the real `package.json` / `Makefile` / `pyproject.toml`, not boilerplate.
|
|
14
|
+
- You need a consistent, scannable README with the standard sections (install, usage, scripts, structure) in one pass.
|
|
15
|
+
|
|
16
|
+
> [!WARNING]
|
|
17
|
+
> Never invent features, flags, or commands. If a script, entry point, or env var is not in the repo, it does not go in the README. When something is genuinely unknown (license, deploy target), insert a clearly marked `<!-- TODO -->` rather than fabricating it.
|
|
18
|
+
|
|
19
|
+
## Instructions
|
|
20
|
+
|
|
21
|
+
1. **Locate the project root and existing README.** Glob for `README*` at the root. If one exists, read it — preserve hand-written prose (project purpose, badges, screenshots, license) and only regenerate the mechanical sections. Treat the code as the source of truth where they disagree.
|
|
22
|
+
2. **Detect the stack — do not guess.** Read the manifest that exists rather than assuming:
|
|
23
|
+
- Node/TS: `package.json` (name, description, `scripts`, `bin`, `type`, `engines`), plus `tsconfig.json`, lockfile (`package-lock.json` / `pnpm-lock.yaml` / `yarn.lock` / `bun.lock` / `bun.lockb`) to pick the right package manager.
|
|
24
|
+
- Python: `pyproject.toml` / `setup.py` / `requirements.txt`.
|
|
25
|
+
- Go: `go.mod`. Rust: `Cargo.toml`. Make-driven: `Makefile` targets.
|
|
26
|
+
Frameworks: infer from dependencies (`next`, `react`, `fastapi`, `express`) — do not claim a framework that isn't a dependency.
|
|
27
|
+
3. **Extract install and usage facts.** Map the detected manager to the install command (`npm install`, `pnpm install`, `pip install -e .`, `cargo build`). Find the entry point (`main`/`bin` in `package.json`, `cmd/` in Go, `__main__.py`). Pull the dev/start/build commands straight from `scripts` or `Makefile` targets — quote them verbatim.
|
|
28
|
+
4. **Map the structure.** Glob the top-level directories and a shallow level below, ignoring `node_modules`, `.git`, `dist`, `build`, and `.next`. Annotate each meaningful directory with one short phrase describing what lives there, based on what you actually find.
|
|
29
|
+
5. **Assemble the README.** Write `README.md` with: an `#` H1 title (from manifest `name`), a one-line tagline, a short paragraph, then `## Installation`, `## Usage`, `## Scripts` (a table of every script + its command), and `## Project structure` (a fenced tree). Keep it scannable; prefer fenced code blocks over prose for commands.
|
|
30
|
+
6. **Verify against the repo.** Re-check that every script in the table exists in the manifest and every path in the tree exists on disk. Run `npm run` (or `make`) to confirm the script list matches, if available.
|
|
31
|
+
7. **Report and flag gaps.** Summarize what was detected and list what you could not determine (license, badges, env-var docs, deployment) so the user can fill those `<!-- TODO -->` markers.
|
|
32
|
+
|
|
33
|
+
> [!TIP]
|
|
34
|
+
> Generate the scripts table directly from the `scripts` object so it never drifts. If two scripts are obvious wrappers (`build` calling `prebuild`), document the public one and mention the dependency in a single line rather than listing internals.
|
|
35
|
+
|
|
36
|
+
## Examples
|
|
37
|
+
|
|
38
|
+
For a detected Node/TypeScript project (`package.json` with `name: "taskflow"`, a `next dev` style `scripts` block, and `src/` + `public/`), the skill emits:
|
|
39
|
+
|
|
40
|
+
````md
|
|
41
|
+
# taskflow
|
|
42
|
+
|
|
43
|
+
A task-board API and dashboard built with Next.js and TypeScript.
|
|
44
|
+
|
|
45
|
+
TaskFlow exposes a REST API for boards, lists, and cards, with a server-rendered
|
|
46
|
+
dashboard. State is persisted to Postgres via Prisma.
|
|
47
|
+
|
|
48
|
+
## Installation
|
|
49
|
+
|
|
50
|
+
```bash
|
|
51
|
+
pnpm install # lockfile detected: pnpm-lock.yaml
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
## Usage
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
pnpm dev # start the dev server on http://localhost:3000
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## Scripts
|
|
61
|
+
|
|
62
|
+
| Script | Command | Description |
|
|
63
|
+
| ------- | ---------------- | --------------------------------- |
|
|
64
|
+
| `dev` | `next dev` | Run the dev server with HMR |
|
|
65
|
+
| `build` | `next build` | Production build |
|
|
66
|
+
| `start` | `next start` | Serve the production build |
|
|
67
|
+
| `lint` | `eslint .` | Lint with the flat ESLint config |
|
|
68
|
+
| `test` | `vitest run` | Run the test suite once |
|
|
69
|
+
|
|
70
|
+
## Project structure
|
|
71
|
+
|
|
72
|
+
```text
|
|
73
|
+
src/
|
|
74
|
+
app/ Next.js App Router routes and layouts
|
|
75
|
+
lib/ data access and shared utilities
|
|
76
|
+
components/ shared UI components
|
|
77
|
+
public/ static assets served as-is
|
|
78
|
+
prisma/ schema and migrations
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
<!-- TODO: add license, CI badges, and DATABASE_URL setup notes -->
|
|
82
|
+
````
|
|
83
|
+
|
|
84
|
+
Every command above came from the project's real `scripts`; the tree lists only directories that exist. Fill the `TODO` marker before publishing.
|
|
@@ -0,0 +1,65 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "secret-scanner"
|
|
3
|
+
description: "Scan a repo or a diff for committed secrets — API keys, tokens, private keys, .env files, and high-entropy strings — then triage real leaks from fixtures. Use before pushing, in review, or when a credential may have leaked."
|
|
4
|
+
allowed-tools: "Read, Grep, Glob, Bash"
|
|
5
|
+
version: 1.0.0
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
Find credentials that should never be in version control — provider API keys, OAuth tokens, private keys, database URLs, and `.env` files — across a whole repo or a single diff. The skill greps for known key shapes, flags high-entropy strings, then triages each hit: real leak vs. example/test fixture vs. placeholder. For confirmed leaks it tells you the only safe remediation — **rotate the credential and scrub history** — because a secret that reached `git` is already compromised the moment it was pushed.
|
|
9
|
+
|
|
10
|
+
## When to use this skill
|
|
11
|
+
|
|
12
|
+
- Before pushing a branch or opening a PR, to catch a credential that slipped into a commit.
|
|
13
|
+
- During review of a diff that touches config, CI, infrastructure, or `.env*` files.
|
|
14
|
+
- After a suspected leak, to find every place a key appears across the working tree and history.
|
|
15
|
+
- When onboarding a repo and you want a baseline audit of what secrets may already be committed.
|
|
16
|
+
|
|
17
|
+
> [!WARNING]
|
|
18
|
+
> Deleting a secret from the latest commit does **not** remove it from history — it stays in every prior commit, every clone, and every fork. Any matched real key must be treated as compromised: **rotate it first**, then scrub history. Deletion alone is not remediation.
|
|
19
|
+
|
|
20
|
+
## Instructions
|
|
21
|
+
|
|
22
|
+
1. **Define the scan target.** Decide between the working tree (`git ls-files`), a specific diff (`git diff main...HEAD`), or full history (`git log -p` / a dedicated history scanner). Diff scans are fast for PRs; full-tree scans catch already-committed leaks. Make the scope explicit in your report.
|
|
23
|
+
2. **Detect existing tooling and ignore rules — do not guess.** Check for `.gitleaks.toml`, `.trufflehog*`, `detect-secrets` baselines, or a `pre-commit` config. If a scanner is already configured, run it (`gitleaks detect`, `trufflehog filesystem .`) and honor its allowlist. Read `.gitignore` to see what *should* have been excluded but wasn't.
|
|
24
|
+
3. **Grep for known secret shapes.** Search for provider-specific prefixes and structural patterns rather than generic words: `AKIA`/`ASIA` (AWS), `ghp_`/`gho_`/`github_pat_` (GitHub), `sk-`/`sk-proj-` (OpenAI), `xox[baprs]-` (Slack), `AIza` (Google), `-----BEGIN .* PRIVATE KEY-----`, JWTs (`eyJ`), and connection strings (`postgres://`, `mongodb+srv://` with embedded credentials). Also glob for committed `.env`, `.env.*`, `*.pem`, `*.p12`, `id_rsa`, and `*.keystore` files.
|
|
25
|
+
4. **Flag high-entropy strings.** For assignments like `token = "..."`, `secret: ...`, `password=...`, score the value's Shannon entropy; long base64/hex strings with high entropy near a secret-ish identifier are candidates even without a known prefix.
|
|
26
|
+
5. **Triage every hit.** This is the core of the skill — separate true positives from noise: a value in `*.example`, `*.sample`, `fixtures/`, `test/`, or a docs snippet, or an obvious placeholder (`xxx`, `your-key-here`, `changeme`, `dummy`, all-zeros) is a **false positive**. A live-looking value in real config, source, or CI is a **true positive**. When unsure, mark it `review` rather than dismissing it.
|
|
27
|
+
6. **Verify the finding set.** Re-run your matches with `git grep -n` to attach exact `file:line` locations, and confirm each true positive is reachable in a tracked file (not just an ignored local file). For history claims, verify with `git log -p -S '<fragment>'`.
|
|
28
|
+
7. **Report and remediate.** Output a triaged findings table (file, line, type, verdict). For every true positive, give the two-step fix in order: **(1) rotate** the credential at the provider and invalidate the old one; **(2) scrub history** with `git filter-repo --replace-text` or BFG, then force-push and have collaborators re-clone. Flag any `review` items needing human judgment and recommend adding a pre-commit secret scanner to prevent recurrence.
|
|
29
|
+
|
|
30
|
+
> [!NOTE]
|
|
31
|
+
> Rotation comes before scrubbing. Scrubbing hides the secret going forward but cannot un-leak what was already pushed; only rotation makes the exposed value worthless.
|
|
32
|
+
|
|
33
|
+
## Examples
|
|
34
|
+
|
|
35
|
+
Triaged output for a branch diff:
|
|
36
|
+
|
|
37
|
+
```text
|
|
38
|
+
$ git diff main...HEAD | secret-scanner
|
|
39
|
+
|
|
40
|
+
Findings (4 matches, scope: diff main...HEAD)
|
|
41
|
+
|
|
42
|
+
| File | Line | Type | Verdict |
|
|
43
|
+
|-------------------------------|------|---------------------|----------------|
|
|
44
|
+
| src/config/aws.ts | 12 | AWS access key (AKIA) | TRUE POSITIVE |
|
|
45
|
+
| .env | 1 | committed .env file | TRUE POSITIVE |
|
|
46
|
+
| test/fixtures/stripe.json | 8 | Stripe TEST key (sk_test_) | false positive |
|
|
47
|
+
| README.md | 44 | placeholder API key | false positive |
|
|
48
|
+
|
|
49
|
+
2 true positives. ACTION REQUIRED.
|
|
50
|
+
|
|
51
|
+
src/config/aws.ts:12 AKIAIOSFODNN7EXAMPLE...
|
|
52
|
+
-> ROTATE: deactivate this access key in IAM and issue a new one.
|
|
53
|
+
-> SCRUB: git filter-repo --replace-text <(echo 'AKIAIOSFODNN7EXAMPLE==>REMOVED')
|
|
54
|
+
then force-push; ask collaborators to re-clone.
|
|
55
|
+
|
|
56
|
+
.env:1 contains DATABASE_URL with embedded password
|
|
57
|
+
-> ROTATE: change the database password now.
|
|
58
|
+
-> SCRUB: git rm --cached .env && add `.env` to .gitignore, then filter-repo
|
|
59
|
+
to purge it from history.
|
|
60
|
+
|
|
61
|
+
Recommendation: add gitleaks as a pre-commit hook to block future leaks.
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
> [!WARNING]
|
|
65
|
+
> The `sk_test_` Stripe key and the README placeholder are intentionally inert — flagging them as incidents wastes responder time and erodes trust in the scanner. Triage before you alarm.
|