@ai-agent-lead/skills 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (48) hide show
  1. package/README.md +37 -0
  2. package/bin/install.js +272 -0
  3. package/package.json +34 -0
  4. package/skills/LANGUAGE.md +72 -0
  5. package/skills/README.md +156 -0
  6. package/skills/SKILL-TEMPLATE.md +120 -0
  7. package/skills/TRIGGERS.md +64 -0
  8. package/skills/WORKFLOWS.md +369 -0
  9. package/skills/bench/SKILL.md +40 -0
  10. package/skills/bench/templates/benchmark-report.md +26 -0
  11. package/skills/bootstrap/BOOTSTRAP.md +13 -0
  12. package/skills/bootstrap/SKILL.md +47 -0
  13. package/skills/code-hygiene/SKILL.md +92 -0
  14. package/skills/debug/SKILL.md +122 -0
  15. package/skills/design/DEEP-MODULES.md +76 -0
  16. package/skills/design/FUNCTIONAL-CORE.md +121 -0
  17. package/skills/design/ILLEGAL-STATES.md +102 -0
  18. package/skills/design/OBSERVABILITY.md +49 -0
  19. package/skills/design/PERSONAS.md +41 -0
  20. package/skills/design/SKILL.md +139 -0
  21. package/skills/design/TESTABILITY.md +84 -0
  22. package/skills/feature-doc/SKILL.md +113 -0
  23. package/skills/feature-doc/templates/feature-template.md +52 -0
  24. package/skills/formats/ADR-FORMAT.md +51 -0
  25. package/skills/formats/CONTEXT-FORMAT.md +109 -0
  26. package/skills/formats/CONTEXT-MAP-FORMAT.md +6 -0
  27. package/skills/grill-plan/SKILL.md +112 -0
  28. package/skills/improve-codebase-architecture/DEEPENING.md +37 -0
  29. package/skills/improve-codebase-architecture/INTERFACE-DESIGN.md +41 -0
  30. package/skills/improve-codebase-architecture/SKILL.md +115 -0
  31. package/skills/investigate/SKILL.md +97 -0
  32. package/skills/investigate/templates/research-note.md +84 -0
  33. package/skills/pr-review/SKILL.md +197 -0
  34. package/skills/prod-ready/SKILL.md +88 -0
  35. package/skills/security-review/SKILL.md +145 -0
  36. package/skills/simplify/SKILL.md +105 -0
  37. package/skills/sync-check/SKILL.md +69 -0
  38. package/skills/system-design/SKILL.md +160 -0
  39. package/skills/tdd/SKILL.md +121 -0
  40. package/skills/tdd/TESTS.md +93 -0
  41. package/skills/tdd-rounds/COMMITS.md +122 -0
  42. package/skills/tdd-rounds/SKILL.md +96 -0
  43. package/skills/tdd-rounds/templates/builder-brief.md +73 -0
  44. package/skills/tdd-rounds/templates/builder-report.md +21 -0
  45. package/skills/verify-real-deps/MOTIVATION.md +18 -0
  46. package/skills/verify-real-deps/SKILL.md +118 -0
  47. package/skills/verify-real-deps/templates/known-issues.md +45 -0
  48. package/skills/zoom-out/SKILL.md +104 -0
@@ -0,0 +1,118 @@
1
+ ---
2
+ name: verify-real-deps
3
+ description: Pre-tag smoke test against real third-party APIs. Use after `prod-ready` is clean, before tagging vN.0 — the gate that catches wire-shape mismatches that fakes accept but real upstreams reject. Triggered when the user mentions "smoke test", "real API", "live verify", "before tag", or "end-to-end against actual <vendor>". Pairs with `prod-ready` (which catches ops/infra issues tests miss) and `tdd-rounds` (the orchestration that feeds into this gate).
4
+ complexity: high
5
+ expected_duration: 45 minutes
6
+ ---
7
+
8
+ # Verify Against Real Dependencies
9
+
10
+ Tests verify the contract you wrote into the fake server. The fake accepts what you told it to accept. **Real third-party APIs enforce the contract Google / Stripe / OpenAI / etc. actually ship — and that contract drifts from any reverse-engineered model.**
11
+
12
+ This skill is the explicit step between "all tests green" and "tag the release". It catches the class of bug where the fake said yes but production says no.
13
+
14
+ ## Why this skill exists
15
+
16
+ A fake server is a *hypothesis* about the contract; the real upstream **is** the contract. Until you run code against the real upstream at least once, every test that uses the fake is testing your hypothesis. **`verify-real-deps` is the discipline that finds wire-shape bugs before users do.**
17
+
18
+ See [MOTIVATION.md](./MOTIVATION.md) for a worked example showing eight such bugs surfaced after a v1.0 shipped 100% green.
19
+
20
+ ## When to use
21
+
22
+ - After `tdd-rounds` is complete (every AC ticked).
23
+ - After `prod-ready` checklist is filled and clean.
24
+ - Before `git tag vN.0` and before any public announcement.
25
+ - Any time you wonder "have we actually run this against the real upstream?" — the answer should never be "no" for a v1.0.
26
+
27
+ ## When to skip
28
+
29
+ - The system has no third-party API integration. Pure-internal services with database-only state — `prod-ready` is enough.
30
+ - The "real API" is your own service in a staging environment that you fully control. No reverse-engineered contract to verify.
31
+ - A fix-round that doesn't touch the upstream-talking code path.
32
+
33
+ ## Workflow
34
+
35
+ ### 1. Set up a real-credential test environment
36
+
37
+ Use real credentials in a sandbox-equivalent context (a dev project, a low-quota account, a test API key). Document what you used so a future contributor can repeat the verification.
38
+
39
+ ```
40
+ Real credentials used: <handle / sandbox project / dev API key>
41
+ Source-of-truth API host: <e.g., cloudcode-pa.googleapis.com>
42
+ Expected limits: <so unexpected throttling stands out>
43
+ ```
44
+
45
+ ### 2. Run a representative end-to-end flow
46
+
47
+ Pick the smallest set of operations that exercise every wire-level interaction. For an API proxy, that means at least:
48
+
49
+ - Authenticate / load tier metadata.
50
+ - Read state (quota, account info).
51
+ - Write state if any (credential rotation, OAuth refresh).
52
+ - The hot-path operation (the one users will hit most).
53
+ - An error path that exercises your error-handling (a forced 429, a malformed input, a permission failure).
54
+
55
+ Capture the **raw upstream response** for each — headers and body. Don't paraphrase. The body shapes are the contract.
56
+
57
+ ### 3. Capture every surfaced surprise
58
+
59
+ Anything that worked-in-tests but doesn't-work-now is a bug. For each, write an entry in `docs/known-issues.md` (create the file lazily on first use — copy the format from [`templates/known-issues.md`](templates/known-issues.md)). The fields:
60
+
61
+ - **Severity** (high / medium / low — based on user impact, not difficulty).
62
+ - **Status** (Open / Closed in R<N> with commit hash).
63
+ - **Reproduction** (numbered steps a stranger could follow).
64
+ - **Root cause** (one paragraph explaining why the test passed but real didn't).
65
+ - **Files** (which packages own the fix).
66
+ - **Fix sketch** (one paragraph; specific enough to scope a Builder brief).
67
+
68
+ The bug ledger doubles as the **canonical brief** for the fix-rounds — each entry is the Round N+M brief once the parent dispatches a fix-Builder.
69
+
70
+ ### 4. Iterate fix-rounds
71
+
72
+ For each issue, dispatch a fix-Builder per `tdd-rounds`. Brief them with the issue's entry in the ledger. They:
73
+
74
+ 1. Read the issue.
75
+ 2. Reproduce.
76
+ 3. Write a failing test (often unit-test the parser / handler that mishandled the real shape).
77
+ 4. Fix.
78
+ 5. Update the upstreamtest fake / mock to match the real shape — this prevents regression in the test layer.
79
+ 6. Mark the issue **Closed** in `known-issues.md` with the commit hash and test names.
80
+
81
+ After each fix-round, **re-run the same end-to-end flow** to confirm the fix works AND no other regression appeared. Issues compound — a wire-shape fix sometimes reveals a cooldown / retry / cache bug behind it.
82
+
83
+ ### 5. Defer or close
84
+
85
+ Either the bug is fixed, or it's explicitly deferred to vN.1 with a recorded reason. No silent deferrals. If a bug is "intermittent / can't reproduce / probably the vendor", document it that way — the next contributor benefits from knowing the failure mode existed.
86
+
87
+ ### 6. Update the bug-prevention layer
88
+
89
+ Every bug found here is also a test that should have existed. After fixing, ask: **"What test would have caught this if it had existed?"** Then add it. The fake harness should accept the **same shapes the real API does** — if the fake was too permissive, tighten it.
90
+
91
+ ### 7. Tag
92
+
93
+ Only tag when the bug ledger has zero `Open` entries (or all `Open` entries are explicitly deferred-by-design with a tracking line in the feature doc's Non-Goals). Then:
94
+
95
+ ```bash
96
+ git tag vN.0
97
+ git push --tags
98
+ ```
99
+
100
+ ## Rules
101
+
102
+ - **Use real credentials, not mocked ones.** The whole point is to find what the mock didn't.
103
+ - **Don't fix issues silently.** Every bug gets a ledger entry, even one-liners. The discipline IS the value.
104
+ - **Fix the fake too.** When you patch a wire-shape, the fake must also reject the wrong shape — otherwise the tests will accept regression.
105
+ - **Don't extend scope.** This is a verification phase, not a feature phase. If you find a "while I'm here, let me also..." impulse, write it as a `vN.1` note and resist.
106
+ - **Don't skip the ledger entry for "small" bugs.** A one-line fix still gets a ledger entry. The ledger is the post-mortem record.
107
+
108
+ ## Templates
109
+
110
+ - [`templates/known-issues.md`](templates/known-issues.md) — the bug-ledger format.
111
+
112
+ ## Handoff
113
+
114
+ When the bug ledger is clean and the same end-to-end flow runs without surprise:
115
+
116
+ - Append a verification entry to `docs/STATE.md`: `## End-to-end smoke test (DONE YYYY-MM-DD)` with what was run and what was found.
117
+ - Tag the release.
118
+ - The bug ledger stays in `docs/known-issues.md` as a permanent post-mortem record. Future contributors read it to understand "what surprises lurk in this codebase that the test suite doesn't show."
@@ -0,0 +1,45 @@
1
+ # Known issues — post-vN.0 smoke test findings
2
+
3
+ **Discovered:** YYYY-MM-DD (first end-to-end run against real <upstream>).
4
+ **Status:** <N> open / <M> closed.
5
+
6
+ These bugs slipped past unit/integration tests because the test harness accepted whatever wire shape the code sent. Real upstream enforces stricter contracts and surfaced them. Each entry doubles as the canonical brief for its fix-round.
7
+
8
+ ---
9
+
10
+ ## #N — <one-line title>
11
+
12
+ **Severity:** High | Medium | Low — <one-line reason based on user impact>.
13
+ **Status:** Open | Closed in R<N> (commit <hash>). Tests:
14
+ `TestNameOne`,
15
+ `TestNameTwo`.
16
+
17
+ **Reproduction:**
18
+ 1. <numbered step a stranger could follow>
19
+ 2. <step>
20
+ 3. <expected vs observed>
21
+
22
+ **Root cause:** <one paragraph — why the test passed but the real API didn't. Cite the test harness's permissive behavior, the real API's documented or observed contract, and the gap between them.>
23
+
24
+ **Files:**
25
+ - `<package>/<file>.go` (the function that mishandled the shape)
26
+ - `<test-harness>/<fake>.go` (the fake that accepted the wrong shape — fix this too to prevent regression)
27
+
28
+ **Fix sketch:**
29
+ <one paragraph specific enough to scope a Builder brief. Mention the precise wire shape change, the function signature change if any, and the new test that pins the real-API contract.>
30
+
31
+ ---
32
+
33
+ (repeat per issue)
34
+
35
+ ---
36
+
37
+ ## Fix plan
38
+
39
+ **R<N> round (DONE YYYY-MM-DD):** issues #X-#Y closed in <K> dedicated commits + a simplify pass.
40
+ **R<N+1> round (PENDING):** issue #Z.
41
+
42
+ After all open issues are closed:
43
+ - Append an end-to-end verification entry to `docs/STATE.md`.
44
+ - Tag vN.0.
45
+ - Leave this file in place as a permanent post-mortem record.
@@ -0,0 +1,104 @@
1
+ ---
2
+ name: zoom-out
3
+ description: User-invoked utility — pulls the agent up an abstraction layer when the user is lost in unfamiliar code. Produces a map of relevant modules, callers, and seams in `docs/CONTEXT.md` vocabulary. Use when the user says "I'm lost", "zoom out", "give me higher-level context", "I don't know this area", "what depends on what here", or invokes the slash command. Does not change which workflow the user is in — interrupts to orient, then hands back. Skip when the user already has the map and just needs to read code.
4
+ disable-model-invocation: true
5
+ complexity: low
6
+ expected_duration: 5 minutes
7
+ ---
8
+
9
+ # Zoom Out
10
+
11
+ A user-invoked interrupt: the user is mid-task in an unfamiliar area and needs the topology before they keep going. The skill produces a map — what modules exist, which depend on which, what the seams look like, what each is responsible for — so the user can resume their original workflow with context.
12
+
13
+ This skill **does not change the workflow** the user is in. It runs once, produces the map, and hands back. The user is still in their bug-fix / feature / refactor.
14
+
15
+ ## When to use
16
+
17
+ - User says "I'm lost", "zoom out", "give me higher-level context", "what depends on what here", "I don't know this area".
18
+ - User invokes the slash command (`/zoom-out`).
19
+ - A related skill (`debug`, `improve-codebase-architecture`, Workflow 4 / 5b) suggests zooming out before continuing.
20
+
21
+ ## When to skip
22
+
23
+ - User already has the map and just needs to read code.
24
+ - Single-file scope — there's nothing to zoom out from.
25
+ - User is asking for a code review or a specific implementation question; zoom-out is *prelude*, not the answer.
26
+
27
+ ## Process
28
+
29
+ ### 1. Identify the area
30
+
31
+ Ask the user (or infer from context) what they're trying to do. The map is scoped to *the area relevant to that task*, not the whole codebase.
32
+
33
+ ### 2. Read the project's vocabulary first
34
+
35
+ - [`docs/CONTEXT.md`](../../docs/CONTEXT.md) — the domain glossary. Module names should come from here.
36
+ - [`docs/architecture.md`](../../docs/architecture.md) — the system map, if it exists.
37
+ - Any ADRs in [`docs/adr/`](../../docs/adr/) that constrain the area.
38
+
39
+ If `CONTEXT.md` doesn't exist yet (greenfield repo), name modules by their file paths and flag the missing glossary as a gap.
40
+
41
+ ### 3. Walk the dependency graph for the area
42
+
43
+ Use the Agent tool with `subagent_type=Explore` if the area is broad. Capture:
44
+
45
+ - **Modules involved** — which directories / packages / files implement the responsibility.
46
+ - **Callers** — what calls into this area, from where.
47
+ - **Callees** — what this area calls into.
48
+ - **Seams** — function calls, queue events, HTTP boundaries, DB tables shared across the area.
49
+
50
+ ### 4. Produce the map
51
+
52
+ A short artifact, in chat (not on disk unless the user asks). Use the format below — it's what callers expect.
53
+
54
+ ```md
55
+ ## Map: <area>
56
+
57
+ **Responsibility**: <one sentence — what this area does>
58
+
59
+ **Modules**:
60
+ - `<path>` — <one-line responsibility, in CONTEXT.md vocabulary>
61
+ - ...
62
+
63
+ **Callers** (who depends on this):
64
+ - `<path>` — <how it uses the area>
65
+ - ...
66
+
67
+ **Callees** (what this depends on):
68
+ - `<path>` — <what it provides>
69
+ - ...
70
+
71
+ **Seams**:
72
+ - <Module A> ↔ <Module B>: <in-process | queue | HTTP | DB-shared>; <where the adapter lives>
73
+ - ...
74
+
75
+ **Decisions worth knowing**:
76
+ - ADR-NNN: <one line — why this constraint matters here>
77
+ - ...
78
+
79
+ **Gaps spotted while mapping** (optional):
80
+ - <e.g. "shallow module at <path> — possibly worth deepening; not blocking your task">
81
+ ```
82
+
83
+ Keep it on one screen. If it doesn't fit, you over-zoomed — narrow the area.
84
+
85
+ ## Anti-patterns
86
+
87
+ - **Drawing the whole codebase.** The map is scoped to the user's task. A full codebase tour wastes the user's attention.
88
+ - **Listing files without responsibilities.** A list of paths isn't a map. Each entry needs a one-line "what it does", in domain vocabulary.
89
+ - **Inventing names.** If `CONTEXT.md` calls it "Order intake", don't call it "OrderHandler" or "the order service". Use the canonical name.
90
+ - **Persisting the map without being asked.** This skill outputs to chat. If the user wants a durable artifact, that's `system-design` (greenfield) or `improve-codebase-architecture` (brownfield) — different skills with different outputs.
91
+ - **Making decisions during zoom-out.** This is a mapping skill, not a decision skill. Surface gaps; don't propose solutions.
92
+
93
+ ## Pairing with other skills
94
+
95
+ - **`debug`** — runs before debug when the area is unfamiliar. Easier to bisect when you know the topology. ([`debug/SKILL.md`](../debug/SKILL.md) calls this out.)
96
+ - **`improve-codebase-architecture`** — runs before. Map first, then propose deepening candidates.
97
+ - **`feature-doc`** — runs before, when a feature touches an unfamiliar area and the writer needs vocabulary before drafting ACs.
98
+ - **Any workflow** can pause for `zoom-out` and resume — the skill does not change which workflow the user is in.
99
+
100
+ ## Done when
101
+
102
+ - The user has a one-screen map of the area, in `CONTEXT.md` vocabulary.
103
+ - Modules, callers, callees, seams, and load-bearing ADRs are named with cite-able paths.
104
+ - The user explicitly resumes the original task or asks a specific follow-up — don't keep mapping past their patience.