@event4u/agent-config 1.28.0 → 1.29.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,147 @@
1
+ ---
2
+ name: async-python-patterns
3
+ description: "Use when writing Python asyncio code — picking between gather / TaskGroup / wait, structured concurrency, timeouts, cancellation, sync-bridging — decision framework only, cookbook externalized."
4
+ source: package
5
+ status: active
6
+ refresh_trigger: "Python ships a new structured-concurrency primitive (post-TaskGroup), OR ≥30% of cited upstream cookbook examples become deprecated, OR the cited libraries (aiohttp, httpx, anyio, trio) cut a major version with breaking async surface changes."
7
+ sunset_criterion: "When `https://docs.python.org/3/library/asyncio.html` ships an in-tree decision framework AND consumer projects no longer cite this skill in PR reviews for two consecutive review cycles."
8
+ ---
9
+
10
+ # async-python-patterns
11
+
12
+ Decision framework for picking the right Python asyncio primitive. **The pattern cookbook lives upstream** (links in § Provenance) — this skill is the predicate, not the recipe library. Sunset-policy compliant: the 600+ lines of language-specific cookbook stay in authoritative Python docs.
13
+
14
+ ## When to use
15
+
16
+ - Designing a new async I/O-bound service (FastAPI, aiohttp, async DB client).
17
+ - Reviewing a diff that introduces `asyncio.gather`, `asyncio.create_task`, `TaskGroup`, `as_completed`, or `wait_for`.
18
+ - Mixing sync and async code (calling sync libs from async context, or vice versa).
19
+ - Diagnosing event-loop blocking, never-awaited warnings, or cancellation leaks.
20
+
21
+ Do NOT use when:
22
+
23
+ - The work is CPU-bound — async will not help; route to multiprocessing or threadpool.
24
+ - The runtime is not Python — read the host runtime's concurrency guide.
25
+ - The fix is a single missing `await` — read the upstream tutorial directly.
26
+
27
+ ## Decision framework
28
+
29
+ ### Step 1 — Verify async is the right tool
30
+
31
+ ```
32
+ Workload is:
33
+ I/O-bound, many concurrent waits → async fits (network, disk, IPC).
34
+ CPU-bound (parsing, math, crypto) → async is wrong; use ProcessPoolExecutor.
35
+ Mixed → async shell + run_in_executor for CPU bursts.
36
+ Single sequential call → don't introduce async; sync is simpler.
37
+ ```
38
+
39
+ ### Step 2 — Pick the concurrency primitive
40
+
41
+ ```
42
+ Run N independent coroutines, ALL must complete:
43
+ Same trust level, exceptions cancel siblings → asyncio.TaskGroup (3.11+; preferred).
44
+ Pre-3.11 OR exceptions must NOT cancel peers → asyncio.gather(*, return_exceptions=...).
45
+
46
+ Run N coroutines, react to results as they finish:
47
+ → asyncio.as_completed (yields completed futures in finish order).
48
+
49
+ Run N coroutines, race to first success / failure:
50
+ → asyncio.wait(..., return_when=FIRST_COMPLETED) + cancel pending.
51
+
52
+ Schedule fire-and-forget background work:
53
+ → asyncio.create_task + keep a strong reference (else GC eats it).
54
+ Forgetting the reference is the #1 silent-failure source.
55
+
56
+ Bound the wait time:
57
+ → asyncio.wait_for(coro, timeout=...) → raises TimeoutError on expiry.
58
+ → asyncio.timeout(...) context manager (3.11+; preferred when many awaits share a deadline).
59
+
60
+ Bound concurrency (rate-limit, connection pool):
61
+ → asyncio.Semaphore(n); acquire around the awaitable.
62
+ ```
63
+
64
+ ### Step 3 — Bridge sync ↔ async correctly
65
+
66
+ ```
67
+ Async code calls sync, blocking, function:
68
+ Short pure-CPU → fine, accept the block (microseconds).
69
+ Long, blocking, or I/O-sync → await loop.run_in_executor(None, fn, *args).
70
+ Library has async sibling → switch the library (httpx vs requests, aiosqlite vs sqlite3).
71
+
72
+ Sync code calls async function:
73
+ Top-level entrypoint → asyncio.run(coro()).
74
+ Inside running loop → never asyncio.run; create_task + await it.
75
+ Test suite → pytest-asyncio fixture; never raw run() in tests.
76
+ ```
77
+
78
+ ### Step 4 — Cancellation discipline
79
+
80
+ Every long-running coroutine MUST be cancellation-safe:
81
+
82
+ - Catch `asyncio.CancelledError`, perform cleanup, **re-raise**. Swallowing it silently breaks the propagation chain.
83
+ - Use `try / finally` (or `async with`) around resource acquisition so cancellation cannot leak file handles, DB connections, locks.
84
+ - Detached `create_task` without a strong reference is undefined behavior; either store the task or use a TaskGroup.
85
+
86
+ ### Step 5 — Don't block the event loop
87
+
88
+ A single blocking call (sync I/O, time.sleep, CPU-heavy parse, large JSON load) freezes every coroutine. Audit every leaf function under `async def`:
89
+
90
+ - Sleep → `await asyncio.sleep`, never `time.sleep`.
91
+ - HTTP → `httpx.AsyncClient` / `aiohttp`, never `requests`.
92
+ - DB → `asyncpg` / `aiosqlite` / `motor`, never the sync driver.
93
+ - File → `aiofiles` for hot-paths, or `run_in_executor` for one-shots.
94
+
95
+ ## Procedure: Apply to a new async feature
96
+
97
+ 1. Run Step 1; reject if work is CPU-bound.
98
+ 2. Sketch the call graph; tag each `await` site with its primitive (Step 2).
99
+ 3. Mark every sync↔async boundary; pick the bridge per Step 3.
100
+ 4. For each long-running coroutine, write the cancel-safety contract (Step 4).
101
+ 5. Grep the leaf calls for blocking sins (Step 5); replace or push to executor.
102
+ 6. Hand the sketch to a reviewer **before** coding; cite this skill.
103
+
104
+ ## Output format
105
+
106
+ 1. Call-graph table: coroutine · concurrency primitive · timeout · cancel-safety note.
107
+ 2. Sync↔async boundary list: site · bridge · justification.
108
+ 3. Blocking-call audit: leaf function · status (async / executor / accepted-block + reason).
109
+ 4. Cancel-safety contract for each background task.
110
+
111
+ ## Gotcha
112
+
113
+ - "It works in my REPL" — `asyncio.run` inside an already-running loop (Jupyter, FastAPI startup) raises `RuntimeError`. Use `await` directly or `nest_asyncio` (last resort).
114
+ - `asyncio.gather` swallows the second exception silently; use `return_exceptions=True` and inspect, or use `TaskGroup` (cancels all on first error, surfaces the group).
115
+ - `create_task` results that nobody awaits look fine until the program exits and Python prints `Task was destroyed but it is pending!`. Always `await` or use a TaskGroup.
116
+ - `wait_for` on a non-cancellation-safe coroutine leaks resources; the timeout cancels the task but cleanup never runs.
117
+ - Libraries that "support async" via thread pools (e.g. `requests-async`) often re-block the loop under load; verify with the cited upstream library docs, not the README.
118
+
119
+ ## Do NOT
120
+
121
+ - Do NOT call `asyncio.run` from a running loop.
122
+ - Do NOT swallow `CancelledError` without re-raising.
123
+ - Do NOT call sync blocking I/O from async paths without `run_in_executor`.
124
+ - Do NOT spawn `create_task` without storing the reference (or using TaskGroup).
125
+ - Do NOT inline the asyncio cookbook into this skill — externalize per Sunset Policy.
126
+
127
+ ## Auto-trigger keywords
128
+
129
+ - asyncio
130
+ - async / await
131
+ - gather / TaskGroup / wait_for
132
+ - event loop blocking
133
+ - cancellation
134
+ - sync to async bridge
135
+
136
+ ## Provenance
137
+
138
+ - Adopted from: `Microck/ordinary-claude-skills@8f5c83174f7aa683b4ddc7433150471983b93131:skills_all/async-python-patterns/SKILL.md` (MIT, © 2025 Microck) — **Sunset Policy applied**: 694-line cookbook source reduced to a ~140-line decision framework; pattern catalogues externalized to upstream docs below.
139
+ - Externalized cookbook:
140
+ - asyncio core: https://docs.python.org/3/library/asyncio.html · https://docs.python.org/3/library/asyncio-task.html
141
+ - TaskGroup (3.11+): https://docs.python.org/3/library/asyncio-task.html#task-groups
142
+ - Structured concurrency: https://anyio.readthedocs.io · https://trio.readthedocs.io
143
+ - Async HTTP: https://www.python-httpx.org/async/ · https://docs.aiohttp.org/en/stable/
144
+ - Async DB: https://magicstack.github.io/asyncpg/ · https://aiosqlite.omnilib.dev/
145
+ - Cross-linked: [`error-handling-patterns`](../error-handling-patterns/SKILL.md), [`mcp-builder`](../mcp-builder/SKILL.md), [`api-design`](../api-design/SKILL.md), [`performance`](../performance/SKILL.md).
146
+ - Provenance registry: `agents/contexts/skills-provenance.yml` (entry: `async-python-patterns`).
147
+ - Iron-Law floor: `verify-before-complete`, `skill-quality`, `non-destructive-by-default`.
@@ -0,0 +1,152 @@
1
+ ---
2
+ name: defense-in-depth
3
+ description: "Use when validation needs entry, business-logic, environment, and instrumentation guards so a bad value cannot reach the failure point — turns a local bug fix into a structural one."
4
+ source: package
5
+ ---
6
+
7
+ # defense-in-depth
8
+
9
+ Validate at every layer the value passes through. Fixing the bug at one layer is locally sufficient and globally fragile — the next refactor, code path, mock, or platform edge case will rediscover it. Four-layer validation makes the bug *structurally* impossible.
10
+
11
+ ## When to use
12
+
13
+ - Bug fix where invalid data caused failure several frames deep.
14
+ - New entry point that funnels external input into existing internals.
15
+ - Refactor that adds a second caller to a previously single-caller routine.
16
+ - Test setup that shortcuts production guards (mocks bypassing entry validation).
17
+
18
+ Do NOT use when:
19
+
20
+ - Pure formatting / style change — no data flow, no layers to defend.
21
+ - Boundary validation alone is correct (e.g. immutable value object with constructor invariant) — route to [`laravel-validation`](../laravel-validation/SKILL.md).
22
+ - The fix belongs at a single architectural seam — adding three more guards is over-engineering. Use the gate function below to stop early.
23
+
24
+ ## Procedure: Apply the four-layer pattern
25
+
26
+ ### Step 0: Analyze the data flow before adding guards
27
+
28
+ 1. Identify where the bad value originates (test fixture, request body, env var, config).
29
+ 2. List every function that receives the value before the failure point.
30
+ 3. Mark which functions are reachable from production paths and which only from tests.
31
+
32
+ ### Step 1: Layer 1 — Entry-point validation
33
+
34
+ Reject obviously invalid input at the API / route / command boundary. In Laravel this is FormRequest rules; in pure PHP services it is the public method on the service.
35
+
36
+ ```php
37
+ public function createProject(string $name, string $workingDirectory): Project
38
+ {
39
+ if (trim($workingDirectory) === '') {
40
+ throw new InvalidArgumentException('workingDirectory cannot be empty');
41
+ }
42
+ if (! is_dir($workingDirectory)) {
43
+ throw new InvalidArgumentException("workingDirectory does not exist: {$workingDirectory}");
44
+ }
45
+ if (! is_writable($workingDirectory)) {
46
+ throw new InvalidArgumentException("workingDirectory is not writable: {$workingDirectory}");
47
+ }
48
+ // ... proceed
49
+ }
50
+ ```
51
+
52
+ ### Step 2: Layer 2 — Business-logic validation
53
+
54
+ Verify the value still makes sense for the operation that consumes it. Different code paths can reach the same internal — re-check rather than trust the caller.
55
+
56
+ ```php
57
+ public function initializeWorkspace(string $projectDir, string $sessionId): Workspace
58
+ {
59
+ if ($projectDir === '') {
60
+ throw new RuntimeException('projectDir required for workspace initialization');
61
+ }
62
+ // ... proceed
63
+ }
64
+ ```
65
+
66
+ ### Step 3: Layer 3 — Environment guards
67
+
68
+ Refuse dangerous operations in the wrong context — most often: running a destructive command outside a test temp dir while the test suite is active.
69
+
70
+ ```php
71
+ public function gitInit(string $directory): void
72
+ {
73
+ if (app()->environment('testing')) {
74
+ $normalized = realpath($directory) ?: $directory;
75
+ $tmp = realpath(sys_get_temp_dir());
76
+
77
+ if ($tmp === false || ! str_starts_with($normalized, $tmp)) {
78
+ throw new RuntimeException("refusing git init outside tmp during tests: {$directory}");
79
+ }
80
+ }
81
+ // ... proceed
82
+ }
83
+ ```
84
+
85
+ ### Step 4: Layer 4 — Debug instrumentation
86
+
87
+ Capture context for forensics so the next failure surfaces *why*, not just *that*. Log only when the call is about to hit an irreversible side effect.
88
+
89
+ ```php
90
+ public function gitInit(string $directory): void
91
+ {
92
+ Log::debug('about to git init', [
93
+ 'directory' => $directory,
94
+ 'cwd' => getcwd(),
95
+ 'trace' => (new Exception)->getTraceAsString(),
96
+ ]);
97
+ // ... proceed
98
+ }
99
+ ```
100
+
101
+ ### Step 5: Verify each layer in isolation
102
+
103
+ Try to bypass Layer 1 (call the internal directly) and confirm Layer 2 catches it. Mock the production guard and confirm Layer 3 still refuses. The pattern only earns its name when each layer is independently provable.
104
+
105
+ ## Gate function — when to stop adding layers
106
+
107
+ ```
108
+ BEFORE adding the 5th guard:
109
+ STOP — re-check the data flow.
110
+
111
+ IF the value crosses ≤ 1 module boundary:
112
+ Use a single boundary check + a value-object invariant. Two layers max.
113
+
114
+ IF every layer would re-implement the same predicate:
115
+ Hoist the predicate into a value object / type and inject. One check is enough.
116
+
117
+ Layers are for distinct concerns: input shape vs operation invariant
118
+ vs environment risk vs forensic visibility. Same concern repeated is duplication, not depth.
119
+ ```
120
+
121
+ ## Output format
122
+
123
+ 1. The four guards (or a documented subset, with the gate-function justification).
124
+ 2. Tests that bypass each layer to prove the next layer catches the failure.
125
+ 3. One-line note on the data flow that motivated the layering.
126
+
127
+ ## Gotcha
128
+
129
+ - Layers 1 and 2 must reject with **distinct** errors — same error string makes the second guard look like a duplicate.
130
+ - Layer 3 environment checks should fail closed: unknown environment treated as production.
131
+ - Layer 4 instrumentation must not change behavior — no early returns, no mutated state.
132
+ - Test bypasses (in-process mocks) often skip Layer 1 — Layer 2 catches them; do not weaken Layer 2 to silence the test.
133
+
134
+ ## Do NOT
135
+
136
+ - Do NOT replicate Layer 1 inside private methods that only Layer 1 can reach.
137
+ - Do NOT log secrets in Layer 4 — sanitize before `Log::debug`.
138
+ - Do NOT use Layer 3 to gate business logic — environments change, business rules do not.
139
+ - Do NOT add a layer without a failing test that proves the layer was needed.
140
+
141
+ ## Auto-trigger keywords
142
+
143
+ - defense in depth
144
+ - multiple validation layers
145
+ - bug deep in execution
146
+ - structurally impossible
147
+
148
+ ## Provenance
149
+
150
+ - Adopted from: `Microck/ordinary-claude-skills@8f5c83174f7aa683b4ddc7433150471983b93131:skills_all/defense-in-depth/SKILL.md` (MIT, © 2025 Microck).
151
+ - Provenance registry: `agents/contexts/skills-provenance.yml` (entry: `defense-in-depth`).
152
+ - Iron-Law floor: `non-destructive-by-default`, `verify-before-complete`, `skill-quality`.
@@ -0,0 +1,134 @@
1
+ ---
2
+ name: error-handling-patterns
3
+ description: "Use when picking a failure-reporting strategy — exceptions vs Result types, recoverable vs not, retry / circuit-breaker / graceful degradation — decision framework only, catalogues externalized."
4
+ source: package
5
+ status: active
6
+ refresh_trigger: "≥30% of cited upstream pattern catalogues become deprecated, OR a new top-2 ecosystem (Python/JS/PHP/Go/Rust) ships a paradigm-shifting standard error model"
7
+ sunset_criterion: "When the upstream framework docs (Laravel, FastAPI, Express, Axum, Effect-TS) all carry an equivalent in-tree decision framework AND consumer projects no longer cite this skill in PR reviews for two consecutive review cycles."
8
+ ---
9
+
10
+ # error-handling-patterns
11
+
12
+ Decision framework for picking an error-handling strategy. **Catalogues of language-specific code live upstream** (links in § Provenance) — this skill is the predicate, not the pattern library. Sunset-policy compliant: large language-specific catalogues stay in authoritative upstream docs.
13
+
14
+ ## When to use
15
+
16
+ - Designing how a new feature, API, or service reports failure.
17
+ - Reviewing a diff that introduces a new exception class, `Result<T, E>`, or sentinel return.
18
+ - Debugging production noise that traces back to inconsistent error semantics.
19
+ - Choosing between retry, circuit-breaker, fallback, and fail-fast for an external dependency.
20
+
21
+ Do NOT use when:
22
+
23
+ - You only need the syntax for a `try/catch` in language X — read the upstream language guide directly.
24
+ - The failure is a single-call Laravel validation error — route to [`laravel-validation`](../laravel-validation/SKILL.md).
25
+ - The fix is a one-line null check in existing code — route to [`bug-analyzer`](../bug-analyzer/SKILL.md).
26
+
27
+ ## Decision framework
28
+
29
+ ### Step 1 — Classify the failure
30
+
31
+ ```
32
+ Failure is:
33
+ caller's fault (bad input, missing auth) → reject at boundary, structured error
34
+ expected operational (timeout, 404, rate-limit) → Result-type / typed return; retry-aware
35
+ unexpected operational (DB down, OOM, deadlock) → exception; observability + alert
36
+ programmer bug (null deref, off-by-one) → crash early; do not catch
37
+ ```
38
+
39
+ ### Step 2 — Pick the reporting mechanism
40
+
41
+ ```
42
+ IF failure is an EXPECTED, branchable outcome the caller will route on
43
+ → Result type / tagged union / typed error return.
44
+ Forces the caller to handle it; the type system is the proof.
45
+
46
+ IF failure is UNEXPECTED and most callers cannot do anything useful
47
+ → exception, propagated to a single boundary handler.
48
+ One layer (HTTP, queue, CLI) translates exceptions to user-facing errors.
49
+
50
+ IF failure is UNRECOVERABLE (invariant violated, data corruption)
51
+ → fail loud, fail fast. No catch-and-continue.
52
+ Log structured context, exit / panic / 500.
53
+
54
+ IF the language idiom forces one choice (Go: errors are values; Rust: Result;
55
+ Python/PHP/JS: exceptions)
56
+ → follow the idiom. Inventing a foreign mechanism is more cost than the
57
+ correctness it buys.
58
+ ```
59
+
60
+ ### Step 3 — Pick the resilience strategy
61
+
62
+ ```
63
+ External call?
64
+ Idempotent + transient failure mode → retry with exponential backoff + jitter, cap.
65
+ Non-idempotent → no blind retry; require an idempotency key.
66
+ Repeated failure across instances → circuit breaker; open → half-open probe → close.
67
+ Optional functionality → graceful degradation (cached / default / null result).
68
+ Required functionality → propagate; surface to user with a recovery hint.
69
+ ```
70
+
71
+ ### Step 4 — Shape the error payload
72
+
73
+ Every produced error must carry: `code` (stable string), `message` (human-readable), `cause` (chained), `context` (sanitized inputs), `correlation_id` (request / trace).
74
+
75
+ Forbidden: secrets, raw SQL, full stack traces in user-facing surfaces, internal class names leaked through API boundaries.
76
+
77
+ ### Step 5 — Define the boundary
78
+
79
+ Exactly **one** layer translates internal errors to the egress format (HTTP status + body, queue requeue policy, CLI exit code). Anywhere else doing this duplication is the bug.
80
+
81
+ ## Procedure: Apply the framework to a new feature
82
+
83
+ 1. List failure modes (each external call, each invariant, each user input class).
84
+ 2. Run Step 1 against each, write the classification next to it.
85
+ 3. Pick reporting mechanism per Step 2; reject combinations the language idiom rejects.
86
+ 4. For each external call, run Step 3 and write down the chosen resilience strategy.
87
+ 5. Sketch the error payload shape (Step 4) and the single boundary (Step 5).
88
+ 6. Hand the sketch to a reviewer **before** coding; cite this skill.
89
+
90
+ ## Output format
91
+
92
+ 1. The failure-mode table (mode · classification · mechanism · resilience strategy).
93
+ 2. The shared error payload definition (code, message, cause, context, correlation_id).
94
+ 3. The single boundary handler (file:line) where internal → egress translation happens.
95
+ 4. The retry / circuit-breaker config (attempts, base, jitter, breaker thresholds), if any.
96
+
97
+ ## Gotcha
98
+
99
+ - "Catch everything, log it, return null" silently destroys signal — every catch must either rethrow, translate, or recover with a written reason.
100
+ - Retries on non-idempotent calls are the second-most-common production incident; insist on idempotency keys before allowing retry.
101
+ - Circuit breakers without a half-open probe never close — they degrade to permanent failure.
102
+ - Mixing Result types and exceptions in the same module is worse than picking the wrong one — pick one per module and stay in it.
103
+ - Upstream pattern catalogues drift; trust the link, not memory. Refresh per `refresh_trigger` above.
104
+
105
+ ## Do NOT
106
+
107
+ - Do NOT introduce a custom error mechanism that fights the language idiom.
108
+ - Do NOT swallow exceptions — every catch has a written purpose.
109
+ - Do NOT leak stack traces, secrets, or internal class names across the boundary.
110
+ - Do NOT retry without backoff + jitter + cap.
111
+ - Do NOT inline language-specific code catalogues into this skill — externalize per Sunset Policy.
112
+
113
+ ## Auto-trigger keywords
114
+
115
+ - error handling strategy
116
+ - exceptions vs result
117
+ - retry pattern
118
+ - circuit breaker
119
+ - graceful degradation
120
+ - error payload shape
121
+
122
+ ## Provenance
123
+
124
+ - Adopted from: `Microck/ordinary-claude-skills@8f5c83174f7aa683b4ddc7433150471983b93131:skills_all/error-handling-patterns/SKILL.md` (MIT, © 2025 Microck) — **Sunset Policy applied**: 636-line source reduced to a ~150-line decision framework; language catalogues externalized to the upstream resources below.
125
+ - Externalized catalogues:
126
+ - Python: https://docs.python.org/3/tutorial/errors.html · https://docs.python.org/3/library/exceptions.html
127
+ - PHP / Laravel: https://laravel.com/docs/errors · https://www.php.net/manual/en/language.exceptions.php
128
+ - JS / TS: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Control_flow_and_error_handling · https://www.typescriptlang.org/docs/handbook/2/narrowing.html
129
+ - Go: https://go.dev/blog/error-handling-and-go · https://pkg.go.dev/errors
130
+ - Rust: https://doc.rust-lang.org/book/ch09-00-error-handling.html
131
+ - Resilience patterns: https://martinfowler.com/bliki/CircuitBreaker.html · https://aws.amazon.com/builders-library/timeouts-retries-and-backoff-with-jitter/
132
+ - Cross-linked: [`defense-in-depth`](../defense-in-depth/SKILL.md), [`laravel-validation`](../laravel-validation/SKILL.md), [`bug-analyzer`](../bug-analyzer/SKILL.md), [`api-design`](../api-design/SKILL.md).
133
+ - Provenance registry: `agents/contexts/skills-provenance.yml` (entry: `error-handling-patterns`).
134
+ - Iron-Law floor: `verify-before-complete`, `skill-quality`, `non-destructive-by-default`.
@@ -0,0 +1,108 @@
1
+ ---
2
+ name: mcp-builder
3
+ description: "Use when building an MCP server in Python (FastMCP) or Node/TypeScript (MCP SDK) — agent-centric tool design, input schemas, error handling, and the 10-question evaluation harness."
4
+ source: package
5
+ ---
6
+
7
+ # mcp-builder
8
+
9
+ Author MCP servers that LLMs can drive end-to-end. The quality bar is *can the agent finish the workflow*, not *does the endpoint return 200*. This skill is the **server-author** counterpart to the existing [`mcp`](../mcp/SKILL.md) consumer skill.
10
+
11
+ ## When to use
12
+
13
+ - Wrapping an external API or service as MCP tools for an LLM client.
14
+ - Adding tools to an existing MCP server (Python FastMCP or TypeScript SDK).
15
+ - Reviewing an MCP server before shipping — Phase 4 evaluation gate below.
16
+
17
+ Do NOT use when:
18
+
19
+ - You only need to *call* an MCP server — route to [`mcp`](../mcp/SKILL.md).
20
+ - The integration belongs in the host process — write a regular service, not an MCP server.
21
+ - The "server" wraps one endpoint with no workflow — a CLI wrapper is enough.
22
+
23
+ ## Procedure: Four phases, one tool at a time
24
+
25
+ ### Phase 1 — Research & plan
26
+
27
+ 1. **Agent-centric design**. Tools encode *workflows*, not raw endpoints. Consolidate (`schedule_event` checks availability **and** creates the event). Default to human-readable names over IDs. Errors are educational, not just diagnostic ("retry with `filter='active_only'` to reduce results").
28
+ 2. **Load the protocol**. Fetch `https://modelcontextprotocol.io/llms-full.txt` once into context — the canonical spec.
29
+ 3. **Load the SDK README** for the chosen language:
30
+ - Python: `https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md`
31
+ - TypeScript: `https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md`
32
+ 4. **Read the target service's API docs in full** — auth, rate limits, pagination, error codes, schemas. Skipping this produces incomplete mocks (see [`testing-anti-patterns`](../testing-anti-patterns/SKILL.md) § Anti-Pattern 4).
33
+ 5. **Write the plan**: tool list with priority, shared utilities (request helper, pagination, formatter), input/output schemas, error strategy, response-detail levels (concise vs detailed), character limits (default 25 000 tokens).
34
+
35
+ ### Phase 2 — Implement
36
+
37
+ 1. **Project layout**. Python: single `.py` or modular package; Pydantic v2 with `model_config`. TypeScript: standard `package.json` + `tsconfig.json` strict mode; Zod schemas with `.strict()`.
38
+ 2. **Shared utilities first**. API request helper with retry/timeout, error formatter, JSON-vs-Markdown response builder, pagination cursor handling, auth/token cache.
39
+ 3. **Per tool**:
40
+ - Input schema (Pydantic / Zod) with constraints, descriptions, and *examples*.
41
+ - One-line summary + detailed docstring covering purpose, parameters, return shape, when-to-use, when-NOT-to-use, error handling.
42
+ - Tool annotations: `readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint`.
43
+ - Async/await for all I/O. Honor pagination. Truncate to the character limit and signal truncation in the response.
44
+
45
+ ### Phase 3 — Review & test
46
+
47
+ 1. **Code-quality pass**: DRY across tools, shared helpers extracted, consistent response shapes, all external calls have error handling, full type coverage.
48
+ 2. **Build & syntax**:
49
+ - Python: `python -m py_compile server.py`.
50
+ - TypeScript: `npm run build`; verify `dist/index.js`.
51
+ 3. **Run the server safely**. MCP servers block on stdio. Either run inside `tmux` and drive from the harness, or wrap with `timeout 5s python server.py` for a smoke check. Do NOT block your own session by running it in-process.
52
+
53
+ ### Phase 4 — Evaluations (10-question harness)
54
+
55
+ Each evaluation is a question the agent must answer using only the new tools.
56
+
57
+ Requirements per question — **independent**, **read-only**, **complex** (multiple tool calls), **realistic**, **verifiable** (string-comparable answer), **stable** (answer does not drift over time).
58
+
59
+ ```xml
60
+ <evaluation>
61
+ <qa_pair>
62
+ <question>...</question>
63
+ <answer>...</answer>
64
+ </qa_pair>
65
+ <!-- 9 more -->
66
+ </evaluation>
67
+ ```
68
+
69
+ Process: enumerate the tools, explore READ-ONLY data, draft 10 questions, **solve each yourself first** to confirm the answer is reachable and stable.
70
+
71
+ ## Output format
72
+
73
+ 1. The server source plus the 10-question evaluation XML.
74
+ 2. A README with: install, env vars, transport mode (stdio / sse / http), example tool call.
75
+ 3. A line in `agents/contexts/skills-provenance.yml` if the server was forked from an upstream, or a note that it was authored from scratch.
76
+
77
+ ## Gotcha
78
+
79
+ - "Wrap every endpoint" is the failure mode — agents cannot orchestrate 60 thin tools as well as 12 workflow tools.
80
+ - Returning the full upstream payload blows the agent's context. Default to a *concise* shape with an opt-in *detailed* mode.
81
+ - Pydantic / Zod descriptions are the *only* documentation the LLM sees at runtime — write them like usage docs, not comments.
82
+ - A server that hangs your session usually means stdio transport ran in the main process — move it under `tmux` or use a `timeout`.
83
+ - Inflated token claims are not credible without an evaluation harness — Phase 4 is the validation gate, not optional.
84
+
85
+ ## Do NOT
86
+
87
+ - Do NOT mirror REST routes 1:1.
88
+ - Do NOT use `any` (TypeScript) or untyped `dict` (Python) in tool I/O.
89
+ - Do NOT skip the 10-question evaluation — Phase 4 IS the quality bar.
90
+ - Do NOT run the MCP server in your main process during testing — it will block.
91
+ - Do NOT log tokens, API keys, or full request bodies — sanitize before logging.
92
+
93
+ ## Auto-trigger keywords
94
+
95
+ - mcp server
96
+ - model context protocol
97
+ - fastmcp
98
+ - mcp builder
99
+ - agent-centric tools
100
+
101
+ ## Provenance
102
+
103
+ - Upstream protocol: https://modelcontextprotocol.io
104
+ - Upstream SDKs: https://github.com/modelcontextprotocol/python-sdk · https://github.com/modelcontextprotocol/typescript-sdk
105
+ - Adopted from: `Microck/ordinary-claude-skills@8f5c83174f7aa683b4ddc7433150471983b93131:skills_all/mcp-builder/SKILL.md` (MIT, © 2025 Microck) — external `./reference/*.md` file links replaced with inline guidance + upstream URLs.
106
+ - Cross-linked: [`mcp`](../mcp/SKILL.md), [`testing-anti-patterns`](../testing-anti-patterns/SKILL.md), [`api-design`](../api-design/SKILL.md).
107
+ - Provenance registry: `agents/contexts/skills-provenance.yml` (entry: `mcp-builder`).
108
+ - Iron-Law floor: `verify-before-complete`, `tool-safety`, `skill-quality`.