npm - @event4u/agent-config - Versions diffs - 1.27.0 → 1.29.0 - Mend

@event4u/agent-config 1.27.0 → 1.29.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (34) hide show

package/.agent-src/commands/research.md +142 -0
package/.agent-src/contexts/contracts/frugality-charter.md +4 -3
package/.agent-src/contexts/contracts/research-schema.md +117 -0
package/.agent-src/rules/domain-adoption-policy.md +1 -1
package/.agent-src/rules/no-roadmap-references.md +1 -1
package/.agent-src/rules/no-unsolicited-rebase.md +1 -1
package/.agent-src/rules/scope-control.md +6 -8
package/.agent-src/skills/async-python-patterns/SKILL.md +147 -0
package/.agent-src/skills/deep-reading-analyst/SKILL.md +192 -0
package/.agent-src/skills/defense-in-depth/SKILL.md +152 -0
package/.agent-src/skills/error-handling-patterns/SKILL.md +134 -0
package/.agent-src/skills/mcp-builder/SKILL.md +108 -0
package/.agent-src/skills/prompt-engineering-patterns/SKILL.md +145 -0
package/.agent-src/skills/repomix/SKILL.md +135 -0
package/.agent-src/skills/roadmap-writing/SKILL.md +3 -3
package/.agent-src/skills/secrets-management/SKILL.md +142 -0
package/.agent-src/skills/testing-anti-patterns/SKILL.md +145 -0
package/.agent-src/templates/agent-settings.md +1 -1
package/.claude-plugin/marketplace.json +11 -1
package/CHANGELOG.md +57 -0
package/README.md +3 -3
package/docs/architecture.md +3 -3
package/docs/catalog.md +20 -7
package/docs/contracts/command-clusters.md +1 -0
package/docs/contracts/file-ownership-matrix.json +1644 -165
package/docs/contracts/package-self-orientation.md +1 -1
package/docs/decisions/ADR-004-rule-governance-pruning.md +3 -3
package/docs/getting-started.md +1 -1
package/docs/guidelines/agent-infra/inversion-thinking.md +388 -0
package/docs/guidelines/agent-infra/mcp-request-signing.md +11 -14
package/docs/guidelines/agent-infra/mental-models.md +314 -0
package/docs/guidelines/agent-infra/scqa-framework.md +526 -0
package/package.json +1 -1
package/scripts/schemas/skill.schema.json +15 -0

package/.agent-src/skills/deep-reading-analyst/SKILL.md ADDED Viewed

@@ -0,0 +1,192 @@
+---
+name: deep-reading-analyst
+description: "Deep analysis of articles/long-form via thinking frameworks (SCQA, mental models, inversion) — 'analyze article', 'deep dive', 'extract insights', URL/text wanting depth not summary."
+status: active
+source: package
+external_source: "https://github.com/ginobefun/deep-reading-analyst-skill/tree/26cd7dc9920e025d39751e396e707399022e49ef/src/deep-reading-analyst"
+refresh_trigger: "Upstream `ginobefun/deep-reading-analyst-skill` major rewrite (new framework added, dispatch table reshaped, or SHA pin invalidated by reference rename)."
+sunset_criterion: "Replace with a 50-line pointer skill if (a) all referenced modules are adopted as project-local guidelines (`docs/guidelines/agent-infra/{framework}.md`) AND (b) the dispatch logic moves into a project-native router."
+---
+> **Pinned upstream:** `ginobefun/deep-reading-analyst-skill` @ SHA `26cd7dc9` (MIT). Re-verify per upstream major rewrite. Reference modules below link to the same SHA.
+# deep-reading-analyst
+Wing-1 deep-thinking skill for articles, papers, opinion pieces, case studies, and long-form decision documents. Routes the user's content through 8 thinking frameworks at four depth levels (Quick / Standard / Deep / Research) and returns insight tied to the user's **goal**, not framework completion.
+## When to use
+- User pastes an article URL, paper, or long text and wants depth ("analyze", "deep dive", "extract insights", "help me understand").
+- User asks for a specific framework ("apply SCQA to this", "use inversion thinking", "give me the mental models lens").
+- User is making a decision and wants pre-mortem / multi-lens analysis on a written proposal.
+- User is studying or note-taking on dense material (research papers, strategy memos, books).
+Do NOT use when:
+- User wants a 3-bullet TL;DR — use `agent-docs-writing` or write a direct summary.
+- Content is code or a diff — route to `judge-bug-hunter`, `judge-code-quality`, or `adversarial-review`.
+- User wants risk analysis on a code change — route to `adversarial-review` (diff-bound) or `threat-modeling`.
+- User wants debugging or incident analysis — route to `bug-analyzer` or `systematic-debugging`.
+## Framework Arsenal
+| Depth | Time | Frameworks | Reference module |
+|---|---|---|---|
+| **L1 — Quick** | ~15 min | SCQA, 5W2H | [`scqa-framework`](../../../docs/guidelines/agent-infra/scqa-framework.md), `5w2h_analysis.md` (upstream) |
+| **L2 — Standard** | ~30 min | L1 + Critical Thinking, Inversion | + [`inversion-thinking`](../../../docs/guidelines/agent-infra/inversion-thinking.md), `critical_thinking.md` (upstream) |
+| **L3 — Deep** | ~60 min | L2 + Mental Models, First Principles, Systems Thinking, Six Hats | + [`mental-models`](../../../docs/guidelines/agent-infra/mental-models.md), `first_principles.md`, `systems_thinking.md`, `six_hats.md` (upstream) |
+| **L4 — Research** | 120 min+ | L3 + Cross-source comparison via web search | + `comparison_matrix.md` (upstream) |
+Modules tagged `(upstream)` link to SHA-pinned files at the URL in `external_source` above; project-local modules are adopted as guidelines.
+## Procedure: deep-reading-analyst
+### Step 0: Inspect
+1. **Detect content type** — article, paper, opinion piece, case study, how-to, strategy memo. Drives auto-suggested frameworks (Step 1).
+2. **Detect goal signal** — problem-solving, learning, writing reference, decision-making, curiosity. Drives Step 4 output shape.
+3. **Skip if mismatched** — see "Do NOT use when" above; route to the named skill.
+### Step 1: Initialize Analysis
+Ask the user three things in **one** message (not three turns), per `ask-when-uncertain` Iron Law (one question per turn — these three are bundled into a single numbered-options block):
+1. **Goal** — problem-solving · learning · writing · decision-making · curiosity.
+2. **Depth** — L1 Quick (15 min) · L2 Standard (30 min) · L3 Deep (60 min) · L4 Research (120 min+).
+3. **Framework override** — defaults are auto-suggested by content type (table below); user may name specific frameworks.
+If the user does not answer, default to L2 Standard with auto-selected frameworks.
+**Auto-suggest matrix:**
+| Content type | Default frameworks |
+|---|---|
+| Strategy / business article | SCQA + Mental Models + Inversion |
+| Research paper | 5W2H + Critical Thinking + Systems Thinking |
+| How-to guide | SCQA + 5W2H + First Principles |
+| Opinion piece | Critical Thinking + Inversion + Six Hats |
+| Case study | SCQA + Mental Models + Systems Thinking |
+### Step 2: Structural Understanding (always run)
+Regardless of depth, open with two short blocks:
+**2A — Basic structure.**
+```markdown
+📄 Type: [article/paper/report]
+🎯 Core thesis: [one sentence]
+Structure:
+├─ Argument 1 → [key support]
+├─ Argument 2 → [key support]
+└─ Argument 3 → [key support]
+Key concepts: [3–5 terms with one-line definitions]
+```
+**2B — SCQA breakdown.** Apply the four-element decomposition from [`scqa-framework`](../../../docs/guidelines/agent-infra/scqa-framework.md):
+```markdown
+**S (Situation)**: [context the article establishes]
+**C (Complication)**: [problem identified]
+**Q (Question)**: [core question]
+**A (Answer)**: [main solution / conclusion]
+Structure quality — clarity / logic / completeness: [★★★☆☆]
+```
+**2C — 5W2H gap check** (L1+). Quick scan: which of *What, Why, Who, When, Where, How, How much* are well-covered, partial, or missing? Flag the 1–2 most critical gaps.
+### Step 3: Apply Frameworks (depth-gated)
+Load only the frameworks the user's depth bought. Each framework follows the same pattern: **load reference module → apply lens → produce one fixed-shape block**.
+**L2 additions:**
+- **Critical Thinking** (`critical_thinking.md`, upstream) — argument strength score (X/10), strengths, weaknesses, logical fallacies detected.
+- **Inversion** ([`inversion-thinking`](../../../docs/guidelines/agent-infra/inversion-thinking.md)) — pre-mortem on the article's recommendation: "how would this fail?", missing risk factors, mitigations.
+**L3 additions:**
+- **Mental Models** ([`mental-models`](../../../docs/guidelines/agent-infra/mental-models.md)) — pick 3–5 models from different disciplines (physics, biology, psychology, economics, math), apply each lens, surface cross-model patterns.
+- **First Principles** (`first_principles.md`, upstream) — strip to fundamental truths, validate each core assumption, rebuild from base.
+- **Systems Thinking** (`systems_thinking.md`, upstream) — map relationships, feedback loops, leverage points.
+- **Six Hats** (`six_hats.md`, upstream) — White (facts), Red (feelings), Black (cautions), Yellow (benefits), Green (creativity), Blue (process).
+**L4 addition:**
+- **Cross-source comparison** (`comparison_matrix.md`, upstream) — web-search 2–3 related sources, compare SCQA across sources, identify consensus vs divergence, synthesize integrated perspective.
+### Step 4: Synthesize by Goal
+Output shape is **driven by Step 0 goal**, not by frameworks applied. Pick exactly one of the four blocks below.
+**For problem-solving:** Applicable solutions (2–3 from content) → Application plan with timed action steps → Success metrics → Risk mitigations from Inversion.
+**For learning:** Core concepts (definition + example) → Mental models gained → Connections to prior knowledge → First Principles fundamental question → 3 verification questions (understanding / application / evaluation).
+**For writing reference:** Key arguments + evidence (with paragraph citations) → Quotable insights with context → Critical analysis (strengths for citing, limitations for balanced discussion) → Alternative perspectives from Mental Models → Gaps and counterfactuals.
+**For decision-making:** Options presented → Multi-model evaluation (economic + risk + systems lens) → Six Hats decision analysis → Scenario analysis (best / worst / most likely) → Synthesized recommendation.
+### Step 5: Knowledge Activation (always end here)
+Regardless of goal, close with three fixed blocks:
+```markdown
+## 🎯 Top 3 takeaways
+1. **[Insight]** — Why it matters: [...] · One action: [specific, time-bound]
+2. **[Insight]** — Why it matters: [...] · One action: [specific, time-bound]
+3. **[Insight]** — Why it matters: [...] · One action: [specific, time-bound]
+## 💡 Quick win — one tiny, specific action for the next 24 hours.
+## 🧭 Frameworks used
+✅ SCQA  ✅ 5W2H  ✅ Critical  ✅ Inversion
+□ Mental Models  □ First Principles  □ Systems  □ Six Hats
+```
+### Step 6: Validate
+1. Every claim is faithful to the source — no misrepresentation, facts distinguished from opinions.
+2. Frameworks applied **purposefully**, not force-fit — drop a framework that adds no insight rather than padding the output.
+3. Output ends with concrete, actionable steps — no analysis-without-application.
+4. Specific citations (paragraph numbers, quotes) where the source supports them.
+## Output format
+1. **Structural block (Step 2)** — type / thesis / structure tree / key concepts / SCQA / 5W2H gaps.
+2. **Framework blocks (Step 3)** — one fixed-shape block per framework the depth bought.
+3. **Goal-shaped synthesis (Step 4)** — problem-solving / learning / writing / decision-making.
+4. **Knowledge activation (Step 5)** — top 3 takeaways · quick win · frameworks-used checkboxes.
+## Gotcha
+- The model tends to **apply every framework** even at L1 — respect the depth budget; skip frameworks the user did not buy.
+- The model tends to **summarize** instead of analyze when the user pastes long text — go deep on 1–3 points, not shallow on all of them.
+- Inversion drifts into adversarial code review — this skill targets **decisions and arguments**, not diffs. Route diff stress-tests to `adversarial-review` / `judge-bug-hunter`.
+- Mental Models drifts into name-dropping — pick 3–5, apply each lens *concretely* to the article's claims, drop models that yield no new insight.
+- L4 cross-source comparison drifts into a literature review — keep it to 2–3 sources, focus on consensus / divergence / unique value.
+- Output without action steps is a failure mode — every Step-4 synthesis must end with timed, concrete actions tied to the user's goal.
+## Do NOT
+- Do NOT force-apply all frameworks at the user's chosen depth — drop ones that add no insight.
+- Do NOT copy text verbatim from the source — always reword for the user's understanding.
+- Do NOT use academic jargon without one-line definitions in the "key concepts" block.
+- Do NOT skip Step 5 — the takeaways + quick win are the load-bearing output, not optional decoration.
+- Do NOT route code reviews, diff stress-tests, or incident debugging through this skill.
+## Reference modules
+Project-local guidelines (full text adopted under the Reference-Guideline Sunset Policy):
+- [`scqa-framework`](../../../docs/guidelines/agent-infra/scqa-framework.md) — full 499-line authoritative-link adopt.
+- [`mental-models`](../../../docs/guidelines/agent-infra/mental-models.md) — pure adopt of Munger's multi-discipline toolkit.
+- [`inversion-thinking`](../../../docs/guidelines/agent-infra/inversion-thinking.md) — pre-mortem on decisions, distinct from `adversarial-review`.
+Upstream modules (loaded on demand from the SHA-pinned URL in `external_source`):
+- `5w2h_analysis.md` — completeness check (7 questions).
+- `critical_thinking.md` — argument quality / fallacy detection.
+- `first_principles.md` — fundamental-truth extraction.
+- `systems_thinking.md` — feedback loops + leverage points.
+- `six_hats.md` — White / Red / Black / Yellow / Green / Blue protocol.
+- `comparison_matrix.md` — cross-source synthesis (L4 only).

package/.agent-src/skills/defense-in-depth/SKILL.md ADDED Viewed

@@ -0,0 +1,152 @@
+---
+name: defense-in-depth
+description: "Use when validation needs entry, business-logic, environment, and instrumentation guards so a bad value cannot reach the failure point — turns a local bug fix into a structural one."
+source: package
+---
+# defense-in-depth
+Validate at every layer the value passes through. Fixing the bug at one layer is locally sufficient and globally fragile — the next refactor, code path, mock, or platform edge case will rediscover it. Four-layer validation makes the bug *structurally* impossible.
+## When to use
+- Bug fix where invalid data caused failure several frames deep.
+- New entry point that funnels external input into existing internals.
+- Refactor that adds a second caller to a previously single-caller routine.
+- Test setup that shortcuts production guards (mocks bypassing entry validation).
+Do NOT use when:
+- Pure formatting / style change — no data flow, no layers to defend.
+- Boundary validation alone is correct (e.g. immutable value object with constructor invariant) — route to [`laravel-validation`](../laravel-validation/SKILL.md).
+- The fix belongs at a single architectural seam — adding three more guards is over-engineering. Use the gate function below to stop early.
+## Procedure: Apply the four-layer pattern
+### Step 0: Analyze the data flow before adding guards
+1. Identify where the bad value originates (test fixture, request body, env var, config).
+2. List every function that receives the value before the failure point.
+3. Mark which functions are reachable from production paths and which only from tests.
+### Step 1: Layer 1 — Entry-point validation
+Reject obviously invalid input at the API / route / command boundary. In Laravel this is FormRequest rules; in pure PHP services it is the public method on the service.
+```php
+public function createProject(string $name, string $workingDirectory): Project
+{
+    if (trim($workingDirectory) === '') {
+        throw new InvalidArgumentException('workingDirectory cannot be empty');
+    }
+    if (! is_dir($workingDirectory)) {
+        throw new InvalidArgumentException("workingDirectory does not exist: {$workingDirectory}");
+    }
+    if (! is_writable($workingDirectory)) {
+        throw new InvalidArgumentException("workingDirectory is not writable: {$workingDirectory}");
+    }
+    // ... proceed
+}
+```
+### Step 2: Layer 2 — Business-logic validation
+Verify the value still makes sense for the operation that consumes it. Different code paths can reach the same internal — re-check rather than trust the caller.
+```php
+public function initializeWorkspace(string $projectDir, string $sessionId): Workspace
+{
+    if ($projectDir === '') {
+        throw new RuntimeException('projectDir required for workspace initialization');
+    }
+    // ... proceed
+}
+```
+### Step 3: Layer 3 — Environment guards
+Refuse dangerous operations in the wrong context — most often: running a destructive command outside a test temp dir while the test suite is active.
+```php
+public function gitInit(string $directory): void
+{
+    if (app()->environment('testing')) {
+        $normalized = realpath($directory) ?: $directory;
+        $tmp = realpath(sys_get_temp_dir());
+        if ($tmp === false || ! str_starts_with($normalized, $tmp)) {
+            throw new RuntimeException("refusing git init outside tmp during tests: {$directory}");
+        }
+    }
+    // ... proceed
+}
+```
+### Step 4: Layer 4 — Debug instrumentation
+Capture context for forensics so the next failure surfaces *why*, not just *that*. Log only when the call is about to hit an irreversible side effect.
+```php
+public function gitInit(string $directory): void
+{
+    Log::debug('about to git init', [
+        'directory' => $directory,
+        'cwd' => getcwd(),
+        'trace' => (new Exception)->getTraceAsString(),
+    ]);
+    // ... proceed
+}
+```
+### Step 5: Verify each layer in isolation
+Try to bypass Layer 1 (call the internal directly) and confirm Layer 2 catches it. Mock the production guard and confirm Layer 3 still refuses. The pattern only earns its name when each layer is independently provable.
+## Gate function — when to stop adding layers
+```
+BEFORE adding the 5th guard:
+  STOP — re-check the data flow.
+  IF the value crosses ≤ 1 module boundary:
+    Use a single boundary check + a value-object invariant. Two layers max.
+  IF every layer would re-implement the same predicate:
+    Hoist the predicate into a value object / type and inject. One check is enough.
+  Layers are for distinct concerns: input shape vs operation invariant
+  vs environment risk vs forensic visibility. Same concern repeated is duplication, not depth.
+```
+## Output format
+1. The four guards (or a documented subset, with the gate-function justification).
+2. Tests that bypass each layer to prove the next layer catches the failure.
+3. One-line note on the data flow that motivated the layering.
+## Gotcha
+- Layers 1 and 2 must reject with **distinct** errors — same error string makes the second guard look like a duplicate.
+- Layer 3 environment checks should fail closed: unknown environment treated as production.
+- Layer 4 instrumentation must not change behavior — no early returns, no mutated state.
+- Test bypasses (in-process mocks) often skip Layer 1 — Layer 2 catches them; do not weaken Layer 2 to silence the test.
+## Do NOT
+- Do NOT replicate Layer 1 inside private methods that only Layer 1 can reach.
+- Do NOT log secrets in Layer 4 — sanitize before `Log::debug`.
+- Do NOT use Layer 3 to gate business logic — environments change, business rules do not.
+- Do NOT add a layer without a failing test that proves the layer was needed.
+## Auto-trigger keywords
+- defense in depth
+- multiple validation layers
+- bug deep in execution
+- structurally impossible
+## Provenance
+- Adopted from: `Microck/ordinary-claude-skills@8f5c83174f7aa683b4ddc7433150471983b93131:skills_all/defense-in-depth/SKILL.md` (MIT, © 2025 Microck).
+- Provenance registry: `agents/contexts/skills-provenance.yml` (entry: `defense-in-depth`).
+- Iron-Law floor: `non-destructive-by-default`, `verify-before-complete`, `skill-quality`.

package/.agent-src/skills/error-handling-patterns/SKILL.md ADDED Viewed

@@ -0,0 +1,134 @@
+---
+name: error-handling-patterns
+description: "Use when picking a failure-reporting strategy — exceptions vs Result types, recoverable vs not, retry / circuit-breaker / graceful degradation — decision framework only, catalogues externalized."
+source: package
+status: active
+refresh_trigger: "≥30% of cited upstream pattern catalogues become deprecated, OR a new top-2 ecosystem (Python/JS/PHP/Go/Rust) ships a paradigm-shifting standard error model"
+sunset_criterion: "When the upstream framework docs (Laravel, FastAPI, Express, Axum, Effect-TS) all carry an equivalent in-tree decision framework AND consumer projects no longer cite this skill in PR reviews for two consecutive review cycles."
+---
+# error-handling-patterns
+Decision framework for picking an error-handling strategy. **Catalogues of language-specific code live upstream** (links in § Provenance) — this skill is the predicate, not the pattern library. Sunset-policy compliant: large language-specific catalogues stay in authoritative upstream docs.
+## When to use
+- Designing how a new feature, API, or service reports failure.
+- Reviewing a diff that introduces a new exception class, `Result<T, E>`, or sentinel return.
+- Debugging production noise that traces back to inconsistent error semantics.
+- Choosing between retry, circuit-breaker, fallback, and fail-fast for an external dependency.
+Do NOT use when:
+- You only need the syntax for a `try/catch` in language X — read the upstream language guide directly.
+- The failure is a single-call Laravel validation error — route to [`laravel-validation`](../laravel-validation/SKILL.md).
+- The fix is a one-line null check in existing code — route to [`bug-analyzer`](../bug-analyzer/SKILL.md).
+## Decision framework
+### Step 1 — Classify the failure
+```
+Failure is:
+  caller's fault (bad input, missing auth)         → reject at boundary, structured error
+  expected operational (timeout, 404, rate-limit)  → Result-type / typed return; retry-aware
+  unexpected operational (DB down, OOM, deadlock)  → exception; observability + alert
+  programmer bug (null deref, off-by-one)          → crash early; do not catch
+```
+### Step 2 — Pick the reporting mechanism
+```
+IF failure is an EXPECTED, branchable outcome the caller will route on
+  → Result type / tagged union / typed error return.
+  Forces the caller to handle it; the type system is the proof.
+IF failure is UNEXPECTED and most callers cannot do anything useful
+  → exception, propagated to a single boundary handler.
+  One layer (HTTP, queue, CLI) translates exceptions to user-facing errors.
+IF failure is UNRECOVERABLE (invariant violated, data corruption)
+  → fail loud, fail fast. No catch-and-continue.
+  Log structured context, exit / panic / 500.
+IF the language idiom forces one choice (Go: errors are values; Rust: Result;
+   Python/PHP/JS: exceptions)
+  → follow the idiom. Inventing a foreign mechanism is more cost than the
+    correctness it buys.
+```
+### Step 3 — Pick the resilience strategy
+```
+External call?
+  Idempotent + transient failure mode  → retry with exponential backoff + jitter, cap.
+  Non-idempotent                       → no blind retry; require an idempotency key.
+  Repeated failure across instances    → circuit breaker; open → half-open probe → close.
+  Optional functionality               → graceful degradation (cached / default / null result).
+  Required functionality               → propagate; surface to user with a recovery hint.
+```
+### Step 4 — Shape the error payload
+Every produced error must carry: `code` (stable string), `message` (human-readable), `cause` (chained), `context` (sanitized inputs), `correlation_id` (request / trace).
+Forbidden: secrets, raw SQL, full stack traces in user-facing surfaces, internal class names leaked through API boundaries.
+### Step 5 — Define the boundary
+Exactly **one** layer translates internal errors to the egress format (HTTP status + body, queue requeue policy, CLI exit code). Anywhere else doing this duplication is the bug.
+## Procedure: Apply the framework to a new feature
+1. List failure modes (each external call, each invariant, each user input class).
+2. Run Step 1 against each, write the classification next to it.
+3. Pick reporting mechanism per Step 2; reject combinations the language idiom rejects.
+4. For each external call, run Step 3 and write down the chosen resilience strategy.
+5. Sketch the error payload shape (Step 4) and the single boundary (Step 5).
+6. Hand the sketch to a reviewer **before** coding; cite this skill.
+## Output format
+1. The failure-mode table (mode · classification · mechanism · resilience strategy).
+2. The shared error payload definition (code, message, cause, context, correlation_id).
+3. The single boundary handler (file:line) where internal → egress translation happens.
+4. The retry / circuit-breaker config (attempts, base, jitter, breaker thresholds), if any.
+## Gotcha
+- "Catch everything, log it, return null" silently destroys signal — every catch must either rethrow, translate, or recover with a written reason.
+- Retries on non-idempotent calls are the second-most-common production incident; insist on idempotency keys before allowing retry.
+- Circuit breakers without a half-open probe never close — they degrade to permanent failure.
+- Mixing Result types and exceptions in the same module is worse than picking the wrong one — pick one per module and stay in it.
+- Upstream pattern catalogues drift; trust the link, not memory. Refresh per `refresh_trigger` above.
+## Do NOT
+- Do NOT introduce a custom error mechanism that fights the language idiom.
+- Do NOT swallow exceptions — every catch has a written purpose.
+- Do NOT leak stack traces, secrets, or internal class names across the boundary.
+- Do NOT retry without backoff + jitter + cap.
+- Do NOT inline language-specific code catalogues into this skill — externalize per Sunset Policy.
+## Auto-trigger keywords
+- error handling strategy
+- exceptions vs result
+- retry pattern
+- circuit breaker
+- graceful degradation
+- error payload shape
+## Provenance
+- Adopted from: `Microck/ordinary-claude-skills@8f5c83174f7aa683b4ddc7433150471983b93131:skills_all/error-handling-patterns/SKILL.md` (MIT, © 2025 Microck) — **Sunset Policy applied**: 636-line source reduced to a ~150-line decision framework; language catalogues externalized to the upstream resources below.
+- Externalized catalogues:
+  - Python: https://docs.python.org/3/tutorial/errors.html · https://docs.python.org/3/library/exceptions.html
+  - PHP / Laravel: https://laravel.com/docs/errors · https://www.php.net/manual/en/language.exceptions.php
+  - JS / TS: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Control_flow_and_error_handling · https://www.typescriptlang.org/docs/handbook/2/narrowing.html
+  - Go: https://go.dev/blog/error-handling-and-go · https://pkg.go.dev/errors
+  - Rust: https://doc.rust-lang.org/book/ch09-00-error-handling.html
+  - Resilience patterns: https://martinfowler.com/bliki/CircuitBreaker.html · https://aws.amazon.com/builders-library/timeouts-retries-and-backoff-with-jitter/
+- Cross-linked: [`defense-in-depth`](../defense-in-depth/SKILL.md), [`laravel-validation`](../laravel-validation/SKILL.md), [`bug-analyzer`](../bug-analyzer/SKILL.md), [`api-design`](../api-design/SKILL.md).
+- Provenance registry: `agents/contexts/skills-provenance.yml` (entry: `error-handling-patterns`).
+- Iron-Law floor: `verify-before-complete`, `skill-quality`, `non-destructive-by-default`.

package/.agent-src/skills/mcp-builder/SKILL.md ADDED Viewed

@@ -0,0 +1,108 @@
+---
+name: mcp-builder
+description: "Use when building an MCP server in Python (FastMCP) or Node/TypeScript (MCP SDK) — agent-centric tool design, input schemas, error handling, and the 10-question evaluation harness."
+source: package
+---
+# mcp-builder
+Author MCP servers that LLMs can drive end-to-end. The quality bar is *can the agent finish the workflow*, not *does the endpoint return 200*. This skill is the **server-author** counterpart to the existing [`mcp`](../mcp/SKILL.md) consumer skill.
+## When to use
+- Wrapping an external API or service as MCP tools for an LLM client.
+- Adding tools to an existing MCP server (Python FastMCP or TypeScript SDK).
+- Reviewing an MCP server before shipping — Phase 4 evaluation gate below.
+Do NOT use when:
+- You only need to *call* an MCP server — route to [`mcp`](../mcp/SKILL.md).
+- The integration belongs in the host process — write a regular service, not an MCP server.
+- The "server" wraps one endpoint with no workflow — a CLI wrapper is enough.
+## Procedure: Four phases, one tool at a time
+### Phase 1 — Research & plan
+1. **Agent-centric design**. Tools encode *workflows*, not raw endpoints. Consolidate (`schedule_event` checks availability **and** creates the event). Default to human-readable names over IDs. Errors are educational, not just diagnostic ("retry with `filter='active_only'` to reduce results").
+2. **Load the protocol**. Fetch `https://modelcontextprotocol.io/llms-full.txt` once into context — the canonical spec.
+3. **Load the SDK README** for the chosen language:
+   - Python: `https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md`
+   - TypeScript: `https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md`
+4. **Read the target service's API docs in full** — auth, rate limits, pagination, error codes, schemas. Skipping this produces incomplete mocks (see [`testing-anti-patterns`](../testing-anti-patterns/SKILL.md) § Anti-Pattern 4).
+5. **Write the plan**: tool list with priority, shared utilities (request helper, pagination, formatter), input/output schemas, error strategy, response-detail levels (concise vs detailed), character limits (default 25 000 tokens).
+### Phase 2 — Implement
+1. **Project layout**. Python: single `.py` or modular package; Pydantic v2 with `model_config`. TypeScript: standard `package.json` + `tsconfig.json` strict mode; Zod schemas with `.strict()`.
+2. **Shared utilities first**. API request helper with retry/timeout, error formatter, JSON-vs-Markdown response builder, pagination cursor handling, auth/token cache.
+3. **Per tool**:
+   - Input schema (Pydantic / Zod) with constraints, descriptions, and *examples*.
+   - One-line summary + detailed docstring covering purpose, parameters, return shape, when-to-use, when-NOT-to-use, error handling.
+   - Tool annotations: `readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint`.
+   - Async/await for all I/O. Honor pagination. Truncate to the character limit and signal truncation in the response.
+### Phase 3 — Review & test
+1. **Code-quality pass**: DRY across tools, shared helpers extracted, consistent response shapes, all external calls have error handling, full type coverage.
+2. **Build & syntax**:
+   - Python: `python -m py_compile server.py`.
+   - TypeScript: `npm run build`; verify `dist/index.js`.
+3. **Run the server safely**. MCP servers block on stdio. Either run inside `tmux` and drive from the harness, or wrap with `timeout 5s python server.py` for a smoke check. Do NOT block your own session by running it in-process.
+### Phase 4 — Evaluations (10-question harness)
+Each evaluation is a question the agent must answer using only the new tools.
+Requirements per question — **independent**, **read-only**, **complex** (multiple tool calls), **realistic**, **verifiable** (string-comparable answer), **stable** (answer does not drift over time).
+```xml
+<evaluation>
+  <qa_pair>
+    <question>...</question>
+    <answer>...</answer>
+  </qa_pair>
+  <!-- 9 more -->
+</evaluation>
+```
+Process: enumerate the tools, explore READ-ONLY data, draft 10 questions, **solve each yourself first** to confirm the answer is reachable and stable.
+## Output format
+1. The server source plus the 10-question evaluation XML.
+2. A README with: install, env vars, transport mode (stdio / sse / http), example tool call.
+3. A line in `agents/contexts/skills-provenance.yml` if the server was forked from an upstream, or a note that it was authored from scratch.
+## Gotcha
+- "Wrap every endpoint" is the failure mode — agents cannot orchestrate 60 thin tools as well as 12 workflow tools.
+- Returning the full upstream payload blows the agent's context. Default to a *concise* shape with an opt-in *detailed* mode.
+- Pydantic / Zod descriptions are the *only* documentation the LLM sees at runtime — write them like usage docs, not comments.
+- A server that hangs your session usually means stdio transport ran in the main process — move it under `tmux` or use a `timeout`.
+- Inflated token claims are not credible without an evaluation harness — Phase 4 is the validation gate, not optional.
+## Do NOT
+- Do NOT mirror REST routes 1:1.
+- Do NOT use `any` (TypeScript) or untyped `dict` (Python) in tool I/O.
+- Do NOT skip the 10-question evaluation — Phase 4 IS the quality bar.
+- Do NOT run the MCP server in your main process during testing — it will block.
+- Do NOT log tokens, API keys, or full request bodies — sanitize before logging.
+## Auto-trigger keywords
+- mcp server
+- model context protocol
+- fastmcp
+- mcp builder
+- agent-centric tools
+## Provenance
+- Upstream protocol: https://modelcontextprotocol.io
+- Upstream SDKs: https://github.com/modelcontextprotocol/python-sdk · https://github.com/modelcontextprotocol/typescript-sdk
+- Adopted from: `Microck/ordinary-claude-skills@8f5c83174f7aa683b4ddc7433150471983b93131:skills_all/mcp-builder/SKILL.md` (MIT, © 2025 Microck) — external `./reference/*.md` file links replaced with inline guidance + upstream URLs.
+- Cross-linked: [`mcp`](../mcp/SKILL.md), [`testing-anti-patterns`](../testing-anti-patterns/SKILL.md), [`api-design`](../api-design/SKILL.md).
+- Provenance registry: `agents/contexts/skills-provenance.yml` (entry: `mcp-builder`).
+- Iron-Law floor: `verify-before-complete`, `tool-safety`, `skill-quality`.