npm - @cyanheads/mcp-ts-core - Versions diffs - 0.9.10 → 0.9.12 - Mend

@cyanheads/mcp-ts-core 0.9.10 → 0.9.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

package/CLAUDE.md +2 -1
package/README.md +1 -1
package/biome.json +1 -1
package/changelog/0.9.x/0.9.11.md +38 -0
package/changelog/0.9.x/0.9.12.md +26 -0
package/dist/cli/init.js +16 -3
package/dist/cli/init.js.map +1 -1
package/dist/config/index.d.ts +6 -0
package/dist/config/index.d.ts.map +1 -1
package/dist/config/index.js +10 -0
package/dist/config/index.js.map +1 -1
package/dist/core/serverManifest.d.ts +1 -0
package/dist/core/serverManifest.d.ts.map +1 -1
package/dist/core/serverManifest.js +1 -0
package/dist/core/serverManifest.js.map +1 -1
package/dist/mcp-server/transports/http/httpTransport.d.ts.map +1 -1
package/dist/mcp-server/transports/http/httpTransport.js +1 -0
package/dist/mcp-server/transports/http/httpTransport.js.map +1 -1
package/package.json +4 -4
package/skills/code-simplifier/SKILL.md +130 -0
package/skills/git-wrapup/SKILL.md +1 -3
package/skills/orchestrations/SKILL.md +223 -0
package/skills/orchestrations/workflows/field-test-fix.md +206 -0
package/skills/orchestrations/workflows/fix-wrapup-release.md +175 -0
package/skills/orchestrations/workflows/greenfield-build.md +143 -0
package/skills/orchestrations/workflows/maintenance-release.md +173 -0
package/skills/polish-docs-meta/SKILL.md +2 -1
package/skills/report-issue-framework/SKILL.md +4 -1
package/templates/AGENTS.md +6 -0
package/templates/CLAUDE.md +6 -0
package/dist/logs/combined.log +0 -4
package/dist/logs/error.log +0 -2
package/dist/logs/interactions.log +0 -0
package/skills/migrate-mcp-ts-template/SKILL.md +0 -162
package/skills/multi-server-orchestration/SKILL.md +0 -137
package/skills/multi-server-orchestration/references/greenfield-buildout.md +0 -246
package/skills/multi-server-orchestration/references/maintenance-pass.md +0 -148
package/skills/multi-server-orchestration/references/release-and-publish-pass.md +0 -184
package/skills/multi-server-orchestration/references/wrapup-pass.md +0 -150

package/skills/code-simplifier/SKILL.md ADDED Viewed

@@ -0,0 +1,130 @@
+---
+name: code-simplifier
+description: >
+  Post-session code review and cleanup against a working tree of changes. Analyzes `git diff` to simplify, consolidate, and align changed code with the existing codebase — modernize syntax, remove unnecessary complexity, consolidate duplicated logic, catch efficiency issues. Use after a substantive working session, or when asked to clean up, simplify, reduce slop, consolidate, modernize, tighten up, or de-slop code. For `@cyanheads/mcp-ts-core` projects, includes specific transformations for tool/resource/prompt definitions, the ctx pattern, error factories, and framework idioms.
+metadata:
+  author: cyanheads
+  version: "1.0"
+  audience: external
+  type: workflow
+---
+# Code Simplifier
+Post-session cleanup pass. Reviews what changed, understands how it fits the existing codebase, and makes targeted improvements — modernizing syntax, removing unnecessary complexity, consolidating duplicated logic, catching efficiency issues. Prioritizes codebase cohesion over local perfection.
+## Core philosophy
+**Every change must earn its keep.** A simplification that doesn't meaningfully improve clarity, correctness, or cohesion is noise. Don't refactor for refactoring's sake. Don't create new files, abstractions, or utilities unless they solve a demonstrated problem. If the existing code works and is readable, leave it alone. The goal is a cohesive codebase, not a pristine one.
+## Procedure
+### Phase 1: Identify changes
+Run `git diff` (or `git diff HEAD` if changes are staged) to see what changed. If there are no git changes, review the most recently modified files from the current session.
+### Phase 2: Understand the surrounding codebase
+Don't review changes in isolation. Before any modifications:
+1. **Read the full files** containing changes — not just the diff hunks. Understand imports, surrounding logic, module structure.
+2. **Identify the project language(s)** and select the relevant transformation rules. Discard inapplicable rules.
+3. **Survey adjacent code** — shared utilities, sibling modules, common patterns. You need to know what already exists before deciding something is missing. For mcp-ts-core projects, check `src/utils/` for project utilities, `src/errors/` for error handling, and `node_modules/@cyanheads/mcp-ts-core/` for framework exports.
+### Phase 3: Review
+Evaluate the changes across these dimensions. Not every dimension applies to every diff — skip what's irrelevant.
+#### Codebase cohesion
+- **Reuse** — Search for existing utilities, helpers, and patterns that could replace newly-written code. For mcp-ts-core projects, prefer `import from '@cyanheads/mcp-ts-core/utils'` over hand-rolled equivalents — pagination helpers, schema builders, retry primitives, and OTel attribute constants are framework-provided.
+- **Consolidation** — Flag copy-paste-with-variation: near-duplicate code blocks that should be unified. Only unify if the shared abstraction is genuinely simpler than the duplicated code.
+- **Consistency** — Check that new code follows the same patterns as the rest of the codebase: naming conventions, error handling style, import patterns, type annotation style. Normalize toward the better variant when the project is inconsistent.
+- **Stringly-typed code** — Flag raw strings where constants, string-union types, branded types, or framework attribute constants already exist. For mcp-ts-core projects, the `ATTR_*` constants in `@cyanheads/mcp-ts-core/utils` should replace raw OTel attribute keys.
+#### Code quality
+- **Redundant state** — State that duplicates existing state, cached values that could be derived.
+- **Unnecessary complexity** — Deep nesting that could be guard clauses, premature abstractions, over-engineered solutions to simple problems.
+- **Dead code** — Unreachable branches, unused variables, commented-out code, exports that nothing imports.
+- **Defensive code for impossible states** — Guards for cases the type system or framework already prevents. Drop them.
+- **Outdated patterns** — Verbose or legacy syntax where modern equivalents exist. See the transformation tables below.
+#### Efficiency
+- **Redundant work** — Repeated computations, duplicate file reads, duplicate network/API calls, N+1 query patterns.
+- **Missed concurrency** — Independent async operations run sequentially that could run in parallel with `Promise.all` / `Promise.allSettled`.
+- **No-op updates** — State/store updates inside loops or event handlers that fire unconditionally. Add change-detection so downstream consumers aren't notified when nothing changed.
+- **TOCTOU** — Pre-checking file/resource existence before operating on it. Operate directly and handle the error instead.
+- **Overly broad operations** — Reading entire files when only a portion is needed, loading all items when filtering for one.
+#### mcp-ts-core-specific
+- **Error throwing patterns** — Prefer framework error factories (`McpError`, `validationError`, `notFound`, `httpErrorFromResponse`) over raw `throw new Error()`. Tool handlers should throw — the framework catches, classifies, and instruments.
+- **Error codes** — `InvalidParams` only for malformed JSON-RPC params shape. `ValidationError` for domain validation. `NotFound` for missing entities. Don't conflate them.
+- **Ctx usage** — Use `ctx.log`, `ctx.state`, `ctx.elicit`, `ctx.sample` — don't reach for global loggers, request-scoped storage, or sampling APIs directly. The `ctx` pattern carries tenant scope and OTel context.
+- **Zod schemas** — Every tool input/output field needs `.describe()`. Zod 4 requires `z.record(z.string(), z.string())` not `z.record(z.string())`. Use `.optional()` rather than `.nullish()` unless null is semantically distinct from absent.
+- **Tool annotations** — `readOnlyHint`, `idempotentHint`, `openWorldHint` should reflect reality. A read-only tool with `readOnlyHint: false` gives clients the wrong picture.
+- **`exactOptionalPropertyTypes` boundaries** — If a downstream type insists on the field being present-or-not-present (not present-as-undefined), use a mapped widening type at the boundary. The pattern is documented in the framework.
+- **`format()` ↔ `structuredContent` parity** — Different MCP clients forward different surfaces. Tests should assert both surfaces carry equivalent data.
+### Phase 4: Apply transformations
+1. **Filter findings ruthlessly.** If a finding is a false positive or not worth the churn, skip it. Don't argue with yourself about borderline cases — move on.
+2. **Transform incrementally** — one category of change at a time (modernize syntax, then reduce nesting, then consolidate).
+3. **Verify equivalence** — all functionality, types, and public interfaces must remain unchanged.
+4. **Keep the diff minimal.** Only touch lines that have a real reason to change. Don't reformat untouched code, add comments to code you didn't modify, or "improve" things that are already fine.
+When done, briefly summarize what was fixed or confirm the code was already clean.
+## Common transformations
+The tables below cover TypeScript and Python. For other languages, apply analogous principles: prefer modern idioms, reduce nesting, eliminate dead code, follow project conventions.
+### TypeScript (modern ESM, TS 5.x+)
+| Before | After | Why |
+| --- | --- | --- |
+| `const x: Foo = { ... } as Foo` | `const x = { ... } satisfies Foo` | Type-checked without assertion |
+| `let resource = acquire(); try { ... } finally { release(resource) }` | `using resource = acquire()` | Explicit resource disposal (TS 5.2+) |
+| `if (x !== null && x !== undefined)` | `if (x != null)` | Idiomatic null/undefined check |
+| `arr.filter(x => x !== null) as T[]` | `arr.filter((x): x is T => x != null)` | Type-safe filtering, no cast |
+| `export { foo } from './foo/index.js'` | Direct imports at call sites | Avoid barrel re-exports inside the package; barrel exports are for public APIs only |
+| `async function f() { const a = await x(); const b = await y(); }` | `const [a, b] = await Promise.all([x(), y()])` | Parallel when independent |
+| `obj.x !== undefined ? obj.x : fallback` | `obj.x ?? fallback` | Nullish coalescing |
+| `if (a) { if (b) { if (c) { ... } } }` | Guard clauses with early returns | Reduce nesting |
+| `try { risky() } catch (e: any) { ... }` | `try { risky() } catch (e: unknown) { ... }` | Type-safe error handling |
+| `enum Status { A, B, C }` | `const Status = { A: 'A', B: 'B', C: 'C' } as const` | Prefer const objects for numeric enums; string enums are acceptable |
+| `function f(a: string, b: string, c: string, d?: string)` | `function f(opts: FnOptions)` | Options object when >3 params |
+| `throw new Error('Bad input')` (in a tool handler) | `throw validationError('Bad input', { field: 'x' })` | Use framework error factories so the framework can classify and instrument |
+| `const ATTR_KEY = 'mcp.tool.name'` | `import { ATTR_MCP_TOOL_NAME } from '@cyanheads/mcp-ts-core/utils'` | Use framework attribute constants |
+### Python (3.12+)
+| Before | After | Why |
+| --- | --- | --- |
+| `Optional[str]` | `str \| None` | Modern union syntax (3.10+) |
+| `List[str]`, `Dict[str, int]` | `list[str]`, `dict[str, int]` | Built-in generics (3.9+) |
+| `if x == 0: ... elif x == 1: ... elif x == 2: ...` | `match x: case 0: ... case 1: ...` | Structural pattern matching (3.10+) |
+| `class Config: def __init__(self, a, b, c): self.a = a ...` | `@dataclass class Config: a: str; b: int; c: float` | Less boilerplate, built-in eq/repr |
+| `results = []; for item in items: results.append(transform(item))` | `results = [transform(item) for item in items]` | Idiomatic comprehension |
+| `f = open('x'); try: ... finally: f.close()` | `with open('x') as f: ...` | Context manager for resources |
+| `line = f.readline(); while line: process(line); line = f.readline()` | `while (line := f.readline()): process(line)` | Walrus operator where it reduces duplication |
+| `"Hello " + name + "!"` | `f"Hello {name}!"` | f-string over concatenation |
+| `except Exception as e: pass` | `except SpecificError as e: log(e)` | Catch specific, never bare except/pass |
+| `from module import *` | `from module import specific_name` | Explicit imports only |
+| `TypeAlias = Union[A, B, C]` | `type ABC = A \| B \| C` | `type` statement (3.12+) |
+| Sequential `await` for independent I/O | `await asyncio.gather(a(), b())` | Parallel when independent |
+## When NOT to simplify
+Leave code alone when:
+- **It works and is readable.** "I would have written it differently" is not a reason to change it.
+- **The change is cosmetic.** Renaming a variable from `data` to `result` isn't worth the churn.
+- **Intentional verbosity for debugging.** Verbose code may exist to make stack traces or logging clearer.
+- **Performance-critical paths.** A less readable version may exist for measured performance reasons — check before simplifying.
+- **API compatibility.** Don't change public function signatures, export shapes, or return types that callers depend on. For mcp-ts-core projects, the public surface includes tool input/output schemas exposed via MCP — changing them is a breaking change to the server's MCP surface.
+- **Tests.** Don't DRY up test code aggressively — test readability and isolation matter more than deduplication.
+- **Type workarounds.** Sometimes an `as` cast or `# type: ignore` exists because of a genuine type system limitation — verify before removing.
+- **The abstraction isn't proven.** Don't create a shared utility for two similar blocks of code. Wait until there are three, and even then only if the abstraction is genuinely simpler than the duplication.

package/skills/git-wrapup/SKILL.md CHANGED Viewed

@@ -4,7 +4,7 @@ description: >
   Land working-tree changes as a versioned release commit with an annotated tag — version bump, changelog, regenerate derived artifacts, verify, commit, tag. Stops at "committed and tagged locally" — no push, no publish. The release-and-publish skill picks up from here. Distilled from the git_wrapup_instructions protocol.
 metadata:
   author: cyanheads
-  version: "1.0"
+  version: "1.1"
   audience: external
   type: workflow
 ---
@@ -35,8 +35,6 @@ Every item must be true before starting wrapup. Committing means releasing — a
 If any gate is red, fix it before proceeding. This skill re-verifies build + tests in step 6, but starting wrapup on a broken tree wastes the version number and creates a revert-or-amend situation.
-After all gates pass, spawn dedicated agents to handle the wrapup (this skill) and publish (`release-and-publish`). These are separate agents from the ones that did the editing work.
 ## Steps
 ### 1. Review the diff

package/skills/orchestrations/SKILL.md ADDED Viewed

@@ -0,0 +1,223 @@
+---
+name: orchestrations
+description: >
+  Pick and run a multi-phase workflow that chains foundational task skills (`git-wrapup`, `release-and-publish`, `maintenance`, `field-test`, `setup`, etc.) end-to-end. Routes user intent to a workflow file under `workflows/` — greenfield builds, maintenance + release, field-test + fix, or known-work + release. Single source for the universal rules (no commits without authorization, no destructive git, no marketing language), the orchestrator posture (own the goal, ground sub-agents in primary sources, verify against the goal), and the sub-agent strategy (orient block, parallel fanout, isolation, normalization) that apply across every workflow. Sub-agents are an optional capability — workflows run linearly when fanout isn't available.
+metadata:
+  author: cyanheads
+  version: "1.1"
+  audience: internal
+  type: workflow
+---
+## When to Use
+Multi-phase work that chains several foundational skills against one or more MCP server projects. Typical triggers:
+- "Build N new servers" / "scaffold and ship X, Y, Z" → `workflows/greenfield-build.md`
+- "Update and release these servers" / "run maintenance and ship" → `workflows/maintenance-release.md`
+- "QA / field-test / find-and-fix bugs in these servers" → `workflows/field-test-fix.md`
+- "Fix these issues and ship" / handoff document with findings to act on → `workflows/fix-wrapup-release.md`
+Single-skill work — running just `maintenance`, just `git-wrapup`, just `release-and-publish` — invokes the foundational skill directly. Use this orchestrations skill when at least two phases need to chain.
+## Mental Model — Three Tiers
+| Tier | Layer | Examples | Who reads it |
+|:---|:---|:---|:---|
+| **1** | Foundational task skills | `git-wrapup`, `release-and-publish`, `maintenance`, `field-test`, `setup`, `design-mcp-server`, `polish-docs-meta`, `code-simplifier`, `add-tool`, `add-resource`, `add-service`, `add-test`, etc. | Orchestrator AND sub-agents (by direct path reference) |
+| **2** | Orchestration workflows | The four files under `workflows/` | Orchestrator only |
+| **3** | Router | This `SKILL.md` | Orchestrator only |
+Workflows in Tier 2 sequence Tier 1 skills with gates and verification. They never duplicate Tier 1 content — they direct to it. A workflow file says "Phase N: agent reads and runs `skills/git-wrapup/SKILL.md`," not "here's how to wrap up a release."
+The orchestrator is the agent driving the workflow — the one reading this SKILL.md. Sub-agents the orchestrator spawns receive prompts pointing at Tier 1 skills directly; they do not receive this skill or the workflow file. That boundary prevents recursive sub-agent spawning.
+## Pick a Workflow
+Identify the workflow from user intent first, then sanity-check against project state if intent is ambiguous.
+| User intent / state signal | Workflow |
+|:---|:---|
+| New scaffold(s) from `bunx @cyanheads/mcp-ts-core init`, no implementation yet (echo definitions still present, no released changelog) | `workflows/greenfield-build.md` |
+| Existing server(s), `bun outdated` shows updates, want to land them and ship | `workflows/maintenance-release.md` |
+| Existing server(s), want to find bugs via live testing and fix them, optionally ship | `workflows/field-test-fix.md` |
+| Existing server(s) with known issues (GH issues, handoff document, observed gap), want to fix and ship | `workflows/fix-wrapup-release.md` |
+If intent is ambiguous (no clear signal), surface the candidate workflows to the user and confirm. Don't pick silently.
+A workflow file is the orchestrator's playbook for one run. Read it end-to-end before kicking off the first phase.
+## Universal Rules
+These apply to every workflow. Workflow files don't restate them; the orchestrator carries them forward and restates them in sub-agent prompts where applicable.
+1. **No commits, pushes, tags, branch creation, or destructive ops without explicit user authorization.** Work phases leave the working tree dirty for orchestrator review. Wrap-up and release phases run only after the user authorizes — though once authorized, the authorization is durable through the workflow's end (no re-asking at each phase boundary).
+2. **No `git stash`, no `git reset --hard`, no `git restore .`, no `git clean -f`, no `git checkout -- .`.** These bypass safety and risk silent data loss. Read-only git (`status`, `diff`, `log`, `show`, `blame`) is always safe.
+3. **No `--no-verify`, no `--no-gpg-sign`, no bypassing commit hooks.** If a hook fails, investigate the underlying issue.
+4. **`bun run devcheck` is the handoff gate between phases.** Work phases must hand back a green devcheck. If a phase can't reach green, halt and report the failing step verbatim rather than carrying broken state forward.
+5. **No marketing adjectives** in commits, tags, READMEs, or changelog entries — no "comprehensive", "robust", "enhanced", "seamless", "improved". State the change, not its quality.
+6. **One workflow per orchestration run.** Don't interleave two workflows in the same session. If a target needs both (e.g., maintenance surfaces a bug fix that needs field-testing first), sequence them as two workflow runs with a clean handoff in between.
+7. **`gh release create --notes-from-tag` is incompatible with `--repo`.** Always `cd` into the target repo directory for `gh release` commands.
+8. **Annotated tags only** (`git tag -a`), never lightweight. Tag annotation subject omits the version number — GitHub prepends `v<VERSION>:` to release titles when using `--notes-from-tag`, so including the version in the subject creates stutter.
+9. **Conventional Commits subjects** (`feat|fix|refactor|chore|docs|test|build(scope): message`). One logical concern per commit. The release commit (version bump + changelog + regenerated artifacts) lands on top of a stack of feature/fix commits, never collapsed alongside them.
+10. **Email on any artifact is the user's domain email**, never a personal address that might appear in git config.
+## Orchestrator Posture
+The orchestrator owns the goals. Workflow phases are not "run skill X" — they are "achieve goal Y, using skill X as the path." Sub-agents (when used) are instruments for hitting the goals, not the work itself. The same posture applies in linear mode — the orchestrator runs the phase directly, but the goal is still the contract.
+Before running a phase (or spawning a sub-agent for it), write down four things:
+1. **Goal** — the verifiable end state this phase must produce. Concrete and testable: "v0.5.2 tag exists at HEAD with structured-markdown annotation; `bun run devcheck` green; `npm view <pkg>@0.5.2` resolves." Not fuzzy: "ran the release-and-publish skill."
+2. **Primary sources** — the specific files, GH issues, and reference docs the sub-agent must read directly. Inlining content into the prompt is a paraphrase that loses nuance; agents grounded in the source catch details the orchestrator's summary missed. For GH issues, instruct `gh issue view N --comments` — body alone misses thread clarifications. The orchestrator reads these sources too (to construct the prompt), but that's prompt construction, not a substitute for the sub-agent reading them.
+3. **Path** — the Tier 1 skill(s) and steps that get to the goal. This is what gets handed to the sub-agent.
+4. **Verification** — the read-only checks that confirm the goal was hit. Defined upfront, not as an afterthought.
+Why the framing matters:
+- **Verification follows from goal definition.** If the goal is concrete, the verification is obvious — check that exact state. If the goal is fuzzy, verification degrades to "did the sub-agent say it worked?"
+- **Sub-agent self-reports describe intent, not always reality.** A goal you wrote down beforehand is the falsification target — the sub-agent's report is a hypothesis to verify against it.
+- **Replanning is local.** When verification fails, the goal is unchanged; the orchestrator picks a different path (re-spawn with the failure context, re-slice the work, intervene directly). Phase rework doesn't cascade.
+**Inform without inlining.** An enhanced sub-agent prompt names the specific primary sources and the goal — it does NOT paraphrase them. "Review GH issue #123 (read it via `gh issue view 123 --comments`); the goal is X; verify with Y" is the right shape. Pasting the issue body into the prompt forces the sub-agent to work from a paraphrase. Let the sub-agent read the source and explore for additional context as needed.
+## Sub-Agent Strategy (if available)
+Sub-agents are optional. Match the mechanism to your platform's capability — three tiers, in increasing order:
+1. **No fanout** — run phases linearly. The phase structure is the value; parallelism is the optimization.
+2. **Parallel sub-agents** — compose N prompts, launch concurrently, collect, verify (the pattern below). Single-target workflows usually run linearly anyway; multi-target workflows get one sub-agent per target.
+3. **Programmatic orchestration** — if your platform offers deterministic multi-agent control flow, use its primitives: schema-validated sub-agent returns, automatic concurrency management, resumable/journaled runs, and barrier-free pipelining across phases.
+Phases, gates, goals, and constraints are identical across all three tiers — only the fanout mechanism changes. Use the most capable tier available, and don't hand-roll what the platform does natively (e.g., rolling concurrency). Choose by scope and capability, not by default.
+The decision tree below is orthogonal to tier — it governs *whether* a given phase fans out, by target count and conflict risk:
+| Situation | Strategy |
+|:---|:---|
+| Single target, small change | Linear, orchestrator runs the phases itself |
+| Single target, large change likely to exhaust orchestrator context | Sub-agent per phase; orchestrator gates between phases |
+| N > 1 targets, independent work per target | One sub-agent per target per phase (parallel fanout) |
+| N > 1 targets, work that conflicts across targets (e.g., all editing the same file) | Linear or serial — the parallel model assumes target independence |
+| Sub-agents not available | Linear, regardless of N — same phases, just sequential |
+### Orient block
+Every sub-agent prompt opens with this block. Sub-agents do not inherit the orchestrator's `CLAUDE.md`/`AGENTS.md` chain or skill registry — both must be reconstructed in the prompt. Substitute the bracketed values per target.
+```text
+You are working on `[project name]` at `[project absolute path]`.
+Orient first. These steps are required before any task work — do them in
+order. If any file does not exist, note it and continue.
+1. Read the global agent protocol at `~/.claude/CLAUDE.md` (or your agent's equivalent — `~/.codex/AGENTS.md`, etc.).
+2. Read the workspace-level protocol if one exists at `[workspace agent protocol path]`
+   — skip this step if no workspace-tier protocol applies.
+3. Read the project protocol at `[project absolute path]/CLAUDE.md` (or `AGENTS.md`, whichever the project keeps).
+4. Run `cd [project absolute path] && bun run list-skills` to see the project's
+   available skills with descriptions and locations.
+5. Read the skill file(s) for this task: `[Tier 1 skill paths]`.
+6. Read the primary sources for this task directly — design docs (`docs/design.md`),
+   GH issues (use `gh issue view <N> --comments` to capture the full thread, not
+   just the body), handoff documents, reference/gold-standard files. List each
+   one explicitly: `[primary source paths and gh commands]`. Skip this step only
+   if no primary source applies (rare).
+Only after that, begin the task below.
+**Goal:** [the verifiable end state this phase must produce — concrete, testable]
+**Path:** [Tier 1 skill(s) and steps the sub-agent should follow]
+**Constraints:** [no-go list — restate git/commit rules and other invariants verbatim]
+**Expected outputs:** [report shape you want back — e.g., "Step 8 numbered summary", "list of files touched with one-line rationale per fix"]
+```
+The sub-agent reads the primary sources directly during orient (step 6) — do not paste their contents into the prompt. The orchestrator names them; the sub-agent reads them.
+### Isolation rules
+1. **Bash `git` only in parallel sub-agents.** Do not let parallel sub-agents call `mcp__git-mcp-server__*` tools — session state (`set_working_dir`) leaks across parallel calls in the same orchestrator session, causing silent no-ops, wrong-directory operations, and false "tag already exists" errors. Bash `git` in the agent's CWD is reliable. The orchestrator may still use `git-mcp-server` itself in serial.
+2. **Sub-agents do not receive this orchestrations skill or workflow files.** Their prompts include Tier 1 skill paths only. This prevents recursive sub-agent spawning — if a sub-agent decides it needs to fan out work, that's a signal the orchestrator sliced the work too wide. Re-slice; don't let the sub-agent recurse.
+3. **Sub-agent prompts must restate the no-git-write and no-`stash` rules verbatim.** The orchestrator's `CLAUDE.md`/`AGENTS.md` rules aren't visible to sub-agents at prompt time.
+4. **Narrow scope per fanout.** A sub-agent doing "implement everything, write tests, run devcheck, polish, commit, tag" will exhaust its context window before finishing — the work lands on disk but the agent can't continue. Split phases so each sub-agent finishes well under the context limit. Plan a follow-up "finish" phase as a normal backstop, not a fallback for failure.
+### Parallel fanout pattern
+For N targets in a phase:
+1. Compose N sub-agent prompts (one per target) with the orient block + task body + workflow's phase-specific constraints
+2. Launch them as parallel sub-agents in a single orchestrator action
+3. Collect their reports
+4. Verify with a read-only orchestrator check before advancing to the next phase
+**Barriers only where gates sit.** Step 4's "advance to the next phase" implies a barrier — collect every target's phase-N result before any target starts phase N+1. That barrier is only required when a gate sits between the phases: a human decision (authorization, version-bump intent) or cross-target synthesis (the roll-up). Where no gate intervenes, a target may flow through consecutive phases independently — tier-3 platforms pipeline this for wall-clock, and even hand-spawned runs can let one sub-agent carry a target across adjacent gate-free phases. Keep the barrier at gate boundaries; drop it elsewhere.
+### Editor / wrap-up separation
+Editing phases and wrap-up phases never go in the same sub-agent. Editing sub-agents make file changes and run devcheck — they do not commit, tag, or push. Wrap-up sub-agents read the working tree, commit, tag, and (when releasing) push and publish — they do not edit source. This separation lets the orchestrator review diffs before they become permanent and keeps the commit graph clean.
+### Normalization
+Independent sub-agents diverge on incidental choices — scoped vs. unscoped package names, script invocation form, README hero structure, badge ordering. When choices should be uniform across targets, plan an explicit normalization step after the fanout — don't expect alignment for free.
+For small N or small diffs, the orchestrator normalizes directly. For large N or non-trivial fixes, spawn a narrow-scope fanout with an explicit rule list.
+### Rolling concurrency
+If your platform manages sub-agent concurrency automatically (tier 3), rely on it rather than hand-rolling the below. Otherwise: rate limits on parallel sub-agent spawning are intermittent — sometimes 15 concurrent agents work fine, sometimes 3 get throttled. Don't hard-cap; use rolling concurrency. Launch an initial batch, then as each agent completes, kick off the next in line. If a wave gets rate-limited, shrink the window for the next wave.
+### Cross-project naming hygiene
+When N targets share a phase, never name other targets in a sub-agent's prompt — even as examples. Sub-agents pattern-match on everything in their prompt, and cross-project names leak into commits, messages, and variable names. Each sub-agent's prompt references its own target only.
+## Verification (orchestrator)
+Verification runs against the goal *you* defined for the phase — not against the sub-agent's self-report. A sub-agent that reports "done" without producing the goal state is not done. The artifact checks below are the *means* of confirming the goal; pick the ones that exercise your specific goal definition.
+Sub-agent self-reports describe intent, not always reality. After every phase that touched the filesystem or remote services, run a read-only check against the goal:
+- **Files** — `ls`, `git status`, `git diff --stat`
+- **Commits** — `git log --oneline -5`
+- **Tags** — `git tag --points-at HEAD`, `git ls-remote --tags origin`
+- **GitHub** — `gh repo view --json visibility`, `gh release view v<VERSION>`, `gh issue list`, `gh issue view <N> --comments` to confirm the fix comment landed
+- **npm / registries** — `npm view <pkg>@<version>`, registry-specific checks
+- **Build state** — re-run `bun run devcheck` if the previous phase was supposed to land green
+- **Quality** — tag annotation reads as structured markdown (not flat string), subject omits the version number, no marketing adjectives, dep arrows present where applicable, issue backlinks where applicable
+If verification disagrees with the sub-agent's report, that's the signal to re-spawn with the actual state and the unmet goal in the prompt — not to trust the report. The goal hasn't changed; only the path needs to.
+## Authorization Flow
+| Phase type | Authorization required |
+|:---|:---|
+| Reads, analysis, file edits (working tree only) | Implicit — initial workflow approval covers these |
+| Local commits, annotated tags | Explicit at workflow start; durable through workflow end |
+| Push to remote, npm / registry publish, GH release create, Docker push | Explicit at workflow start; durable through workflow end |
+| Destructive ops (force push, tag delete, remote branch delete, etc.) | Always re-confirm, never assume |
+Pipeline authorization is durable through to completion. Once the user authorizes a workflow run, don't re-ask at each phase boundary — proceed automatically through gates that pass. Conditions that always require a fresh check-in: destructive ops on shared resources, external actions without sign-off, errors that need human judgment.
+## Workflow File Discipline
+Workflow files are thin by design. Each phase row in a workflow's phases table maps to a Tier 1 skill or a thin orchestration step. **Phase notes are for orchestration overrides only** — sequencing rules, fanout-specific constraints, non-obvious instructions, decisions the foundational skill leaves to the caller. Never paraphrase what a foundational skill already documents. A phase that runs a Tier 1 skill end-to-end with no orchestration override needs no phase note — just the row in the table.
+The same discipline applies to gotchas: workflow-specific gotchas are about the orchestration pattern itself (e.g., parallel sub-agent context exhaustion, normalization gaps). Gotchas about a Tier 1 skill's internals belong in that skill, not the workflow.
+## When the Workflow List Doesn't Fit
+For scenarios that don't map cleanly to one of the four workflow files — security audits across N servers, framework-wide migrations, design-only extensions, ad-hoc multi-step work — the universal rules and sub-agent strategy above still apply. Author a new workflow file at `workflows/<scenario>.md` when the pattern is repeatable enough to codify. Follow the shape of the existing workflow files. Open with the back-pointer every workflow carries — a "Read `../SKILL.md` first" tail on the frontmatter `description`, plus a "Use after reading `../SKILL.md`." line under the H1 — so an orchestrator that opens the file directly (not routed through this skill) still picks up the universal rules and sub-agent strategy. Then, when applicable: Tier 1 skills referenced, pre-flight, phases table, phase notes, workflow-specific gotchas, checklist. Apply the workflow file discipline above.
+## Pre-flight Checklist (every workflow)
+Verify before kicking off the first phase. Workflow files add their own pre-flight items on top of these.
+- [ ] Target list captured with absolute paths
+- [ ] Intent and state signals point to a single workflow (or confirmed with user if ambiguous)
+- [ ] Selected workflow file read end-to-end
+- [ ] Phase objectives understood (the Objective column of the phases table is the goal contract — verification runs against these)
+- [ ] Plan surfaced to user: workflow, targets, phase objectives, applicable universal rules
+- [ ] User authorization captured for the workflow's commit/push/publish phases (if any apply)
+- [ ] Sub-agent capability confirmed (or fallback to linear execution noted)

package/skills/orchestrations/workflows/field-test-fix.md ADDED Viewed

@@ -0,0 +1,206 @@
+---
+name: field-test-fix
+description: >
+  Workflow: field-test one or more existing MCP server projects against the live upstream API, file GH issues for valid findings, deploy fix sub-agents per server, optionally loop until clean, then wrap up and release. Chains the `field-test`, `report-issue-local`, `tool-defs-analysis`, `code-simplifier`, `git-wrapup`, and `release-and-publish` skills. Read `../SKILL.md` first for the universal rules and sub-agent strategy.
+metadata:
+  author: cyanheads
+  version: "1.0"
+  audience: internal
+  type: workflow
+---
+# Field-Test + Fix Workflow
+Use after reading `../SKILL.md`. Drives field-testing, issue filing, fix application, verification, and (optional) release across N MCP server projects.
+## When applicable
+- One or more existing servers need a QA pass against the live upstream API — quality gate before launch, post-release smoke test, or "find and fix bugs" instruction
+- The user wants observed bugs filed as GH issues and then fixed
+- Optionally ends in a release; can also stop at "fixed, committed locally" if release isn't authorized yet
+For known work (issues already tracked, handoff documents) where the discovery phase isn't needed, use `fix-wrapup-release.md` instead.
+## Tier 1 skills referenced
+| Phase | Tier 1 skill(s) |
+|:---|:---|
+| Field-test | `skills/field-test/SKILL.md` |
+| Issue filing | `skills/report-issue-local/SKILL.md` + `.github/ISSUE_TEMPLATE/` |
+| Tool definition quality (informs field-test framing) | `skills/tool-defs-analysis/SKILL.md` |
+| Fix | (No single skill — sub-agent reads issues, validates, fixes) |
+| Code simplify (optional) | `skills/code-simplifier/SKILL.md` |
+| Wrap-up | `skills/git-wrapup/SKILL.md` |
+| Release | `skills/release-and-publish/SKILL.md` |
+## Pre-flight
+Per target:
+1. **Clean working tree** — `git status --short` must be empty
+2. **Current version** — `git describe --tags --abbrev=0`, `grep '"version"' package.json`
+3. **API keys** — `.env` files exist for servers requiring them. If missing, surface the list with registration URLs before proceeding.
+4. **Issue template + `report-issue-local` skill present** per target
+5. **`list-skills` script present** — `test -f scripts/list-skills.ts && grep -q '"list-skills"' package.json`
+6. **Repo visibility** — `gh repo view --json visibility -q '.visibility'` per target. Determines wrap-up scope.
+7. **Build** — `bun run rebuild` per target, parallel. All must pass before Phase 1.
+## Phases
+Each phase's Objective column is the goal state per target — the verifiable end state the phase must produce.
+| # | Phase | Objective | Sub-agent mode |
+|:--|:---|:---|:---|
+| 1 | Field-test | Per target: live tool/resource/prompt surface exercised across happy/error/edge paths; valid findings filed as GH issues against the server's own repo; noise filtered | parallel fanout per target; within a target, 1 or 3 sub-agents (see below) |
+| 2 | Issue triage | Per-target GH issue count + severity breakdown reconciled against actual GH state | orchestrator (serial) |
+| 3 | Fix | Per target: priority issues fixed in source, tests updated, `devcheck` + `test` green, each issue commented with fix details, working tree dirty for review | parallel fanout (one sub-agent per target — hard constraint) |
+| 4 | Verify | Per target: full diff cold-reviewed; simplified if warranted; each fix re-exercised against the running server with actual tool output in the summary | parallel fanout |
+| 5 | Loop decision | Orchestrator decision recorded — proceed to release, loop another field-test cycle, or pause/surface to user. Evidence-based | orchestrator (serial) |
+| 6 | Wrap-up + release | (Optional) Per target: fixes split into per-file commits with a release commit on top; annotated tag; published per repo visibility; tag annotation is structured markdown with issue backlinks | parallel fanout (Bash git only) |
+| 7 | Issue cleanup | Every GH issue that shipped a fix closed with "Fixed in v\<version\>" comment; skipped issues remain open | orchestrator (serial) |
+Phase 6 is optional — stop earlier if release isn't authorized. Phase 7 only runs if Phase 6 ran.
+## Phase notes
+### Phase 1: Field-test
+**Default: one comprehensive sub-agent per target** that covers happy paths, error paths, and edge cases in sequence. Use three separate sub-agents only when the server has 8+ tools and a single agent would exhaust context.
+| Category | What to test |
+|:---|:---|
+| Happy paths | Every tool/resource/prompt with realistic input — output shape, `content[]` readability, `structuredContent` parity, field selection |
+| Error paths | Invalid inputs, missing fields, wrong types — error contract verification (code/reason), error text actionability |
+| Edge cases | Boundaries, empty results, pagination limits, special characters, domain-specific oddities — crash resistance, 0-result messaging, date boundaries |
+**Sub-agent isolation.** Each sub-agent gets a unique field-test ID in its helper file path: `/tmp/<project-name>-field-test-<ID>.sh`. Convention: `<SERVER-PREFIX>-<HP|ER|EC>-<5CHAR>`. The helper script is stateless — every function takes IDs as positional args.
+**Build skip.** Pre-flight built the project. Tell sub-agents to modify their `mcp_start` helper to skip `bun run rebuild` — just start the server. This avoids concurrent builds racing on `dist/`. Each agent starts its own server instance; ports auto-increment.
+**Issue filing.** Sub-agents file GH issues against the server's own repo using `report-issue-local` patterns. Constraints:
+- **Noise filter** — before filing, the sub-agent asks: "Would a maintainer coming to this cold say 'yes, this needs fixing'?" If not, skip.
+- **`gh issue create` with `--title` and `--body`** (not `--web`) — include server version, framework version, runtime, transport, repro steps, actual vs expected behavior
+- **Do NOT file against `@cyanheads/mcp-ts-core`** unless the bug is clearly in the framework — file against the server's own repo
+- **Redact secrets** — API keys, tokens, etc.
+Sub-agent reads `skills/tool-defs-analysis/SKILL.md` as a primer — field-testing evaluates the agent-facing surface during live use, not just statically.
+### Phase 2: Issue triage
+Orchestrator verifies filed issues exist via `gh issue list -R <owner>/<repo>` per target. Reconciles sub-agent reports against actual GH state (sub-agents sometimes report filing but hit errors). Produces a per-target issue count and severity breakdown. If all sub-agents found 0 issues, skip to Phase 6 (or end the workflow if no release authorized).
+### Phase 3: Fix
+**One sub-agent per target — hard constraint.** No file-locking system exists for concurrent edits; multiple agents touching the same server's `src/` will conflict.
+Each sub-agent:
+1. Reads all open issues for its target via `gh issue list` + `gh issue view N --comments` (full thread — body alone misses clarifications)
+2. **Validates each issue against source code** — a "fixed" issue is a misdiagnosed one if validation fails
+3. Implements fixes in priority order: security → bugs → UX
+4. Rebuilds after each fix or group of related fixes
+5. Field-tests each fix live (starts server, runs repro steps from the issue)
+6. Runs `bun run devcheck` and `bun run test` — exit gate
+7. Comments on each GH issue with a concise fix summary
+8. Leaves everything uncommitted
+**Constraints to restate:**
+- Surgical fixes only — don't refactor surrounding code unless the fix requires it
+- If a fix is disproportionate (major architecture change), note it on the issue and skip
+- Every fix verified live, not just compiled — include actual tool call output in the summary
+### Phase 4: Verify
+Fresh sub-agent per target, reads the full `git diff` cold. Two passes in one sub-agent:
+1. **Code-simplify** — read `code-simplifier`, review the full diff through that lens, apply cleanup if warranted. Skip if changes are minimal — don't run as ceremony.
+2. **Re-field-test** — spin up the server, run the repro steps from each fixed issue, include actual tool call output in the summary.
+Exit gate: `bun run devcheck && bun run rebuild && bun run test`.
+### Phase 5: Loop decision
+| Signal | Action |
+|:---|:---|
+| All fixes validated, devcheck + tests green | Proceed to Phase 6 (or end if release not authorized) |
+| Fix sub-agent reported skipped issues (disproportionate) | Note; proceed unless critical |
+| Fix sub-agent couldn't reach green gates | Respawn fix sub-agent with specific failure context |
+| Major architectural issues surfaced | Pause, surface to user |
+The orchestrator makes this call based on evidence — don't defer when the data is clear.
+If looping: respawn Phase 1 + Phase 3 for targets that had fixes applied; skip targets that passed clean. Diminishing returns after 2 cycles.
+### Phase 6: Wrap-up + release (optional)
+Each sub-agent reads both `skills/git-wrapup/SKILL.md` and `skills/release-and-publish/SKILL.md`.
+**Commit structure.** Fixes are NOT collapsed into a single commit. Per the universal git rules:
+1. Analyze the diff (`git diff --stat`, then spot-check actual changes)
+2. Group by file boundaries — fixes sharing a file ship in the same commit
+3. Commit each group: `fix(scope): description` (Conventional Commits)
+4. Release commit on top — version bump + changelog + regenerated artifacts as `chore(release): v<version>`
+5. Tag the release commit
+The tag annotation and changelog cover ALL fixes — the commit split is about git history, not release notes.
+**Version bump.** Default **patch** for field-test fix releases. **Minor** when enhancements are bundled in.
+**Tag annotation format.** Tag subject omits the version number. Structured markdown:
+```
+Field-test bug fixes across N tools
+Fixed:
+- <tool_name>: <one-line fix description> (#<issue>)
+- <tool_name>: <one-line fix description> (#<issue>)
+<test count>; `bun run devcheck` clean.
+```
+Add a `Security:` section when the changelog frontmatter sets `security: true`.
+**Wrap-up scope.** Determined by repo visibility:
+| Status | Scope |
+|:---|:---|
+| Private / in-development | Version bump → changelog → commit → tag → mcpb bundle → push → `gh release create`. Skip `bun publish`, Docker, MCP Registry. |
+| Public / launched | Full `release-and-publish`: push + `bun publish` + `publish-mcp` + bundle + GH release + Docker (if Dockerfile). |
+### Phase 7: Issue cleanup
+Close issues that shipped fixes — only those. Skipped issues stay open.
+```bash
+for n in <fixed-issue-numbers-from-phase-3>; do
+  gh issue close "$n" -R "<owner>/<repo>" --reason completed --comment "Fixed in v<version>."
+done
+```
+Collect specific issue numbers from Phase 3 sub-agent summaries — do not close all open issues indiscriminately.
+## Workflow-specific gotchas
+| # | Gotcha | Mitigation |
+|:--|:-------|:-----------|
+| 1 | 3 sub-agents per server racing on `bun run rebuild` corrupts `dist/` | Pre-flight builds once; sub-agents skip rebuild in `mcp_start` |
+| 2 | Tmp file collisions between concurrent field-test sub-agents | Unique IDs per agent in helper path: `/tmp/<project>-field-test-<ID>.sh` |
+| 3 | Sub-agents file issues against `@cyanheads/mcp-ts-core` instead of the server | Restate explicitly in every Phase 1 prompt: "Do NOT file issues against mcp-ts-core" |
+| 4 | Multiple fix sub-agents editing the same server's files | Hard constraint: 1 sub-agent per server in Phase 3 |
+| 5 | Fix sub-agent can't live-verify due to API quota exhaustion | Accept fixes where the root cause is code-evident (wrong field path, missing guard); note that live verification was blocked by quota |
+| 6 | Sub-agents file noise issues (nits, style preferences, bikeshedding) | Noise filter instruction in every Phase 1 prompt; sub-agents self-filter before filing |
+| 7 | Field-test sub-agent reports success but didn't actually exercise the tool | Sub-agent must include actual tool call output in the summary; orchestrator spot-checks |
+| 8 | Wrap-up sub-agent collapses multi-fix diff into a single commit | Phase 6 prompt enumerates the commit structure — group by file, release commit on top |
+| 9 | Wrap-up sub-agent makes unplanned intermediate commits outside the planned structure | Prompt defines the exact commit shape; sub-agents must not invent extras |
+| 10 | Loop decision deferred to user when orchestrator has enough data | Orchestrator decides on evidence |
+| 11 | MCP Registry returns 502 transiently during publish | Retry up to 2x with backoff. First attempt may fail; second usually succeeds |
+| 12 | Private repos need upstream set before first push | Agents should use `git push -u origin main` if upstream is unset |
+## Checklist
+- [ ] Pre-flight: working trees clean, API keys present, issue templates exist, all targets build clean
+- [ ] Phase 1: field-test sub-agents launched with unique IDs, build-skip, orient blocks
+- [ ] Phase 1: sub-agents tore down servers and cleaned tmp files before reporting
+- [ ] Phase 2: issue counts verified against GH state
+- [ ] Phase 3: 1 sub-agent per server (hard constraint), priority order followed, exit gate (devcheck + test) green
+- [ ] Phase 3: GH issues commented with fix details
+- [ ] Phase 4: verify pass — code-simplify (if applicable) + re-field-test, actual outputs in summary
+- [ ] Phase 5: loop decision made on evidence
+- [ ] Phase 6 (if releasing): version bumped, fix commits + release commit, annotated tag, scope matches private/public status
+- [ ] Phase 7 (if releasing): fixed issues closed; skipped issues remain open
+- [ ] Post-workflow verification: `git ls-remote --tags origin`, `npm view <pkg>@<version>` if public, GH release artifacts attached
+- [ ] Tag/release quality review: tag subject omits version number, structured markdown, no marketing adjectives, issue backlinks present