npm - @jgamaraalv/ts-dev-kit - Versions diffs - 2.0.0 → 2.2.0 - Mend

@jgamaraalv/ts-dev-kit 2.0.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/.claude-plugin/marketplace.json +1 -1
package/.claude-plugin/plugin.json +1 -1
package/README.md +36 -0
package/package.json +1 -1
package/skills/task/SKILL.md +73 -31
package/skills/task/references/output-templates.md +9 -7
package/skills/task/references/verification-protocol.md +101 -0

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -12,7 +12,7 @@
       "name": "ts-dev-kit",
       "source": "./",
       "description": "13 specialized agents and 16 skills for TypeScript fullstack development",
-      "version": "2.0.0",
+      "version": "2.2.0",
       "author": {
         "name": "jgamaraalv"
       },

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ts-dev-kit",
-  "version": "2.0.0",
+  "version": "2.2.0",
   "description": "13 specialized agents and 16 skills for TypeScript fullstack development with Fastify, Next.js, PostgreSQL, Redis, and more.",
   "author": {
     "name": "jgamaraalv",

package/README.md CHANGED Viewed

@@ -100,6 +100,42 @@ claude --plugin-dir ./ts-dev-kit
 ---
+## Recommended MCP Servers
+Some agents and skills reference external MCP tools for documentation lookup, browser debugging, E2E testing, and web fetching. These are **optional** — skills degrade gracefully without them — but installing them unlocks the full experience.
+| MCP Server    | Used By                                  | Purpose                              |
+| ------------- | ---------------------------------------- | ------------------------------------ |
+| context7      | Most skills (doc lookup)                 | Query up-to-date library docs        |
+| playwright    | playwright-expert, debugger, test-generator | Browser automation and E2E testing |
+| chrome-devtools | debugger                               | Frontend debugging, screenshots      |
+| firecrawl     | task skill                               | Web fetching and scraping            |
+### Installing as Claude Code plugins
+```bash
+claude plugin add context7
+claude plugin add playwright
+claude plugin add firecrawl
+```
+### Installing as standalone MCP servers
+```bash
+# context7 — no API key required
+claude mcp add context7 -- npx -y @upstash/context7-mcp@latest
+# playwright — no API key required
+claude mcp add playwright -- npx -y @playwright/mcp@latest
+# firecrawl — requires FIRECRAWL_API_KEY
+claude mcp add firecrawl --env FIRECRAWL_API_KEY=your-key -- npx -y firecrawl-mcp
+```
+> **chrome-devtools** requires Chrome running with remote debugging enabled (`--remote-debugging-port=9222`). Refer to the [Chrome DevTools MCP docs](https://github.com/anthropics/mcp-chrome-devtools) for setup instructions.
+---
 ## Customizing for Your Project
 This kit ships with a project orchestration template at `docs/rules/orchestration.md.template`. It defines quality gates, workspace commands, and dependency ordering that you can adapt to your own monorepo or project.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@jgamaraalv/ts-dev-kit",
-  "version": "2.0.0",
+  "version": "2.2.0",
   "description": "Claude Code plugin: 13 agents + 16 skills for TypeScript fullstack development",
   "author": "jgamaraalv",
   "license": "MIT",

package/skills/task/SKILL.md CHANGED Viewed

@@ -108,8 +108,7 @@ Match skills to the sub-area identified in domain_areas:
 **Cross-cutting** → combine skills from each sub-area involved.
 </skill_map>
-In SINGLE-ROLE mode: call each skill yourself before starting phase 4.
-In MULTI-ROLE mode: include explicit Skill() call instructions in each subagent prompt (see references/agent-dispatch.md).
+In ALL execution modes: include explicit Skill() call instructions in each subagent prompt (see references/agent-dispatch.md). The orchestrator does not need to load skills itself — agents load them before writing code.
 </required_skills>
 <available_mcps>
@@ -235,24 +234,28 @@ Constraints:
 <execution_mode_decision>
 At the end of phase 2, make an explicit execution mode decision and state it to the user:
-> **EXECUTION MODE: SINGLE-ROLE** — I will implement all changes directly.
+> **EXECUTION MODE: SINGLE-ROLE** — Single domain. I will dispatch 1-2 focused agents for implementation.
 OR
-> **EXECUTION MODE: MULTI-ROLE** — I will act as orchestrator, dispatching specialized agents via the Task tool.
+> **EXECUTION MODE: MULTI-ROLE** — Multiple domains. I will dispatch specialized agents across domains via the Task tool.
 OR
 > **EXECUTION MODE: PLAN** — The task is highly complex. I will enter plan mode to design a structured implementation plan before executing.
+**CRITICAL: In ALL execution modes, the orchestrator (main session) NEVER writes application code directly.** All implementation — components, hooks, pages, routes, services, tests — is delegated to agents via the Task tool. The orchestrator's role is: context gathering, agent dispatch, output review, integration glue (under 15 lines), and quality gates.
+The difference between SINGLE-ROLE and MULTI-ROLE is decomposition complexity, NOT whether agents are used:
+- **SINGLE-ROLE**: simpler decomposition (1-2 agents, same domain). Use when the task is contained within one package or domain.
+- **MULTI-ROLE**: complex decomposition (3+ agents, multiple domains). Use when the task spans packages or skill sets.
 Use PLAN mode when:
 - The task has 4+ distinct roles or implementation phases.
 - The scope is large enough that context window management becomes a concern.
 - The task benefits from upfront architectural planning before any code is written.
-In PLAN mode: use EnterPlanMode to design the full plan. Once the user approves it, exit plan mode and execute phases sequentially as MULTI-ROLE orchestrator, with context cleanup between major phases when needed.
-Follow this decision in phase 4. In MULTI-ROLE and PLAN modes, delegate application code to agents — your job is dispatch, review, integration, and quality gates.
+In PLAN mode: use EnterPlanMode to design the full plan. Once the user approves it, exit plan mode and execute phases sequentially as orchestrator, with context cleanup between major phases when needed.
 </execution_mode_decision>
 </phase_2b_multi_role_decomposition>
@@ -278,51 +281,74 @@ Follow this decision in phase 4. In MULTI-ROLE and PLAN modes, delegate applicat
 4. For questions about project libraries, use Context7 (`mcp__context7__resolve-library-id` → `mcp__context7__query-docs`) to query up-to-date documentation. If anything is ambiguous, ask the user before proceeding.
 5. Check for helpful MCPs — does the task involve browser testing, external docs?
 6. Plan the implementation order — determine which changes must happen first (e.g., types before lib before hooks before components before pages).
+7. Generate the verification plan — build a before/after test plan combining task-defined criteria with automatic checks based on domain and available MCPs. See references/verification-protocol.md for the full protocol.
+   - Detect available testing MCPs (playwright, chrome-devtools, or none).
+   - Map the task domain to checks: frontend → visual + performance, backend → API responses, database → schema state.
+   - Always include standard quality gates (lint, build, test) as baseline.
+   - Present the plan:
+     > **Verification plan:**
+     > - Baseline checks: [list]
+     > - MCPs for verification: [list or "none available — shell-only checks"]
+     > - Post-change checks: [list]
 </phase_3_task_analysis>
-<phase_4_execution>
-Before writing any code, check the execution mode decision from phase 2.
+<phase_3b_baseline_capture>
+**MANDATORY.** Run the verification plan before writing any code to establish the baseline for comparison. Do NOT skip this phase.
+**Step 1: Standard quality gates** — run and record results (pass/fail, counts, bundle sizes).
+**Step 2: MCP-based checks** — follow this decision tree in order:
+1. Use ToolSearch to confirm which browser MCPs are available (playwright, chrome-devtools, or neither).
+2. **If browser MCPs are available AND the task touches frontend pages:**
+   a. Check if the dev server is running (attempt to navigate to `localhost` or the configured URL).
+   b. **If dev server is accessible:** navigate to each affected page, capture screenshots of key states, and measure performance (LCP, load time). Use Chrome DevTools traces or Playwright screenshots as appropriate.
+   c. **If dev server is NOT accessible:** ask the user whether to start it or skip visual checks. Do NOT silently skip — the user must confirm.
+3. **If no browser MCPs are available:** note it explicitly and proceed with shell-only checks (build output, bundle sizes).
+4. **Backend tasks:** execute requests to affected endpoints (via curl or available API MCPs) and record response status, payload shape, and timing.
+5. **Database tasks:** record current schema state for affected tables.
+**Step 3: Store baseline** — all values captured here are compared against post-change results in phase 5b.
-**MULTI-ROLE → Follow <multi_role_orchestration> below.**
-**SINGLE-ROLE → Follow <single_role_implementation> below.**
+When visual/performance checks are skipped, state the reason:
+> **Baseline captured.** MCP-based visual checks skipped — [reason: no browser MCPs available | dev server not running (user confirmed skip) | no frontend pages affected].
-<multi_role_orchestration>
-As orchestrator, dispatch agents, review their output, and verify integration. Do not implement application code yourself.
+The orchestrator ALWAYS runs baseline capture before dispatching any agents, regardless of execution mode.
+</phase_3b_baseline_capture>
-You may write code directly only for trivial glue (under 15 lines total):
-- Adding an export line to a barrel file
-- Adding a small schema to the shared package that multiple agents need
-- Wiring an import in a top-level file after agents complete
+<phase_4_execution>
+**CRITICAL: The orchestrator (main session) NEVER writes application code.** All implementation is dispatched to agents via the Task tool. The orchestrator may only write trivial glue (under 15 lines total): barrel file exports, small wiring imports, or config one-liners.
+Before dispatching, check the execution mode decision from phase 2 to determine decomposition complexity.
-Everything else should be delegated to an agent. For the agent prompt template and dispatch details, see references/agent-dispatch.md.
+<agent_dispatch_protocol>
+This protocol applies to ALL execution modes (SINGLE-ROLE, MULTI-ROLE, and PLAN).
-Dispatch steps:
-1. Create TaskCreate entries for each role.
+As orchestrator, your responsibilities are: context gathering, agent dispatch, output review, integration glue, and quality gates. You do NOT write application code (components, hooks, pages, routes, services, tests).
+**Dispatch steps:**
+1. Create TaskCreate entries for each role to track progress.
 2. For each role, dispatch a specialized agent via the Task tool with a self-contained prompt. Set the `model` parameter according to rule_4_model_selection.
 3. Launch independent agents in parallel. Launch dependent agents sequentially.
 4. Each agent runs its own quality gates before reporting completion. Review the agent's output and gate results before dispatching dependents.
 5. After all agents complete, proceed to phase 5 for the final cross-package quality gates.
-If you find yourself creating application files (routes, components, services, hooks, tests) while in MULTI-ROLE mode, delegate to an agent instead.
-</multi_role_orchestration>
+For the agent prompt template and dispatch details, see references/agent-dispatch.md.
-<single_role_implementation>
-Think through each step before acting. Share your reasoning at key decision points.
+**Self-check:** If you find yourself creating application files (routes, components, services, hooks, tests, pages), STOP and delegate to an agent instead.
+</agent_dispatch_protocol>
 <build_order>
-Work from micro to macro — build dependencies before dependents:
-1. Shared code first — new constants, types, schemas, or enums needed by multiple packages go in the shared/common package (discover its location from the project structure).
+Instruct agents to work from micro to macro — build dependencies before dependents:
+1. Shared code first — new constants, types, schemas, or enums needed by multiple packages go in the shared/common package.
 2. Check for reuse — before creating a helper, hook, component, or utility, search the codebase for existing code that can be used or extended.
 3. Implement the core change — build the feature/fix in the target package.
 4. Wire it together — connect the pieces across packages if needed.
-Decision tree:
+Decision tree (include in agent prompts when relevant):
 - Is this code used by multiple modules? YES → Create in the shared/common package.
 - Is this code used by multiple modules? NO → Is this component multi-file? YES → Create folder with index.tsx + related files. NO → Single file, co-located with usage.
 </build_order>
-If you created any temporary files or scripts for iteration, remove them at the end.
-</single_role_implementation>
 </phase_4_execution>
 <phase_5_quality_gates>
@@ -345,6 +371,20 @@ If a gate fails:
 - Repeat until all gates pass cleanly.
 </phase_5_quality_gates>
+<phase_5b_post_change_verification>
+After all quality gates pass, re-run the verification plan from phase 3b and compare against baseline.
+1. Re-run every check from phase 3b with identical parameters.
+2. Compare each result against baseline:
+   - **Quality gates**: must remain passing. New failures = regression.
+   - **Visual checks** (if MCPs available): compare screenshots for unintended changes.
+   - **Performance** (if MCPs available): compare metrics. Regressions > 10% must be investigated.
+   - **API responses**: compare status codes and payload shapes. Breaking changes = regression.
+3. Build the comparison table (see references/output-templates.md for format).
+If any regression is found, fix it, re-run phase 5 quality gates, then re-run this phase. Repeat until clean.
+</phase_5b_post_change_verification>
 <phase_6_documentation>
 After all quality gates pass, review whether the changes require documentation updates:
@@ -356,7 +396,9 @@ Only update documentation directly affected by the changes. Do not create new do
 </workflow>
 <output>
-When complete, produce the completion report. See references/output-templates.md for the exact format.
+When complete, produce the completion report including the baseline vs post-change comparison table. See references/output-templates.md for the exact format.
+If the task document specifies a results file path, also create the comparison report at that path.
 Do not add explanations, caveats, or follow-up suggestions unless the user explicitly asks. The report is the final output.
 </output>

package/skills/task/references/output-templates.md CHANGED Viewed

@@ -35,13 +35,15 @@ List every file created/modified/deleted across all roles.
 - test: pass/fail (count)
 - build: pass/fail (per package)
-### Task-defined test results
-If the task document defined tests, report results here:
-| Test | Baseline (before) | Result (after) | Delta |
-|------|-------------------|----------------|-------|
-| [test name] | [baseline value] | [post-change value] | [improvement or regression] |
-If no task-defined tests existed, omit this section.
+### Verification results (baseline vs post-change)
+| Check | Baseline (before) | Result (after) | Delta | Status |
+|-------|-------------------|----------------|-------|--------|
+| lint | [pass/fail] | [pass/fail] | — | [ok/regression] |
+| build | [pass/fail (size)] | [pass/fail (size)] | [delta] | [ok/improved/regression] |
+| tests | [count] | [count] | [delta] | [ok/improved/regression] |
+| [domain-specific check] | [baseline value] | [post-change value] | [delta] | [ok/improved/regression] |
+Include every check from the verification plan. This section is always present.
 ### Skills loaded
 List every skill called across all roles.

package/skills/task/references/verification-protocol.md ADDED Viewed

@@ -0,0 +1,101 @@
+# Verification Protocol
+Every task runs a before/after verification cycle: capture baseline → make changes → verify no regressions and improvements achieved.
+## MCP detection
+At the start of verification planning, detect which testing MCPs are available by checking for their tool prefixes:
+| MCP | Tool prefix | Use for |
+|-----|-------------|---------|
+| Playwright | `mcp__plugin_playwright_playwright__*` | Visual verification, interaction testing, screenshots |
+| Chrome DevTools | `mcp__chrome-devtools__*` | Performance traces, network analysis, console checks |
+| Neither | — | Shell-only checks (curl, CLI commands, build output) |
+Use ToolSearch to confirm availability before including MCP-based checks in the plan.
+## Domain-specific test catalog
+### Frontend tasks
+**With browser MCPs (playwright or chrome-devtools):**
+1. Navigate to each affected page/route.
+2. Capture screenshots of key states (idle, loading, error, success).
+3. Measure performance:
+   - Chrome DevTools: `performance_start_trace` → interact → `performance_stop_trace` → `performance_analyze_insight`.
+   - Playwright: `browser_navigate` with timing, `browser_take_screenshot`.
+4. Verify interactive elements: click buttons, fill forms, check navigation flows.
+5. Check browser console for errors or warnings.
+**Without browser MCPs:**
+1. Run build and record bundle size (`ls -la` on build output).
+2. Run lint and type checking.
+3. Run existing test suite and record pass/fail counts.
+### Backend tasks
+**API endpoint testing:**
+1. Identify affected endpoints from the task scope.
+2. Record baseline responses via curl:
+   ```bash
+   curl -s -w "\nHTTP %{http_code} | %{time_total}s" <endpoint>
+   ```
+3. After changes, re-run identical requests and compare status codes, response shapes, and timing.
+**With Postman/API MCPs (if available):**
+- Use the MCP to send structured requests and validate response schemas.
+### Database tasks
+1. Record current schema state for affected tables (e.g., `\d+ table_name` in psql).
+2. After migration, verify schema matches expectations.
+3. Run sample queries to verify data integrity.
+### Shared/library tasks
+1. Run type checking across all dependent packages.
+2. Run tests in all packages that import the changed code.
+3. Verify no new type errors introduced downstream.
+## Baseline capture protocol
+1. Run standard quality gates first: lint, build, test — record pass/fail and counts.
+2. Run domain-specific checks from the catalog above.
+3. Store results in a structured format for comparison:
+```
+Baseline:
+- lint: pass
+- build: pass (bundle: 245KB)
+- tests: 42 passed, 0 failed
+- page /receive: screenshot captured, LCP 1.2s
+- GET /api/resource: 200, 45ms
+```
+## Post-change verification protocol
+1. Re-run every baseline check with identical parameters.
+2. Build the comparison table:
+| Check | Baseline | After | Delta | Status |
+|-------|----------|-------|-------|--------|
+| lint | pass | pass | — | ok |
+| build | pass (245KB) | pass (238KB) | -7KB | improved |
+| tests | 42 pass | 45 pass | +3 | improved |
+| LCP /receive | 1.2s | 0.9s | -0.3s | improved |
+| GET /api/resource | 200, 45ms | 200, 42ms | -3ms | ok |
+3. Status classification:
+   - **ok**: no change or negligible change.
+   - **improved**: measurable improvement.
+   - **regression**: measurable degradation — must be fixed before task completion.
+4. If regressions exist, fix and re-verify until clean.
+## Results file
+When the task document specifies a results file path, create a markdown file at that path containing:
+1. Summary of changes made.
+2. Full comparison table.
+3. Key improvements highlighted.
+4. Any regressions that were found and how they were resolved.