npm - opencastle - Versions diffs - 0.32.4 → 0.32.6 - Mend

opencastle 0.32.4 → 0.32.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (108) hide show

package/README.md +13 -3
package/bin/cli.mjs +2 -0
package/dist/cli/bootstrap.js +1 -1
package/dist/cli/bootstrap.js.map +1 -1
package/dist/cli/bootstrap.test.js +16 -0
package/dist/cli/bootstrap.test.js.map +1 -1
package/dist/cli/init.test.js +38 -0
package/dist/cli/init.test.js.map +1 -1
package/dist/cli/stack-config-update.test.js +18 -0
package/dist/cli/stack-config-update.test.js.map +1 -1
package/dist/cli/stack-config.d.ts.map +1 -1
package/dist/cli/stack-config.js +1 -0
package/dist/cli/stack-config.js.map +1 -1
package/dist/cli/types.d.ts +1 -1
package/dist/cli/types.d.ts.map +1 -1
package/dist/orchestrator/plugins/index.d.ts.map +1 -1
package/dist/orchestrator/plugins/index.js +4 -0
package/dist/orchestrator/plugins/index.js.map +1 -1
package/dist/orchestrator/plugins/notion/config.d.ts +3 -0
package/dist/orchestrator/plugins/notion/config.d.ts.map +1 -0
package/dist/orchestrator/plugins/notion/config.js +46 -0
package/dist/orchestrator/plugins/notion/config.js.map +1 -0
package/dist/orchestrator/plugins/trello/config.d.ts +3 -0
package/dist/orchestrator/plugins/trello/config.d.ts.map +1 -0
package/dist/orchestrator/plugins/trello/config.js +43 -0
package/dist/orchestrator/plugins/trello/config.js.map +1 -0
package/dist/orchestrator/plugins/types.d.ts +1 -1
package/dist/orchestrator/plugins/types.d.ts.map +1 -1
package/package.json +1 -1
package/src/cli/bootstrap.test.ts +21 -0
package/src/cli/bootstrap.ts +1 -1
package/src/cli/init.test.ts +46 -0
package/src/cli/stack-config-update.test.ts +20 -0
package/src/cli/stack-config.ts +1 -0
package/src/cli/types.ts +1 -1
package/src/dashboard/node_modules/.vite/deps/_metadata.json +6 -6
package/src/orchestrator/agents/api-designer.agent.md +25 -34
package/src/orchestrator/agents/architect.agent.md +40 -84
package/src/orchestrator/agents/content-engineer.agent.md +29 -31
package/src/orchestrator/agents/copywriter.agent.md +35 -60
package/src/orchestrator/agents/data-expert.agent.md +24 -30
package/src/orchestrator/agents/database-engineer.agent.md +26 -31
package/src/orchestrator/agents/developer.agent.md +32 -34
package/src/orchestrator/agents/devops-expert.agent.md +31 -26
package/src/orchestrator/agents/documentation-writer.agent.md +29 -29
package/src/orchestrator/agents/performance-expert.agent.md +36 -33
package/src/orchestrator/agents/release-manager.agent.md +25 -34
package/src/orchestrator/agents/researcher.agent.md +41 -95
package/src/orchestrator/agents/reviewer.agent.md +24 -34
package/src/orchestrator/agents/security-expert.agent.md +35 -39
package/src/orchestrator/agents/seo-specialist.agent.md +25 -32
package/src/orchestrator/agents/session-guard.agent.md +20 -79
package/src/orchestrator/agents/team-lead.agent.md +50 -254
package/src/orchestrator/agents/testing-expert.agent.md +37 -49
package/src/orchestrator/agents/ui-ux-expert.agent.md +33 -39
package/src/orchestrator/customizations/KNOWN-ISSUES.md +0 -1
package/src/orchestrator/customizations/agents/skill-matrix.json +20 -4
package/src/orchestrator/customizations/agents/skill-matrix.md +20 -0
package/src/orchestrator/instructions/general.instructions.md +24 -84
package/src/orchestrator/plugins/astro/SKILL.md +23 -179
package/src/orchestrator/plugins/convex/SKILL.md +38 -12
package/src/orchestrator/plugins/index.ts +4 -0
package/src/orchestrator/plugins/netlify/SKILL.md +17 -13
package/src/orchestrator/plugins/nextjs/SKILL.md +55 -261
package/src/orchestrator/plugins/notion/SKILL.md +205 -0
package/src/orchestrator/plugins/notion/config.ts +47 -0
package/src/orchestrator/plugins/nx/SKILL.md +20 -72
package/src/orchestrator/plugins/playwright/SKILL.md +5 -17
package/src/orchestrator/plugins/slack/SKILL.md +28 -190
package/src/orchestrator/plugins/teams/SKILL.md +10 -140
package/src/orchestrator/plugins/trello/SKILL.md +151 -0
package/src/orchestrator/plugins/trello/config.ts +44 -0
package/src/orchestrator/plugins/types.ts +1 -1
package/src/orchestrator/plugins/vitest/SKILL.md +2 -2
package/src/orchestrator/prompts/bug-fix.prompt.md +25 -63
package/src/orchestrator/prompts/implement-feature.prompt.md +29 -66
package/src/orchestrator/prompts/quick-refinement.prompt.md +31 -66
package/src/orchestrator/skills/accessibility-standards/SKILL.md +50 -105
package/src/orchestrator/skills/agent-hooks/SKILL.md +60 -110
package/src/orchestrator/skills/agent-memory/SKILL.md +44 -93
package/src/orchestrator/skills/api-patterns/SKILL.md +20 -68
package/src/orchestrator/skills/code-commenting/SKILL.md +49 -101
package/src/orchestrator/skills/context-map/SKILL.md +47 -88
package/src/orchestrator/skills/data-engineering/SKILL.md +27 -74
package/src/orchestrator/skills/decomposition/SKILL.md +50 -98
package/src/orchestrator/skills/deployment-infrastructure/SKILL.md +44 -107
package/src/orchestrator/skills/documentation-standards/SKILL.md +28 -89
package/src/orchestrator/skills/fast-review/SKILL.md +51 -276
package/src/orchestrator/skills/frontend-design/SKILL.md +53 -163
package/src/orchestrator/skills/git-workflow/SKILL.md +18 -54
package/src/orchestrator/skills/memory-merger/SKILL.md +51 -88
package/src/orchestrator/skills/observability-logging/SKILL.md +29 -75
package/src/orchestrator/skills/orchestration-protocols/SKILL.md +58 -117
package/src/orchestrator/skills/panel-majority-vote/SKILL.md +65 -140
package/src/orchestrator/skills/performance-optimization/SKILL.md +21 -85
package/src/orchestrator/skills/project-consistency/SKILL.md +62 -281
package/src/orchestrator/skills/react-development/SKILL.md +38 -86
package/src/orchestrator/skills/security-hardening/SKILL.md +40 -84
package/src/orchestrator/skills/self-improvement/SKILL.md +26 -60
package/src/orchestrator/skills/seo-patterns/SKILL.md +40 -105
package/src/orchestrator/skills/session-checkpoints/SKILL.md +26 -68
package/src/orchestrator/skills/team-lead-reference/SKILL.md +66 -206
package/src/orchestrator/skills/testing-workflow/SKILL.md +42 -112
package/src/orchestrator/skills/validation-gates/SKILL.md +39 -170
package/src/orchestrator/snippets/base-output-contract.md +14 -0
package/src/orchestrator/snippets/discovered-issues-policy.md +15 -0
package/src/orchestrator/snippets/logging-mandatory.md +11 -0
package/src/orchestrator/snippets/never-expose-secrets.md +22 -0

package/src/orchestrator/skills/orchestration-protocols/SKILL.md CHANGED Viewed

@@ -9,173 +9,114 @@ Runtime patterns for managing delegated agents. **Load at:** Execution phase (St
 ## Active Steering
-Monitor agent sessions during execution. Intervene early when you spot:
+Intervene early when you spot:
-- **Failing tests/builds** — the agent can't resolve a dependency or breaks existing code
-- **Unexpected file changes** — files outside the agent's partition appear in the diff
-- **Scope creep** — the agent starts refactoring code you didn't ask about
-- **Circular behavior** — the agent retries the same failing approach without adjusting
-- **Intent misunderstanding** — session log shows the agent interpreted the prompt differently
+| Signal | Action |
+|--------|--------|
+| Failing tests/builds | Can't resolve dependency or breaks existing code |
+| Unexpected file changes | Files outside partition in diff |
+| Scope creep | Refactors code not in scope |
+| Circular behavior | Same failing approach retried without change |
+| Intent misunderstanding | Session log shows wrong prompt interpretation |
-**When redirecting, be specific.** Explain *why* you're redirecting and *how* to proceed:
+When redirecting, explain *why* and *how*:
-> "Don't modify `libs/data/src/lib/product.ts` — that file is shared across features. Instead, add the new query in `libs/data/src/lib/reviews.ts`. This keeps the change isolated."
+> "Don't modify `libs/data/src/lib/product.ts` — shared across features. Add the new query in `libs/data/src/lib/reviews.ts`."
-**Timing matters.** Catching a problem 5 minutes in can save an hour. Don't wait until the agent finishes.
-**Background agent caveat:** The drift signals above apply only to **sub-agents** (inline) where you see results in real-time. Background agents run autonomously — you cannot inspect their intermediate state or redirect mid-execution. For background agents, steering is **post-hoc**: invest more effort in prompt specificity and file partition constraints upfront, then review thoroughly when the agent returns its output.
+**Sub-agents:** Catch problems early (5 min in can save an hour). **Background agents:** Steer post-hoc — invest in prompt specificity and partition constraints upfront.
 ## Background Agents
-Background agents run autonomously in isolated Git worktrees. Use for well-scoped subtasks with clear acceptance criteria.
+Run autonomously in isolated Git worktrees. Reserve for well-scoped tasks >5 min with clear acceptance criteria.
 - **Spawn:** Delegate Session → Background → Select agent → Enter prompt
-- **Auto-compaction:** At 95% token limit, context is automatically compressed
-- **Resume:** Use `--resume` for previous sessions
-- **Duration threshold:** Reserve for tasks expected to take >5 minutes
-- **No real-time monitoring:** You cannot inspect intermediate state. Drift detection happens only at completion review. Mitigate with: (a) highly specific prompts, (b) strict file partition constraints, (c) acceptance criteria checklists in the prompt
+- **Auto-compaction:** At 95% token limit; use `--resume` to continue
+- **No real-time monitoring:** Invest in specific prompts, strict partition constraints, and acceptance criteria checklists upfront
 ## Parallel Research Protocol
-When a task requires broad exploration before implementation, spawn multiple research sub-agents in parallel to gather context efficiently.
-### When to Use
-- 3+ independent research questions need answering before implementation can begin
-- Broad codebase exploration across multiple libraries or domains
-- Multi-area analysis (e.g., "How do we handle X in the frontend, backend, and CMS?")
+Spawn multiple research sub-agents in parallel when 3+ independent questions must be answered before implementation. **Use when:** 3+ independent research questions, broad codebase exploration, or multi-area analysis (frontend/backend/CMS). **Skip when:** single-file investigation, answer in one known location, sequential results, or fewer than 3 questions.
 ### Spawn Strategy
-- **Divide by topic/area**, not by file count — each researcher should own a coherent domain
-- **Max 3-5 parallel researchers** — more than 5 creates diminishing returns and token waste
-- **Each researcher gets a focused scope** — explicit directories, file patterns, or questions
-- **Use Economy/Standard tier** for research sub-agents to manage cost
-### Research Sub-Agent Prompt Template
+| Rule | Detail |
+|------|--------|
+| Divide by topic/area | Each researcher owns a coherent domain |
+| Max 3–5 researchers | More creates diminishing returns and token waste |
+| Focused scope per agent | Explicit dirs, file patterns, or questions |
+| Economy/Standard tier | Manage cost for research sub-agents |
+**Prompt template:**
 ```
 Research: [specific question]
 Scope: [files/directories to search]
-Return: A structured summary with:
-- Key findings (bullet list)
-- Relevant file paths (with line numbers)
-- Patterns observed
-- Unanswered questions
+Return: key findings, relevant file paths (with line numbers), patterns, unanswered questions
 ```
 ### Result Merge Protocol
-After all research sub-agents return:
-1. **Collect** all sub-agent results into a single context
-2. **Deduplicate** findings — same file/pattern reported by multiple agents counts once
-3. **Resolve conflicts** — if agents report contradictory information, trust the one with more specific evidence (exact file paths + line numbers > general observations)
-4. **Synthesize** into a single context block for the next phase — distill the combined findings into a concise summary that can be included in implementation delegation prompts
-### When NOT to Use
-- Single-file investigation — just read the file directly
-- When the answer is in one known location — a single sub-agent or direct read is faster
-- When results must be sequential (e.g., "find X, then based on X find Y")
-- For fewer than 3 questions — overhead of parallel coordination exceeds time saved
+1. Collect all results into single context
+2. Deduplicate (same file/pattern counts once)
+3. Resolve conflicts — specific evidence beats general observations
+4. Synthesize into concise context block for implementation prompts
 ## Batch Reviews
-When multiple background agents complete work simultaneously, batch similar reviews to save time:
-- Group reviews by domain (e.g., all UI changes together, all data changes together)
-- Run fast reviews in parallel for independent outputs
-- If multiple outputs share the same file partition boundary, review them sequentially to catch integration issues
-- For panel reviews, combine related artifacts into a single panel question when they share acceptance criteria
+- Group by domain (UI, data); run fast reviews in parallel for independent outputs
+- Review sequentially when outputs share the same partition boundary
+- Combine related artifacts into one panel question when they share acceptance criteria
 ## Context Compaction
-Between phases, summarize prior agent output before passing it to the next agent. Never paste raw sub-agent results into a downstream prompt.
-**When:** Multi-phase chains where the next agent only needs outcomes, not full reasoning traces. Skip for single-phase work or when raw detail is needed (e.g., code review).
-**How:** After a sub-agent returns, extract only: files changed, key decisions, verification results (pass/fail), and blockers. Discard raw tool output, reasoning traces, and failed attempts.
-**Template for delegation prompts:**
+Summarize prior phase output before passing to the next agent. **Extract:** files changed, key decisions, verification (pass/fail), blockers. **Discard:** raw tool output, reasoning traces, failed attempts.
+**Template:**
 ```
 ### Prior Phase Output
 **Phase [N] — [Agent Name] — [Task Title]**
-- Files changed: [list with one-line descriptions]
-- Decisions: [key decisions that affect downstream work]
+- Files changed: [list]
+- Decisions: [key decisions affecting downstream work]
 - Verification: [lint ✅ | types ✅ | tests ✅]
 - Blockers: [none | list]
 ```
-## Agent Health-Check Protocol
-Monitor delegated agents for failure signals. Intervene early rather than waiting for completion.
+## Agent Health Monitoring
 ### Health Signals
-| Signal | Detection | Threshold | Recovery |
-|--------|-----------|-----------|----------|
-| **Stuck** | No new terminal output or file changes | Sub-agent: 5 min / Background: 15 min | Check terminal output. If idle, nudge with clarification. If frozen, abort and re-delegate with simpler scope. |
-| **Looping** | Same error message repeated 3+ times | 3 consecutive identical failures | Abort immediately. Analyze the error, add context the agent is missing, re-delegate with explicit fix path. |
-| **Scope creep** | Files outside assigned partition appear in diff | Any file outside partition | Redirect: "Only modify files in [partition]. Revert changes to [file]." |
-| **Context exhaustion** | Responses become repetitive, confused, or lose earlier instructions | Visible confusion or instruction amnesia | Checkpoint immediately. End session. Resume in fresh context. |
-| **Permission loop** | Agent repeatedly asks for confirmation or waits for input | 2+ consecutive prompts without progress | Auto-approve if safe, or abort and re-delegate with `--dangerously-skip-permissions` flag or equivalent. |
-### Health-Check Cadence
+| Signal | Threshold | Recovery |
+|--------|-----------|----------|
+| **Stuck** — no output/changes | Sub: 5 min / BG: 15 min | Nudge; if frozen, abort + re-delegate with simpler scope |
+| **Looping** — same error repeated | 3 consecutive failures | Abort; add context; re-delegate with explicit fix path |
+| **Scope creep** — files outside partition | Any | Redirect: "Only modify files in [partition]. Revert [file]." |
+| **Context exhaustion** — confused/repetitive | Visible instruction amnesia | Checkpoint, end session, resume in fresh context |
+| **Permission loop** — waiting for input | 2+ prompts without progress | Auto-approve if safe; abort + re-delegate |
-- **Sub-agents (inline):** Monitor continuously — you see output in real-time
-- **Background agents:** Check terminal output after 10 minutes, then every 10 minutes
-- **After completion:** Always review the full diff before accepting output
+**Cadence:** Sub-agents — continuous (real-time). Background agents — check at 10 min, then every 10 min. Always review full diff before accepting.
 ### Escalation Path
-1. **First failure:** Re-delegate with more specific prompt + error context
-2. **Second failure:** Downscope the task (split into smaller pieces) and re-delegate
-3. **Third failure:** Log to Dead Letter Queue (`.opencastle/AGENT-FAILURES.md`), escalate to Architect for root cause analysis. If the failure involves a panel 3x BLOCK or unresolvable agent/reviewer conflict, create a **dispute record** in `.opencastle/DISPUTES.md` instead (see **team-lead-reference** skill § Dispute Protocol).
+1. **Failure 1:** Re-delegate with more specific prompt + error context
+2. **Failure 2:** Downscope (split into smaller pieces), re-delegate
+3. **Failure 3:** Log to `.opencastle/AGENT-FAILURES.md`; if 3× panel BLOCK or conflict, create dispute in `.opencastle/DISPUTES.md` (see **team-lead-reference** § Dispute Protocol)
 ## Error Recovery Playbook
-Common failure modes and how to recover:
-### Agent Stuck in Retry Loop
-**Symptom:** Agent retries the same failing command 3+ times without changing approach.
-**Recovery:** Intervene immediately. Read the error output, identify the root cause, and re-delegate with explicit fix instructions. Use the **self-improvement** skill to add a lesson.
-### MCP Tool Unavailable
-**Symptom:** Tool calls fail with connection or timeout errors.
-**Recovery:** (1) Check if the MCP server is running. (2) If transient, retry once. (3) If persistent, work around: use CLI tools as alternatives. Log to DLQ if critical.
-### Background Agent Produces Broken Output
-**Symptom:** Background agent returns, but files have lint/type/test errors.
-**Recovery:** (1) Review the diff to understand intent. (2) If fixable with small edits, fix inline. (3) If fundamentally wrong, discard the worktree changes and re-delegate with a more specific prompt. (4) Log to DLQ after 2 failed attempts.
-### Merge Conflict from Parallel Agents
-**Symptom:** Two background agents modified overlapping files.
-**Recovery:** (1) This should never happen if file partitioning was followed. (2) Accept one agent's changes first (the one with more complex work). (3) Re-delegate the simpler changes to adapt to the new state. (4) Use the **self-improvement** skill to add a lesson about the conflict.
-### Context Window Exhausted
-**Symptom:** Agent responses become confused, repetitive, or lose track of earlier instructions.
-**Recovery:** (1) Save a session checkpoint immediately. (2) End the current session. (3) Resume in a new session, loading the checkpoint. (4) Reduce parallel work in the next session.
-### Test Failures After Merge
-**Symptom:** Tests pass individually but fail when multiple agent outputs are merged.
-**Recovery:** (1) Run affected tests to identify which projects break. (2) Check for import conflicts, duplicate definitions, or state pollution. (3) Delegate fix to the agent whose changes are most likely the cause.
+| Failure | Symptom | Recovery |
+|---------|---------|----------|
+| **Retry loop** | Same command fails 3+ times | Abort; identify root cause; re-delegate with explicit fix; log lesson |
+| **MCP unavailable** | Tool connection/timeout errors | Check server; retry once; fall back to CLI; log to DLQ if critical |
+| **Broken BG output** | Lint/type/test errors on return | Fix inline if small; discard + re-delegate if fundamental; DLQ after 2 fails |
+| **Parallel merge conflict** | Two agents modified overlapping files | Accept complex side first; re-delegate simple side to adapt; log lesson |
+| **Context exhausted** | Confused/repetitive responses | Checkpoint; end session; resume with checkpoint; reduce parallel work |
+| **Post-merge test failure** | Tests pass alone but fail merged | Run affected tests; check import/state conflicts; delegate fix to likely cause |
 ## Agent Circuit Breaker
-Track per-agent failure counts across the session (not just per-task). If the same agent keeps failing, the problem is likely systemic.
 | Threshold | Action |
 |-----------|--------|
-| **2 failures** | Warning — investigate: same error class? Model endpoint healthy? Prompt pattern issue? |
-| **3 failures** | Open circuit — stop delegating to that agent. Reassign tasks to an overlapping agent, try a different model tier, or checkpoint and escalate to the user. |
-| **Next session** | Half-open — circuit resets. If the agent fails again immediately, re-open and add a lesson via **self-improvement**. |
+| **2 failures** | Investigate: same error class? Model healthy? Prompt pattern? |
+| **3 failures** | Open circuit — stop delegating; reassign or escalate to user |
+| **Next session** | Half-open — resets; re-open + add lesson if fails again |
-This is a judgment-based pattern, not a hard gate. 3 failures on similar tasks with the same error is more concerning than 3 unrelated failures.
+Judgment-based, not a hard gate. 3 similar failures with the same error is more concerning than 3 unrelated failures.

package/src/orchestrator/skills/panel-majority-vote/SKILL.md CHANGED Viewed

@@ -3,168 +3,98 @@ name: panel-majority-vote
 description: "Run 3 isolated reviewer sub-agents against the same question and decide PASS/BLOCK by majority vote (2/3 wins). Use when deterministic verification is insufficient."
 ---
-<!-- ⚠️ This file is managed by OpenCastle. Edits will be overwritten on update. Customize in the .opencastle/ directory instead. -->
+# Skill: Panel majority vote
-# Skill: Panel majority vote (3 reviewers)
+## Contract
-Use this skill when deterministic verification is unavailable and you need a panel to decide PASS/BLOCK for a single question against a declared artifact scope.
+| Rule | Detail |
+|------|--------|
+| Scope | One run root, one panel key |
+| Artifacts | Reviewers use only declared in-scope artifacts |
+| Runners | Exactly 3 isolated reviewer runs |
+| Verdict | Majority (2/3 wins) |
+| On BLOCK | Consolidated report must include retry summary |
-## Contract
-- Scope is exactly one run root and one panel key.
-- Reviewers must only use the declared in-scope artifacts.
-- Exactly 3 isolated reviewer runs.
-- Majority vote decides overall verdict (2/3 wins).
-- Consolidated panel report must include a short retry summary when BLOCK.
-## Inputs
-- Run root: `<runRoot>`
-- Panel key: `<panelKey>` (a filesystem-safe identifier used to name output files)
-- Exact question text (single question)
-- Explicit in-scope artifact list (all under the same run root)
-Optional (defaults shown):
-- Panel output directory: `<panelDir>` (default: `<runRoot>/panel/`)
-## Outputs (files)
-- (Optional) Prompt payload: `<panelDir>/<panelKey>-panel-prompt.md`
-- Raw reviewer outputs: `<panelDir>/<panelKey>-reviewer-outputs.md`
-- Consolidated report: `<panelDir>/<panelKey>.md`
-## Procedure (required: run in isolation)
-Run this skill in an isolated subagent (using `runSubagent`) so the panel cannot accidentally consult unrelated workspace context.
-The isolated runner subagent must:
-1. Validate scope
-  - Ensure every in-scope artifact path is under `<runRoot>`.
-  - Ensure the in-scope list is sufficient to answer the question.
-2. Spawn exactly 3 reviewers (in parallel)
-  - Launch 3 isolated reviewer subagents (using `runSubagent`) with the exact same prompt payload.
-  - The prompt payload may be passed directly to the reviewer subagents (no file required).
-  - If you want an explicit artifact of the prompt payload, optionally write it to `<panelDir>/<panelKey>-panel-prompt.md`.
-  - Reviewer prompt must require this strict output format:
-    1) VERDICT: PASS | BLOCK
-    2) MUST-FIX:
-    - ...
-    3) SHOULD-FIX:
-    - ...
-    4) QUESTIONS:
-    - ...
-    5) TEST IDEAS:
-    - ...
-    6) CONFIDENCE: low | med | high
-  - Reviewers must not include any other sections.
-3. Persist reviewer outputs (required audit trail)
-  - Create/overwrite `<panelDir>/<panelKey>-reviewer-outputs.md`.
-  - Include at the top:
-    - Run root
-    - Panel key
-    - Question text
-    - In-scope artifact list
-    - (Optional) The exact prompt payload text provided to reviewers
-  - Then include each reviewer output verbatim, clearly separated.
-4. Consolidate by majority vote (2/3 wins)
-  - Compute:
-    - PASS count
-    - BLOCK count
-    - Overall = PASS if PASS >= 2 else BLOCK
-  - Deduplicate MUST-FIX and SHOULD-FIX items; annotate how many reviewers flagged each.
-  - Record disagreements (items flagged by only 1 reviewer; or materially conflicting assessments).
-  - Include determinize-next recommendations.
-  - If Overall = BLOCK, include a short Retry summary:
-    - top changes required before retrying
-5. Write the consolidated panel report
- - Create `<panelDir>/<panelKey>.md` using the template in `panel-report.template.md` (in this directory).
-6. Print a concise summary to chat
-  - Overall verdict + vote tally + path to `<panelDir>/<panelKey>.md`.
-7. Log the panel result **(⛔ hard gate — do NOT return the verdict or proceed until logged)**
-  - Log the panel result using the **observability-logging** skill's panel record command. An unlogged panel is a failed panel.
-  - Include: `panel_key`, `verdict`, `pass_count`, `block_count`, `must_fix`, `should_fix`, `reviewer_model`, `weighted`, `attempt`, `tracker_issue`, `artifacts_count`, `report_path`.
-  - The skill's panel record command includes a verify step.
-Finally: ensure whatever produced the claim being verified links the consolidated panel report as verification evidence.
+## Inputs / Outputs
-## Notes
-- If the panel output is BLOCK, prefer to change the underlying work and re-run the same panel question over re-wording the question.
-- After 3 consecutive BLOCKs on the same panel key, create a **dispute record** in `.opencastle/DISPUTES.md` instead of retrying further. The dispute packages the agent's position, all reviewer feedback, attempt history, and resolution options for human decision-making. See the **team-lead-reference** skill § Dispute Protocol for the full procedure.
+**Inputs:** `<runRoot>`, `<panelKey>` (filesystem-safe), question text, artifact list. Panel dir default: `<runRoot>/panel/`.
-## Model Selection for Reviewers
+| File | Path |
+|------|------|
+| Prompt payload (optional) | `<panelDir>/<panelKey>-panel-prompt.md` |
+| Raw reviewer outputs | `<panelDir>/<panelKey>-reviewer-outputs.md` |
+| Consolidated report | `<panelDir>/<panelKey>.md` |
-Choose reviewer models based on the domain being reviewed:
-- **Security, architecture, complex logic** → Quality (Claude Sonnet 4.6) for all 3 reviewers
-- **Feature implementation, UI, queries** → Standard (Gemini 3.1 Pro) for all 3 reviewers
-- **Mixed-domain review** → Use Quality for at least 1 reviewer, Standard for the other 2
+## Procedure
-All 3 reviewers should use the same model to ensure comparable verdicts. Mixing models can lead to inconsistent review depth and confusing disagreements.
+1. **Validate scope** — every artifact path is under `<runRoot>`; list is sufficient to answer the question.
+2. **Spawn 3 reviewers in parallel** — identical prompt to 3 isolated subagents. Optionally write payload to `<panelDir>/<panelKey>-panel-prompt.md`. Required output sections (no others): `VERDICT: PASS | BLOCK`, `MUST-FIX:`, `SHOULD-FIX:`, `QUESTIONS:`, `TEST IDEAS:`, `CONFIDENCE: low | med | high`.
+3. **Persist outputs** — write `<panelDir>/<panelKey>-reviewer-outputs.md` with header (run root, panel key, question, artifacts) and each reviewer output verbatim, separated.
+4. **Consolidate** — count PASS/BLOCK; overall PASS if ≥ 2. Deduplicate MUST-FIX/SHOULD-FIX with reviewer counts. Record disagreements. Include determinize-next recs. If BLOCK, add retry summary.
+5. **Write report** — create `<panelDir>/<panelKey>.md` using `panel-report.template.md`.
+6. **Print summary** — overall verdict + vote tally + report path.
+7. **Log (⛔ hard gate)** — use **observability-logging** skill panel command. Fields: `panel_key`, `verdict`, `pass_count`, `block_count`, `must_fix`, `should_fix`, `reviewer_model`, `weighted`, `attempt`, `tracker_issue`, `artifacts_count`, `report_path`. Link report as verification evidence.
-## Weighted Consensus Variant
+## Notes
+- On BLOCK: change the underlying work and re-run; do not re-word the question.
+- After 3 consecutive BLOCKs on the same panel key: create a dispute record per **team-lead-reference** § Dispute Protocol.
-Extends the panel system for subjective decisions where domain expertise should weight more heavily than a simple head-count.
+## Model Selection
-### When to Use Weighted Consensus
+| Domain | Model |
+|--------|-------|
+| Security, architecture, complex logic | Quality (Claude Sonnet 4.6) × 3 |
+| Feature implementation, UI, queries | Standard (Gemini 3.1 Pro) × 3 |
+| Mixed-domain | Quality × 1, Standard × 2 |
-| Decision Type | Use Simple Majority | Use Weighted Consensus |
-|--------------|--------------------|-----------------------|
-| Security vulnerability present? | ✅ | — |
-| Code correctness | ✅ | — |
-| Best UI approach for user experience | — | ✅ |
-| Architecture tradeoff (performance vs maintainability) | — | ✅ |
-| Data model design choices | — | ✅ |
-| Naming conventions / code style disputes | — | ✅ |
+Use same model for all 3 reviewers.
-### Weight Assignment Rules
+## Weighted Consensus Variant
-Each reviewer gets a weight based on 3 factors:
+For subjective decisions where domain expertise should weight more than head-count.
-| Factor | Weight Bonus | Example |
-|--------|-------------|---------|
-| **Domain expertise** | +2 | Security Expert reviewing auth code |
-| **Confidence level** | +1 (high) / 0 (med) / -1 (low) | Self-reported by reviewer |
-| **Prior success** | +1 | Agent has >80% success rate for similar reviews (from AGENT-PERFORMANCE.md) |
+### When to Use
-**Base weight:** 1 for all reviewers. Add bonuses to get final weight.
+| Decision Type | Mode |
+|--------------|------|
+| Security vulnerability, code correctness | Simple majority |
+| UI/UX, architecture tradeoffs, data model, naming | Weighted |
-**Example:**
+### Weight Assignment
-```text
-Reviewer 1 (Security Expert, reviewing auth): base 1 + domain 2 + confidence 1 = weight 4
-Reviewer 2 (Frontend Dev, reviewing auth):    base 1 + domain 0 + confidence 1 = weight 2
-Reviewer 3 (Architect, reviewing auth):        base 1 + domain 1 + confidence 0 = weight 2
-```
+Base weight: 1. Add bonuses:
-### Weighted Voting Protocol
+| Factor | Bonus |
+|--------|-------|
+| Domain expertise (relevant to review) | +2 |
+| Confidence high / med / low | +1 / 0 / -1 |
+| Prior success rate >80% (AGENT-PERFORMANCE.md) | +1 |
-1. **Assign weights** to each reviewer before spawning them (based on their role relative to the review domain)
-2. **Spawn reviewers** with the same prompt as simple majority (use the existing procedure)
-3. **Collect verdicts** — each reviewer submits PASS/BLOCK with confidence level
-4. **Calculate weighted score:**
-   - Sum weights of PASS reviewers → PASS score
-   - Sum weights of BLOCK reviewers → BLOCK score
-   - Overall = PASS if PASS score > BLOCK score, else BLOCK
-5. **Tie-breaking:** If scores are equal, the reviewer with the highest individual weight breaks the tie. If weights are also equal, default to BLOCK (conservative).
+Example: Security Expert + high = **4**; Architect + med = **2**.
-### Conflict Resolution
+### Voting Protocol
-- If a low-weight reviewer BLOCKs but high-weight reviewers PASS: note the BLOCK concerns in the report but overall PASS. Include the low-weight MUST-FIX items as SHOULD-FIX instead.
-- If the domain expert BLOCKs but generalists PASS: overall BLOCK. Domain expertise overrides general opinion.
-- If all reviewers have equal weight: falls back to simple majority vote (2/3 wins).
+1. Assign weights before spawning.
+2. Spawn with same prompt; collect PASS/BLOCK + confidence.
+3. Score: sum weights by verdict; PASS if PASS score > BLOCK score.
+4. Tie: highest individual weight breaks tie; if equal, default BLOCK.
-### Weighted Panel Report Extension
+### Conflict Resolution
+| Scenario | Outcome |
+|----------|---------|
+| Low-weight BLOCKs, high-weight PASSes | PASS; move BLOCK's MUST-FIX → SHOULD-FIX |
+| Domain expert BLOCKs, generalists PASS | BLOCK |
+| All equal weight | Simple majority (2/3 wins) |
-Add these fields to the consolidated panel report template when using weighted consensus:
+### Report Extension
 ```markdown
 ### Weighting
 | Reviewer | Role | Domain | Confidence | Prior Success | Final Weight |
 |----------|------|--------|------------|---------------|-------------|
 | 1 | [Agent] | +X | +X | +X | X |
-| 2 | [Agent] | +X | +X | +X | X |
-| 3 | [Agent] | +X | +X | +X | X |
 ### Weighted Score
 - PASS: X (reviewers: 1, 3)
@@ -172,12 +102,7 @@ Add these fields to the consolidated panel report template when using weighted c
 - **Overall: PASS/BLOCK** (weighted)
 ```
-### Integration with Existing Panel Workflow
-The weighted consensus variant follows the SAME procedure steps (1-6) from the main panel protocol. The only differences are:
-1. Weight assignment happens in step 2 (before spawning reviewers)
-2. Step 4 uses weighted calculation instead of simple count
-3. The consolidated report includes the weighting table
+### Integration
-The Team Lead decides whether to use simple majority or weighted consensus when scheduling the panel review. Include the decision rationale in the delegation prompt.
+Same steps 1–7 as standard panel. Differences: assign weights in step 2; use weighted calculation in step 4; add weighting table to report. Team Lead decides simple vs. weighted; include rationale in delegation prompt.

package/src/orchestrator/skills/performance-optimization/SKILL.md CHANGED Viewed

@@ -3,101 +3,37 @@ name: performance-optimization
 description: "Frontend and backend performance optimization patterns including rendering, asset optimization, JavaScript performance, caching, profiling, and code review checklist. Use when optimizing components, reviewing code for performance, or analyzing bundle size and Core Web Vitals."
 ---
-<!-- ⚠️ This file is managed by OpenCastle. Edits will be overwritten on update. Customize in the .opencastle/ directory instead. -->
 # Performance Optimization
-## General Principles
-- **Measure first, optimize second** — profile before optimizing. Use Chrome DevTools, Lighthouse, Datadog.
-- **Optimize for the common case** — focus on frequently executed code paths.
-- **Avoid premature optimization** — write clear code first, optimize when necessary.
-- **Minimize resource usage** — CPU, memory, network, disk.
-- **Prefer simplicity** — simple algorithms are often faster and easier to optimize.
-- **Document performance assumptions** — comment performance-critical code.
-- **Automate performance testing** — integrate into CI/CD.
-- **Set performance budgets** — define limits for load time, memory, API latency.
-## Rendering and DOM
-- **Memoization**: Use `React.memo`, `useMemo`, `useCallback` judiciously — only when profiling shows unnecessary re-renders. Don't pre-optimize.
-- Stable `key` props in lists (avoid array indices unless static).
-- Avoid inline styles (can trigger layout thrashing). Prefer CSS classes.
-- CSS transitions/animations over JavaScript for GPU-accelerated effects.
-- `requestIdleCallback` for deferring non-critical rendering.
-## Asset Optimization
-- Modern image formats (WebP, AVIF). Tools: ImageOptim, Squoosh.
-- SVGs for icons.
-- Bundle and minify JS/CSS (Webpack, Rollup, esbuild). Tree-shaking.
-- Long-lived cache headers for static assets. Cache busting for updates.
-- `loading="lazy"` for images. Dynamic imports for JS.
-- Font subsetting. `font-display: swap`.
-## JavaScript Performance
+**Rule:** Measure first (`Chrome DevTools`, `Lighthouse`, `Datadog`), optimize second. Set budgets (load time, memory, API latency). Automate in CI/CD.
-- Offload heavy computation to Web Workers.
-- Debounce/throttle scroll, resize, input events.
-- Clean up event listeners, intervals, DOM references (prevent memory leaks).
-- Maps/Sets for lookups. TypedArrays for numeric data.
-- Avoid global variables.
-- Avoid deep object cloning unless necessary.
+## Patterns by Domain
-## Node.js
+| Domain | Key patterns |
+|--------|-------------|
+| **Rendering** | `React.memo`/`useMemo`/`useCallback` only after profiling; stable `key` props; CSS classes over inline styles; CSS animations (GPU); `requestIdleCallback` for non-critical work |
+| **Assets** | WebP/AVIF images; SVG icons; bundle+minify+tree-shake (esbuild/Rollup); `loading="lazy"`; dynamic imports; long-lived cache headers + cache-busting; font subsetting + `font-display: swap` |
+| **JS** | Web Workers for heavy computation; debounce/throttle events; clean up listeners/intervals; `Map`/`Set` for lookups; `TypedArray` for numeric data |
+| **Node.js** | Async APIs only (never `readFileSync` in prod); clustering/worker threads for CPU; streams for large I/O; profile with `clinic.js` / `node --inspect` |
-- Async APIs only — never `fs.readFileSync` in production.
-- Clustering or worker threads for CPU-bound tasks.
-- Streams for large file/network processing.
-- Profile with `clinic.js`, `node --inspect`.
+## Debounce Example
-## Code Review Checklist
-- [ ] No obvious algorithmic inefficiencies (O(n²) or worse)?
-- [ ] Appropriate data structures?
-- [ ] No unnecessary computations or repeated work?
-- [ ] Caching used where appropriate with correct invalidation?
-- [ ] Database queries optimized, indexed, no N+1?
-- [ ] Large payloads paginated, streamed, or chunked?
-- [ ] No memory leaks or unbounded resource usage?
-- [ ] Network requests minimized, batched, retried on failure?
-- [ ] Assets optimized, compressed, served efficiently?
-- [ ] No blocking operations in hot paths?
-- [ ] Logging in hot paths minimized and structured?
-- [ ] Performance-critical paths documented and tested?
-- [ ] Automated benchmarks for performance-sensitive code?
-- [ ] Alerts for performance regressions?
-- [ ] No anti-patterns (SELECT *, blocking I/O, globals)?
-- [ ] Memoization used judiciously — only where profiling shows benefit?
-## Practical Examples
-### Debouncing User Input
-```javascript
-// BAD: API call on every keystroke
+```js
+// BAD: fetch on every keystroke
 input.addEventListener('input', (e) => fetch(`/search?q=${e.target.value}`));
-// GOOD: Debounced
-let timeout;
-input.addEventListener('input', (e) => {
-  clearTimeout(timeout);
-  timeout = setTimeout(() => fetch(`/search?q=${e.target.value}`), 300);
-});
+// GOOD: debounced 300 ms
+let t; input.addEventListener('input', (e) => { clearTimeout(t); t = setTimeout(() => fetch(`/search?q=${e.target.value}`), 300); });
 ```
-### Lazy Loading Images
-```html
-<!-- BAD -->
-<img src="large-image.jpg" />
+## Review Checklist
-<!-- GOOD -->
-<img src="large-image.jpg" loading="lazy" />
-```
+- [ ] No O(n²)+ algorithms; appropriate data structures
+- [ ] Caching with correct invalidation; no N+1 DB queries
+- [ ] Large payloads paginated/streamed; network requests batched
+- [ ] No memory leaks or blocking ops in hot paths
+- [ ] Assets optimized; memoization only where profiling shows benefit
+- [ ] Benchmarks for perf-sensitive code; alerts for regressions
 ## References
-- [Google Web Fundamentals: Performance](https://web.dev/performance/)
-- [MDN: Performance](https://developer.mozilla.org/en-US/docs/Web/Performance)
-- [Lighthouse](https://developers.google.com/web/tools/lighthouse)
+- [web.dev/performance](https://web.dev/performance/) · [MDN Performance](https://developer.mozilla.org/en-US/docs/Web/Performance) · [Lighthouse](https://developers.google.com/web/tools/lighthouse)