npm - opencastle - Versions diffs - 0.8.1 → 0.9.0 - Mend

opencastle 0.8.1 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (25) hide show

package/dist/cli/detect.test.d.ts +2 -0
package/dist/cli/detect.test.d.ts.map +1 -0
package/dist/cli/detect.test.js +255 -0
package/dist/cli/detect.test.js.map +1 -0
package/dist/cli/run/executor.test.d.ts +2 -0
package/dist/cli/run/executor.test.d.ts.map +1 -0
package/dist/cli/run/executor.test.js +297 -0
package/dist/cli/run/executor.test.js.map +1 -0
package/dist/cli/run/schema.js +1 -1
package/dist/cli/run/schema.js.map +1 -1
package/dist/cli/run/schema.test.d.ts +2 -0
package/dist/cli/run/schema.test.d.ts.map +1 -0
package/dist/cli/run/schema.test.js +294 -0
package/dist/cli/run/schema.test.js.map +1 -0
package/package.json +2 -1
package/src/cli/detect.test.ts +337 -0
package/src/cli/run/executor.test.ts +338 -0
package/src/cli/run/schema.test.ts +343 -0
package/src/cli/run/schema.ts +1 -1
package/src/dashboard/node_modules/.vite/deps/_metadata.json +6 -6
package/src/orchestrator/agents/session-guard.agent.md +113 -0
package/src/orchestrator/agents/team-lead.agent.md +172 -569
package/src/orchestrator/skills/agent-hooks/SKILL.md +14 -13
package/src/orchestrator/skills/decomposition/SKILL.md +122 -0
package/src/orchestrator/skills/orchestration-protocols/SKILL.md +116 -0

package/src/orchestrator/skills/agent-hooks/SKILL.md CHANGED Viewed

@@ -58,17 +58,11 @@ Load relevant skills before writing code.
 ### Actions
-1. **Run the Pre-Response Quality Gate** — This is the single exit gate. Verify ALL items from the checklist in `general.instructions.md` § Pre-Response Quality Gate:
-   - [ ] Lessons read at session start
-   - [ ] Lessons captured for any retries
-   - [ ] Discovered issues tracked (not ignored)
-   - [ ] Lint/type/test pass (no new errors)
-   - [ ] Session logged to `sessions.ndjson` (ALWAYS)
-   - [ ] Delegations logged (Team Lead only)
-   - [ ] Reviews/panels/disputes logged (if applicable)
-2. **Save checkpoint** (Team Lead only) — If work is incomplete, write `.github/customizations/SESSION-CHECKPOINT.md` with current state so the next session can resume. Load **session-checkpoints** skill for format.
-3. **Memory merge check** — If `LESSONS-LEARNED.md` has grown significantly (5+ new entries this session), flag for memory merge consideration.
-4. **Clean up** — Remove any temporary files created during the session (e.g., test fixtures, debug outputs).
+1. **Call Session Guard** (Team Lead only) — Delegate to the **Session Guard** agent with a session summary (delegations, retries, discoveries, files changed). Execute any fix commands it returns. This replaces the manual Pre-Response Quality Gate checklist — the guard runs it automatically with a fresh context window.
+2. **For specialist agents** (not Team Lead) — Run the Pre-Response Quality Gate checklist from `general.instructions.md` manually. Specialist agents don't have access to the Session Guard.
+3. **Save checkpoint** (Team Lead only) — If work is incomplete, write `.github/customizations/SESSION-CHECKPOINT.md` with current state so the next session can resume. Load **session-checkpoints** skill for format.
+4. **Memory merge check** — If `LESSONS-LEARNED.md` has grown significantly (5+ new entries this session), flag for memory merge consideration.
+5. **Clean up** — Remove any temporary files created during the session (e.g., test fixtures, debug outputs).
 ### Template for Delegation Prompts
@@ -80,6 +74,8 @@ Load relevant skills before writing code.
 - Clean up temp files
 ```
+> **Note for Team Lead:** You do NOT use this template yourself. Instead, call the **Session Guard** agent (step 10 in your role). This template is only for specialist agents you delegate to.
 ---
 ## Hook: on-pre-delegate
@@ -115,8 +111,12 @@ Pre-Delegate:
 ### Actions
-0. **Fast review (mandatory)** — Run the `fast-review` skill against the agent's output. This is a **non-skippable gate**. See the fast-review skill for the full procedure (single reviewer sub-agent, automatic retry, escalation). Only after the fast review passes do you proceed to the remaining post-delegate actions below.
-1. **Verify output** — Read changed files. Check that changes stay within the agent's file partition.
+0. **Log the delegation NOW** — Append a record to `.github/customizations/logs/delegations.ndjson` immediately. Do this BEFORE review or verification — logging must not depend on review passing.
+   ```bash
+   echo '{"timestamp":"...","session_id":"<branch>","agent":"...","model":"...","tier":"...","mechanism":"sub-agent","outcome":"...","retries":0,"phase":N,"file_partition":["..."]}' >> .github/customizations/logs/delegations.ndjson
+   ```
+1. **Fast review (mandatory)** — Run the `fast-review` skill against the agent's output. This is a **non-skippable gate**. See the fast-review skill for the full procedure (single reviewer sub-agent, automatic retry, escalation). Only after the fast review passes do you proceed to the remaining post-delegate actions below.
+2. **Verify output** — Read changed files. Check that changes stay within the agent's file partition.
 2. **Run verification** — Execute appropriate checks: lint, type-check, tests, or visual inspection.
 3. **Check acceptance criteria** — Compare output against the tracker issue's acceptance criteria. Each criterion must be independently verified.
 4. **Discovered issues tracked** — Verify the agent followed the Discovered Issues Policy. If they found issues, check that they're in KNOWN-ISSUES.md or a new tracker ticket.
@@ -127,6 +127,7 @@ Pre-Delegate:
 ```
 Post-Delegate:
+☐ Delegation logged to delegations.ndjson (FIRST — before anything else)
 ☐ Changed files reviewed
 ☐ Files within partition
 ☐ Lint/test/build passes

package/src/orchestrator/skills/decomposition/SKILL.md ADDED Viewed

@@ -0,0 +1,122 @@
+---
+name: decomposition
+description: "Task decomposition patterns for the Team Lead: dependency resolution, phase assignment, delegation spec templates, prompt quality examples, and orchestration patterns."
+---
+# Task Decomposition
+Detailed decomposition and delegation patterns for the Team Lead. **Load at:** Decompose & Partition phase (Step 2) or when writing delegation prompts (Step 3).
+## Dependency Resolution
+Declare dependencies between subtasks using arrow notation: `TaskB → TaskA` means B depends on A (A must finish first).
+**Topological sort rules:**
+1. Tasks with no dependencies go in Phase 1 (can run in parallel)
+2. Tasks depending only on Phase 1 tasks go in Phase 2
+3. Continue until all tasks are assigned to phases
+4. Tasks in the same phase with no mutual dependencies run in parallel
+**Cycle detection:** If A → B → C → A, break the cycle by: (a) finding a task that can partially complete independently, (b) splitting that task into an independent part and a dependent part.
+**Visual example:**
+```
+Dependency Graph:        Execution Plan:
+E → C → A               Phase 1: A, B (parallel)
+D → B                   Phase 2: C, D (parallel, depend on Phase 1)
+F → C, D                Phase 3: E, F (parallel, depend on Phase 2)
+```
+Always draw the dependency graph before assigning phases. Missed dependencies cause agents to block on missing inputs; redundant sequencing wastes time.
+## Delegation Spec Template
+For complex tasks (score 5+), generate a structured spec rather than a free-form prompt:
+```
+## Delegation Spec: [Task Title]
+**Tracker Issue:** TAS-XX — [Title]
+**Complexity:** [score]/13 → [tier] tier
+**Agent:** [Agent Name]
+### Objective
+What to build/change and why. 1-3 sentences max.
+### Context
+- Key files to read first: [list]
+- Related patterns to follow: [file:line references]
+- Relevant lessons: [LES-XXX references from LESSONS-LEARNED.md]
+### Constraints
+- File partition: Only modify files under [paths]
+- Do NOT modify: [explicit exclusions]
+- Dependencies: Requires [TAS-XX] to be Done first
+### Acceptance Criteria
+- [ ] Criterion 1 (copied from tracker issue)
+- [ ] Criterion 2
+- [ ] Criterion 3
+### Expected Output
+Return a structured summary with:
+- Files changed (path + one-line description)
+- Verification results (lint/test/build pass/fail)
+- Acceptance criteria status (each item ✅/❌)
+- Discovered issues (if any)
+- Lessons applied or added
+**Note:** Follow the Structured Output Contract from the team-lead-reference skill. Include all standard fields plus agent-specific extensions.
+### Self-Improvement
+Read `.github/customizations/LESSONS-LEARNED.md` before starting. If you retry any command/tool with a different approach that works, immediately add a lesson to that file.
+```
+For simpler tasks (score 1-3), the existing prompt format (objective + files + criteria) is sufficient. Don't over-engineer delegation for trivial work.
+**For sub-agents** — also specify what information to return in the result message.
+**For background agents** — include full self-contained context since they cannot ask follow-up questions.
+## Prompt Quality Examples
+**Strong prompt (simple task, score 2):**
+> "**Tracker issue:** TAS-42 — [Auth] Fix token refresh logic
+> Users report 'Invalid token' errors after 30 minutes. JWT tokens are configured with 1-hour expiration in `libs/auth/src/server.ts`. Investigate why tokens expire early and fix the refresh logic. Only modify files under `libs/auth/`. Run the auth library tests to verify."
+**Strong prompt (complex task, score 8):**
+> Use the Delegation Spec Template above. Fill in all sections for tasks scoring 5+.
+**Weak prompt:**
+> "Fix the authentication bug."
+## Delegation Mechanism Selection
+```
+                         Need result immediately?
+                        /                        \
+                      YES                         NO
+                       |                           |
+              Is it a dependency              Expected duration
+              for the next step?              > 5 minutes?
+                /           \                  /          \
+              YES            NO              YES           NO
+               |              |               |             |
+          Sub-Agent      Sub-Agent       Background     Sub-Agent
+          (inline)    (if small enough,   Agent        (sequential)
+                       else Background)
+```
+## Mixed Delegation Orchestration
+Combine sub-agents and background agents for maximum efficiency:
+```
+Phase 1 (sub-agent):     Research — gather context, identify patterns, map files
+Phase 2 (background):    Foundation — DB migration + Component scaffolding (parallel)
+Phase 3 (sub-agent):     Integration — wire components to data (needs Phase 2 results)
+Phase 4 (background):    Validation — Security audit + Tests + Docs (parallel)
+Phase 5 (sub-agent):     QA gate — verify all phases, run builds, self-review
+Phase 6 (sub-agent):     Panel review — load panel-majority-vote skill for high-stakes validation
+```

package/src/orchestrator/skills/orchestration-protocols/SKILL.md ADDED Viewed

@@ -0,0 +1,116 @@
+---
+name: orchestration-protocols
+description: "Runtime orchestration patterns for the Team Lead: parallel research spawning, agent health monitoring, active steering, background agent management, and escalation paths."
+---
+# Orchestration Protocols
+Runtime patterns for managing delegated agents. **Load at:** Execution phase (Step 4+), when monitoring active agents or spawning parallel work.
+## Active Steering
+Monitor agent sessions during execution. Intervene early when you spot:
+- **Failing tests/builds** — the agent can't resolve a dependency or breaks existing code
+- **Unexpected file changes** — files outside the agent's partition appear in the diff
+- **Scope creep** — the agent starts refactoring code you didn't ask about
+- **Circular behavior** — the agent retries the same failing approach without adjusting
+- **Intent misunderstanding** — session log shows the agent interpreted the prompt differently
+**When redirecting, be specific.** Explain *why* you're redirecting and *how* to proceed:
+> "Don't modify `libs/data/src/lib/product.ts` — that file is shared across features. Instead, add the new query in `libs/data/src/lib/reviews.ts`. This keeps the change isolated."
+**Timing matters.** Catching a problem 5 minutes in can save an hour. Don't wait until the agent finishes.
+**Background agent caveat:** The drift signals above apply only to **sub-agents** (inline) where you see results in real-time. Background agents run autonomously — you cannot inspect their intermediate state or redirect mid-execution. For background agents, steering is **post-hoc**: invest more effort in prompt specificity and file partition constraints upfront, then review thoroughly when the agent returns its output.
+## Background Agents
+Background agents run autonomously in isolated Git worktrees. Use for well-scoped subtasks with clear acceptance criteria.
+- **Spawn:** Delegate Session → Background → Select agent → Enter prompt
+- **Auto-compaction:** At 95% token limit, context is automatically compressed
+- **Resume:** Use `--resume` for previous sessions
+- **Duration threshold:** Reserve for tasks expected to take >5 minutes
+- **No real-time monitoring:** You cannot inspect intermediate state. Drift detection happens only at completion review. Mitigate with: (a) highly specific prompts, (b) strict file partition constraints, (c) acceptance criteria checklists in the prompt
+## Parallel Research Protocol
+When a task requires broad exploration before implementation, spawn multiple research sub-agents in parallel to gather context efficiently.
+### When to Use
+- 3+ independent research questions need answering before implementation can begin
+- Broad codebase exploration across multiple libraries or domains
+- Multi-area analysis (e.g., "How do we handle X in the frontend, backend, and CMS?")
+### Spawn Strategy
+- **Divide by topic/area**, not by file count — each researcher should own a coherent domain
+- **Max 3-5 parallel researchers** — more than 5 creates diminishing returns and token waste
+- **Each researcher gets a focused scope** — explicit directories, file patterns, or questions
+- **Use Economy/Standard tier** for research sub-agents to manage cost
+### Research Sub-Agent Prompt Template
+```
+Research: [specific question]
+Scope: [files/directories to search]
+Return: A structured summary with:
+- Key findings (bullet list)
+- Relevant file paths (with line numbers)
+- Patterns observed
+- Unanswered questions
+```
+### Result Merge Protocol
+After all research sub-agents return:
+1. **Collect** all sub-agent results into a single context
+2. **Deduplicate** findings — same file/pattern reported by multiple agents counts once
+3. **Resolve conflicts** — if agents report contradictory information, trust the one with more specific evidence (exact file paths + line numbers > general observations)
+4. **Synthesize** into a single context block for the next phase — distill the combined findings into a concise summary that can be included in implementation delegation prompts
+### When NOT to Use
+- Single-file investigation — just read the file directly
+- When the answer is in one known location — a single sub-agent or direct read is faster
+- When results must be sequential (e.g., "find X, then based on X find Y")
+- For fewer than 3 questions — overhead of parallel coordination exceeds time saved
+## Batch Reviews
+When multiple background agents complete work simultaneously, batch similar reviews to save time:
+- Group reviews by domain (e.g., all UI changes together, all data changes together)
+- Run fast reviews in parallel for independent outputs
+- If multiple outputs share the same file partition boundary, review them sequentially to catch integration issues
+- For panel reviews, combine related artifacts into a single panel question when they share acceptance criteria
+## Agent Health-Check Protocol
+Monitor delegated agents for failure signals. Intervene early rather than waiting for completion.
+### Health Signals
+| Signal | Detection | Threshold | Recovery |
+|--------|-----------|-----------|----------|
+| **Stuck** | No new terminal output or file changes | Sub-agent: 5 min / Background: 15 min | Check terminal output. If idle, nudge with clarification. If frozen, abort and re-delegate with simpler scope. |
+| **Looping** | Same error message repeated 3+ times | 3 consecutive identical failures | Abort immediately. Analyze the error, add context the agent is missing, re-delegate with explicit fix path. |
+| **Scope creep** | Files outside assigned partition appear in diff | Any file outside partition | Redirect: "Only modify files in [partition]. Revert changes to [file]." |
+| **Context exhaustion** | Responses become repetitive, confused, or lose earlier instructions | Visible confusion or instruction amnesia | Checkpoint immediately. End session. Resume in fresh context. |
+| **Permission loop** | Agent repeatedly asks for confirmation or waits for input | 2+ consecutive prompts without progress | Auto-approve if safe, or abort and re-delegate with `--dangerously-skip-permissions` flag or equivalent. |
+### Health-Check Cadence
+- **Sub-agents (inline):** Monitor continuously — you see output in real-time
+- **Background agents:** Check terminal output after 10 minutes, then every 10 minutes
+- **After completion:** Always review the full diff before accepting output
+### Escalation Path
+1. **First failure:** Re-delegate with more specific prompt + error context
+2. **Second failure:** Downscope the task (split into smaller pieces) and re-delegate
+3. **Third failure:** Log to Dead Letter Queue (`.github/customizations/AGENT-FAILURES.md`), escalate to Architect for root cause analysis. If the failure involves a panel 3x BLOCK or unresolvable agent/reviewer conflict, create a **dispute record** in `.github/customizations/DISPUTES.md` instead (see **team-lead-reference** skill § Dispute Protocol).