npm - @codyswann/lisa - Versions diffs - 1.80.0 → 1.81.1 - Mend

@codyswann/lisa 1.80.0 → 1.81.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (49) hide show

package/package.json +1 -1
package/plugins/lisa/.claude-plugin/plugin.json +14 -10
package/plugins/lisa/agents/bug-fixer.md +1 -1
package/plugins/lisa/agents/builder.md +1 -1
package/plugins/lisa/agents/jira-agent.md +10 -9
package/plugins/lisa/commands/build.md +3 -3
package/plugins/lisa/commands/fix.md +3 -3
package/plugins/lisa/commands/improve.md +8 -8
package/plugins/lisa/commands/investigate.md +1 -1
package/plugins/lisa/commands/monitor.md +2 -2
package/plugins/lisa/commands/plan/create.md +3 -1
package/plugins/lisa/commands/plan/execute.md +1 -1
package/plugins/lisa/commands/plan.md +3 -1
package/plugins/lisa/commands/research.md +8 -0
package/plugins/lisa/commands/review.md +2 -2
package/plugins/lisa/commands/ship.md +2 -4
package/plugins/lisa/commands/verify.md +10 -0
package/plugins/lisa/hooks/inject-flow-context.sh +12 -0
package/plugins/lisa/rules/base-rules.md +4 -0
package/plugins/lisa/rules/intent-routing.md +204 -82
package/plugins/lisa/rules/verification.md +11 -0
package/plugins/lisa/skills/plan-execute/SKILL.md +36 -19
package/plugins/lisa-cdk/.claude-plugin/plugin.json +1 -1
package/plugins/lisa-expo/.claude-plugin/plugin.json +1 -1
package/plugins/lisa-nestjs/.claude-plugin/plugin.json +1 -1
package/plugins/lisa-rails/.claude-plugin/plugin.json +1 -1
package/plugins/lisa-typescript/.claude-plugin/plugin.json +1 -1
package/plugins/src/base/.claude-plugin/plugin.json +6 -10
package/plugins/src/base/agents/bug-fixer.md +1 -1
package/plugins/src/base/agents/builder.md +1 -1
package/plugins/src/base/agents/jira-agent.md +10 -9
package/plugins/src/base/commands/build.md +3 -3
package/plugins/src/base/commands/fix.md +3 -3
package/plugins/src/base/commands/improve.md +8 -8
package/plugins/src/base/commands/investigate.md +1 -1
package/plugins/src/base/commands/monitor.md +2 -2
package/plugins/src/base/commands/plan/create.md +3 -1
package/plugins/src/base/commands/plan/execute.md +1 -1
package/plugins/src/base/commands/plan.md +3 -1
package/plugins/src/base/commands/research.md +8 -0
package/plugins/src/base/commands/review.md +2 -2
package/plugins/src/base/commands/ship.md +2 -4
package/plugins/src/base/commands/verify.md +10 -0
package/plugins/src/base/hooks/inject-flow-context.sh +12 -0
package/plugins/src/base/rules/base-rules.md +4 -0
package/plugins/src/base/rules/intent-routing.md +204 -82
package/plugins/src/base/rules/verification.md +11 -0
package/plugins/src/base/skills/plan-execute/SKILL.md +36 -19
package/scripts/test-intent-routing.sh +221 -0

package/plugins/src/base/rules/intent-routing.md CHANGED Viewed

@@ -1,114 +1,211 @@
 # Intent Routing
-Classify the user's request and execute the matching flow. Each flow is a sequence of agents. Sub-flows can be invoked by any flow.
+MANDATORY: Before starting any work in a session, classify the user's initial request using the Flow Classification Protocol below. Do not respond, do not start work, do not ask questions until you have determined which flow applies. Once a flow is established, all subsequent messages operate within that flow — do not re-classify. This is not optional — skipping classification leads to unstructured responses that bypass readiness gates.
-## Flows
+Each flow has a readiness gate that MUST pass before work begins. If the gate fails, stop and ask for what is missing.
-### Fix
-When: Bug reports, broken behavior, error messages, JIRA bug tickets.
+## Flow Classification Protocol
-Sequence:
-1. `git-history-analyzer` — understand why affected code exists, find related past fixes/reverts
-2. `debug-specialist` — reproduce the bug, prove root cause with evidence
-3. `architecture-specialist` — assess fix risk, identify files to change, check for ripple effects
-4. `test-specialist` — design regression test strategy
-5. `bug-fixer` — implement fix via TDD (reproduction becomes failing test)
-6. **Verify sub-flow**
-7. **Ship sub-flow**
-8. `learner` — capture discoveries for future sessions
-### Build
-When: New features, stories, tasks, JIRA story/task tickets.
+A `UserPromptSubmit` prompt hook uses a fast model to pre-classify the user's request and injects the result as `additionalContext`. Use this classification as a strong hint but verify it against the flow definitions below.
-Sequence:
-1. `product-specialist` — define acceptance criteria, user flows, error states
-2. `architecture-specialist` — research codebase, design approach, map dependencies
-3. `test-specialist` — design test strategy (coverage, edge cases, TDD sequence)
-4. `builder` — implement via TDD (acceptance criteria become tests)
-5. **Verify sub-flow**
-6. **Review sub-flow**
-7. **Ship sub-flow**
-8. `learner` — capture discoveries
+1. If the user invoked a slash command (`/fix`, `/build`, `/plan`, etc.), the flow is already determined -- skip classification.
+2. If a flow hint was injected by the hook, verify it matches the request. If it does, proceed with that flow.
+3. If the classification is "None" or you disagree with the hint:
+   - **Interactive session** (user is present): present a multiple choice using AskUserQuestion with options: Research, Plan, Implement, Verify, No flow.
+   - **Headless/non-interactive session** (running with `-p` flag, in a CI pipeline, or as a scheduled agent): do NOT ask the user. Classify to the best of your ability from available context (ticket content, prompt text, current branch state). If you truly cannot classify, default to "No flow" and proceed with the request as-is.
+4. Once a flow is selected, check its readiness gate before proceeding.
+5. If you are a subagent: your parent agent has already determined the flow -- do NOT ask the user to choose a flow. Execute your assigned work within the established flow context.
-### Investigate
-When: "Why is this happening?", triage requests, JIRA spike tickets.
+## Readiness Gate Protocol
+Every flow begins with a gate check. The gate defines what information must be present before the flow can begin.
+If the gate fails:
+- **Interactive session** (user is present):
+  1. Identify exactly what is missing
+  2. Before asking the user, attempt to answer the questions yourself from available context (source code, docs, git history, project structure, config files). Only ask the user about information you genuinely cannot determine.
+  3. When you do ask the user, provide recommended answers to choose from based on what you found in the codebase. Do not ask open-ended questions when you can offer specific options.
+  4. Tell the user what is needed and why
+  5. Do NOT proceed until the missing information is provided or resolved
+  6. If the missing information can be obtained by running a preceding flow (e.g., Research before Plan), suggest that instead
+- **Headless/non-interactive session** (running with `-p` flag, in a CI pipeline, or as a scheduled agent): do NOT block on missing information. Infer what you can from available context (ticket content, prompt text, codebase state, git history). Proceed with best effort using what is available. If critical information is truly unobtainable, fail with a clear error message explaining what was missing.
+## Main Flows
+### Research
+When: "I need a PRD", "What should we build?", product discovery, requirements gathering, feature exploration, understanding a problem space, open-ended feature ideas.
+Gate:
+- A problem statement, feature idea, or business objective must be provided
+- If none is provided, ask: "What problem are you trying to solve or what capability are you trying to add?"
 Sequence:
-1. `git-history-analyzer` — understand code evolution, find related changes
-2. `debug-specialist` — reproduce, trace execution, prove root cause
-3. `ops-specialist` — check logs, errors, health (if runtime issue)
-4. Report findings with evidence, recommend next action (Fix, Build, or escalate)
+1. **Investigate sub-flow** -- gather context from codebase, git history, existing behavior, and external sources
+2. `product-specialist` -- define user goals, user flows (Gherkin), acceptance criteria, error states, UX concerns, and out-of-scope items
+3. `architecture-specialist` -- assess technical feasibility, identify constraints, map existing system boundaries
+4. Synthesize findings into a PRD document containing: problem statement, user stories, acceptance criteria, technical constraints, open questions, and proposed scope
+5. `learner` -- capture discoveries for future sessions
+Output: A PRD document. If there is not enough context to produce a complete PRD, stop and report what is missing rather than producing an incomplete one.
 ### Plan
-When: "Break this down", epic planning, large scope work, JIRA epic tickets.
+When: "Break this down", "Create tickets", epic planning, large scope work, JIRA epic tickets, "Turn this PRD into work items".
+Gate:
+- A PRD, specification document, or equivalent detailed description must be provided
+- The specification must contain: clear scope, acceptance criteria, and enough detail to decompose into work items
+- If no specification exists, stop and suggest running the **Research** flow first
+- If the specification has unresolved ambiguities, stop and list them
 Sequence:
-1. `product-specialist` — define acceptance criteria for the whole scope
-2. `architecture-specialist` — understand scope, map dependencies, identify cross-cutting concerns
-3. Break down into ordered tasks, each with: acceptance criteria, verification type, dependencies
+1. **Investigate sub-flow** -- explore codebase for architecture, patterns, dependencies relevant to the spec
+2. `product-specialist` -- validate and refine acceptance criteria for the whole scope
+3. `architecture-specialist` -- map dependencies, identify cross-cutting concerns, determine execution order
+4. Decompose into ordered work items (epics, stories, tasks, spikes, bugs), each with:
+   - Type (epic, story, task, spike, bug)
+   - Acceptance criteria
+   - Verification method
+   - Dependencies
+   - Skills required
+5. Create work items in the tracker (JIRA, Linear, GitHub) with acceptance criteria and dependencies
+6. `learner` -- capture discoveries for future sessions
+Output: Work items in a tracker with acceptance criteria, ordered by dependency. If the specification cannot be decomposed without further clarification, stop and report what is missing.
+### Implement
+When: Working on a specific ticket (story, task, bug, spike), implementing a well-defined piece of work.
+Gate:
+- A well-defined work item with acceptance criteria must be provided (ticket URL, file spec, or detailed description)
+- The work item must have clear scope, expected behavior, and verification method
+- If acceptance criteria are missing or ambiguous, stop and ask before proceeding
+- If the work item is too large (epic-level), stop and suggest running the **Plan** flow first
+Determine the work type and execute the matching variant:
+#### Build (features, stories, tasks)
+1. **Investigate sub-flow** -- explore codebase for related code, patterns, dependencies
+2. `product-specialist` -- define acceptance criteria, user flows, error states
+3. `architecture-specialist` -- design approach, map files to modify, identify reusable code
+4. `test-specialist` -- design test strategy (coverage, edge cases, TDD sequence)
+5. `builder` -- implement via TDD (acceptance criteria become tests)
+6. Run validation: lint, typecheck, tests
+7. `verification-specialist` -- verify locally (run the software, observe behavior)
+8. Write e2e test encoding the verification
+9. **Review sub-flow**
+10. `learner` -- capture discoveries
+#### Fix (bugs)
+1. **Reproduce sub-flow** -- write failing test or script that demonstrates the bug (MANDATORY before any fix is attempted)
+2. **Investigate sub-flow** -- git history, root cause analysis
+3. `debug-specialist` -- prove root cause with evidence
+4. `architecture-specialist` -- assess fix risk, identify files to change, check for ripple effects
+5. `test-specialist` -- design regression test strategy
+6. `bug-fixer` -- implement fix via TDD (reproduction becomes failing test)
+7. Run validation: lint, typecheck, tests
+8. `verification-specialist` -- verify locally (prove the bug is fixed)
+9. Write e2e test encoding the verification
+10. **Review sub-flow**
+11. `learner` -- capture discoveries
+#### Improve (refactoring, optimization, coverage improvement)
+1. **Investigate sub-flow** -- understand current state, measure baseline
+2. `architecture-specialist` -- identify target, plan approach
+3. `test-specialist` -- ensure existing test coverage before refactoring (safety net)
+4. `builder` -- implement improvements via TDD
+5. Run validation: lint, typecheck, tests
+6. `verification-specialist` -- measure again, prove improvement over baseline
+7. Write e2e test encoding the verification (if applicable)
+8. **Review sub-flow**
+9. `learner` -- capture discoveries
+#### Investigate Only (spikes)
+1. **Investigate sub-flow** -- full investigation
+2. Report findings with evidence
+3. Recommend next action (Research, Plan, Implement, or escalate)
+4. `learner` -- capture discoveries
+Output: Code passing all validation + local empirical verification + e2e test (except for spikes, which produce findings only).
 ### Verify
-When: Pre-ship quality gate. Used as a sub-flow by Fix and Build.
-Sequence:
-1. Run full test suite — all tests must pass before proceeding
-2. Run quality checks — lint, typecheck, and format
-3. `verification-specialist` — verify acceptance criteria are met empirically
+When: Code is ready to ship. All local validation passes. Moving from "works on my machine" to "works in production".
-### Ship
-When: Code is ready to deploy. Used as a sub-flow by Fix, Build, and Improve.
+Gate:
+- Code must pass local validation (lint, typecheck, tests)
+- Local empirical verification must be complete
+- If local validation fails, go back to **Implement**
+- If no code changes exist, there is nothing to verify
 Sequence:
-1. Commit — atomic conventional commits via `git-commit` skill
-2. PR — create/update pull request via `git-submit-pr` skill
-3. **Review sub-flow** (if not already done)
-4. PR Watch Loop (repeat until mergeable):
-   - If status checks fail → fix and push
-   - If merge conflicts → resolve and push
+1. Commit -- atomic conventional commits via `git-commit` skill
+2. PR -- create/update pull request via `git-submit-pr` skill
+3. PR Watch Loop (repeat until mergeable):
+   - If status checks fail -- fix and push
+   - If merge conflicts -- resolve and push
    - If bot review feedback (CodeRabbit, etc.):
-     - Valid feedback → implement fix, push, resolve comment
-     - Invalid feedback → reply explaining why, resolve comment
+     - Valid feedback -- implement fix, push, resolve comment
+     - Invalid feedback -- reply explaining why, resolve comment
    - Repeat until all checks pass and all comments are resolved
-5. Merge the PR
-6. `ops-specialist` — deploy to target environment
-7. `verification-specialist` — post-deploy health check and smoke test
-8. `ops-specialist` — monitor for errors in first minutes
+4. Merge the PR
+5. Monitor deploy (watch the deployment action triggered by merge):
+   - If deploy fails -- fix, open new PR, return to step 3
+6. Remote verification:
+   - `verification-specialist` -- verify in target environment (same checks as local verification, but on remote)
+   - `ops-specialist` -- post-deploy health check, smoke test, monitor for errors in first minutes
+   - If remote verification fails -- fix, open new PR, return to step 3
-### Review
-When: Code review requests, PR review, quality assessment. Used as a sub-flow by Build.
+Output: Merged PR, successful deploy, remote verification passing.
+## Sub-flows
+Sub-flows are reusable sequences invoked by main flows. When a flow says "Investigate sub-flow", execute the full Investigate sequence.
+### Investigate
+Purpose: Gather context and evidence about code, behavior, or systems.
 Sequence:
-1. Run in parallel: `quality-specialist`, `security-specialist`, `performance-specialist`
-2. `product-specialist` — verify acceptance criteria are met empirically
-3. `test-specialist` — verify test coverage and quality
-4. Consolidate findings, ranked by severity
+1. `git-history-analyzer` -- understand why affected code exists, find related past changes
+2. `debug-specialist` -- reproduce if applicable, trace execution, prove findings with evidence
+3. `ops-specialist` -- check logs, errors, health (if runtime issue)
+4. Report findings with evidence
+### Reproduce
-### Improve
-When: Refactoring, optimization, coverage improvement, complexity reduction.
+Purpose: Create a reliable reproduction that demonstrates a bug before fixing it.
 Sequence:
-1. `architecture-specialist` — identify target, measure baseline, plan approach
-2. `test-specialist` — ensure existing test coverage before refactoring (safety net)
-3. `builder` — implement improvements via TDD
-4. `verification-specialist` — measure again, prove improvement
-5. **Ship sub-flow**
-6. `learner` — capture discoveries
+1. Execute the exact scenario that triggers the bug
+2. Capture complete error output and evidence
+3. Write a failing test that captures the bug (preferred) or a minimal reproduction script
+4. Verify the reproduction is reliable (runs multiple times, consistently fails)
-#### Improve: Test Quality
-When: "Improve tests", "strengthen test suite", "fix weak tests", test quality improvement.
+A bug MUST be reproduced before any fix is attempted. If reproduction fails, report what was tried and stop.
+### Review
+Purpose: Multi-dimensional code review before shipping.
 Sequence:
-1. `test-specialist` — scan tests, identify weak/brittle tests, rank by improvement impact
-2. `builder` — implement test improvements
-3. **Verify sub-flow**
-4. **Ship sub-flow**
-5. `learner` — capture discoveries
+1. Run in parallel: `quality-specialist`, `security-specialist`, `performance-specialist`
+2. `product-specialist` -- verify acceptance criteria are met empirically
+3. `test-specialist` -- verify test coverage and quality
+4. Consolidate findings, ranked by severity
 ### Monitor
-When: "Check the logs", "Any errors?", health checks, production monitoring.
+Purpose: Check application health and operational status. Can be invoked standalone or as part of Verify.
 Sequence:
-1. `ops-specialist` — health checks, log inspection, error monitoring, performance analysis
+1. `ops-specialist` -- health checks, log inspection, error monitoring, performance analysis
 2. Report findings, escalate if action needed
 ## JIRA Entry Point
@@ -116,11 +213,36 @@ Sequence:
 When the request references a JIRA ticket (ticket ID like PROJ-123 or a JIRA URL):
 1. Hand off to `jira-agent`
-2. `jira-agent` reads the ticket, validates structural quality, and runs analytical triage
-3. If triage finds unresolved ambiguities, `jira-agent` posts findings and STOPS — no work begins
-4. `jira-agent` determines intent and delegates to the appropriate flow above
-5. `jira-agent` syncs progress at milestones and posts evidence at completion
+2. `jira-agent` reads the ticket fully (description, comments, attachments, linked issues)
+3. `jira-agent` validates ticket quality via `jira-verify` skill
+4. `jira-agent` runs analytical triage via `ticket-triage` skill
+5. If triage finds unresolved ambiguities (`BLOCKED` verdict), `jira-agent` posts findings and STOPS -- no work begins
+6. `jira-agent` determines intent and delegates to the appropriate flow:
+| Ticket Type | Flow | Work Type |
+|-------------|------|-----------|
+| Epic | Plan | -- |
+| Story | Implement | Build |
+| Task | Implement | Build |
+| Bug | Implement | Fix |
+| Spike | Implement | Investigate Only |
+| Improvement | Implement | Improve |
+If the ticket type is ambiguous, read the description to classify. A "Task" that describes broken behavior is a Fix. A "Bug" that requests new functionality is a Build.
+7. `jira-agent` syncs progress at milestones via `jira-sync` skill
+8. `jira-agent` posts evidence at completion via `jira-evidence` skill
+## Flow Chaining
+Flows can chain naturally:
+- Research produces a PRD -- hand it to Plan
+- Plan produces work items -- hand each to Implement
+- Implement produces verified code -- hand to Verify
+- If any flow discovers it lacks what it needs, it stops and suggests the preceding flow
+The full lifecycle for a large initiative: Research -> Plan -> Implement (per item) -> Verify (per item).
 ## Sub-flow Usage
-Flows reference sub-flows by name. When a flow says "Ship sub-flow", execute the full Ship sequence. When it says "Review sub-flow", execute the full Review sequence. Sub-flows can be nested (e.g., Ship includes Review).
+Flows reference sub-flows by name. When a flow says "Investigate sub-flow", execute the full Investigate sub-flow sequence. When it says "Review sub-flow", execute the full Review sequence. Sub-flows can be invoked by any main flow.

package/plugins/src/base/rules/verification.md CHANGED Viewed

@@ -90,4 +90,15 @@ Every change requires one or more verification types. Classify the change first,
 ---
+## Local vs Remote Verification
+Verification happens at two stages in the workflow:
+- **Local verification** (part of the Implement flow): Run the full test suite, typecheck, lint, and empirically verify the change in a local or preview environment. This proves the change works before shipping. After local verification succeeds, encode it as an e2e test.
+- **Remote verification** (part of the Verify flow): After the PR is merged and deployed, repeat the same empirical verification against the target environment. This proves the change works in production, not just locally. If remote verification fails, fix and re-deploy.
+Both levels use the same verification types table above. The difference is the environment, not the rigor.
+---
 For the full verification lifecycle (classify, check tooling, plan, execute, loop), surfaces, escalation protocol, and proof artifact requirements, see the `verification-lifecycle` skill loaded by the `verification-specialist` agent.

package/plugins/src/base/skills/plan-execute/SKILL.md CHANGED Viewed

@@ -35,17 +35,28 @@ Using the general-purpose agent in Team Lead session, Determine what branch to u
 2. Are we on a feature branch without an open pull request? Use the branch, but ask the human what branch to target for the PR
 3. Are we on an environment branch (dev, staging, main, prod, production)? Check out a feature branch named for this plan and set the target branch of the PR to the environment branch
-Using the general-purpose agent in Team Lead session, Determine what type of request this is for:
-1. Informational/Spike
-2. Task
-3. Bug
-4. Feature/Story
-5. Epic
-IF it's a bug, Using the general-purpose agent in Team Lead session, determine how you will replicate the bug empirically:
-1. Examples:
+Using the general-purpose agent in Team Lead session, Determine which flow applies:
+1. Research -- needs a PRD (no specification exists)
+2. Plan -- needs decomposition (specification exists but no work items)
+3. Implement -- has a well-defined work item
+4. Verify -- has code ready to ship
+If Implement, determine the work type:
+1. Build (feature, story, task)
+2. Fix (bug -- mandatory Reproduce sub-flow before investigation)
+3. Improve (refactoring, optimization, coverage improvement)
+4. Investigate Only (spike -- no code changes, just findings)
+Run the readiness gate check for the selected flow as defined in `.claude/rules/intent-routing.md`. If the gate fails, stop and report what is missing.
+IF it is a Fix (bug), execute the Reproduce sub-flow FIRST:
+1. Write a failing test that demonstrates the bug (preferred)
+2. If a failing test is not possible, write a minimal reproduction script
+3. Verify the reproduction is reliable (consistent failure)
+4. The reproduction MUST succeed before any investigation or fix attempt begins
+5. Examples of reproduction methods:
    1. Write a simple API client and call the offending API
-   2. Start the server on localhost and Use the Playwright CLI or Chrome DevTools
+   2. Start the server on localhost and use the Playwright CLI or Chrome DevTools
 Using the general-purpose agent in Team Lead session, determine how you will know that the task is fully complete
 1. Examples
@@ -78,12 +89,18 @@ Before any task is implemented, the agent team must explore the codebase for rel
 Each task must be reviewed by the team to make sure their verification passes.
 Each task must have their learnings reviewed by the learner subagent.
-Before shutting down the team:
-1. Commit ALL outstanding changes in logical batches on the branch (minus sensitive data/information) — not just changes made by the agent team. This includes pre-existing uncommitted changes that were on the branch before the plan started. Do NOT filter commits to only "task-related" files. If it shows up in git status, it gets committed (unless it contains secrets).
-2. Push the changes - if any pre-push hook blocks you, create a task for the agent team to fix the error/problem whether it was pre-existing or not
-3. Open a pull request with auto-merge on
-4. Monitor the PR. Create a task for the agent team to resolve any code review comments by either implementing the suggestions or commenting why they should not be implemented and close the comment. Fix any failing checks and repush. Continue all checks pass
-5. Monitor the deploy action that triggers automatically from the successful merge
-6. If it fails, create a task for the agent team to fix the failure, open a new PR and then go back to step 4
-7. Execute empirical verification. If empirical verification succeeds, you're finished, otherwise, create a task for the agent team to find out why it failed, fix it and return to step 1 (repeat this until you get all the way through)
+Before shutting down the team, execute the Verify flow:
+1. Run local validation: lint, typecheck, tests — all must pass
+2. `verification-specialist`: verify locally (empirical proof that the change works)
+3. Write e2e test encoding the verification
+4. Commit ALL outstanding changes in logical batches on the branch (minus sensitive data/information) — not just changes made by the agent team. This includes pre-existing uncommitted changes that were on the branch before the plan started. Do NOT filter commits to only "task-related" files. If it shows up in git status, it gets committed (unless it contains secrets).
+5. Push the changes - if any pre-push hook blocks you, create a task for the agent team to fix the error/problem whether it was pre-existing or not
+6. Open a pull request with auto-merge on
+7. PR Watch Loop: Monitor the PR. Create a task for the agent team to resolve any code review comments by either implementing the suggestions or commenting why they should not be implemented and close the comment. Fix any failing checks and repush. Continue until all checks pass.
+8. Merge the PR
+9. Monitor the deploy action that triggers automatically from the successful merge
+10. If deploy fails, create a task for the agent team to fix the failure, open a new PR and then go back to step 7
+11. Remote verification: `verification-specialist` verifies in target environment (same checks as local verification, but on remote)
+12. `ops-specialist`: post-deploy health check, monitor for errors in first minutes
+13. If remote verification fails, create a task for the agent team to find out why it failed, fix it and return to step 4 (repeat until you get all the way through)

package/scripts/test-intent-routing.sh ADDED Viewed

@@ -0,0 +1,221 @@
+#!/usr/bin/env bash
+# Validates the intent-routing system wiring:
+# - All commands reference valid flows
+# - All agents referenced in intent-routing exist
+# - All skills referenced by agents exist
+# - Hooks produce valid JSON
+# - plugin.json is valid and references existing hook files
+# - No stale flow names remain
+set -euo pipefail
+PLUGIN_SRC="plugins/src/base"
+PLUGIN_BUILT="plugins/lisa"
+ERRORS=0
+WARNINGS=0
+pass() { echo "  PASS: $1"; }
+fail() { echo "  FAIL: $1"; ERRORS=$((ERRORS + 1)); }
+warn() { echo "  WARN: $1"; WARNINGS=$((WARNINGS + 1)); }
+echo "=== Intent Routing Validation ==="
+echo ""
+# 1. Check intent-routing.md exists and has all 4 main flows
+echo "--- 1. Flow Definitions ---"
+ROUTING="$PLUGIN_SRC/rules/intent-routing.md"
+if [ ! -f "$ROUTING" ]; then
+  fail "intent-routing.md not found at $ROUTING"
+else
+  for flow in "### Research" "### Plan" "### Implement" "### Verify"; do
+    if grep -q "$flow" "$ROUTING"; then
+      pass "Flow section '$flow' exists"
+    else
+      fail "Flow section '$flow' missing from intent-routing.md"
+    fi
+  done
+  for subflow in "### Investigate" "### Reproduce" "### Review" "### Monitor"; do
+    if grep -q "$subflow" "$ROUTING"; then
+      pass "Sub-flow section '$subflow' exists"
+    else
+      fail "Sub-flow section '$subflow' missing from intent-routing.md"
+    fi
+  done
+fi
+echo ""
+# 2. Check no stale flow names remain in source
+echo "--- 2. Stale Flow Names ---"
+STALE=$(grep -rl "Fix flow\|Build flow\|Ship flow\|Improve flow\|Investigate flow\|Monitor flow\|Review flow" "$PLUGIN_SRC" 2>/dev/null || true)
+if [ -z "$STALE" ]; then
+  pass "No stale flow names (Fix flow, Build flow, etc.) in $PLUGIN_SRC"
+else
+  fail "Stale flow names found in: $STALE"
+fi
+echo ""
+# 3. Check all commands exist
+echo "--- 3. Commands ---"
+for cmd in research verify fix build improve investigate plan ship review monitor; do
+  if [ -f "$PLUGIN_SRC/commands/$cmd.md" ]; then
+    pass "Command /$cmd exists"
+  else
+    fail "Command /$cmd missing at $PLUGIN_SRC/commands/$cmd.md"
+  fi
+done
+echo ""
+# 4. Check commands reference the correct flows
+echo "--- 4. Command -> Flow References ---"
+check_cmd_flow() {
+  local cmd="$1" expected="$2"
+  if grep -q "$expected" "$PLUGIN_SRC/commands/$cmd.md" 2>/dev/null; then
+    pass "/$cmd references '$expected'"
+  else
+    fail "/$cmd does not reference '$expected'"
+  fi
+}
+check_cmd_flow "research" "Research"
+check_cmd_flow "plan" "Plan"
+check_cmd_flow "fix" "Implement"
+check_cmd_flow "build" "Implement"
+check_cmd_flow "improve" "Implement"
+check_cmd_flow "investigate" "Implement"
+check_cmd_flow "verify" "Verify"
+check_cmd_flow "ship" "Verify"
+check_cmd_flow "review" "Review"
+check_cmd_flow "monitor" "Monitor"
+echo ""
+# 5. Check all agents referenced in intent-routing exist
+echo "--- 5. Agent References ---"
+AGENTS_DIR="$PLUGIN_SRC/agents"
+for agent in product-specialist architecture-specialist test-specialist builder bug-fixer \
+             debug-specialist git-history-analyzer ops-specialist verification-specialist \
+             quality-specialist security-specialist performance-specialist learner jira-agent; do
+  if [ -f "$AGENTS_DIR/$agent.md" ]; then
+    pass "Agent '$agent' exists"
+  else
+    # ops-specialist is stack-specific, check expo/rails
+    if [ "$agent" = "ops-specialist" ]; then
+      if [ -f "plugins/src/expo/agents/$agent.md" ] || [ -f "plugins/src/rails/agents/$agent.md" ]; then
+        pass "Agent '$agent' exists (stack-specific)"
+      else
+        warn "Agent '$agent' not found in base (expected in stack-specific plugins)"
+      fi
+    else
+      fail "Agent '$agent' referenced in intent-routing but not found"
+    fi
+  fi
+done
+echo ""
+# 6. Check hooks
+echo "--- 6. Hooks ---"
+HOOK_FILE="$PLUGIN_SRC/hooks/inject-flow-context.sh"
+if [ -f "$HOOK_FILE" ]; then
+  pass "inject-flow-context.sh exists"
+  if [ -x "$HOOK_FILE" ]; then
+    pass "inject-flow-context.sh is executable"
+  else
+    fail "inject-flow-context.sh is not executable"
+  fi
+  # Test it produces valid JSON
+  HOOK_OUTPUT=$(echo '{}' | bash "$HOOK_FILE" 2>/dev/null)
+  if echo "$HOOK_OUTPUT" | jq . >/dev/null 2>&1; then
+    pass "inject-flow-context.sh produces valid JSON"
+    # Check it has the right structure
+    if echo "$HOOK_OUTPUT" | jq -e '.hookSpecificOutput.additionalContext' >/dev/null 2>&1; then
+      pass "inject-flow-context.sh has correct JSON structure"
+    else
+      fail "inject-flow-context.sh missing hookSpecificOutput.additionalContext"
+    fi
+  else
+    fail "inject-flow-context.sh does not produce valid JSON"
+  fi
+else
+  fail "inject-flow-context.sh not found"
+fi
+echo ""
+# 7. Check plugin.json
+echo "--- 7. Plugin Configuration ---"
+PLUGIN_JSON="$PLUGIN_SRC/.claude-plugin/plugin.json"
+if jq . "$PLUGIN_JSON" >/dev/null 2>&1; then
+  pass "plugin.json is valid JSON"
+else
+  fail "plugin.json is not valid JSON"
+fi
+# Check haiku prompt hook is registered
+if jq -e '.hooks.UserPromptSubmit[].hooks[] | select(.type == "prompt")' "$PLUGIN_JSON" >/dev/null 2>&1; then
+  pass "Haiku prompt hook registered in UserPromptSubmit"
+else
+  fail "Haiku prompt hook not found in UserPromptSubmit"
+fi
+# Check inject-flow-context is registered in SubagentStart
+if jq -e '.hooks.SubagentStart[] | select(.hooks[].command | test("inject-flow-context"))' "$PLUGIN_JSON" >/dev/null 2>&1; then
+  pass "inject-flow-context.sh registered in SubagentStart"
+else
+  fail "inject-flow-context.sh not registered in SubagentStart"
+fi
+echo ""
+# 8. Check built plugin matches source
+echo "--- 8. Built Plugin ---"
+if [ -d "$PLUGIN_BUILT" ]; then
+  for file in commands/research.md commands/verify.md hooks/inject-flow-context.sh rules/intent-routing.md; do
+    if [ -f "$PLUGIN_BUILT/$file" ]; then
+      if diff -q "$PLUGIN_SRC/$file" "$PLUGIN_BUILT/$file" >/dev/null 2>&1; then
+        pass "Built $file matches source"
+      else
+        fail "Built $file differs from source (run bun run build:plugins)"
+      fi
+    else
+      fail "Built $file not found (run bun run build:plugins)"
+    fi
+  done
+else
+  fail "Built plugin directory $PLUGIN_BUILT not found"
+fi
+echo ""
+# 9. Check readiness gates are defined
+echo "--- 9. Readiness Gates ---"
+for gate in "Gate:" "problem statement" "PRD" "acceptance criteria" "local validation"; do
+  if grep -q "$gate" "$ROUTING"; then
+    pass "Readiness gate reference '$gate' found"
+  else
+    warn "Readiness gate reference '$gate' not found in intent-routing.md"
+  fi
+done
+echo ""
+# 10. Check headless mode handling
+echo "--- 10. Headless Mode ---"
+if grep -qi "headless\|non-interactive" "$ROUTING"; then
+  pass "Headless/non-interactive mode handling documented"
+else
+  fail "No headless/non-interactive mode handling in intent-routing.md"
+fi
+if grep -qi "do NOT ask" "$ROUTING"; then
+  pass "Explicit 'do NOT ask' directive for headless mode"
+else
+  warn "Missing explicit 'do NOT ask' directive for headless mode"
+fi
+echo ""
+# Summary
+echo "=== Summary ==="
+echo "  Passed: checks completed"
+echo "  Errors: $ERRORS"
+echo "  Warnings: $WARNINGS"
+if [ "$ERRORS" -gt 0 ]; then
+  echo ""
+  echo "FAILED: $ERRORS error(s) found. Fix before deploying."
+  exit 1
+else
+  echo ""
+  echo "ALL CHECKS PASSED."
+  exit 0
+fi