npm - @codyswann/lisa - Versions diffs - 2.159.5 → 2.159.6 - Mend

@codyswann/lisa 2.159.5 → 2.159.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (61) hide show

package/package.json CHANGED Viewed

@@ -84,7 +84,7 @@
     "lodash": ">=4.18.1"
   },
   "name": "@codyswann/lisa",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Claude Code governance framework that applies guardrails, guidance, and automated enforcement to projects",
   "main": "dist/index.js",
   "exports": {

package/plugins/lisa/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Universal governance — agents, skills, commands, hooks, and rules for all projects",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa/.codex-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Universal governance: agents, skills, commands, hooks, and rules for all projects.",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa/skills/implement/SKILL.md CHANGED Viewed

@@ -111,6 +111,17 @@ IF it is a Fix (bug), execute the Reproduce sub-flow FIRST:
    1. Write a simple API client and call the offending API
    2. Start the server on localhost and use the Playwright CLI or Chrome DevTools
+For any Fix flow, and for any Build flow that changes user-visible behavior, regression coverage is a required deliverable at the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for that platform (for example Playwright, Maestro, Detox, Cypress, or an equivalent runtime), the task plan and definition of done MUST include a deterministic regression spec against the reported surface, using mocked or seeded data where needed. This is alongside unit or integration coverage, not a substitute for it.
+The team lead may not waive, defer, demote, or phrase this regression spec as "optional", "if cheap", "nice to have", or equivalent. The only permitted exits are:
+1. The project genuinely has no end-to-end harness for the affected platform; record the checked locations and that absence in the task metadata, PR, and work-item evidence.
+2. A genuine technical blocker prevents adding or executing the spec in this PR; before merge, create a linked build-ready follow-up ticket, reference it from the PR and source work item, and keep the current item blocked or explicitly non-terminal until that follow-up is accepted.
+Completion evidence for the regression spec must prove execution, not mere existence. A green CI run is insufficient unless the PR evidence includes a CI log line, reporter output, or equivalent record naming the new spec and showing that it ran and passed. Guard explicitly against `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
+If the required regression spec is still in flight on an auto-merge-enabled PR, pause auto-merge or use an equivalent merge gate until the spec commit is pushed and its execution proof is available. The flow must not allow the PR to merge before this non-demotable deliverable is satisfied or formally blocked through the linked follow-up path above.
 Using the general-purpose agent in Team Lead session, determine how you will know that the task is fully complete. Write this as an **effective completion condition** — one an independent verifier could confirm from observed output alone, not from your assertion that it works. A strong condition has:
 - **One measurable end state** — a status code, an exit code, a row count, an observable UI state, an empty queue. Not "it looks right" or "the code is correct".
@@ -146,13 +157,15 @@ Every task MUST include this JSON metadata block. Do NOT omit `skills` (use `[]`
 Before any task is implemented, the agent team must explore the codebase for relevant research (documentation, code, git history, etc) and update each task's `metadata.relevant_documentation` with the findings.
+For Fix tasks and user-visible Build tasks, `testing_requirements` must include the highest-practical-observation regression requirement above, including the selected harness or the recorded absence/blocker path. The completion condition must include the proof command and the required CI execution evidence for the new spec.
 Each task must be reviewed by the team to make sure their verification passes.
 Each task must have their learnings reviewed by the learner subagent.
 Before shutting down the team, execute the Verify flow:
 1. Run quality gates: lint, typecheck, tests — all must pass. These are prerequisites, NOT verification.
-2. `verification-specialist`: verify locally by running the actual system and observing results (empirical proof that the change works). This is the real verification step.
+2. `verification-specialist`: verify locally by running the actual system and observing results (empirical proof that the change works). This is the real verification step. For UI-surface bugs, the proof must observe the UI surface with browser/device automation against the target environment whenever such a harness exists; unit-level or API-only proof cannot satisfy the empirical verification contract for a UI-surface defect.
 2a. **Record the verification verdict** — the independent, machine-readable proof that gates completion. The `verification-specialist` writes `${CLAUDE_PROJECT_DIR:-.}/.lisa/verification-status.json` with one entry per acceptance criterion, each carrying the proof command's observed evidence:
     ```json
@@ -169,7 +182,7 @@ Before shutting down the team, execute the Verify flow:
     Set `status: "pass"` only when every criterion is `pass` with real evidence (output from running the system, not a claim). The verdict must be judged by an agent that did NOT implement the change (the `verification-specialist`), never self-certified by the implementer. This is runtime scratch — it is gitignored and MUST NOT be committed (treat it like the secrets exclusion in the commit step).
     On Claude, the `enforce-verification-gate.sh` Stop hook reads this file and **will not let the flow stop** until it shows a terminal, all-`pass` verdict — carrying over the non-bypassable completion gate of the `/goal` primitive, but checked deterministically against real evidence rather than by a transcript-only evaluator model. If you must stop before completion (a readiness gate failed, a blocker was found, a dependency is unresolved), write the verdict with `status: "blocked"` and the reason: that records the outcome and releases the gate instead of leaving it to spin. Other harnesses fall back to this prose obligation.
-3. Write e2e test encoding the verification
+3. Write the highest-practical-observation regression test encoding the verification. For user-visible bugs or user-visible Build changes with an available browser/device/e2e harness, this means a deterministic spec on the reported surface. Prove the new spec actually executed and passed in PR CI by recording a named spec log/reporter line or equivalent execution record; green CI without that named evidence does not satisfy this step.
 4. Record Implement usage on the originating work artifact via `lisa:usage-accounting` so the work item (or other implementation-owned artifact) gains a direct `implement` usage entry in the canonical `## Lisa Usage` section. If the parent / child graph is already known, prefer `record_and_rollup` so ancestor totals refresh in the same write; otherwise still write the direct entry, and if runtime usage is unavailable, use `source: unavailable` with nullable token/cost fields instead of skipping the row.
 5. Commit ALL outstanding changes in logical batches on the branch (minus sensitive data/information) — not just changes made by the agent team. This includes pre-existing uncommitted changes that were on the branch before the plan started. Do NOT filter commits to only "task-related" files. If it shows up in git status, it gets committed (unless it contains secrets).
 6. Push the changes - if any pre-push hook blocks you, create a task for the agent team to fix the error/problem whether it was pre-existing or not

package/plugins/lisa/skills/tdd-implementation/SKILL.md CHANGED Viewed

@@ -62,6 +62,9 @@ TDD Cycle:
 - Focus on testing behavior, not implementation details
 - The test must fail before you write any production code
 - If the imported module doesn't exist, Jest reports 0 tests found (not N failed) — this is expected RED behavior
+- For a Fix task, or a Build task that changes user-visible behavior, include a regression test at the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for that platform (for example Playwright, Maestro, Detox, Cypress, or an equivalent runtime), the RED test plan must include a deterministic spec against the reported surface, using mocked or seeded data where needed.
+- The team lead may not waive, defer, or mark that user-visible regression spec as optional, "if cheap", or equivalent. The only exits are a recorded absence of an end-to-end harness for the affected platform, or a genuine technical blocker with a linked build-ready follow-up ticket created before merge and referenced from the PR and source work item.
+- A regression spec is not complete merely because it exists. Completion evidence must prove the spec actually ran and passed in PR CI with a named log line, reporter output, or equivalent execution record. Guard against `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
 ### GREEN Phase

package/plugins/lisa/skills/verification-lifecycle/SKILL.md CHANGED Viewed

@@ -58,12 +58,20 @@ For each verification type, state:
 A verification plan that only lists `bun run test`, `bun run typecheck`, or `bun run lint` is NOT a verification plan. Those are quality gates handled in step 1.
+For a user-visible Fix, or a Build change that affects user-visible behavior, the verification plan must include the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for the affected platform, plan a deterministic regression spec against that surface and the empirical command that observes the same surface. Unit-level or API-only verification does not satisfy a UI-surface defect when browser/device automation is available.
+The lead cannot waive, defer, or demote this regression spec as optional, "if cheap", or equivalent. The only acceptable exits are a recorded absence of an end-to-end harness for the platform, or a genuine technical blocker that is captured before merge as a linked build-ready follow-up ticket referenced from the PR and source work item.
 ### 6. Execute
 After implementation, run the verification plan. Execute each verification type in order.
 Evidence output must explicitly label each verification result as either `verified empirically` or `artifact-only / verification deferred`. Artifact-only evidence can support a blocked escalation packet, but it cannot mark a required runtime verification complete.
+For a required user-visible regression spec, evidence must prove execution, not only existence. Record a CI log line, reporter output, or equivalent artifact that names the new spec and shows it ran and passed in the PR. A green CI run without named execution proof is not enough; explicitly check for `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
+If auto-merge is enabled while the regression spec is still in flight, disable auto-merge or apply an equivalent merge gate until the spec commit is pushed and its CI execution proof is available. Do not let the PR merge before the required regression deliverable is satisfied or formally blocked through the linked follow-up path.
 ### 7. Codify
 After each empirical verification produces PASS evidence, invoke the `codify-verification` skill to encode the verification as an automated regression test. The manual proof becomes a repeatable check that catches future regressions.
@@ -72,6 +80,8 @@ The `codify-verification` skill maps the verification type to the appropriate fr
 Codification is mandatory for every empirical verification type with one exception set: PR, Documentation, Deploy, and Investigate-Only spikes — those have inherently non-behavioral proof. For every other type, skipping codification is not allowed; if codification is genuinely impossible (e.g., the test framework does not exist and cannot be installed in scope), escalate via the Escalation Protocol rather than silently skipping.
+For UI-surface defects with an available browser/device/e2e harness, codification must happen in that harness or the nearest surface-equivalent automated harness. Lower-level tests may be added for diagnosis or edge cases, but they do not replace the reported-surface regression spec.
 A change is not "verified" in the lifecycle sense until each empirical verification has both passed AND been codified.
 ### 8. Spec Conformance

package/plugins/lisa-agy/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Universal governance — agents, skills, commands, hooks, and rules for all projects",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-agy/skills/implement/SKILL.md CHANGED Viewed

@@ -111,6 +111,17 @@ IF it is a Fix (bug), execute the Reproduce sub-flow FIRST:
    1. Write a simple API client and call the offending API
    2. Start the server on localhost and use the Playwright CLI or Chrome DevTools
+For any Fix flow, and for any Build flow that changes user-visible behavior, regression coverage is a required deliverable at the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for that platform (for example Playwright, Maestro, Detox, Cypress, or an equivalent runtime), the task plan and definition of done MUST include a deterministic regression spec against the reported surface, using mocked or seeded data where needed. This is alongside unit or integration coverage, not a substitute for it.
+The team lead may not waive, defer, demote, or phrase this regression spec as "optional", "if cheap", "nice to have", or equivalent. The only permitted exits are:
+1. The project genuinely has no end-to-end harness for the affected platform; record the checked locations and that absence in the task metadata, PR, and work-item evidence.
+2. A genuine technical blocker prevents adding or executing the spec in this PR; before merge, create a linked build-ready follow-up ticket, reference it from the PR and source work item, and keep the current item blocked or explicitly non-terminal until that follow-up is accepted.
+Completion evidence for the regression spec must prove execution, not mere existence. A green CI run is insufficient unless the PR evidence includes a CI log line, reporter output, or equivalent record naming the new spec and showing that it ran and passed. Guard explicitly against `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
+If the required regression spec is still in flight on an auto-merge-enabled PR, pause auto-merge or use an equivalent merge gate until the spec commit is pushed and its execution proof is available. The flow must not allow the PR to merge before this non-demotable deliverable is satisfied or formally blocked through the linked follow-up path above.
 Using the general-purpose agent in Team Lead session, determine how you will know that the task is fully complete. Write this as an **effective completion condition** — one an independent verifier could confirm from observed output alone, not from your assertion that it works. A strong condition has:
 - **One measurable end state** — a status code, an exit code, a row count, an observable UI state, an empty queue. Not "it looks right" or "the code is correct".
@@ -146,13 +157,15 @@ Every task MUST include this JSON metadata block. Do NOT omit `skills` (use `[]`
 Before any task is implemented, the agent team must explore the codebase for relevant research (documentation, code, git history, etc) and update each task's `metadata.relevant_documentation` with the findings.
+For Fix tasks and user-visible Build tasks, `testing_requirements` must include the highest-practical-observation regression requirement above, including the selected harness or the recorded absence/blocker path. The completion condition must include the proof command and the required CI execution evidence for the new spec.
 Each task must be reviewed by the team to make sure their verification passes.
 Each task must have their learnings reviewed by the learner subagent.
 Before shutting down the team, execute the Verify flow:
 1. Run quality gates: lint, typecheck, tests — all must pass. These are prerequisites, NOT verification.
-2. `verification-specialist`: verify locally by running the actual system and observing results (empirical proof that the change works). This is the real verification step.
+2. `verification-specialist`: verify locally by running the actual system and observing results (empirical proof that the change works). This is the real verification step. For UI-surface bugs, the proof must observe the UI surface with browser/device automation against the target environment whenever such a harness exists; unit-level or API-only proof cannot satisfy the empirical verification contract for a UI-surface defect.
 2a. **Record the verification verdict** — the independent, machine-readable proof that gates completion. The `verification-specialist` writes `${CLAUDE_PROJECT_DIR:-.}/.lisa/verification-status.json` with one entry per acceptance criterion, each carrying the proof command's observed evidence:
     ```json
@@ -169,7 +182,7 @@ Before shutting down the team, execute the Verify flow:
     Set `status: "pass"` only when every criterion is `pass` with real evidence (output from running the system, not a claim). The verdict must be judged by an agent that did NOT implement the change (the `verification-specialist`), never self-certified by the implementer. This is runtime scratch — it is gitignored and MUST NOT be committed (treat it like the secrets exclusion in the commit step).
     On Claude, the `enforce-verification-gate.sh` Stop hook reads this file and **will not let the flow stop** until it shows a terminal, all-`pass` verdict — carrying over the non-bypassable completion gate of the `/goal` primitive, but checked deterministically against real evidence rather than by a transcript-only evaluator model. If you must stop before completion (a readiness gate failed, a blocker was found, a dependency is unresolved), write the verdict with `status: "blocked"` and the reason: that records the outcome and releases the gate instead of leaving it to spin. Other harnesses fall back to this prose obligation.
-3. Write e2e test encoding the verification
+3. Write the highest-practical-observation regression test encoding the verification. For user-visible bugs or user-visible Build changes with an available browser/device/e2e harness, this means a deterministic spec on the reported surface. Prove the new spec actually executed and passed in PR CI by recording a named spec log/reporter line or equivalent execution record; green CI without that named evidence does not satisfy this step.
 4. Record Implement usage on the originating work artifact via `lisa:usage-accounting` so the work item (or other implementation-owned artifact) gains a direct `implement` usage entry in the canonical `## Lisa Usage` section. If the parent / child graph is already known, prefer `record_and_rollup` so ancestor totals refresh in the same write; otherwise still write the direct entry, and if runtime usage is unavailable, use `source: unavailable` with nullable token/cost fields instead of skipping the row.
 5. Commit ALL outstanding changes in logical batches on the branch (minus sensitive data/information) — not just changes made by the agent team. This includes pre-existing uncommitted changes that were on the branch before the plan started. Do NOT filter commits to only "task-related" files. If it shows up in git status, it gets committed (unless it contains secrets).
 6. Push the changes - if any pre-push hook blocks you, create a task for the agent team to fix the error/problem whether it was pre-existing or not

package/plugins/lisa-agy/skills/tdd-implementation/SKILL.md CHANGED Viewed

@@ -62,6 +62,9 @@ TDD Cycle:
 - Focus on testing behavior, not implementation details
 - The test must fail before you write any production code
 - If the imported module doesn't exist, Jest reports 0 tests found (not N failed) — this is expected RED behavior
+- For a Fix task, or a Build task that changes user-visible behavior, include a regression test at the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for that platform (for example Playwright, Maestro, Detox, Cypress, or an equivalent runtime), the RED test plan must include a deterministic spec against the reported surface, using mocked or seeded data where needed.
+- The team lead may not waive, defer, or mark that user-visible regression spec as optional, "if cheap", or equivalent. The only exits are a recorded absence of an end-to-end harness for the affected platform, or a genuine technical blocker with a linked build-ready follow-up ticket created before merge and referenced from the PR and source work item.
+- A regression spec is not complete merely because it exists. Completion evidence must prove the spec actually ran and passed in PR CI with a named log line, reporter output, or equivalent execution record. Guard against `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
 ### GREEN Phase

package/plugins/lisa-agy/skills/verification-lifecycle/SKILL.md CHANGED Viewed

@@ -58,12 +58,20 @@ For each verification type, state:
 A verification plan that only lists `bun run test`, `bun run typecheck`, or `bun run lint` is NOT a verification plan. Those are quality gates handled in step 1.
+For a user-visible Fix, or a Build change that affects user-visible behavior, the verification plan must include the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for the affected platform, plan a deterministic regression spec against that surface and the empirical command that observes the same surface. Unit-level or API-only verification does not satisfy a UI-surface defect when browser/device automation is available.
+The lead cannot waive, defer, or demote this regression spec as optional, "if cheap", or equivalent. The only acceptable exits are a recorded absence of an end-to-end harness for the platform, or a genuine technical blocker that is captured before merge as a linked build-ready follow-up ticket referenced from the PR and source work item.
 ### 6. Execute
 After implementation, run the verification plan. Execute each verification type in order.
 Evidence output must explicitly label each verification result as either `verified empirically` or `artifact-only / verification deferred`. Artifact-only evidence can support a blocked escalation packet, but it cannot mark a required runtime verification complete.
+For a required user-visible regression spec, evidence must prove execution, not only existence. Record a CI log line, reporter output, or equivalent artifact that names the new spec and shows it ran and passed in the PR. A green CI run without named execution proof is not enough; explicitly check for `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
+If auto-merge is enabled while the regression spec is still in flight, disable auto-merge or apply an equivalent merge gate until the spec commit is pushed and its CI execution proof is available. Do not let the PR merge before the required regression deliverable is satisfied or formally blocked through the linked follow-up path.
 ### 7. Codify
 After each empirical verification produces PASS evidence, invoke the `codify-verification` skill to encode the verification as an automated regression test. The manual proof becomes a repeatable check that catches future regressions.
@@ -72,6 +80,8 @@ The `codify-verification` skill maps the verification type to the appropriate fr
 Codification is mandatory for every empirical verification type with one exception set: PR, Documentation, Deploy, and Investigate-Only spikes — those have inherently non-behavioral proof. For every other type, skipping codification is not allowed; if codification is genuinely impossible (e.g., the test framework does not exist and cannot be installed in scope), escalate via the Escalation Protocol rather than silently skipping.
+For UI-surface defects with an available browser/device/e2e harness, codification must happen in that harness or the nearest surface-equivalent automated harness. Lower-level tests may be added for diagnosis or edge cases, but they do not replace the reported-surface regression spec.
 A change is not "verified" in the lifecycle sense until each empirical verification has both passed AND been codified.
 ### 8. Spec Conformance

package/plugins/lisa-cdk/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-cdk",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "AWS CDK-specific plugin",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-cdk/.codex-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-cdk",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "AWS CDK-specific Lisa plugin.",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-cdk-agy/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-cdk",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "AWS CDK-specific plugin",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-cdk-copilot/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-cdk",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "AWS CDK-specific plugin",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-cdk-cursor/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-cdk",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "AWS CDK-specific plugin",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-copilot/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Universal governance — agents, skills, commands, hooks, and rules for all projects",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-copilot/skills/implement/SKILL.md CHANGED Viewed

@@ -111,6 +111,17 @@ IF it is a Fix (bug), execute the Reproduce sub-flow FIRST:
    1. Write a simple API client and call the offending API
    2. Start the server on localhost and use the Playwright CLI or Chrome DevTools
+For any Fix flow, and for any Build flow that changes user-visible behavior, regression coverage is a required deliverable at the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for that platform (for example Playwright, Maestro, Detox, Cypress, or an equivalent runtime), the task plan and definition of done MUST include a deterministic regression spec against the reported surface, using mocked or seeded data where needed. This is alongside unit or integration coverage, not a substitute for it.
+The team lead may not waive, defer, demote, or phrase this regression spec as "optional", "if cheap", "nice to have", or equivalent. The only permitted exits are:
+1. The project genuinely has no end-to-end harness for the affected platform; record the checked locations and that absence in the task metadata, PR, and work-item evidence.
+2. A genuine technical blocker prevents adding or executing the spec in this PR; before merge, create a linked build-ready follow-up ticket, reference it from the PR and source work item, and keep the current item blocked or explicitly non-terminal until that follow-up is accepted.
+Completion evidence for the regression spec must prove execution, not mere existence. A green CI run is insufficient unless the PR evidence includes a CI log line, reporter output, or equivalent record naming the new spec and showing that it ran and passed. Guard explicitly against `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
+If the required regression spec is still in flight on an auto-merge-enabled PR, pause auto-merge or use an equivalent merge gate until the spec commit is pushed and its execution proof is available. The flow must not allow the PR to merge before this non-demotable deliverable is satisfied or formally blocked through the linked follow-up path above.
 Using the general-purpose agent in Team Lead session, determine how you will know that the task is fully complete. Write this as an **effective completion condition** — one an independent verifier could confirm from observed output alone, not from your assertion that it works. A strong condition has:
 - **One measurable end state** — a status code, an exit code, a row count, an observable UI state, an empty queue. Not "it looks right" or "the code is correct".
@@ -146,13 +157,15 @@ Every task MUST include this JSON metadata block. Do NOT omit `skills` (use `[]`
 Before any task is implemented, the agent team must explore the codebase for relevant research (documentation, code, git history, etc) and update each task's `metadata.relevant_documentation` with the findings.
+For Fix tasks and user-visible Build tasks, `testing_requirements` must include the highest-practical-observation regression requirement above, including the selected harness or the recorded absence/blocker path. The completion condition must include the proof command and the required CI execution evidence for the new spec.
 Each task must be reviewed by the team to make sure their verification passes.
 Each task must have their learnings reviewed by the learner subagent.
 Before shutting down the team, execute the Verify flow:
 1. Run quality gates: lint, typecheck, tests — all must pass. These are prerequisites, NOT verification.
-2. `verification-specialist`: verify locally by running the actual system and observing results (empirical proof that the change works). This is the real verification step.
+2. `verification-specialist`: verify locally by running the actual system and observing results (empirical proof that the change works). This is the real verification step. For UI-surface bugs, the proof must observe the UI surface with browser/device automation against the target environment whenever such a harness exists; unit-level or API-only proof cannot satisfy the empirical verification contract for a UI-surface defect.
 2a. **Record the verification verdict** — the independent, machine-readable proof that gates completion. The `verification-specialist` writes `${CLAUDE_PROJECT_DIR:-.}/.lisa/verification-status.json` with one entry per acceptance criterion, each carrying the proof command's observed evidence:
     ```json
@@ -169,7 +182,7 @@ Before shutting down the team, execute the Verify flow:
     Set `status: "pass"` only when every criterion is `pass` with real evidence (output from running the system, not a claim). The verdict must be judged by an agent that did NOT implement the change (the `verification-specialist`), never self-certified by the implementer. This is runtime scratch — it is gitignored and MUST NOT be committed (treat it like the secrets exclusion in the commit step).
     On Claude, the `enforce-verification-gate.sh` Stop hook reads this file and **will not let the flow stop** until it shows a terminal, all-`pass` verdict — carrying over the non-bypassable completion gate of the `/goal` primitive, but checked deterministically against real evidence rather than by a transcript-only evaluator model. If you must stop before completion (a readiness gate failed, a blocker was found, a dependency is unresolved), write the verdict with `status: "blocked"` and the reason: that records the outcome and releases the gate instead of leaving it to spin. Other harnesses fall back to this prose obligation.
-3. Write e2e test encoding the verification
+3. Write the highest-practical-observation regression test encoding the verification. For user-visible bugs or user-visible Build changes with an available browser/device/e2e harness, this means a deterministic spec on the reported surface. Prove the new spec actually executed and passed in PR CI by recording a named spec log/reporter line or equivalent execution record; green CI without that named evidence does not satisfy this step.
 4. Record Implement usage on the originating work artifact via `lisa:usage-accounting` so the work item (or other implementation-owned artifact) gains a direct `implement` usage entry in the canonical `## Lisa Usage` section. If the parent / child graph is already known, prefer `record_and_rollup` so ancestor totals refresh in the same write; otherwise still write the direct entry, and if runtime usage is unavailable, use `source: unavailable` with nullable token/cost fields instead of skipping the row.
 5. Commit ALL outstanding changes in logical batches on the branch (minus sensitive data/information) — not just changes made by the agent team. This includes pre-existing uncommitted changes that were on the branch before the plan started. Do NOT filter commits to only "task-related" files. If it shows up in git status, it gets committed (unless it contains secrets).
 6. Push the changes - if any pre-push hook blocks you, create a task for the agent team to fix the error/problem whether it was pre-existing or not

package/plugins/lisa-copilot/skills/tdd-implementation/SKILL.md CHANGED Viewed

@@ -62,6 +62,9 @@ TDD Cycle:
 - Focus on testing behavior, not implementation details
 - The test must fail before you write any production code
 - If the imported module doesn't exist, Jest reports 0 tests found (not N failed) — this is expected RED behavior
+- For a Fix task, or a Build task that changes user-visible behavior, include a regression test at the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for that platform (for example Playwright, Maestro, Detox, Cypress, or an equivalent runtime), the RED test plan must include a deterministic spec against the reported surface, using mocked or seeded data where needed.
+- The team lead may not waive, defer, or mark that user-visible regression spec as optional, "if cheap", or equivalent. The only exits are a recorded absence of an end-to-end harness for the affected platform, or a genuine technical blocker with a linked build-ready follow-up ticket created before merge and referenced from the PR and source work item.
+- A regression spec is not complete merely because it exists. Completion evidence must prove the spec actually ran and passed in PR CI with a named log line, reporter output, or equivalent execution record. Guard against `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
 ### GREEN Phase

package/plugins/lisa-copilot/skills/verification-lifecycle/SKILL.md CHANGED Viewed

@@ -58,12 +58,20 @@ For each verification type, state:
 A verification plan that only lists `bun run test`, `bun run typecheck`, or `bun run lint` is NOT a verification plan. Those are quality gates handled in step 1.
+For a user-visible Fix, or a Build change that affects user-visible behavior, the verification plan must include the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for the affected platform, plan a deterministic regression spec against that surface and the empirical command that observes the same surface. Unit-level or API-only verification does not satisfy a UI-surface defect when browser/device automation is available.
+The lead cannot waive, defer, or demote this regression spec as optional, "if cheap", or equivalent. The only acceptable exits are a recorded absence of an end-to-end harness for the platform, or a genuine technical blocker that is captured before merge as a linked build-ready follow-up ticket referenced from the PR and source work item.
 ### 6. Execute
 After implementation, run the verification plan. Execute each verification type in order.
 Evidence output must explicitly label each verification result as either `verified empirically` or `artifact-only / verification deferred`. Artifact-only evidence can support a blocked escalation packet, but it cannot mark a required runtime verification complete.
+For a required user-visible regression spec, evidence must prove execution, not only existence. Record a CI log line, reporter output, or equivalent artifact that names the new spec and shows it ran and passed in the PR. A green CI run without named execution proof is not enough; explicitly check for `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
+If auto-merge is enabled while the regression spec is still in flight, disable auto-merge or apply an equivalent merge gate until the spec commit is pushed and its CI execution proof is available. Do not let the PR merge before the required regression deliverable is satisfied or formally blocked through the linked follow-up path.
 ### 7. Codify
 After each empirical verification produces PASS evidence, invoke the `codify-verification` skill to encode the verification as an automated regression test. The manual proof becomes a repeatable check that catches future regressions.
@@ -72,6 +80,8 @@ The `codify-verification` skill maps the verification type to the appropriate fr
 Codification is mandatory for every empirical verification type with one exception set: PR, Documentation, Deploy, and Investigate-Only spikes — those have inherently non-behavioral proof. For every other type, skipping codification is not allowed; if codification is genuinely impossible (e.g., the test framework does not exist and cannot be installed in scope), escalate via the Escalation Protocol rather than silently skipping.
+For UI-surface defects with an available browser/device/e2e harness, codification must happen in that harness or the nearest surface-equivalent automated harness. Lower-level tests may be added for diagnosis or edge cases, but they do not replace the reported-surface regression spec.
 A change is not "verified" in the lifecycle sense until each empirical verification has both passed AND been codified.
 ### 8. Spec Conformance

package/plugins/lisa-cursor/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Universal governance — agents, skills, commands, hooks, and rules for all projects",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-cursor/skills/implement/SKILL.md CHANGED Viewed

@@ -111,6 +111,17 @@ IF it is a Fix (bug), execute the Reproduce sub-flow FIRST:
    1. Write a simple API client and call the offending API
    2. Start the server on localhost and use the Playwright CLI or Chrome DevTools
+For any Fix flow, and for any Build flow that changes user-visible behavior, regression coverage is a required deliverable at the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for that platform (for example Playwright, Maestro, Detox, Cypress, or an equivalent runtime), the task plan and definition of done MUST include a deterministic regression spec against the reported surface, using mocked or seeded data where needed. This is alongside unit or integration coverage, not a substitute for it.
+The team lead may not waive, defer, demote, or phrase this regression spec as "optional", "if cheap", "nice to have", or equivalent. The only permitted exits are:
+1. The project genuinely has no end-to-end harness for the affected platform; record the checked locations and that absence in the task metadata, PR, and work-item evidence.
+2. A genuine technical blocker prevents adding or executing the spec in this PR; before merge, create a linked build-ready follow-up ticket, reference it from the PR and source work item, and keep the current item blocked or explicitly non-terminal until that follow-up is accepted.
+Completion evidence for the regression spec must prove execution, not mere existence. A green CI run is insufficient unless the PR evidence includes a CI log line, reporter output, or equivalent record naming the new spec and showing that it ran and passed. Guard explicitly against `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
+If the required regression spec is still in flight on an auto-merge-enabled PR, pause auto-merge or use an equivalent merge gate until the spec commit is pushed and its execution proof is available. The flow must not allow the PR to merge before this non-demotable deliverable is satisfied or formally blocked through the linked follow-up path above.
 Using the general-purpose agent in Team Lead session, determine how you will know that the task is fully complete. Write this as an **effective completion condition** — one an independent verifier could confirm from observed output alone, not from your assertion that it works. A strong condition has:
 - **One measurable end state** — a status code, an exit code, a row count, an observable UI state, an empty queue. Not "it looks right" or "the code is correct".
@@ -146,13 +157,15 @@ Every task MUST include this JSON metadata block. Do NOT omit `skills` (use `[]`
 Before any task is implemented, the agent team must explore the codebase for relevant research (documentation, code, git history, etc) and update each task's `metadata.relevant_documentation` with the findings.
+For Fix tasks and user-visible Build tasks, `testing_requirements` must include the highest-practical-observation regression requirement above, including the selected harness or the recorded absence/blocker path. The completion condition must include the proof command and the required CI execution evidence for the new spec.
 Each task must be reviewed by the team to make sure their verification passes.
 Each task must have their learnings reviewed by the learner subagent.
 Before shutting down the team, execute the Verify flow:
 1. Run quality gates: lint, typecheck, tests — all must pass. These are prerequisites, NOT verification.
-2. `verification-specialist`: verify locally by running the actual system and observing results (empirical proof that the change works). This is the real verification step.
+2. `verification-specialist`: verify locally by running the actual system and observing results (empirical proof that the change works). This is the real verification step. For UI-surface bugs, the proof must observe the UI surface with browser/device automation against the target environment whenever such a harness exists; unit-level or API-only proof cannot satisfy the empirical verification contract for a UI-surface defect.
 2a. **Record the verification verdict** — the independent, machine-readable proof that gates completion. The `verification-specialist` writes `${CLAUDE_PROJECT_DIR:-.}/.lisa/verification-status.json` with one entry per acceptance criterion, each carrying the proof command's observed evidence:
     ```json
@@ -169,7 +182,7 @@ Before shutting down the team, execute the Verify flow:
     Set `status: "pass"` only when every criterion is `pass` with real evidence (output from running the system, not a claim). The verdict must be judged by an agent that did NOT implement the change (the `verification-specialist`), never self-certified by the implementer. This is runtime scratch — it is gitignored and MUST NOT be committed (treat it like the secrets exclusion in the commit step).
     On Claude, the `enforce-verification-gate.sh` Stop hook reads this file and **will not let the flow stop** until it shows a terminal, all-`pass` verdict — carrying over the non-bypassable completion gate of the `/goal` primitive, but checked deterministically against real evidence rather than by a transcript-only evaluator model. If you must stop before completion (a readiness gate failed, a blocker was found, a dependency is unresolved), write the verdict with `status: "blocked"` and the reason: that records the outcome and releases the gate instead of leaving it to spin. Other harnesses fall back to this prose obligation.
-3. Write e2e test encoding the verification
+3. Write the highest-practical-observation regression test encoding the verification. For user-visible bugs or user-visible Build changes with an available browser/device/e2e harness, this means a deterministic spec on the reported surface. Prove the new spec actually executed and passed in PR CI by recording a named spec log/reporter line or equivalent execution record; green CI without that named evidence does not satisfy this step.
 4. Record Implement usage on the originating work artifact via `lisa:usage-accounting` so the work item (or other implementation-owned artifact) gains a direct `implement` usage entry in the canonical `## Lisa Usage` section. If the parent / child graph is already known, prefer `record_and_rollup` so ancestor totals refresh in the same write; otherwise still write the direct entry, and if runtime usage is unavailable, use `source: unavailable` with nullable token/cost fields instead of skipping the row.
 5. Commit ALL outstanding changes in logical batches on the branch (minus sensitive data/information) — not just changes made by the agent team. This includes pre-existing uncommitted changes that were on the branch before the plan started. Do NOT filter commits to only "task-related" files. If it shows up in git status, it gets committed (unless it contains secrets).
 6. Push the changes - if any pre-push hook blocks you, create a task for the agent team to fix the error/problem whether it was pre-existing or not

package/plugins/lisa-cursor/skills/tdd-implementation/SKILL.md CHANGED Viewed

@@ -62,6 +62,9 @@ TDD Cycle:
 - Focus on testing behavior, not implementation details
 - The test must fail before you write any production code
 - If the imported module doesn't exist, Jest reports 0 tests found (not N failed) — this is expected RED behavior
+- For a Fix task, or a Build task that changes user-visible behavior, include a regression test at the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for that platform (for example Playwright, Maestro, Detox, Cypress, or an equivalent runtime), the RED test plan must include a deterministic spec against the reported surface, using mocked or seeded data where needed.
+- The team lead may not waive, defer, or mark that user-visible regression spec as optional, "if cheap", or equivalent. The only exits are a recorded absence of an end-to-end harness for the affected platform, or a genuine technical blocker with a linked build-ready follow-up ticket created before merge and referenced from the PR and source work item.
+- A regression spec is not complete merely because it exists. Completion evidence must prove the spec actually ran and passed in PR CI with a named log line, reporter output, or equivalent execution record. Guard against `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
 ### GREEN Phase

package/plugins/lisa-cursor/skills/verification-lifecycle/SKILL.md CHANGED Viewed

@@ -58,12 +58,20 @@ For each verification type, state:
 A verification plan that only lists `bun run test`, `bun run typecheck`, or `bun run lint` is NOT a verification plan. Those are quality gates handled in step 1.
+For a user-visible Fix, or a Build change that affects user-visible behavior, the verification plan must include the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for the affected platform, plan a deterministic regression spec against that surface and the empirical command that observes the same surface. Unit-level or API-only verification does not satisfy a UI-surface defect when browser/device automation is available.
+The lead cannot waive, defer, or demote this regression spec as optional, "if cheap", or equivalent. The only acceptable exits are a recorded absence of an end-to-end harness for the platform, or a genuine technical blocker that is captured before merge as a linked build-ready follow-up ticket referenced from the PR and source work item.
 ### 6. Execute
 After implementation, run the verification plan. Execute each verification type in order.
 Evidence output must explicitly label each verification result as either `verified empirically` or `artifact-only / verification deferred`. Artifact-only evidence can support a blocked escalation packet, but it cannot mark a required runtime verification complete.
+For a required user-visible regression spec, evidence must prove execution, not only existence. Record a CI log line, reporter output, or equivalent artifact that names the new spec and shows it ran and passed in the PR. A green CI run without named execution proof is not enough; explicitly check for `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
+If auto-merge is enabled while the regression spec is still in flight, disable auto-merge or apply an equivalent merge gate until the spec commit is pushed and its CI execution proof is available. Do not let the PR merge before the required regression deliverable is satisfied or formally blocked through the linked follow-up path.
 ### 7. Codify
 After each empirical verification produces PASS evidence, invoke the `codify-verification` skill to encode the verification as an automated regression test. The manual proof becomes a repeatable check that catches future regressions.
@@ -72,6 +80,8 @@ The `codify-verification` skill maps the verification type to the appropriate fr
 Codification is mandatory for every empirical verification type with one exception set: PR, Documentation, Deploy, and Investigate-Only spikes — those have inherently non-behavioral proof. For every other type, skipping codification is not allowed; if codification is genuinely impossible (e.g., the test framework does not exist and cannot be installed in scope), escalate via the Escalation Protocol rather than silently skipping.
+For UI-surface defects with an available browser/device/e2e harness, codification must happen in that harness or the nearest surface-equivalent automated harness. Lower-level tests may be added for diagnosis or edge cases, but they do not replace the reported-surface regression spec.
 A change is not "verified" in the lifecycle sense until each empirical verification has both passed AND been codified.
 ### 8. Spec Conformance

package/plugins/lisa-expo/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-expo",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Expo/React Native-specific skills, agents, rules, and MCP servers",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-expo/.codex-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-expo",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Expo and React Native-specific skills, agents, rules, and MCP servers.",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-expo-agy/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-expo",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Expo/React Native-specific skills, agents, rules, and MCP servers",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-expo-copilot/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-expo",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Expo/React Native-specific skills, agents, rules, and MCP servers",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-expo-cursor/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-expo",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Expo/React Native-specific skills, agents, rules, and MCP servers",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-harper-fabric/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-harper-fabric",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Harper/Fabric-specific rules for TypeScript component apps",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-harper-fabric/.codex-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-harper-fabric",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Harper/Fabric-specific Lisa rules for TypeScript component apps.",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-harper-fabric-agy/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-harper-fabric",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Harper/Fabric-specific rules for TypeScript component apps",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-harper-fabric-copilot/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-harper-fabric",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Harper/Fabric-specific rules for TypeScript component apps",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-harper-fabric-cursor/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-harper-fabric",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Harper/Fabric-specific rules for TypeScript component apps",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-nestjs/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-nestjs",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "NestJS-specific skills (GraphQL, TypeORM) and hooks (migration write-protection)",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-nestjs/.codex-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-nestjs",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "NestJS-specific skills and migration write-protection hooks.",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-nestjs-agy/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-nestjs",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "NestJS-specific skills (GraphQL, TypeORM) and hooks (migration write-protection)",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-nestjs-copilot/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-nestjs",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "NestJS-specific skills (GraphQL, TypeORM) and hooks (migration write-protection)",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-nestjs-cursor/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-nestjs",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "NestJS-specific skills (GraphQL, TypeORM) and hooks (migration write-protection)",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-openclaw/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-openclaw",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Connect staff roles to Telegram or Slack via OpenClaw — facilitator/specialist hub-and-spoke routing and repo-coding topics, for Claude Code and Codex",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-openclaw/.codex-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-openclaw",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Connect staff roles to Telegram or Slack via OpenClaw — facilitator/specialist hub-and-spoke routing and repo-coding topics, across Claude and Codex.",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-openclaw-agy/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-openclaw",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Connect staff roles to Telegram or Slack via OpenClaw — facilitator/specialist hub-and-spoke routing and repo-coding topics, for Claude Code and Codex",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-openclaw-copilot/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-openclaw",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Connect staff roles to Telegram or Slack via OpenClaw — facilitator/specialist hub-and-spoke routing and repo-coding topics, for Claude Code and Codex",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-openclaw-cursor/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-openclaw",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Connect staff roles to Telegram or Slack via OpenClaw — facilitator/specialist hub-and-spoke routing and repo-coding topics, for Claude Code and Codex",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-rails/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-rails",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Ruby on Rails-specific hooks — RuboCop linting/formatting and ast-grep scanning on edit",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-rails/.codex-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-rails",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Ruby on Rails-specific skills and hooks for RuboCop and ast-grep scanning on edit.",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-rails-agy/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-rails",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Ruby on Rails-specific hooks — RuboCop linting/formatting and ast-grep scanning on edit",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-rails-copilot/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-rails",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Ruby on Rails-specific hooks — RuboCop linting/formatting and ast-grep scanning on edit",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-rails-cursor/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-rails",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Ruby on Rails-specific hooks — RuboCop linting/formatting and ast-grep scanning on edit",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-typescript/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-typescript",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "TypeScript-specific hooks — Prettier formatting, ESLint linting, ast-grep scanning, and error-suppression blocking on edit",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-typescript/.codex-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-typescript",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "TypeScript-specific hooks for formatting, linting, and ast-grep scanning on edit.",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-typescript-agy/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-typescript",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "TypeScript-specific hooks — Prettier formatting, ESLint linting, ast-grep scanning, and error-suppression blocking on edit",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-typescript-copilot/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-typescript",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "TypeScript-specific hooks — Prettier formatting, ESLint linting, ast-grep scanning, and error-suppression blocking on edit",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-typescript-cursor/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-typescript",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "TypeScript-specific hooks — Prettier formatting, ESLint linting, ast-grep scanning, and error-suppression blocking on edit",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-wiki/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-wiki",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "LLM Wiki — a distributable, git-native markdown knowledge base for Claude Code and Codex",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-wiki/.codex-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-wiki",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "Distributable LLM Wiki kernel — ingest, query, lint, and maintain a git-native markdown knowledge base across Claude and Codex.",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-wiki-agy/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-wiki",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "LLM Wiki — a distributable, git-native markdown knowledge base for Claude Code and Codex",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-wiki-copilot/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-wiki",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "LLM Wiki — a distributable, git-native markdown knowledge base for Claude Code and Codex",
   "author": {
     "name": "Cody Swann"

package/plugins/lisa-wiki-cursor/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lisa-wiki",
-  "version": "2.159.5",
+  "version": "2.159.6",
   "description": "LLM Wiki — a distributable, git-native markdown knowledge base for Claude Code and Codex",
   "author": {
     "name": "Cody Swann"

package/plugins/src/base/skills/implement/SKILL.md CHANGED Viewed

@@ -111,6 +111,17 @@ IF it is a Fix (bug), execute the Reproduce sub-flow FIRST:
    1. Write a simple API client and call the offending API
    2. Start the server on localhost and use the Playwright CLI or Chrome DevTools
+For any Fix flow, and for any Build flow that changes user-visible behavior, regression coverage is a required deliverable at the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for that platform (for example Playwright, Maestro, Detox, Cypress, or an equivalent runtime), the task plan and definition of done MUST include a deterministic regression spec against the reported surface, using mocked or seeded data where needed. This is alongside unit or integration coverage, not a substitute for it.
+The team lead may not waive, defer, demote, or phrase this regression spec as "optional", "if cheap", "nice to have", or equivalent. The only permitted exits are:
+1. The project genuinely has no end-to-end harness for the affected platform; record the checked locations and that absence in the task metadata, PR, and work-item evidence.
+2. A genuine technical blocker prevents adding or executing the spec in this PR; before merge, create a linked build-ready follow-up ticket, reference it from the PR and source work item, and keep the current item blocked or explicitly non-terminal until that follow-up is accepted.
+Completion evidence for the regression spec must prove execution, not mere existence. A green CI run is insufficient unless the PR evidence includes a CI log line, reporter output, or equivalent record naming the new spec and showing that it ran and passed. Guard explicitly against `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
+If the required regression spec is still in flight on an auto-merge-enabled PR, pause auto-merge or use an equivalent merge gate until the spec commit is pushed and its execution proof is available. The flow must not allow the PR to merge before this non-demotable deliverable is satisfied or formally blocked through the linked follow-up path above.
 Using the general-purpose agent in Team Lead session, determine how you will know that the task is fully complete. Write this as an **effective completion condition** — one an independent verifier could confirm from observed output alone, not from your assertion that it works. A strong condition has:
 - **One measurable end state** — a status code, an exit code, a row count, an observable UI state, an empty queue. Not "it looks right" or "the code is correct".
@@ -146,13 +157,15 @@ Every task MUST include this JSON metadata block. Do NOT omit `skills` (use `[]`
 Before any task is implemented, the agent team must explore the codebase for relevant research (documentation, code, git history, etc) and update each task's `metadata.relevant_documentation` with the findings.
+For Fix tasks and user-visible Build tasks, `testing_requirements` must include the highest-practical-observation regression requirement above, including the selected harness or the recorded absence/blocker path. The completion condition must include the proof command and the required CI execution evidence for the new spec.
 Each task must be reviewed by the team to make sure their verification passes.
 Each task must have their learnings reviewed by the learner subagent.
 Before shutting down the team, execute the Verify flow:
 1. Run quality gates: lint, typecheck, tests — all must pass. These are prerequisites, NOT verification.
-2. `verification-specialist`: verify locally by running the actual system and observing results (empirical proof that the change works). This is the real verification step.
+2. `verification-specialist`: verify locally by running the actual system and observing results (empirical proof that the change works). This is the real verification step. For UI-surface bugs, the proof must observe the UI surface with browser/device automation against the target environment whenever such a harness exists; unit-level or API-only proof cannot satisfy the empirical verification contract for a UI-surface defect.
 2a. **Record the verification verdict** — the independent, machine-readable proof that gates completion. The `verification-specialist` writes `${CLAUDE_PROJECT_DIR:-.}/.lisa/verification-status.json` with one entry per acceptance criterion, each carrying the proof command's observed evidence:
     ```json
@@ -169,7 +182,7 @@ Before shutting down the team, execute the Verify flow:
     Set `status: "pass"` only when every criterion is `pass` with real evidence (output from running the system, not a claim). The verdict must be judged by an agent that did NOT implement the change (the `verification-specialist`), never self-certified by the implementer. This is runtime scratch — it is gitignored and MUST NOT be committed (treat it like the secrets exclusion in the commit step).
     On Claude, the `enforce-verification-gate.sh` Stop hook reads this file and **will not let the flow stop** until it shows a terminal, all-`pass` verdict — carrying over the non-bypassable completion gate of the `/goal` primitive, but checked deterministically against real evidence rather than by a transcript-only evaluator model. If you must stop before completion (a readiness gate failed, a blocker was found, a dependency is unresolved), write the verdict with `status: "blocked"` and the reason: that records the outcome and releases the gate instead of leaving it to spin. Other harnesses fall back to this prose obligation.
-3. Write e2e test encoding the verification
+3. Write the highest-practical-observation regression test encoding the verification. For user-visible bugs or user-visible Build changes with an available browser/device/e2e harness, this means a deterministic spec on the reported surface. Prove the new spec actually executed and passed in PR CI by recording a named spec log/reporter line or equivalent execution record; green CI without that named evidence does not satisfy this step.
 4. Record Implement usage on the originating work artifact via `lisa:usage-accounting` so the work item (or other implementation-owned artifact) gains a direct `implement` usage entry in the canonical `## Lisa Usage` section. If the parent / child graph is already known, prefer `record_and_rollup` so ancestor totals refresh in the same write; otherwise still write the direct entry, and if runtime usage is unavailable, use `source: unavailable` with nullable token/cost fields instead of skipping the row.
 5. Commit ALL outstanding changes in logical batches on the branch (minus sensitive data/information) — not just changes made by the agent team. This includes pre-existing uncommitted changes that were on the branch before the plan started. Do NOT filter commits to only "task-related" files. If it shows up in git status, it gets committed (unless it contains secrets).
 6. Push the changes - if any pre-push hook blocks you, create a task for the agent team to fix the error/problem whether it was pre-existing or not

package/plugins/src/base/skills/tdd-implementation/SKILL.md CHANGED Viewed

@@ -62,6 +62,9 @@ TDD Cycle:
 - Focus on testing behavior, not implementation details
 - The test must fail before you write any production code
 - If the imported module doesn't exist, Jest reports 0 tests found (not N failed) — this is expected RED behavior
+- For a Fix task, or a Build task that changes user-visible behavior, include a regression test at the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for that platform (for example Playwright, Maestro, Detox, Cypress, or an equivalent runtime), the RED test plan must include a deterministic spec against the reported surface, using mocked or seeded data where needed.
+- The team lead may not waive, defer, or mark that user-visible regression spec as optional, "if cheap", or equivalent. The only exits are a recorded absence of an end-to-end harness for the affected platform, or a genuine technical blocker with a linked build-ready follow-up ticket created before merge and referenced from the PR and source work item.
+- A regression spec is not complete merely because it exists. Completion evidence must prove the spec actually ran and passed in PR CI with a named log line, reporter output, or equivalent execution record. Guard against `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
 ### GREEN Phase

package/plugins/src/base/skills/verification-lifecycle/SKILL.md CHANGED Viewed

@@ -58,12 +58,20 @@ For each verification type, state:
 A verification plan that only lists `bun run test`, `bun run typecheck`, or `bun run lint` is NOT a verification plan. Those are quality gates handled in step 1.
+For a user-visible Fix, or a Build change that affects user-visible behavior, the verification plan must include the highest practical observation level for the reported surface. If the project has a browser, device, or end-to-end harness for the affected platform, plan a deterministic regression spec against that surface and the empirical command that observes the same surface. Unit-level or API-only verification does not satisfy a UI-surface defect when browser/device automation is available.
+The lead cannot waive, defer, or demote this regression spec as optional, "if cheap", or equivalent. The only acceptable exits are a recorded absence of an end-to-end harness for the platform, or a genuine technical blocker that is captured before merge as a linked build-ready follow-up ticket referenced from the PR and source work item.
 ### 6. Execute
 After implementation, run the verification plan. Execute each verification type in order.
 Evidence output must explicitly label each verification result as either `verified empirically` or `artifact-only / verification deferred`. Artifact-only evidence can support a blocked escalation packet, but it cannot mark a required runtime verification complete.
+For a required user-visible regression spec, evidence must prove execution, not only existence. Record a CI log line, reporter output, or equivalent artifact that names the new spec and shows it ran and passed in the PR. A green CI run without named execution proof is not enough; explicitly check for `test.skip`, suite-level environment gates, shard filters, and "0 tests" passes.
+If auto-merge is enabled while the regression spec is still in flight, disable auto-merge or apply an equivalent merge gate until the spec commit is pushed and its CI execution proof is available. Do not let the PR merge before the required regression deliverable is satisfied or formally blocked through the linked follow-up path.
 ### 7. Codify
 After each empirical verification produces PASS evidence, invoke the `codify-verification` skill to encode the verification as an automated regression test. The manual proof becomes a repeatable check that catches future regressions.
@@ -72,6 +80,8 @@ The `codify-verification` skill maps the verification type to the appropriate fr
 Codification is mandatory for every empirical verification type with one exception set: PR, Documentation, Deploy, and Investigate-Only spikes — those have inherently non-behavioral proof. For every other type, skipping codification is not allowed; if codification is genuinely impossible (e.g., the test framework does not exist and cannot be installed in scope), escalate via the Escalation Protocol rather than silently skipping.
+For UI-surface defects with an available browser/device/e2e harness, codification must happen in that harness or the nearest surface-equivalent automated harness. Lower-level tests may be added for diagnosis or edge cases, but they do not replace the reported-surface regression spec.
 A change is not "verified" in the lifecycle sense until each empirical verification has both passed AND been codified.
 ### 8. Spec Conformance