npm - @sun-asterisk/sungen - Versions diffs - 3.2.2-beta.8 → 3.2.2-beta.9 - Mend

@sun-asterisk/sungen 3.2.2-beta.8 → 3.2.2-beta.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/dist/orchestrator/templates/ai-instructions/claude-cmd-create-test.md CHANGED Viewed

@@ -105,6 +105,7 @@ If the unit is **api-first** (`qa/api/<name>/` or `qa/api/flows/<name>/`), the d
      - **BALANCE** → stop expanding secondary viewpoints; add business-core scenarios first.
      - **TRACE** → align `VP-` ids with the viewpoint-overview.
    - Stop when the gate PASSes and findings clear, **or** the budget is exhausted → report residual gaps honestly (never fake a pass).
+   - **Phase gate (boundary — do NOT skip).** Run `sungen gate --screen <name> --phase create` (Bash, exit 2 = HALT). It is the deterministic create-boundary: every required obligation (spec · coverage · depth · trace) must be **satisfied or explicitly waived**. On **HALT**, you have not cleared the phase — keep repairing the blocking obligation(s) within budget; if a blocker is a genuinely-accepted gap (e.g. cross-screen depth owned by a flow), record it with `sungen journey --screen <name> --waive <OB> --reason "..."` (reason mandatory). **Do not converge (step 6) past a HALT** without a fix or a reasoned waiver — no bad output crosses the boundary.
 5.6. **Record (reuse + observability).** Build the manifest and report usage:
    - `sungen manifest --screen <name>` — fingerprints for next-run change detection. On a **re-run**, start the whole command by `sungen manifest --screen <name> --diff` and only regenerate scenarios whose spec section changed (keep/regenerate/retire).

package/dist/orchestrator/templates/ai-instructions/claude-cmd-run-test.md CHANGED Viewed

@@ -104,6 +104,7 @@ If the unit is **api-first**, skip every selector/capture phase (an API test has
 9. **Integrity check & trace (always run after the final run).**
    - `sungen script-check --screen <name>` — verify the generated spec is a **1:1** of the Gherkin (every non-@manual scenario ↔ one `test()`, no drift). If it reports **DRIFT** (spec hand-edited or stale), re-run `sungen generate --screen <name>` so the spec matches the feature, then re-run — **never hand-edit the generated spec** (auto-fix must edit `selectors.yaml`, not the `.spec.ts`).
    - `sungen ledger record --screen <name> --step run --ms <elapsed>` (record this run), then `sungen trace --screen <name>` — show the process map + bottlenecks + **HUMAN-LOOP FOCUS** (the @manual scenarios the QA must verify) to the user.
+   - **Phase gate (boundary — do NOT skip).** `sungen gate --screen <name> --phase run` (exit 2 = HALT): the run-boundary obligations (incl. automation) must be **satisfied or explicitly waived**. On **HALT**, classify per `sungen-error-mapping` § Source of truth (#387): a **selector-resolution** failure → fix `selectors.yaml` + re-run; an **assertion-vs-spec** failure → **report it as a candidate bug / leave it FAIL** (never weaken the assertion or edit the expected to pass); a genuinely-accepted gap → `sungen journey --screen <name> --waive <OB> --reason "..."`. Do **not** declare the run "done" past a HALT without a fix, a reported bug, or a reasoned waiver.
 10. **Capability-pending offer (consent-gated).** If `sungen audit --screen <name>` reports `AUTOMATION-READY-PENDING` (or the run shows `@requires:<cap>` tests skipped "requires …"), these are **automation-ready** scenarios waiting on an opt-in driver. Use `AskUserQuestion` to offer: *"N scenario(s) are automation-ready — enable `<cap>` to run them? (`sungen capability add <cap>`)"*. **Only on the user's yes** run `sungen capability add <cap>` then re-run those specs; on no, leave them skipped (they are NOT failures and NOT manual). **Never auto-install.**
 ## Playwright command guidelines

package/dist/orchestrator/templates/ai-instructions/copilot-cmd-create-test.md CHANGED Viewed

@@ -68,6 +68,7 @@ If the unit is **api-first** (`qa/api/<name>/` or `qa/api/flows/<name>/`), the d
    > **No parallel fan-out here.** Copilot has no sub-agents, so generation is sequential (the Claude Code variant fans out one `sungen-generator` per viewpoint group and merges). Same output, no speedup.
 5.4. **Depth self-check (deterministic — BEFORE the audit).** Run `sungen depth-lint --screen ${input:name}`. It splits every shallow business-critical scenario into **DEEPEN IN PLACE** (add a real value assertion — the printed `template` is a theme-keyed hint, apply judgment to the actual claim; never fake one onto a visibility/behavior scenario) and **CROSS-SCREEN** (route to a flow / tag `@manual:Mx` + reason — removes it from the depth denominator honestly). Act on both, re-run until `deepen` is empty (or only honest over-counts remain), THEN gate. Lifts first-pass `businessDepth` mechanically instead of via 2–3 repair rounds.
 5.5. **Quality gate & repair (harness — always run).** Per `sungen-harness-audit`: run `sungen audit --screen ${input:name}` (structural), THEN do an **independent semantic review inline** using the `sungen-reviewer` criteria (does each scenario's steps PROVE its title/viewpoint? observable Thens? business-critical assertion depth?). Merge both sets of issues; if gate FAILs / findings exist, repair (budget 3) and re-audit — GATE missing theme → generate it (cross-screen → **automate it in the flow** via `/sungen:add-flow`, NOT a full `@manual` screen duplicate — `sungen audit` flags an automatable `@manual` as `MANUAL-AUTOMATABLE`; reserve `@manual:Mx` for true judgment/missing-capability); DEPTH → add data assertions; BALANCE → add business-core first; TRACE → align VP ids. Never fake a pass.
+5.5b. **Phase gate (boundary — do NOT skip).** Run `sungen gate --screen ${input:name} --phase create` (exit 2 = HALT): every required obligation (spec · coverage · depth · trace) must be **satisfied or explicitly waived**. On **HALT**, keep repairing within budget; a genuinely-accepted gap → `sungen journey --screen ${input:name} --waive <OB> --reason "..."` (reason mandatory). Do **not** converge (step 6) past a HALT without a fix or a reasoned waiver.
 5.6. **Record.** `sungen manifest --screen ${input:name}`. Ledger **each phase** (not just repair) — pick one `runId` at the start and pass it so `trace`/`ledger report` show THIS run, not a mix: `sungen ledger record --screen ${input:name} --run <runId> --step <discovery|viewpoint|gherkin|audit|repair:N> --ms <elapsed>`. On re-run, start with `sungen manifest --screen ${input:name} --diff` and only regenerate changed sections.
 6. **Converge — show the trace.** Run `sungen trace --screen ${input:name}` and present: process map (phases + repair rounds), bottlenecks, **HUMAN-LOOP FOCUS** (@manual to verify), audit score + gate + residual gaps. Then offer next steps based on which tier was just generated:

package/dist/orchestrator/templates/ai-instructions/copilot-cmd-run-test.md CHANGED Viewed

@@ -95,6 +95,7 @@ If the unit is **api-first**, skip every selector/capture phase (an API test has
 7. **Phase 3 — Full Run**: Run all tests. Fix only **new** failures (elements unique to `@normal`/`@low`). Max 1 attempt. Don't loop on low-priority failures.
 8. **Phase 4 — Regression**: One final full run. Report results. No more fix loops.
 9. **Integrity & trace (always run after the final run).** `sungen script-check --screen <name>` — verify the spec is a **1:1** of the Gherkin; if **DRIFT**, re-run `sungen generate --screen <name>` (never hand-edit the `.spec.ts` — auto-fix edits `selectors.yaml`). Then `sungen ledger record --screen <name> --step run --ms <elapsed>` and `sungen trace --screen <name>` to show the process map + bottlenecks + **HUMAN-LOOP FOCUS**.
+9b. **Phase gate (boundary — do NOT skip).** `sungen gate --screen <name> --phase run` (exit 2 = HALT): run-boundary obligations (incl. automation) must be **satisfied or explicitly waived**. On HALT, classify per `sungen-error-mapping` § Source of truth (#387): selector-resolution failure → fix `selectors.yaml` + re-run; assertion-vs-spec failure → **report as a candidate bug / leave it FAIL** (never weaken to pass); accepted gap → `sungen journey --screen <name> --waive <OB> --reason "..."`. Don't declare "done" past a HALT without a fix, a reported bug, or a reasoned waiver.
 10. **Capability-pending offer (consent-gated).** If `sungen audit` reports `AUTOMATION-READY-PENDING` (or `@requires:<cap>` tests are skipped "requires …"), offer: *"N scenario(s) are automation-ready — enable `<cap>` to run them? (`sungen capability add <cap>`)"*. Only on the user's yes, run `sungen capability add <cap>` + re-run; on no, leave skipped (not failures, not manual). **Never auto-install.**
 ## Playwright command guidelines

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@sun-asterisk/sungen",
-  "version": "3.2.2-beta.8",
+  "version": "3.2.2-beta.9",
   "description": "Deterministic E2E Test Compiler - Gherkin + Selectors → Playwright tests",
   "main": "src/index.ts",
   "types": "src/index.ts",
@@ -33,7 +33,7 @@
     "node": ">=18.0.0"
   },
   "dependencies": {
-    "@sungen/driver-ui": "3.2.2-beta.8",
+    "@sungen/driver-ui": "3.2.2-beta.9",
     "@anthropic-ai/sdk": "^0.71.0",
     "@babel/parser": "^7.28.5",
     "@babel/traverse": "^7.28.5",

package/src/orchestrator/templates/ai-instructions/claude-cmd-create-test.md CHANGED Viewed

@@ -105,6 +105,7 @@ If the unit is **api-first** (`qa/api/<name>/` or `qa/api/flows/<name>/`), the d
      - **BALANCE** → stop expanding secondary viewpoints; add business-core scenarios first.
      - **TRACE** → align `VP-` ids with the viewpoint-overview.
    - Stop when the gate PASSes and findings clear, **or** the budget is exhausted → report residual gaps honestly (never fake a pass).
+   - **Phase gate (boundary — do NOT skip).** Run `sungen gate --screen <name> --phase create` (Bash, exit 2 = HALT). It is the deterministic create-boundary: every required obligation (spec · coverage · depth · trace) must be **satisfied or explicitly waived**. On **HALT**, you have not cleared the phase — keep repairing the blocking obligation(s) within budget; if a blocker is a genuinely-accepted gap (e.g. cross-screen depth owned by a flow), record it with `sungen journey --screen <name> --waive <OB> --reason "..."` (reason mandatory). **Do not converge (step 6) past a HALT** without a fix or a reasoned waiver — no bad output crosses the boundary.
 5.6. **Record (reuse + observability).** Build the manifest and report usage:
    - `sungen manifest --screen <name>` — fingerprints for next-run change detection. On a **re-run**, start the whole command by `sungen manifest --screen <name> --diff` and only regenerate scenarios whose spec section changed (keep/regenerate/retire).

package/src/orchestrator/templates/ai-instructions/claude-cmd-run-test.md CHANGED Viewed

@@ -104,6 +104,7 @@ If the unit is **api-first**, skip every selector/capture phase (an API test has
 9. **Integrity check & trace (always run after the final run).**
    - `sungen script-check --screen <name>` — verify the generated spec is a **1:1** of the Gherkin (every non-@manual scenario ↔ one `test()`, no drift). If it reports **DRIFT** (spec hand-edited or stale), re-run `sungen generate --screen <name>` so the spec matches the feature, then re-run — **never hand-edit the generated spec** (auto-fix must edit `selectors.yaml`, not the `.spec.ts`).
    - `sungen ledger record --screen <name> --step run --ms <elapsed>` (record this run), then `sungen trace --screen <name>` — show the process map + bottlenecks + **HUMAN-LOOP FOCUS** (the @manual scenarios the QA must verify) to the user.
+   - **Phase gate (boundary — do NOT skip).** `sungen gate --screen <name> --phase run` (exit 2 = HALT): the run-boundary obligations (incl. automation) must be **satisfied or explicitly waived**. On **HALT**, classify per `sungen-error-mapping` § Source of truth (#387): a **selector-resolution** failure → fix `selectors.yaml` + re-run; an **assertion-vs-spec** failure → **report it as a candidate bug / leave it FAIL** (never weaken the assertion or edit the expected to pass); a genuinely-accepted gap → `sungen journey --screen <name> --waive <OB> --reason "..."`. Do **not** declare the run "done" past a HALT without a fix, a reported bug, or a reasoned waiver.
 10. **Capability-pending offer (consent-gated).** If `sungen audit --screen <name>` reports `AUTOMATION-READY-PENDING` (or the run shows `@requires:<cap>` tests skipped "requires …"), these are **automation-ready** scenarios waiting on an opt-in driver. Use `AskUserQuestion` to offer: *"N scenario(s) are automation-ready — enable `<cap>` to run them? (`sungen capability add <cap>`)"*. **Only on the user's yes** run `sungen capability add <cap>` then re-run those specs; on no, leave them skipped (they are NOT failures and NOT manual). **Never auto-install.**
 ## Playwright command guidelines

package/src/orchestrator/templates/ai-instructions/copilot-cmd-create-test.md CHANGED Viewed

@@ -68,6 +68,7 @@ If the unit is **api-first** (`qa/api/<name>/` or `qa/api/flows/<name>/`), the d
    > **No parallel fan-out here.** Copilot has no sub-agents, so generation is sequential (the Claude Code variant fans out one `sungen-generator` per viewpoint group and merges). Same output, no speedup.
 5.4. **Depth self-check (deterministic — BEFORE the audit).** Run `sungen depth-lint --screen ${input:name}`. It splits every shallow business-critical scenario into **DEEPEN IN PLACE** (add a real value assertion — the printed `template` is a theme-keyed hint, apply judgment to the actual claim; never fake one onto a visibility/behavior scenario) and **CROSS-SCREEN** (route to a flow / tag `@manual:Mx` + reason — removes it from the depth denominator honestly). Act on both, re-run until `deepen` is empty (or only honest over-counts remain), THEN gate. Lifts first-pass `businessDepth` mechanically instead of via 2–3 repair rounds.
 5.5. **Quality gate & repair (harness — always run).** Per `sungen-harness-audit`: run `sungen audit --screen ${input:name}` (structural), THEN do an **independent semantic review inline** using the `sungen-reviewer` criteria (does each scenario's steps PROVE its title/viewpoint? observable Thens? business-critical assertion depth?). Merge both sets of issues; if gate FAILs / findings exist, repair (budget 3) and re-audit — GATE missing theme → generate it (cross-screen → **automate it in the flow** via `/sungen:add-flow`, NOT a full `@manual` screen duplicate — `sungen audit` flags an automatable `@manual` as `MANUAL-AUTOMATABLE`; reserve `@manual:Mx` for true judgment/missing-capability); DEPTH → add data assertions; BALANCE → add business-core first; TRACE → align VP ids. Never fake a pass.
+5.5b. **Phase gate (boundary — do NOT skip).** Run `sungen gate --screen ${input:name} --phase create` (exit 2 = HALT): every required obligation (spec · coverage · depth · trace) must be **satisfied or explicitly waived**. On **HALT**, keep repairing within budget; a genuinely-accepted gap → `sungen journey --screen ${input:name} --waive <OB> --reason "..."` (reason mandatory). Do **not** converge (step 6) past a HALT without a fix or a reasoned waiver.
 5.6. **Record.** `sungen manifest --screen ${input:name}`. Ledger **each phase** (not just repair) — pick one `runId` at the start and pass it so `trace`/`ledger report` show THIS run, not a mix: `sungen ledger record --screen ${input:name} --run <runId> --step <discovery|viewpoint|gherkin|audit|repair:N> --ms <elapsed>`. On re-run, start with `sungen manifest --screen ${input:name} --diff` and only regenerate changed sections.
 6. **Converge — show the trace.** Run `sungen trace --screen ${input:name}` and present: process map (phases + repair rounds), bottlenecks, **HUMAN-LOOP FOCUS** (@manual to verify), audit score + gate + residual gaps. Then offer next steps based on which tier was just generated:

package/src/orchestrator/templates/ai-instructions/copilot-cmd-run-test.md CHANGED Viewed

@@ -95,6 +95,7 @@ If the unit is **api-first**, skip every selector/capture phase (an API test has
 7. **Phase 3 — Full Run**: Run all tests. Fix only **new** failures (elements unique to `@normal`/`@low`). Max 1 attempt. Don't loop on low-priority failures.
 8. **Phase 4 — Regression**: One final full run. Report results. No more fix loops.
 9. **Integrity & trace (always run after the final run).** `sungen script-check --screen <name>` — verify the spec is a **1:1** of the Gherkin; if **DRIFT**, re-run `sungen generate --screen <name>` (never hand-edit the `.spec.ts` — auto-fix edits `selectors.yaml`). Then `sungen ledger record --screen <name> --step run --ms <elapsed>` and `sungen trace --screen <name>` to show the process map + bottlenecks + **HUMAN-LOOP FOCUS**.
+9b. **Phase gate (boundary — do NOT skip).** `sungen gate --screen <name> --phase run` (exit 2 = HALT): run-boundary obligations (incl. automation) must be **satisfied or explicitly waived**. On HALT, classify per `sungen-error-mapping` § Source of truth (#387): selector-resolution failure → fix `selectors.yaml` + re-run; assertion-vs-spec failure → **report as a candidate bug / leave it FAIL** (never weaken to pass); accepted gap → `sungen journey --screen <name> --waive <OB> --reason "..."`. Don't declare "done" past a HALT without a fix, a reported bug, or a reasoned waiver.
 10. **Capability-pending offer (consent-gated).** If `sungen audit` reports `AUTOMATION-READY-PENDING` (or `@requires:<cap>` tests are skipped "requires …"), offer: *"N scenario(s) are automation-ready — enable `<cap>` to run them? (`sungen capability add <cap>`)"*. Only on the user's yes, run `sungen capability add <cap>` + re-run; on no, leave skipped (not failures, not manual). **Never auto-install.**
 ## Playwright command guidelines