npm - @sun-asterisk/sungen - Versions diffs - 3.2.2-beta.2 → 3.2.2-beta.4 - Mend

@sun-asterisk/sungen 3.2.2-beta.2 → 3.2.2-beta.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/dist/orchestrator/templates/ai-instructions/claude-cmd-run-test.md CHANGED Viewed

@@ -9,6 +9,8 @@ allowed-tools: Read, Grep, Bash, Glob, Edit, Write, AskUserQuestion, mcp__playwr
 You are a **Senior Developer**. Use `sungen-selector-fix`, `sungen-selector-keys`, and `sungen-error-mapping` skills.
+> ⛔ **Source of truth — the live page is NOT the oracle; `.feature`/`test-data`/`spec.md` are.** Auto-fix is for **selector-resolution** failures (wrong locator → fix `selectors.yaml`). An **assertion-value** failure where the app contradicts the spec is a **CANDIDATE BUG → report it, let it FAIL** — never loosen the rule, weaken the assertion, edit the expected value/`.feature`, or hand-edit the generated `.spec.ts` to make it pass. See `sungen-error-mapping` § "Source of truth". (A `password > 8` test that fails on 6 chars is a bug to report, not a `>= 6` edit.)
 ## Parameters
 Parse from `$ARGUMENTS`:

package/dist/orchestrator/templates/ai-instructions/claude-skill-error-mapping.md CHANGED Viewed

@@ -21,6 +21,23 @@ Then choose the fix from the patterns below.
 ---
+## ⛔ Source of truth — classify EVERY failure before you "fix" it
+`.feature` + `test-data.yaml` + `spec.md` are the **oracle**. The **live page is NOT** — it may be the thing that's broken. A failing test is not automatically a test to "make pass". Classify first:
+- **Selector-resolution failure** (element not found / wrong locator / strict-mode / wrong element type) → the test looked in the wrong place. **Fix the locator in `selectors.yaml`** (re-snapshot, copy the exact accessible name). Legit auto-fix.
+- **Assertion-value failure** (element FOUND, but observed value ≠ expected) → STOP and ask: *is the TEST wrong, or is the APP wrong?*
+  - Expected value/rule is wrong **relative to `spec.md`** (typo, stale test-data) → fix `test-data.yaml`/`.feature` so it matches the **spec** — never the live page.
+  - App behaviour contradicts `spec.md` (spec says X, app shows Y) → **CANDIDATE BUG**. **Report it** (let the test FAIL / surface to the QA in the run summary). **NEVER** change the expected value, loosen the rule, weaken the assertion (`toHaveText`→`toContainText` to dodge a mismatch), edit `.feature`, or edit the generated `.spec.ts` to make it pass.
+> **Cardinal sin (do NOT do this):** a `password > 8 chars` rule fails on a 6-char input → "fix" it to `>= 6` so the test passes. The logic is now meaningless. A failing assertion is a **finding**, not a chore.
+**Auto-fix loop scope:** the run-test auto-fix loop engages ONLY on **selector-resolution** failures. On an assertion-value failure where the app contradicts the spec → **HALT and report**, do not loop it into passing.
+**Never hand-edit the generated `.spec.ts`** (e.g. inserting `page.evaluate`/`fetch` to bypass a broken control). `sungen script-check` regenerates the spec from `.feature` and flags any edit as DRIFT — regenerate, don't patch.
+---
 ## Fix Priority (try in order)
 1. **Auth issue** — page redirected to login? Fix auth first, everything else is noise
@@ -43,11 +60,13 @@ Then choose the fix from the patterns below.
 | not a select | Custom dropdown, not native `<select>` | Set `variant: 'custom'` |
 | Frame not found | iframe selector wrong or doesn't exist | Fix `frame` value, verify iframe in snapshot |
-### Assertion errors → fix in `test-data.yaml` or `.feature`
+### Assertion errors → apply the Source-of-truth gate above FIRST
-| Error | Diagnosis | Fix |
+> The "Fix" column below applies **only when the expected value was wrong relative to `spec.md`** (a test defect). If the app's value contradicts the spec, the row is a **candidate bug → report it, do not edit the expected to match live**. Never weaken `toHaveText`→`toContainText` just to pass.
+| Error | Diagnosis | Fix (only if the TEST was wrong per spec) |
 |---|---|---|
-| toHaveText mismatch | Expected text differs from actual | Fix value in test-data. If element is input type → change Gherkin type to `field`/`textarea` (triggers `toHaveValue` instead) |
+| toHaveText mismatch | Expected text differs from actual | If the test's expected was wrong per spec → fix value in test-data. If element is input type → change Gherkin type to `field`/`textarea` (triggers `toHaveValue`). If the app value contradicts spec → **report as bug**. |
 | toHaveValue mismatch | Expected value differs from actual | Fix value in test-data |
 | toContainText mismatch | Partial text not found | Fix expected partial text in test-data |
 | toBeVisible timeout | Element exists but hidden, or name wrong | Check: is element conditionally visible? Wrong name? Inside dialog? |

package/dist/orchestrator/templates/ai-instructions/copilot-cmd-run-test.md CHANGED Viewed

@@ -9,6 +9,8 @@ tools: [read, execute, edit, vscode/askQuestions, playwright/*]
 You are a **Senior Developer**. Use `sungen-selector-fix`, `sungen-selector-keys`, and `sungen-error-mapping` skills.
+> ⛔ **Source of truth — the live page is NOT the oracle; `.feature`/`test-data`/`spec.md` are.** Auto-fix is for **selector-resolution** failures (wrong locator → fix `selectors.yaml`). An **assertion-value** failure where the app contradicts the spec is a **CANDIDATE BUG → report it, let it FAIL** — never loosen the rule, weaken the assertion, edit the expected value/`.feature`, or hand-edit the generated `.spec.ts` to make it pass. See `sungen-error-mapping` § "Source of truth". (A `password > 8` test that fails on 6 chars is a bug to report, not a `>= 6` edit.)
 ## Parameters
 Parse from `$ARGUMENTS`:

package/dist/orchestrator/templates/ai-instructions/github-skill-sungen-error-mapping.md CHANGED Viewed

@@ -21,6 +21,23 @@ Then choose the fix from the patterns below.
 ---
+## ⛔ Source of truth — classify EVERY failure before you "fix" it
+`.feature` + `test-data.yaml` + `spec.md` are the **oracle**. The **live page is NOT** — it may be the thing that's broken. A failing test is not automatically a test to "make pass". Classify first:
+- **Selector-resolution failure** (element not found / wrong locator / strict-mode / wrong element type) → the test looked in the wrong place. **Fix the locator in `selectors.yaml`** (re-snapshot, copy the exact accessible name). Legit auto-fix.
+- **Assertion-value failure** (element FOUND, but observed value ≠ expected) → STOP and ask: *is the TEST wrong, or is the APP wrong?*
+  - Expected value/rule is wrong **relative to `spec.md`** (typo, stale test-data) → fix `test-data.yaml`/`.feature` so it matches the **spec** — never the live page.
+  - App behaviour contradicts `spec.md` (spec says X, app shows Y) → **CANDIDATE BUG**. **Report it** (let the test FAIL / surface to the QA in the run summary). **NEVER** change the expected value, loosen the rule, weaken the assertion (`toHaveText`→`toContainText` to dodge a mismatch), edit `.feature`, or edit the generated `.spec.ts` to make it pass.
+> **Cardinal sin (do NOT do this):** a `password > 8 chars` rule fails on a 6-char input → "fix" it to `>= 6` so the test passes. The logic is now meaningless. A failing assertion is a **finding**, not a chore.
+**Auto-fix loop scope:** the run-test auto-fix loop engages ONLY on **selector-resolution** failures. On an assertion-value failure where the app contradicts the spec → **HALT and report**, do not loop it into passing.
+**Never hand-edit the generated `.spec.ts`** (e.g. inserting `page.evaluate`/`fetch` to bypass a broken control). `sungen script-check` regenerates the spec from `.feature` and flags any edit as DRIFT — regenerate, don't patch.
+---
 ## Fix Priority (try in order)
 1. **Auth issue** — page redirected to login? Fix auth first, everything else is noise
@@ -43,11 +60,13 @@ Then choose the fix from the patterns below.
 | not a select | Custom dropdown, not native `<select>` | Set `variant: 'custom'` |
 | Frame not found | iframe selector wrong or doesn't exist | Fix `frame` value, verify iframe in snapshot |
-### Assertion errors → fix in `test-data.yaml` or `.feature`
+### Assertion errors → apply the Source-of-truth gate above FIRST
-| Error | Diagnosis | Fix |
+> The "Fix" column below applies **only when the expected value was wrong relative to `spec.md`** (a test defect). If the app's value contradicts the spec, the row is a **candidate bug → report it, do not edit the expected to match live**. Never weaken `toHaveText`→`toContainText` just to pass.
+| Error | Diagnosis | Fix (only if the TEST was wrong per spec) |
 |---|---|---|
-| toHaveText mismatch | Expected text differs from actual | Fix value in test-data. If element is input type → change Gherkin type to `field`/`textarea` (triggers `toHaveValue` instead) |
+| toHaveText mismatch | Expected text differs from actual | If the test's expected was wrong per spec → fix value in test-data. If element is input type → change Gherkin type to `field`/`textarea` (triggers `toHaveValue`). If the app value contradicts spec → **report as bug**. |
 | toHaveValue mismatch | Expected value differs from actual | Fix value in test-data |
 | toContainText mismatch | Partial text not found | Fix expected partial text in test-data |
 | toBeVisible timeout | Element exists but hidden, or name wrong | Check: is element conditionally visible? Wrong name? Inside dialog? |

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@sun-asterisk/sungen",
-  "version": "3.2.2-beta.2",
+  "version": "3.2.2-beta.4",
   "description": "Deterministic E2E Test Compiler - Gherkin + Selectors → Playwright tests",
   "main": "src/index.ts",
   "types": "src/index.ts",
@@ -33,7 +33,7 @@
     "node": ">=18.0.0"
   },
   "dependencies": {
-    "@sungen/driver-ui": "3.2.2-beta.2",
+    "@sungen/driver-ui": "3.2.2-beta.4",
     "@anthropic-ai/sdk": "^0.71.0",
     "@babel/parser": "^7.28.5",
     "@babel/traverse": "^7.28.5",

package/src/orchestrator/templates/ai-instructions/claude-cmd-run-test.md CHANGED Viewed

@@ -9,6 +9,8 @@ allowed-tools: Read, Grep, Bash, Glob, Edit, Write, AskUserQuestion, mcp__playwr
 You are a **Senior Developer**. Use `sungen-selector-fix`, `sungen-selector-keys`, and `sungen-error-mapping` skills.
+> ⛔ **Source of truth — the live page is NOT the oracle; `.feature`/`test-data`/`spec.md` are.** Auto-fix is for **selector-resolution** failures (wrong locator → fix `selectors.yaml`). An **assertion-value** failure where the app contradicts the spec is a **CANDIDATE BUG → report it, let it FAIL** — never loosen the rule, weaken the assertion, edit the expected value/`.feature`, or hand-edit the generated `.spec.ts` to make it pass. See `sungen-error-mapping` § "Source of truth". (A `password > 8` test that fails on 6 chars is a bug to report, not a `>= 6` edit.)
 ## Parameters
 Parse from `$ARGUMENTS`:

package/src/orchestrator/templates/ai-instructions/claude-skill-error-mapping.md CHANGED Viewed

@@ -21,6 +21,23 @@ Then choose the fix from the patterns below.
 ---
+## ⛔ Source of truth — classify EVERY failure before you "fix" it
+`.feature` + `test-data.yaml` + `spec.md` are the **oracle**. The **live page is NOT** — it may be the thing that's broken. A failing test is not automatically a test to "make pass". Classify first:
+- **Selector-resolution failure** (element not found / wrong locator / strict-mode / wrong element type) → the test looked in the wrong place. **Fix the locator in `selectors.yaml`** (re-snapshot, copy the exact accessible name). Legit auto-fix.
+- **Assertion-value failure** (element FOUND, but observed value ≠ expected) → STOP and ask: *is the TEST wrong, or is the APP wrong?*
+  - Expected value/rule is wrong **relative to `spec.md`** (typo, stale test-data) → fix `test-data.yaml`/`.feature` so it matches the **spec** — never the live page.
+  - App behaviour contradicts `spec.md` (spec says X, app shows Y) → **CANDIDATE BUG**. **Report it** (let the test FAIL / surface to the QA in the run summary). **NEVER** change the expected value, loosen the rule, weaken the assertion (`toHaveText`→`toContainText` to dodge a mismatch), edit `.feature`, or edit the generated `.spec.ts` to make it pass.
+> **Cardinal sin (do NOT do this):** a `password > 8 chars` rule fails on a 6-char input → "fix" it to `>= 6` so the test passes. The logic is now meaningless. A failing assertion is a **finding**, not a chore.
+**Auto-fix loop scope:** the run-test auto-fix loop engages ONLY on **selector-resolution** failures. On an assertion-value failure where the app contradicts the spec → **HALT and report**, do not loop it into passing.
+**Never hand-edit the generated `.spec.ts`** (e.g. inserting `page.evaluate`/`fetch` to bypass a broken control). `sungen script-check` regenerates the spec from `.feature` and flags any edit as DRIFT — regenerate, don't patch.
+---
 ## Fix Priority (try in order)
 1. **Auth issue** — page redirected to login? Fix auth first, everything else is noise
@@ -43,11 +60,13 @@ Then choose the fix from the patterns below.
 | not a select | Custom dropdown, not native `<select>` | Set `variant: 'custom'` |
 | Frame not found | iframe selector wrong or doesn't exist | Fix `frame` value, verify iframe in snapshot |
-### Assertion errors → fix in `test-data.yaml` or `.feature`
+### Assertion errors → apply the Source-of-truth gate above FIRST
-| Error | Diagnosis | Fix |
+> The "Fix" column below applies **only when the expected value was wrong relative to `spec.md`** (a test defect). If the app's value contradicts the spec, the row is a **candidate bug → report it, do not edit the expected to match live**. Never weaken `toHaveText`→`toContainText` just to pass.
+| Error | Diagnosis | Fix (only if the TEST was wrong per spec) |
 |---|---|---|
-| toHaveText mismatch | Expected text differs from actual | Fix value in test-data. If element is input type → change Gherkin type to `field`/`textarea` (triggers `toHaveValue` instead) |
+| toHaveText mismatch | Expected text differs from actual | If the test's expected was wrong per spec → fix value in test-data. If element is input type → change Gherkin type to `field`/`textarea` (triggers `toHaveValue`). If the app value contradicts spec → **report as bug**. |
 | toHaveValue mismatch | Expected value differs from actual | Fix value in test-data |
 | toContainText mismatch | Partial text not found | Fix expected partial text in test-data |
 | toBeVisible timeout | Element exists but hidden, or name wrong | Check: is element conditionally visible? Wrong name? Inside dialog? |

package/src/orchestrator/templates/ai-instructions/copilot-cmd-run-test.md CHANGED Viewed

@@ -9,6 +9,8 @@ tools: [read, execute, edit, vscode/askQuestions, playwright/*]
 You are a **Senior Developer**. Use `sungen-selector-fix`, `sungen-selector-keys`, and `sungen-error-mapping` skills.
+> ⛔ **Source of truth — the live page is NOT the oracle; `.feature`/`test-data`/`spec.md` are.** Auto-fix is for **selector-resolution** failures (wrong locator → fix `selectors.yaml`). An **assertion-value** failure where the app contradicts the spec is a **CANDIDATE BUG → report it, let it FAIL** — never loosen the rule, weaken the assertion, edit the expected value/`.feature`, or hand-edit the generated `.spec.ts` to make it pass. See `sungen-error-mapping` § "Source of truth". (A `password > 8` test that fails on 6 chars is a bug to report, not a `>= 6` edit.)
 ## Parameters
 Parse from `$ARGUMENTS`:

package/src/orchestrator/templates/ai-instructions/github-skill-sungen-error-mapping.md CHANGED Viewed

@@ -21,6 +21,23 @@ Then choose the fix from the patterns below.
 ---
+## ⛔ Source of truth — classify EVERY failure before you "fix" it
+`.feature` + `test-data.yaml` + `spec.md` are the **oracle**. The **live page is NOT** — it may be the thing that's broken. A failing test is not automatically a test to "make pass". Classify first:
+- **Selector-resolution failure** (element not found / wrong locator / strict-mode / wrong element type) → the test looked in the wrong place. **Fix the locator in `selectors.yaml`** (re-snapshot, copy the exact accessible name). Legit auto-fix.
+- **Assertion-value failure** (element FOUND, but observed value ≠ expected) → STOP and ask: *is the TEST wrong, or is the APP wrong?*
+  - Expected value/rule is wrong **relative to `spec.md`** (typo, stale test-data) → fix `test-data.yaml`/`.feature` so it matches the **spec** — never the live page.
+  - App behaviour contradicts `spec.md` (spec says X, app shows Y) → **CANDIDATE BUG**. **Report it** (let the test FAIL / surface to the QA in the run summary). **NEVER** change the expected value, loosen the rule, weaken the assertion (`toHaveText`→`toContainText` to dodge a mismatch), edit `.feature`, or edit the generated `.spec.ts` to make it pass.
+> **Cardinal sin (do NOT do this):** a `password > 8 chars` rule fails on a 6-char input → "fix" it to `>= 6` so the test passes. The logic is now meaningless. A failing assertion is a **finding**, not a chore.
+**Auto-fix loop scope:** the run-test auto-fix loop engages ONLY on **selector-resolution** failures. On an assertion-value failure where the app contradicts the spec → **HALT and report**, do not loop it into passing.
+**Never hand-edit the generated `.spec.ts`** (e.g. inserting `page.evaluate`/`fetch` to bypass a broken control). `sungen script-check` regenerates the spec from `.feature` and flags any edit as DRIFT — regenerate, don't patch.
+---
 ## Fix Priority (try in order)
 1. **Auth issue** — page redirected to login? Fix auth first, everything else is noise
@@ -43,11 +60,13 @@ Then choose the fix from the patterns below.
 | not a select | Custom dropdown, not native `<select>` | Set `variant: 'custom'` |
 | Frame not found | iframe selector wrong or doesn't exist | Fix `frame` value, verify iframe in snapshot |
-### Assertion errors → fix in `test-data.yaml` or `.feature`
+### Assertion errors → apply the Source-of-truth gate above FIRST
-| Error | Diagnosis | Fix |
+> The "Fix" column below applies **only when the expected value was wrong relative to `spec.md`** (a test defect). If the app's value contradicts the spec, the row is a **candidate bug → report it, do not edit the expected to match live**. Never weaken `toHaveText`→`toContainText` just to pass.
+| Error | Diagnosis | Fix (only if the TEST was wrong per spec) |
 |---|---|---|
-| toHaveText mismatch | Expected text differs from actual | Fix value in test-data. If element is input type → change Gherkin type to `field`/`textarea` (triggers `toHaveValue` instead) |
+| toHaveText mismatch | Expected text differs from actual | If the test's expected was wrong per spec → fix value in test-data. If element is input type → change Gherkin type to `field`/`textarea` (triggers `toHaveValue`). If the app value contradicts spec → **report as bug**. |
 | toHaveValue mismatch | Expected value differs from actual | Fix value in test-data |
 | toContainText mismatch | Partial text not found | Fix expected partial text in test-data |
 | toBeVisible timeout | Element exists but hidden, or name wrong | Check: is element conditionally visible? Wrong name? Inside dialog? |