npm - @sun-asterisk/sungen - Versions diffs - 3.2.1-beta.1 → 3.2.2-beta.10 - Mend

@sun-asterisk/sungen 3.2.1-beta.1 → 3.2.2-beta.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (86) hide show

package/src/orchestrator/templates/ai-instructions/github-skill-sungen-delivery.md CHANGED Viewed

@@ -88,6 +88,33 @@ Multi-locale (no `SUNGEN_ENV`): one **`<LOCALE> Auto`** sheet per locale + a sin
 ---
+## API delivery — extra worksheet
+For **api-kind units** (`qa/api/<area>/`), the `.xlsx` gains a third worksheet **`API detail`** (appended after Auto/Manual). The main BM-2-901-13 Testcases layout is unchanged. The CSV is unchanged (16-column, no extra sheet).
+### Required sources (API detail sheet only)
+| Source | Path | Created by |
+|--------|------|------------|
+| Endpoint catalog | `qa/api/<area>/api/apis.yaml` | `sungen add --api` or `sungen api import` |
+| Scenario annotations | `qa/api/<area>/features/<feature>.feature` | `create-test` |
+### API detail column mapping
+| Column | Source |
+|--------|--------|
+| Endpoint | `path` from `apis.yaml` catalog entry |
+| Method | `method` from catalog entry (uppercased) |
+| Auth / Datasource | catalog `datasource` + any `@auth:<role>` tag from scenarios calling this endpoint |
+| Request shape | catalog `body` + `params` fields composed as `body: {…}; params: [a, b]` |
+| Expected-status matrix | `@cases:<dataset>` label for data-driven scenarios; catalog `expect.status` as fallback |
+| Flow steps | Ordered `@api:<name>` call chain from multi-call scenarios (e.g. `register → count_users`) |
+| Concurrency invariant | `@concurrent:<N>` + `@query:<oracle>` from concurrent scenarios (e.g. `ok_count=2; @query user_count`) |
+**Sources are catalog + annotations only** — Field Metadata (FM) is not required for this sheet.
+---
 ## Excluded from CSV
 - `@steps:<name>` **base** scenarios — these are setup-only, inlined into `@extend:...` scenarios at compile time

package/src/orchestrator/templates/ai-instructions/github-skill-sungen-error-mapping.md CHANGED Viewed

@@ -21,6 +21,23 @@ Then choose the fix from the patterns below.
 ---
+## ⛔ Source of truth — classify EVERY failure before you "fix" it
+`.feature` + `test-data.yaml` + `spec.md` are the **oracle**. The **live page is NOT** — it may be the thing that's broken. A failing test is not automatically a test to "make pass". Classify first:
+- **Selector-resolution failure** (element not found / wrong locator / strict-mode / wrong element type) → the test looked in the wrong place. **Fix the locator in `selectors.yaml`** (re-snapshot, copy the exact accessible name). Legit auto-fix.
+- **Assertion-value failure** (element FOUND, but observed value ≠ expected) → STOP and ask: *is the TEST wrong, or is the APP wrong?*
+  - Expected value/rule is wrong **relative to `spec.md`** (typo, stale test-data) → fix `test-data.yaml`/`.feature` so it matches the **spec** — never the live page.
+  - App behaviour contradicts `spec.md` (spec says X, app shows Y) → **CANDIDATE BUG**. **Report it** (let the test FAIL / surface to the QA in the run summary). **NEVER** change the expected value, loosen the rule, weaken the assertion (`toHaveText`→`toContainText` to dodge a mismatch), edit `.feature`, or edit the generated `.spec.ts` to make it pass.
+> **Cardinal sin (do NOT do this):** a `password > 8 chars` rule fails on a 6-char input → "fix" it to `>= 6` so the test passes. The logic is now meaningless. A failing assertion is a **finding**, not a chore.
+**Auto-fix loop scope:** the run-test auto-fix loop engages ONLY on **selector-resolution** failures. On an assertion-value failure where the app contradicts the spec → **HALT and report**, do not loop it into passing.
+**Never hand-edit the generated `.spec.ts`** (e.g. inserting `page.evaluate`/`fetch` to bypass a broken control). `sungen script-check` regenerates the spec from `.feature` and flags any edit as DRIFT — regenerate, don't patch.
+---
 ## Fix Priority (try in order)
 1. **Auth issue** — page redirected to login? Fix auth first, everything else is noise
@@ -43,11 +60,13 @@ Then choose the fix from the patterns below.
 | not a select | Custom dropdown, not native `<select>` | Set `variant: 'custom'` |
 | Frame not found | iframe selector wrong or doesn't exist | Fix `frame` value, verify iframe in snapshot |
-### Assertion errors → fix in `test-data.yaml` or `.feature`
+### Assertion errors → apply the Source-of-truth gate above FIRST
-| Error | Diagnosis | Fix |
+> The "Fix" column below applies **only when the expected value was wrong relative to `spec.md`** (a test defect). If the app's value contradicts the spec, the row is a **candidate bug → report it, do not edit the expected to match live**. Never weaken `toHaveText`→`toContainText` just to pass.
+| Error | Diagnosis | Fix (only if the TEST was wrong per spec) |
 |---|---|---|
-| toHaveText mismatch | Expected text differs from actual | Fix value in test-data. If element is input type → change Gherkin type to `field`/`textarea` (triggers `toHaveValue` instead) |
+| toHaveText mismatch | Expected text differs from actual | If the test's expected was wrong per spec → fix value in test-data. If element is input type → change Gherkin type to `field`/`textarea` (triggers `toHaveValue`). If the app value contradicts spec → **report as bug**. |
 | toHaveValue mismatch | Expected value differs from actual | Fix value in test-data |
 | toContainText mismatch | Partial text not found | Fix expected partial text in test-data |
 | toBeVisible timeout | Element exists but hidden, or name wrong | Check: is element conditionally visible? Wrong name? Inside dialog? |

package/src/orchestrator/templates/ai-instructions/github-skill-sungen-gherkin-syntax.md CHANGED Viewed

@@ -214,6 +214,8 @@ Options: `nth` `exact` `scope` `match` `variant` `frame` `contenteditable` `colu
 | `@cases:dataset` | Data-driven: run the scenario once per row of the `dataset` LIST in test-data → one `test()` per row |
 | `@query:name` | Database: run the named query from `database/queries.yaml` (precondition) and bind its rows to `{{name}}`; assert with `expect {{name.count}} …` + path access. Override params `@query:name(p={{v}})`. Repeatable. (Optional Data Driver — see Database verification above) |
 | `@api:name` | API: run the named request from `api/apis.yaml` (precondition) and bind the response to `{{name}}`; assert with `expect {{name.status}} …` + path access (`{{name.body.<path>}}`). Override params `@api:name(p={{v}})`. Repeatable. (Optional API Driver) |
+| `@concurrent:N` | API idempotency: fire the bound `@api` request N times in parallel, then bind aggregates on the `@api` name — `{{name.ok_count}}` (2xx count) and `{{name.status_counts}}` (status→count map). Assert the exactly-once invariant (`expect {{name.ok_count}} is 1`); pair with `@query` as a DB oracle. Tag order = run order: `@api` (mutate) before `@query` (verify). (Optional API Driver) |
+| `@hybrid` | One unit, two capabilities: a signed-in browser session (UI) authorizes the `@api` call — the API request reuses the UI `storageState`. (Optional API + UI Drivers) |
 ### Data-driven scenarios (`@cases`)

package/src/orchestrator/templates/ai-instructions/github-skill-sungen-tc-generation.md CHANGED Viewed

@@ -9,6 +9,8 @@ user-invocable: false
 - **Write incrementally — never emit the whole suite in one response.** Build the `.feature` in batches via successive `Write`/`Edit` (≈10–15 scenarios per call). For **Full coverage**, write tier-by-tier: `Write` Tier 1 → `Edit` append Tier 2 → `Edit` append Tier 3.
   → One huge `Write` can exceed the model's output-token cap → `API Error: Claude's response exceeded the N output token maximum`. Single-pass full coverage only fits when `CLAUDE_CODE_MAX_OUTPUT_TOKENS ≥ 64000`; otherwise batch. Batching also lets the audit/reviewer run per batch — higher quality.
+- **Generate group-by-group (sequential here).** Copilot has no sub-agents, so generate one viewpoint group/theme at a time, tier-by-tier, keeping each `VP-` theme in its own id prefix. (The Claude Code variant fans these out as parallel `sungen-generator` shards and merges — same output shape, just no speedup. Keep each theme self-contained so it would merge cleanly either way.)
 - `spec_figma.md` exists → read file only, **NEVER** call `mcp__figma__*`
   → PAT auth flow already done by `sungen-capture` (mode figma-pat); re-calling fails or duplicates work.
@@ -273,6 +275,7 @@ Security:         [S1 – admin only]
 **Depth is a GATE dimension (harness-roadmap P1) — self-raise, never silently go shallow:**
 - For every data-correctness theme the catalog marks `depth.requires: data-assertion`, emit its `depth.template` shape by **default** — don't wait for the repair loop. `sungen audit` measures `businessDepth` (ratio of these scenarios that assert data) against an intent threshold (functional ≥ 0.70); below it the **gate FAILs**.
+- **Verify depth deterministically before the gate:** run `sungen depth-lint --screen <name>`. It classifies every shallow business-critical scenario into **deepen-in-place** (add the theme's value assertion — the printed `template` is a hint, fit it to the actual claim) vs **cross-screen** (route to a flow / `@manual:Mx`). Clear the `deepen` list first — this is the mechanical way to hit `businessDepth` on the first pass instead of churning repair rounds. Never fake a value assertion onto a visibility/behavior scenario the lint over-counts; leave it and note the over-count.
 - `depth.cross_screen: true` (cart / detail / filter / brand correctness) → write the deep capture/compare shape as an **automated flow scenario** (in the flow — do NOT leave a full-step `@manual` duplicate on the screen). `@manual` is **only** for genuine judgment (M6 visual/UX · M8 not-worth · M9 human) or a missing capability (M1–M5/M7), and it **must** carry a reason code (`@manual:Mx`, or a reason comment the planner can infer). A `@manual` scenario that still has full automatable steps (a data assertion, no visual/mock/a11y judgment) is now flagged by `sungen audit` as `MANUAL-AUTOMATABLE`, and business-critical scenarios you defer to `@manual` are reported as `DEPTH-DEFERRED` (they do NOT silently inflate `businessDepth`). Deferring automatable work to `@manual` lowers quality — automate it in the flow instead.
 - **Pick the right `@manual:Mx` code — it decides which driver can later automate the case** (`sungen audit` flags a code↔reason mismatch). Tag the code that matches the **oracle the reason describes**:

package/src/orchestrator/templates/specs-api.ts CHANGED Viewed

@@ -49,6 +49,17 @@ function substitute(text: string, params: Record<string, any>): string {
   return text.replace(/:([A-Za-z_][A-Za-z0-9_]*)/g, (_m, p) => encodeURIComponent(String(params[p] ?? '')));
 }
+/**
+ * Join a datasource base URL with a catalog path. Concatenate rather than rely on Playwright's
+ * baseURL resolution: an absolute path (`/user/1`) resolves against the base ORIGIN and would drop
+ * a base path component (`/api/v3`). Most APIs are mounted under such a prefix, so the full URL must
+ * be built explicitly.
+ */
+export function joinApiUrl(base: string, urlPath: string): string {
+  const b = base.replace(/\/$/, '');
+  return urlPath.startsWith('/') ? b + urlPath : `${b}/${urlPath}`;
+}
 class ApiClient {
   private configs: Record<string, ApiDataSource> | null = null;
@@ -103,13 +114,13 @@ class ApiClient {
     // `storageState` (the @auth role's saved session) so the request shares the browser's
     // authenticated cookies. Disposed per call so no request context lingers and hangs the process.
     const ctx: APIRequestContext = await request.newContext({
-      baseURL: base,
       extraHTTPHeaders: headers,
       timeout: conf.timeout_ms ?? 15000,
       ...(opts.storageState ? { storageState: opts.storageState } : {}),
     });
     try {
-      const res = await ctx.fetch(urlPath, { method: req.method, ...bodyOpt });
+      // Full URL (not a baseURL-relative path) so a base path component like /api/v3 is preserved.
+      const res = await ctx.fetch(joinApiUrl(base, urlPath), { method: req.method, ...bodyOpt });
       const text = await res.text();
       let parsed: any = text;
       try { parsed = text ? JSON.parse(text) : null; } catch { /* non-JSON → keep text */ }