npm - @sun-asterisk/sungen - Versions diffs - 3.2.2-beta.10 → 3.2.2-beta.11 - Mend

@sun-asterisk/sungen 3.2.2-beta.10 → 3.2.2-beta.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/dist/orchestrator/templates/ai-instructions/claude-cmd-create-test.md CHANGED Viewed

@@ -77,6 +77,7 @@ If the unit is **api-first** (`qa/api/<name>/` or `qa/api/flows/<name>/`), the d
    - one **viewpoint theme** per shard — a `VP-` prefix from the viewpoint overview (`VP-SEC`, `VP-ERROR-EMPTY-STATE`, `VP-CAROUSEL`, …) — preferred when the viewpoint overview is rich (test-2/home had 47 items across many themes); **or**
    - one **`spec.md` section** per shard (the Mapping Contract walk, Table 1) — preferred when generating from spec.
    Each shard owns a disjoint `VP-` prefix ⇒ ids never collide. One shard → skip to 5c (no fan-out gain).
+   - **Budget-adaptive shard size (S4).** Size the fan-out to your context budget: `N = clamp(ceil(viewpoint_items / items_per_shard), 1, min(16, cores-2))`, where `items_per_shard` is **larger on a ~1M budget** (fewer, bigger shards; more held inline) and **smaller on a ~200k "Claude Standards" budget** (more, tighter shards + aggressive offload). The orchestrator keeps **only the compact summary each generator returns** (pointers to its fragment files) — never the raw fragments in-context. Each generator sees **only its slice** — its theme/section + the **one** matching `sungen-viewpoint` group + the relevant `spec.md` section(s); never load the other groups or the whole spec (lazy = context-cheap). If the budget is too tight even for one shard, **fall back to the sequential path (5d)** — same output, just slower; never fail for lack of budget.
    **5b. Parallel fan-out (Claude Code).** Spawn one **`sungen-generator`** sub-agent **per shard** (Task tool, `subagent_type: sungen-generator`) — issue all the Task calls **in a single message** so they run concurrently. Pass each: its shard (theme/section) + viewpoint slice, the **`sungen-discovery` report** (Step 3), only the `spec.md` section(s) it maps to, which one `sungen-viewpoint` group file holds its patterns, the unit (screen/flow) + name + tier, and its fragment paths `.sungen/fragments/<name>/<shard>.{feature,test-data.yaml}`. Each writes a **headerless** fragment + a test-data fragment and returns a compact summary. Small fragments also keep every generator under the output-token cap (the reason the single-pass path writes incrementally).

package/dist/orchestrator/templates/ai-instructions/copilot-cmd-create-test.md CHANGED Viewed

@@ -66,6 +66,7 @@ If the unit is **api-first** (`qa/api/<name>/` or `qa/api/flows/<name>/`), the d
 4. Follow the `sungen-tc-generation` skill for section identification, viewpoint generation, and output format. **For flows**, use the "Flow Test Generation" section in the skill. When requirements exist, use the "Requirements-Driven Generation" strategy. **For Tier 1**, apply the **Lightweight Guard** — verify required fields, validation rules, business rules, security checks, and key state transitions all have TCs after generation. **For Tier 2+**, **MUST** apply the full **Mapping Contract** — walk every `spec.md` section top-to-bottom and produce the indicated TCs per Table 1; handle `test-viewpoint.md` per Table 2. Do not silently skip sections. Present sections as a numbered list and let user pick.
 5. Generate or update `.feature` + `test-data.yaml` following `sungen-gherkin-syntax` and `sungen-tc-generation` skills. Generate **group-by-group** (one viewpoint group at a time, tier-by-tier `Write`/`Edit` batches) to stay under the output-token cap. **For flows**: use `[Screen:Element]` namespace format, namespace test-data by phase, add `@flow` tag.
    > **No parallel fan-out here.** Copilot has no sub-agents, so generation is sequential (the Claude Code variant fans out one `sungen-generator` per viewpoint group and merges). Same output, no speedup.
+   > **Load lazily (budget, S4).** Per group, load **only** the one matching `sungen-viewpoint` group file + the relevant `spec.md` section(s) — never all five groups or the whole spec. On a tight (~200k) budget, write smaller tier-by-tier batches and keep prior batches on disk; this is what keeps generation inside a non-1M context.
 5.4. **Depth self-check (deterministic — BEFORE the audit).** Run `sungen depth-lint --screen ${input:name}`. It splits every shallow business-critical scenario into **DEEPEN IN PLACE** (add a real value assertion — the printed `template` is a theme-keyed hint, apply judgment to the actual claim; never fake one onto a visibility/behavior scenario) and **CROSS-SCREEN** (route to a flow / tag `@manual:Mx` + reason — removes it from the depth denominator honestly). Act on both, re-run until `deepen` is empty (or only honest over-counts remain), THEN gate. Lifts first-pass `businessDepth` mechanically instead of via 2–3 repair rounds.
 5.5. **Quality gate & repair (harness — always run).** Per `sungen-harness-audit`: run `sungen audit --screen ${input:name}` (structural), THEN do an **independent semantic review inline** using the `sungen-reviewer` criteria (does each scenario's steps PROVE its title/viewpoint? observable Thens? business-critical assertion depth?). Merge both sets of issues; if gate FAILs / findings exist, repair (budget 3) and re-audit — GATE missing theme → generate it (cross-screen → **automate it in the flow** via `/sungen:add-flow`, NOT a full `@manual` screen duplicate — `sungen audit` flags an automatable `@manual` as `MANUAL-AUTOMATABLE`; reserve `@manual:Mx` for true judgment/missing-capability); DEPTH → add data assertions; BALANCE → add business-core first; TRACE → align VP ids. Never fake a pass.
 5.5b. **Phase gate (boundary — do NOT skip).** Run `sungen gate --screen ${input:name} --phase create` (exit 2 = HALT): every required obligation (spec · coverage · depth · trace) must be **satisfied or explicitly waived**. On **HALT**, keep repairing within budget; a genuinely-accepted gap → `sungen journey --screen ${input:name} --waive <OB> --reason "..."` (reason mandatory). Do **not** converge (step 6) past a HALT without a fix or a reasoned waiver.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@sun-asterisk/sungen",
-  "version": "3.2.2-beta.10",
+  "version": "3.2.2-beta.11",
   "description": "Deterministic E2E Test Compiler - Gherkin + Selectors → Playwright tests",
   "main": "src/index.ts",
   "types": "src/index.ts",
@@ -33,7 +33,7 @@
     "node": ">=18.0.0"
   },
   "dependencies": {
-    "@sungen/driver-ui": "3.2.2-beta.10",
+    "@sungen/driver-ui": "3.2.2-beta.11",
     "@anthropic-ai/sdk": "^0.71.0",
     "@babel/parser": "^7.28.5",
     "@babel/traverse": "^7.28.5",

package/src/orchestrator/templates/ai-instructions/claude-cmd-create-test.md CHANGED Viewed

@@ -77,6 +77,7 @@ If the unit is **api-first** (`qa/api/<name>/` or `qa/api/flows/<name>/`), the d
    - one **viewpoint theme** per shard — a `VP-` prefix from the viewpoint overview (`VP-SEC`, `VP-ERROR-EMPTY-STATE`, `VP-CAROUSEL`, …) — preferred when the viewpoint overview is rich (test-2/home had 47 items across many themes); **or**
    - one **`spec.md` section** per shard (the Mapping Contract walk, Table 1) — preferred when generating from spec.
    Each shard owns a disjoint `VP-` prefix ⇒ ids never collide. One shard → skip to 5c (no fan-out gain).
+   - **Budget-adaptive shard size (S4).** Size the fan-out to your context budget: `N = clamp(ceil(viewpoint_items / items_per_shard), 1, min(16, cores-2))`, where `items_per_shard` is **larger on a ~1M budget** (fewer, bigger shards; more held inline) and **smaller on a ~200k "Claude Standards" budget** (more, tighter shards + aggressive offload). The orchestrator keeps **only the compact summary each generator returns** (pointers to its fragment files) — never the raw fragments in-context. Each generator sees **only its slice** — its theme/section + the **one** matching `sungen-viewpoint` group + the relevant `spec.md` section(s); never load the other groups or the whole spec (lazy = context-cheap). If the budget is too tight even for one shard, **fall back to the sequential path (5d)** — same output, just slower; never fail for lack of budget.
    **5b. Parallel fan-out (Claude Code).** Spawn one **`sungen-generator`** sub-agent **per shard** (Task tool, `subagent_type: sungen-generator`) — issue all the Task calls **in a single message** so they run concurrently. Pass each: its shard (theme/section) + viewpoint slice, the **`sungen-discovery` report** (Step 3), only the `spec.md` section(s) it maps to, which one `sungen-viewpoint` group file holds its patterns, the unit (screen/flow) + name + tier, and its fragment paths `.sungen/fragments/<name>/<shard>.{feature,test-data.yaml}`. Each writes a **headerless** fragment + a test-data fragment and returns a compact summary. Small fragments also keep every generator under the output-token cap (the reason the single-pass path writes incrementally).

package/src/orchestrator/templates/ai-instructions/copilot-cmd-create-test.md CHANGED Viewed

@@ -66,6 +66,7 @@ If the unit is **api-first** (`qa/api/<name>/` or `qa/api/flows/<name>/`), the d
 4. Follow the `sungen-tc-generation` skill for section identification, viewpoint generation, and output format. **For flows**, use the "Flow Test Generation" section in the skill. When requirements exist, use the "Requirements-Driven Generation" strategy. **For Tier 1**, apply the **Lightweight Guard** — verify required fields, validation rules, business rules, security checks, and key state transitions all have TCs after generation. **For Tier 2+**, **MUST** apply the full **Mapping Contract** — walk every `spec.md` section top-to-bottom and produce the indicated TCs per Table 1; handle `test-viewpoint.md` per Table 2. Do not silently skip sections. Present sections as a numbered list and let user pick.
 5. Generate or update `.feature` + `test-data.yaml` following `sungen-gherkin-syntax` and `sungen-tc-generation` skills. Generate **group-by-group** (one viewpoint group at a time, tier-by-tier `Write`/`Edit` batches) to stay under the output-token cap. **For flows**: use `[Screen:Element]` namespace format, namespace test-data by phase, add `@flow` tag.
    > **No parallel fan-out here.** Copilot has no sub-agents, so generation is sequential (the Claude Code variant fans out one `sungen-generator` per viewpoint group and merges). Same output, no speedup.
+   > **Load lazily (budget, S4).** Per group, load **only** the one matching `sungen-viewpoint` group file + the relevant `spec.md` section(s) — never all five groups or the whole spec. On a tight (~200k) budget, write smaller tier-by-tier batches and keep prior batches on disk; this is what keeps generation inside a non-1M context.
 5.4. **Depth self-check (deterministic — BEFORE the audit).** Run `sungen depth-lint --screen ${input:name}`. It splits every shallow business-critical scenario into **DEEPEN IN PLACE** (add a real value assertion — the printed `template` is a theme-keyed hint, apply judgment to the actual claim; never fake one onto a visibility/behavior scenario) and **CROSS-SCREEN** (route to a flow / tag `@manual:Mx` + reason — removes it from the depth denominator honestly). Act on both, re-run until `deepen` is empty (or only honest over-counts remain), THEN gate. Lifts first-pass `businessDepth` mechanically instead of via 2–3 repair rounds.
 5.5. **Quality gate & repair (harness — always run).** Per `sungen-harness-audit`: run `sungen audit --screen ${input:name}` (structural), THEN do an **independent semantic review inline** using the `sungen-reviewer` criteria (does each scenario's steps PROVE its title/viewpoint? observable Thens? business-critical assertion depth?). Merge both sets of issues; if gate FAILs / findings exist, repair (budget 3) and re-audit — GATE missing theme → generate it (cross-screen → **automate it in the flow** via `/sungen:add-flow`, NOT a full `@manual` screen duplicate — `sungen audit` flags an automatable `@manual` as `MANUAL-AUTOMATABLE`; reserve `@manual:Mx` for true judgment/missing-capability); DEPTH → add data assertions; BALANCE → add business-core first; TRACE → align VP ids. Never fake a pass.
 5.5b. **Phase gate (boundary — do NOT skip).** Run `sungen gate --screen ${input:name} --phase create` (exit 2 = HALT): every required obligation (spec · coverage · depth · trace) must be **satisfied or explicitly waived**. On **HALT**, keep repairing within budget; a genuinely-accepted gap → `sungen journey --screen ${input:name} --waive <OB> --reason "..."` (reason mandatory). Do **not** converge (step 6) past a HALT without a fix or a reasoned waiver.