@sun-asterisk/sungen 3.1.2 → 3.2.0-beta.142

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (290) hide show
  1. package/README.md +4 -428
  2. package/dist/capabilities/builtins.d.ts +31 -0
  3. package/dist/capabilities/builtins.d.ts.map +1 -0
  4. package/dist/capabilities/builtins.js +84 -0
  5. package/dist/capabilities/builtins.js.map +1 -0
  6. package/dist/capabilities/context-router.d.ts +34 -0
  7. package/dist/capabilities/context-router.d.ts.map +1 -0
  8. package/dist/capabilities/context-router.js +49 -0
  9. package/dist/capabilities/context-router.js.map +1 -0
  10. package/dist/capabilities/context.d.ts +68 -0
  11. package/dist/capabilities/context.d.ts.map +1 -0
  12. package/dist/capabilities/context.js +17 -0
  13. package/dist/capabilities/context.js.map +1 -0
  14. package/dist/capabilities/discover.d.ts +2 -0
  15. package/dist/capabilities/discover.d.ts.map +1 -0
  16. package/dist/capabilities/discover.js +109 -0
  17. package/dist/capabilities/discover.js.map +1 -0
  18. package/dist/capabilities/registry.d.ts +92 -0
  19. package/dist/capabilities/registry.d.ts.map +1 -0
  20. package/dist/capabilities/registry.js +43 -0
  21. package/dist/capabilities/registry.js.map +1 -0
  22. package/dist/capabilities/sensor.d.ts +52 -0
  23. package/dist/capabilities/sensor.d.ts.map +1 -0
  24. package/dist/capabilities/sensor.js +3 -0
  25. package/dist/capabilities/sensor.js.map +1 -0
  26. package/dist/cli/commands/audit.d.ts.map +1 -1
  27. package/dist/cli/commands/audit.js +17 -11
  28. package/dist/cli/commands/audit.js.map +1 -1
  29. package/dist/cli/commands/capability.d.ts.map +1 -1
  30. package/dist/cli/commands/capability.js +57 -5
  31. package/dist/cli/commands/capability.js.map +1 -1
  32. package/dist/cli/commands/context.d.ts +9 -0
  33. package/dist/cli/commands/context.d.ts.map +1 -0
  34. package/dist/cli/commands/context.js +91 -0
  35. package/dist/cli/commands/context.js.map +1 -0
  36. package/dist/cli/commands/delivery.d.ts.map +1 -1
  37. package/dist/cli/commands/delivery.js +42 -30
  38. package/dist/cli/commands/delivery.js.map +1 -1
  39. package/dist/cli/commands/generate.d.ts.map +1 -1
  40. package/dist/cli/commands/generate.js +35 -8
  41. package/dist/cli/commands/generate.js.map +1 -1
  42. package/dist/cli/commands/ledger.d.ts.map +1 -1
  43. package/dist/cli/commands/ledger.js +15 -5
  44. package/dist/cli/commands/ledger.js.map +1 -1
  45. package/dist/cli/commands/manifest.d.ts.map +1 -1
  46. package/dist/cli/commands/manifest.js +10 -9
  47. package/dist/cli/commands/manifest.js.map +1 -1
  48. package/dist/cli/commands/repair.d.ts +8 -0
  49. package/dist/cli/commands/repair.d.ts.map +1 -0
  50. package/dist/cli/commands/repair.js +97 -0
  51. package/dist/cli/commands/repair.js.map +1 -0
  52. package/dist/cli/commands/script-check.d.ts.map +1 -1
  53. package/dist/cli/commands/script-check.js +13 -9
  54. package/dist/cli/commands/script-check.js.map +1 -1
  55. package/dist/cli/commands/trace.d.ts.map +1 -1
  56. package/dist/cli/commands/trace.js +7 -4
  57. package/dist/cli/commands/trace.js.map +1 -1
  58. package/dist/cli/index.js +14 -1
  59. package/dist/cli/index.js.map +1 -1
  60. package/dist/generators/test-generator/adapters/adapter-interface.d.ts +1 -0
  61. package/dist/generators/test-generator/adapters/adapter-interface.d.ts.map +1 -1
  62. package/dist/generators/test-generator/adapters/playwright/playwright-adapter.d.ts +1 -0
  63. package/dist/generators/test-generator/adapters/playwright/playwright-adapter.d.ts.map +1 -1
  64. package/dist/generators/test-generator/adapters/playwright/playwright-adapter.js.map +1 -1
  65. package/dist/generators/test-generator/adapters/playwright/templates/imports.hbs +3 -0
  66. package/dist/generators/test-generator/code-generator.d.ts +18 -9
  67. package/dist/generators/test-generator/code-generator.d.ts.map +1 -1
  68. package/dist/generators/test-generator/code-generator.js +162 -115
  69. package/dist/generators/test-generator/code-generator.js.map +1 -1
  70. package/dist/generators/test-generator/patterns/index.d.ts +0 -10
  71. package/dist/generators/test-generator/patterns/index.d.ts.map +1 -1
  72. package/dist/generators/test-generator/patterns/index.js +10 -47
  73. package/dist/generators/test-generator/patterns/index.js.map +1 -1
  74. package/dist/generators/test-generator/template-engine.d.ts +1 -0
  75. package/dist/generators/test-generator/template-engine.d.ts.map +1 -1
  76. package/dist/generators/test-generator/template-engine.js +1 -1
  77. package/dist/generators/test-generator/template-engine.js.map +1 -1
  78. package/dist/harness/annotation-overrides.d.ts +11 -0
  79. package/dist/harness/annotation-overrides.d.ts.map +1 -0
  80. package/dist/harness/annotation-overrides.js +38 -0
  81. package/dist/harness/annotation-overrides.js.map +1 -0
  82. package/dist/harness/audit.d.ts +9 -1
  83. package/dist/harness/audit.d.ts.map +1 -1
  84. package/dist/harness/audit.js +140 -10
  85. package/dist/harness/audit.js.map +1 -1
  86. package/dist/harness/capability-plan.d.ts +14 -0
  87. package/dist/harness/capability-plan.d.ts.map +1 -1
  88. package/dist/harness/capability-plan.js +63 -1
  89. package/dist/harness/capability-plan.js.map +1 -1
  90. package/dist/harness/catalog/drivers.yaml +35 -12
  91. package/dist/harness/data-driven-lint.d.ts.map +1 -1
  92. package/dist/harness/data-driven-lint.js +23 -0
  93. package/dist/harness/data-driven-lint.js.map +1 -1
  94. package/dist/harness/flow-check.d.ts +9 -0
  95. package/dist/harness/flow-check.d.ts.map +1 -1
  96. package/dist/harness/flow-check.js +13 -6
  97. package/dist/harness/flow-check.js.map +1 -1
  98. package/dist/harness/intent.d.ts +6 -0
  99. package/dist/harness/intent.d.ts.map +1 -1
  100. package/dist/harness/intent.js +20 -4
  101. package/dist/harness/intent.js.map +1 -1
  102. package/dist/harness/ledger.d.ts.map +1 -1
  103. package/dist/harness/ledger.js +3 -2
  104. package/dist/harness/ledger.js.map +1 -1
  105. package/dist/harness/manifest.d.ts.map +1 -1
  106. package/dist/harness/manifest.js +3 -2
  107. package/dist/harness/manifest.js.map +1 -1
  108. package/dist/harness/parse.d.ts +2 -0
  109. package/dist/harness/parse.d.ts.map +1 -1
  110. package/dist/harness/parse.js +16 -4
  111. package/dist/harness/parse.js.map +1 -1
  112. package/dist/harness/quality-gates.js +1 -1
  113. package/dist/harness/quality-gates.js.map +1 -1
  114. package/dist/harness/query-catalog.d.ts.map +1 -1
  115. package/dist/harness/query-catalog.js +0 -0
  116. package/dist/harness/query-catalog.js.map +1 -1
  117. package/dist/harness/repair.d.ts +20 -0
  118. package/dist/harness/repair.d.ts.map +1 -0
  119. package/dist/harness/repair.js +111 -0
  120. package/dist/harness/repair.js.map +1 -0
  121. package/dist/harness/script-check.d.ts +3 -1
  122. package/dist/harness/script-check.d.ts.map +1 -1
  123. package/dist/harness/script-check.js +22 -8
  124. package/dist/harness/script-check.js.map +1 -1
  125. package/dist/harness/sensors.d.ts +40 -0
  126. package/dist/harness/sensors.d.ts.map +1 -1
  127. package/dist/harness/sensors.js +54 -2
  128. package/dist/harness/sensors.js.map +1 -1
  129. package/dist/harness/trace.d.ts.map +1 -1
  130. package/dist/harness/trace.js +4 -3
  131. package/dist/harness/trace.js.map +1 -1
  132. package/dist/harness/unit-paths.d.ts +3 -0
  133. package/dist/harness/unit-paths.d.ts.map +1 -0
  134. package/dist/harness/unit-paths.js +52 -0
  135. package/dist/harness/unit-paths.js.map +1 -0
  136. package/dist/index.d.ts +22 -0
  137. package/dist/index.d.ts.map +1 -0
  138. package/dist/index.js +36 -0
  139. package/dist/index.js.map +1 -0
  140. package/dist/orchestrator/ai-rules-updater.d.ts.map +1 -1
  141. package/dist/orchestrator/ai-rules-updater.js +2 -0
  142. package/dist/orchestrator/ai-rules-updater.js.map +1 -1
  143. package/dist/orchestrator/context-discovery.d.ts +12 -0
  144. package/dist/orchestrator/context-discovery.d.ts.map +1 -0
  145. package/dist/orchestrator/context-discovery.js +46 -0
  146. package/dist/orchestrator/context-discovery.js.map +1 -0
  147. package/dist/orchestrator/templates/ai-instructions/claude-agent-reviewer.md +7 -1
  148. package/dist/orchestrator/templates/ai-instructions/claude-cmd-create-test.md +10 -5
  149. package/dist/orchestrator/templates/ai-instructions/claude-cmd-run-test.md +18 -1
  150. package/dist/orchestrator/templates/ai-instructions/claude-skill-api-design.md +62 -0
  151. package/dist/orchestrator/templates/ai-instructions/claude-skill-gherkin-syntax.md +1 -0
  152. package/dist/orchestrator/templates/ai-instructions/claude-skill-harness-audit.md +2 -1
  153. package/dist/orchestrator/templates/ai-instructions/claude-skill-tc-generation.md +19 -2
  154. package/dist/orchestrator/templates/ai-instructions/claude-skill-viewpoint.md +14 -0
  155. package/dist/orchestrator/templates/ai-instructions/copilot-cmd-create-test.md +10 -5
  156. package/dist/orchestrator/templates/ai-instructions/copilot-cmd-run-test.md +11 -1
  157. package/dist/orchestrator/templates/ai-instructions/github-skill-sungen-api-design.md +62 -0
  158. package/dist/orchestrator/templates/ai-instructions/github-skill-sungen-gherkin-syntax.md +1 -0
  159. package/dist/orchestrator/templates/ai-instructions/github-skill-sungen-harness-audit.md +2 -1
  160. package/dist/orchestrator/templates/ai-instructions/github-skill-sungen-tc-generation.md +19 -2
  161. package/dist/orchestrator/templates/ai-instructions/github-skill-sungen-viewpoint.md +14 -0
  162. package/dist/orchestrator/templates/specs-api.d.ts +55 -0
  163. package/dist/orchestrator/templates/specs-api.d.ts.map +1 -0
  164. package/dist/orchestrator/templates/specs-api.js +171 -0
  165. package/dist/orchestrator/templates/specs-api.js.map +1 -0
  166. package/dist/orchestrator/templates/specs-api.ts +154 -0
  167. package/dist/orchestrator/templates/specs-db.d.ts +3 -0
  168. package/dist/orchestrator/templates/specs-db.d.ts.map +1 -1
  169. package/dist/orchestrator/templates/specs-db.js +78 -1
  170. package/dist/orchestrator/templates/specs-db.js.map +1 -1
  171. package/dist/orchestrator/templates/specs-db.ts +78 -1
  172. package/dist/orchestrator/templates/specs-test-data.ts +2 -1
  173. package/package.json +7 -30
  174. package/src/capabilities/builtins.ts +85 -0
  175. package/src/capabilities/context-router.ts +66 -0
  176. package/src/capabilities/context.ts +65 -0
  177. package/src/capabilities/discover.ts +62 -0
  178. package/src/capabilities/registry.ts +113 -0
  179. package/src/capabilities/sensor.ts +47 -0
  180. package/src/cli/commands/audit.ts +15 -9
  181. package/src/cli/commands/capability.ts +53 -5
  182. package/src/cli/commands/context.ts +52 -0
  183. package/src/cli/commands/delivery.ts +40 -31
  184. package/src/cli/commands/generate.ts +37 -8
  185. package/src/cli/commands/ledger.ts +13 -5
  186. package/src/cli/commands/manifest.ts +9 -7
  187. package/src/cli/commands/repair.ts +57 -0
  188. package/src/cli/commands/script-check.ts +12 -8
  189. package/src/cli/commands/trace.ts +7 -4
  190. package/src/cli/index.ts +14 -1
  191. package/src/generators/test-generator/adapters/adapter-interface.ts +1 -1
  192. package/src/generators/test-generator/adapters/playwright/playwright-adapter.ts +1 -1
  193. package/src/generators/test-generator/adapters/playwright/templates/imports.hbs +3 -0
  194. package/src/generators/test-generator/code-generator.ts +163 -111
  195. package/src/generators/test-generator/patterns/index.ts +9 -35
  196. package/src/generators/test-generator/template-engine.ts +2 -2
  197. package/src/harness/annotation-overrides.ts +27 -0
  198. package/src/harness/audit.ts +141 -12
  199. package/src/harness/capability-plan.ts +51 -1
  200. package/src/harness/catalog/drivers.yaml +35 -12
  201. package/src/harness/data-driven-lint.ts +20 -0
  202. package/src/harness/flow-check.ts +15 -6
  203. package/src/harness/intent.ts +25 -4
  204. package/src/harness/ledger.ts +3 -2
  205. package/src/harness/manifest.ts +3 -2
  206. package/src/harness/parse.ts +11 -2
  207. package/src/harness/quality-gates.ts +1 -1
  208. package/src/harness/query-catalog.ts +0 -0
  209. package/src/harness/repair.ts +75 -0
  210. package/src/harness/script-check.ts +25 -8
  211. package/src/harness/sensors.ts +71 -2
  212. package/src/harness/trace.ts +4 -3
  213. package/src/harness/unit-paths.ts +14 -0
  214. package/src/index.ts +32 -0
  215. package/src/orchestrator/ai-rules-updater.ts +2 -0
  216. package/src/orchestrator/context-discovery.ts +50 -0
  217. package/src/orchestrator/templates/ai-instructions/claude-agent-reviewer.md +7 -1
  218. package/src/orchestrator/templates/ai-instructions/claude-cmd-create-test.md +10 -5
  219. package/src/orchestrator/templates/ai-instructions/claude-cmd-run-test.md +18 -1
  220. package/src/orchestrator/templates/ai-instructions/claude-skill-api-design.md +62 -0
  221. package/src/orchestrator/templates/ai-instructions/claude-skill-gherkin-syntax.md +1 -0
  222. package/src/orchestrator/templates/ai-instructions/claude-skill-harness-audit.md +2 -1
  223. package/src/orchestrator/templates/ai-instructions/claude-skill-tc-generation.md +19 -2
  224. package/src/orchestrator/templates/ai-instructions/claude-skill-viewpoint.md +14 -0
  225. package/src/orchestrator/templates/ai-instructions/copilot-cmd-create-test.md +10 -5
  226. package/src/orchestrator/templates/ai-instructions/copilot-cmd-run-test.md +11 -1
  227. package/src/orchestrator/templates/ai-instructions/github-skill-sungen-api-design.md +62 -0
  228. package/src/orchestrator/templates/ai-instructions/github-skill-sungen-gherkin-syntax.md +1 -0
  229. package/src/orchestrator/templates/ai-instructions/github-skill-sungen-harness-audit.md +2 -1
  230. package/src/orchestrator/templates/ai-instructions/github-skill-sungen-tc-generation.md +19 -2
  231. package/src/orchestrator/templates/ai-instructions/github-skill-sungen-viewpoint.md +14 -0
  232. package/src/orchestrator/templates/specs-api.ts +154 -0
  233. package/src/orchestrator/templates/specs-db.ts +78 -1
  234. package/src/orchestrator/templates/specs-test-data.ts +2 -1
  235. package/dist/generators/test-generator/patterns/assertion-patterns.d.ts +0 -7
  236. package/dist/generators/test-generator/patterns/assertion-patterns.d.ts.map +0 -1
  237. package/dist/generators/test-generator/patterns/assertion-patterns.js +0 -626
  238. package/dist/generators/test-generator/patterns/assertion-patterns.js.map +0 -1
  239. package/dist/generators/test-generator/patterns/capture-patterns.d.ts +0 -21
  240. package/dist/generators/test-generator/patterns/capture-patterns.d.ts.map +0 -1
  241. package/dist/generators/test-generator/patterns/capture-patterns.js +0 -87
  242. package/dist/generators/test-generator/patterns/capture-patterns.js.map +0 -1
  243. package/dist/generators/test-generator/patterns/database-patterns.d.ts +0 -6
  244. package/dist/generators/test-generator/patterns/database-patterns.d.ts.map +0 -1
  245. package/dist/generators/test-generator/patterns/database-patterns.js +0 -95
  246. package/dist/generators/test-generator/patterns/database-patterns.js.map +0 -1
  247. package/dist/generators/test-generator/patterns/form-patterns.d.ts +0 -6
  248. package/dist/generators/test-generator/patterns/form-patterns.d.ts.map +0 -1
  249. package/dist/generators/test-generator/patterns/form-patterns.js +0 -160
  250. package/dist/generators/test-generator/patterns/form-patterns.js.map +0 -1
  251. package/dist/generators/test-generator/patterns/interaction-patterns.d.ts +0 -6
  252. package/dist/generators/test-generator/patterns/interaction-patterns.d.ts.map +0 -1
  253. package/dist/generators/test-generator/patterns/interaction-patterns.js +0 -433
  254. package/dist/generators/test-generator/patterns/interaction-patterns.js.map +0 -1
  255. package/dist/generators/test-generator/patterns/keyboard-patterns.d.ts +0 -7
  256. package/dist/generators/test-generator/patterns/keyboard-patterns.d.ts.map +0 -1
  257. package/dist/generators/test-generator/patterns/keyboard-patterns.js +0 -47
  258. package/dist/generators/test-generator/patterns/keyboard-patterns.js.map +0 -1
  259. package/dist/generators/test-generator/patterns/navigation-patterns.d.ts +0 -6
  260. package/dist/generators/test-generator/patterns/navigation-patterns.d.ts.map +0 -1
  261. package/dist/generators/test-generator/patterns/navigation-patterns.js +0 -125
  262. package/dist/generators/test-generator/patterns/navigation-patterns.js.map +0 -1
  263. package/dist/generators/test-generator/patterns/scope-patterns.d.ts +0 -7
  264. package/dist/generators/test-generator/patterns/scope-patterns.d.ts.map +0 -1
  265. package/dist/generators/test-generator/patterns/scope-patterns.js +0 -36
  266. package/dist/generators/test-generator/patterns/scope-patterns.js.map +0 -1
  267. package/dist/generators/test-generator/patterns/scroll-patterns.d.ts +0 -7
  268. package/dist/generators/test-generator/patterns/scroll-patterns.d.ts.map +0 -1
  269. package/dist/generators/test-generator/patterns/scroll-patterns.js +0 -25
  270. package/dist/generators/test-generator/patterns/scroll-patterns.js.map +0 -1
  271. package/dist/generators/test-generator/patterns/setup-patterns.d.ts +0 -6
  272. package/dist/generators/test-generator/patterns/setup-patterns.d.ts.map +0 -1
  273. package/dist/generators/test-generator/patterns/setup-patterns.js +0 -72
  274. package/dist/generators/test-generator/patterns/setup-patterns.js.map +0 -1
  275. package/dist/generators/test-generator/patterns/table-patterns.d.ts +0 -19
  276. package/dist/generators/test-generator/patterns/table-patterns.d.ts.map +0 -1
  277. package/dist/generators/test-generator/patterns/table-patterns.js +0 -239
  278. package/dist/generators/test-generator/patterns/table-patterns.js.map +0 -1
  279. package/docs/orchestration-spec.md +0 -267
  280. package/src/generators/test-generator/patterns/assertion-patterns.ts +0 -691
  281. package/src/generators/test-generator/patterns/capture-patterns.ts +0 -97
  282. package/src/generators/test-generator/patterns/database-patterns.ts +0 -96
  283. package/src/generators/test-generator/patterns/form-patterns.ts +0 -167
  284. package/src/generators/test-generator/patterns/interaction-patterns.ts +0 -465
  285. package/src/generators/test-generator/patterns/keyboard-patterns.ts +0 -51
  286. package/src/generators/test-generator/patterns/navigation-patterns.ts +0 -140
  287. package/src/generators/test-generator/patterns/scope-patterns.ts +0 -40
  288. package/src/generators/test-generator/patterns/scroll-patterns.ts +0 -27
  289. package/src/generators/test-generator/patterns/setup-patterns.ts +0 -76
  290. package/src/generators/test-generator/patterns/table-patterns.ts +0 -279
@@ -0,0 +1,75 @@
1
+ /**
2
+ * Repair planner (#343) — the consumer of the `repair` capability SPI.
3
+ *
4
+ * Gathers the unit-capability's fix rules and matches them against the audit findings (always) and
5
+ * the latest Playwright failures (best-effort), turning them into a concrete fix plan. Deterministic:
6
+ * the AI repair loop and a human get the same proposals. Backs `sungen repair`.
7
+ */
8
+ import * as fs from 'fs';
9
+ import * as path from 'path';
10
+ import { capabilityRegistry } from '../capabilities/registry';
11
+ import { discoverAndRegisterCapabilities } from '../capabilities/discover';
12
+ import { scoringCapabilityFor } from './audit';
13
+
14
+ export interface RepairProposal { source: 'audit' | 'runtime'; signal: string; ruleId: string; fix: string }
15
+ export interface RepairPlan {
16
+ capability: string | undefined;
17
+ rulesAvailable: number;
18
+ proposals: RepairProposal[];
19
+ unmatched: string[]; // findings/failures with no matching rule (need a human)
20
+ }
21
+
22
+ /** Collect failure messages from a Playwright JSON result file (best-effort, defensive). */
23
+ function failuresFromResult(file: string): string[] {
24
+ const out: string[] = [];
25
+ try {
26
+ const r = JSON.parse(fs.readFileSync(file, 'utf8'));
27
+ const visit = (suite: any) => {
28
+ for (const sp of suite.specs ?? []) {
29
+ for (const t of sp.tests ?? []) {
30
+ for (const res of t.results ?? []) {
31
+ if (res.status === 'failed' || res.status === 'timedOut') {
32
+ const msg = res.error?.message || res.errors?.[0]?.message || res.status;
33
+ out.push(`${sp.title}: ${String(msg).split('\n')[0].slice(0, 200)}`);
34
+ }
35
+ }
36
+ }
37
+ }
38
+ for (const s of suite.suites ?? []) visit(s);
39
+ };
40
+ for (const s of r.suites ?? []) visit(s);
41
+ } catch { /* missing/!json → no runtime signals */ }
42
+ return out;
43
+ }
44
+
45
+ /**
46
+ * Build the repair plan for a unit.
47
+ * @param unitId capability-resolution id (`api/<area>`, `flows/<flow>`, or a screen)
48
+ * @param reportName the bare name used for `.sungen/reports/<name>-audit.json` (+ test-result)
49
+ * @param generatedDir the unit's specs/generated dir (for runtime failures); optional
50
+ */
51
+ export function planRepair(unitId: string, reportName: string, cwd: string, generatedDir?: string): RepairPlan {
52
+ discoverAndRegisterCapabilities();
53
+ const capId = scoringCapabilityFor(unitId, capabilityRegistry.defaultCapabilityId());
54
+ const rules = (capId ? capabilityRegistry.get(capId)?.repair?.rules : undefined) ?? [];
55
+
56
+ const signals: { source: 'audit' | 'runtime'; text: string }[] = [];
57
+ const auditPath = path.join(cwd, '.sungen', 'reports', `${reportName}-audit.json`);
58
+ if (fs.existsSync(auditPath)) {
59
+ try { for (const f of JSON.parse(fs.readFileSync(auditPath, 'utf8')).findings ?? []) signals.push({ source: 'audit', text: String(f) }); } catch { /* ignore */ }
60
+ }
61
+ if (generatedDir && fs.existsSync(generatedDir)) {
62
+ for (const f of fs.readdirSync(generatedDir)) {
63
+ if (/test-result.*\.json$/.test(f)) for (const msg of failuresFromResult(path.join(generatedDir, f))) signals.push({ source: 'runtime', text: msg });
64
+ }
65
+ }
66
+
67
+ const proposals: RepairProposal[] = [];
68
+ const unmatched: string[] = [];
69
+ for (const s of signals) {
70
+ const rule = rules.find((r) => r.match.test(s.text));
71
+ if (rule) proposals.push({ source: s.source, signal: s.text, ruleId: rule.id, fix: rule.fix });
72
+ else unmatched.push(s.text);
73
+ }
74
+ return { capability: capId, rulesAvailable: rules.length, proposals, unmatched };
75
+ }
@@ -16,6 +16,7 @@ import * as fs from 'fs';
16
16
  import * as path from 'path';
17
17
  import * as os from 'os';
18
18
  import { loadScenarios, ScenarioInfo } from './parse';
19
+ import { featureBasename } from './unit-paths';
19
20
 
20
21
  export interface ScriptCheckResult {
21
22
  screen: string;
@@ -67,6 +68,9 @@ export function analyzeFaithfulness(specSrc: string, automatedTitles: Set<string
67
68
  const hollowSteps: { test: string; step: string }[] = [];
68
69
  for (const blk of extractTestBlocks(specSrc)) {
69
70
  if (!automatedTitles.has(blk.title)) continue; // only non-@manual scenarios
71
+ // TQ-11 — a capability-pending @requires scenario compiles to a `test.skip(true, …)` stub:
72
+ // it intentionally proves nothing here (it runs once the driver is added), so it is not a bypass.
73
+ if (blk.body.some((l) => /\btest\.skip\(\s*true\b/.test(l))) continue;
70
74
  const body = blk.body;
71
75
  // An assertion is a Playwright `expect(...)` OR a Data Driver DB assertion
72
76
  // (`db.assertRow/assertNoRow/assertCount/...`) — a DB check is a real oracle, so a
@@ -106,9 +110,18 @@ function normalize(src: string): string {
106
110
  .trim();
107
111
  }
108
112
 
109
- function findSpec(dir: string, name: string, flowMode: boolean): string | null {
113
+ /** The unit kind drives the generated-spec subdir + the qa source dir. */
114
+ export type UnitKind = 'screen' | 'flow' | 'api';
115
+
116
+ /** Generated-spec subdir for a unit: screen → <name>, flow → flows/<name>, api → api/<name>. */
117
+ function specSubdir(dir: string, name: string, kind: UnitKind): string {
118
+ return kind === 'flow' ? path.join(dir, 'flows', name) : kind === 'api' ? path.join(dir, 'api', name) : path.join(dir, name);
119
+ }
120
+
121
+ function findSpec(dir: string, name: string, kind: UnitKind): string | null {
110
122
  // Screens compile to <dir>/<name>/<feature>.spec.ts
111
123
  // Flows compile to <dir>/flows/<name>/<feature>.spec.ts
124
+ // Api compile to <dir>/api/<name>/<feature>.spec.ts
112
125
  // Scope the search to THIS target's own subdir — otherwise the first spec of
113
126
  // ANY other screen/flow is returned, which (for an uncompiled flow) falsely
114
127
  // reports the wrong screen's tests as drift.
@@ -121,19 +134,19 @@ function findSpec(dir: string, name: string, flowMode: boolean): string | null {
121
134
  else if (e.name.endsWith('.spec.ts')) hits.push(p);
122
135
  }
123
136
  };
124
- const scoped = flowMode ? path.join(dir, 'flows', name) : path.join(dir, name);
137
+ const scoped = specSubdir(dir, name, kind);
125
138
  if (!fs.existsSync(scoped)) return null; // no spec for this target (e.g. not compiled yet)
126
139
  walk(scoped);
127
140
  return hits[0] ?? null;
128
141
  }
129
142
 
130
- export async function runScriptCheck(screenDir: string, screenName: string, flowMode: boolean): Promise<ScriptCheckResult> {
131
- const featurePath = path.join(screenDir, 'features', `${screenName}.feature`);
143
+ export async function runScriptCheck(screenDir: string, screenName: string, kind: UnitKind): Promise<ScriptCheckResult> {
144
+ const featurePath = path.join(screenDir, 'features', `${featureBasename(screenName)}.feature`);
132
145
  const scenarios = loadScenarios(featurePath);
133
146
  const automated = scenarios.filter((s) => !s.manual);
134
147
  const manual = scenarios.filter((s) => s.manual);
135
148
 
136
- const committedSpec = findSpec(path.join(process.cwd(), 'specs', 'generated'), screenName, flowMode);
149
+ const committedSpec = findSpec(path.join(process.cwd(), 'specs', 'generated'), screenName, kind);
137
150
 
138
151
  const findings: string[] = [];
139
152
  let specTitles: string[] = [];
@@ -167,10 +180,14 @@ export async function runScriptCheck(screenDir: string, screenName: string, flow
167
180
  try {
168
181
  const { CodeGenerator } = require('../generators/test-generator/code-generator');
169
182
  const tmp = fs.mkdtempSync(path.join(os.tmpdir(), 'sungen-scriptcheck-'));
170
- const qaSourceDir = path.join(process.cwd(), 'qa', flowMode ? 'flows' : 'screens');
171
- const gen = new CodeGenerator({ framework: 'playwright', screenName, runtimeData: true, flowMode });
183
+ const qaSourceDir = path.join(process.cwd(), 'qa', kind === 'flow' ? 'flows' : kind === 'api' ? 'api' : 'screens');
184
+ // api units derive their unit id (api/<area>) from the feature path like `generate --api`;
185
+ // screen/flow pass screenName + flowMode explicitly (unchanged → byte-identical regenerate).
186
+ const gen = kind === 'api'
187
+ ? new CodeGenerator({ framework: 'playwright', runtimeData: true })
188
+ : new CodeGenerator({ framework: 'playwright', screenName, runtimeData: true, flowMode: kind === 'flow' });
172
189
  await gen.generateAllTests(qaSourceDir, tmp, [featurePath]);
173
- const fresh = findSpec(tmp, screenName, flowMode);
190
+ const fresh = findSpec(tmp, screenName, kind);
174
191
  if (fresh) {
175
192
  const a = normalize(specSrc);
176
193
  const b = normalize(fs.readFileSync(fresh, 'utf-8'));
@@ -111,6 +111,11 @@ export interface DepthResult {
111
111
  businessCriticalShallow: number; // = depth-required scenarios that are shallow
112
112
  bcDepthRatio: number; // fraction of depth-required scenarios with a real data assertion
113
113
  shallowBusinessCritical: { name: string; category?: string }[];
114
+ // @manual scenarios that would be business-critical if automated (match a data-theme).
115
+ // They are EXCLUDED from bcDepthRatio, so deferring them to @manual collapses the
116
+ // denominator and inflates the ratio toward 1.0 — reported so a high ratio on a tiny
117
+ // denominator isn't misread as "all good" (TQ-3).
118
+ deferredBusinessCritical: number;
114
119
  // Depth-as-Gate (harness-roadmap P1)
115
120
  focus: string; // intent focus driving the threshold
116
121
  threshold: number; // required bcDepthRatio for this focus
@@ -124,6 +129,16 @@ const DEPTH_THRESHOLDS: Record<string, number> = {
124
129
  };
125
130
  const WARN_ONLY_FOCUS = new Set(['smoke']);
126
131
 
132
+ /** The required businessDepth ratio for a focus (default `functional` = 0.7). Shared so a capability
133
+ * gate (e.g. the API gate, which computes its own depth) uses the SAME thresholds as the UI gate. */
134
+ export function depthThresholdFor(focus: string): number {
135
+ return DEPTH_THRESHOLDS[focus] ?? DEPTH_THRESHOLDS.functional;
136
+ }
137
+ /** Whether a depth miss only WARNs (vs FAILs) for a focus (smoke). */
138
+ export function depthWarnOnly(focus: string): boolean {
139
+ return WARN_ONLY_FOCUS.has(focus);
140
+ }
141
+
127
142
  /**
128
143
  * Depth = do DATA-correctness scenarios actually assert DATA (not just visibility)?
129
144
  * "Depth-required" is CATALOG-DRIVEN: only scenarios matching a theme whose
@@ -151,6 +166,8 @@ export function assertionDepth(
151
166
 
152
167
  const required = nonManual.filter(isDepthRequired);
153
168
  const reqShallow = required.filter((s) => s.shallow);
169
+ // Business-critical scenarios deferred to @manual (match a data-theme but excluded above).
170
+ const deferredBusinessCritical = scenarios.filter((s) => s.manual && isDepthRequired(s)).length;
154
171
  // No data-theme scenarios on this screen → depth is not the binding constraint
155
172
  // (the viewpoint gate already flags missing data themes). Don't double-penalize.
156
173
  const ratio = required.length ? 1 - reqShallow.length / required.length : 1;
@@ -167,12 +184,64 @@ export function assertionDepth(
167
184
  businessCriticalShallow: reqShallow.length,
168
185
  bcDepthRatio: ratio,
169
186
  shallowBusinessCritical: reqShallow.map((s) => ({ name: s.name, category: s.category })),
187
+ deferredBusinessCritical,
170
188
  focus,
171
189
  threshold,
172
190
  verdict,
173
191
  };
174
192
  }
175
193
 
194
+ // ---------- Sensor 2b: Automatable-@manual (TQ-2) ----------
195
+
196
+ export interface AutomatableManualResult {
197
+ manualTotal: number; // all @manual scenarios
198
+ automatable: number; // @manual that are actually automatable
199
+ scenarios: { name: string; category?: string }[]; // the automatable ones (to surface)
200
+ }
201
+
202
+ // Genuine-judgment markers (M6/M8/M9 territory): visual/responsive/a11y/mock/network/
203
+ // external/empty-state — these legitimately stay @manual (or need a future driver).
204
+ const JUDGMENT_MARKER =
205
+ /\b(visual|responsive|layout|breakpoint|mobile|tablet|viewport|accessib|a11y|keyboard|screen reader|focus order|\baria\b|empty[- ]?(state|product|list|category|cart)|no[- ]?result|missing (image|product|data)|placeholder|fallback|slow|failing|offline|network|loading|spinner|external|new tab|video tutorial|email|mailbox|download|payment gateway|exploratory|not worth)\b/;
206
+
207
+ /**
208
+ * Automatable-@manual (TQ-2) — a `@manual` scenario whose steps are fully DSL-expressible
209
+ * (it carries a real data assertion) and shows no genuine-judgment marker is *automatable*:
210
+ * it was deferred (typically cross-screen → a flow) rather than truly un-automatable. Leaving
211
+ * it `@manual` creates a non-running duplicate AND inflates businessDepth (it's excluded from
212
+ * the ratio). The UI analog of the API driver's `api-manual-automatable`.
213
+ */
214
+ export function automatableManual(scenarios: ScenarioInfo[]): AutomatableManualResult {
215
+ const manual = scenarios.filter((s) => s.manual);
216
+ const automatable = manual.filter((s) => s.hasDataAssertion && !JUDGMENT_MARKER.test(s.haystack));
217
+ return {
218
+ manualTotal: manual.length,
219
+ automatable: automatable.length,
220
+ scenarios: automatable.map((s) => ({ name: s.name, category: s.category })),
221
+ };
222
+ }
223
+
224
+ // ---------- TQ-4: deferral-aware coverage credit ----------
225
+
226
+ /**
227
+ * Which of the given gate gap-themes are deeply covered by a FLOW scenario (a cross-screen
228
+ * deferral the flow actually fulfils). Returns theme → covering flow. The screen audit uses
229
+ * this to credit an inherently-cross-screen theme to the flow that owns it, instead of
230
+ * double-counting it as a screen gap. A flow scenario covers a theme when its haystack hits
231
+ * the theme keywords AND it carries a data assertion (`deep`).
232
+ */
233
+ export function flowCoveredThemes(
234
+ gaps: { theme: string; keywords: string[] }[],
235
+ flowScenarios: { flow: string; haystack: string; deep: boolean }[],
236
+ ): { theme: string; flow: string }[] {
237
+ const out: { theme: string; flow: string }[] = [];
238
+ for (const g of gaps) {
239
+ const hit = flowScenarios.find((s) => s.deep && g.keywords.some((k) => s.haystack.includes(k.toLowerCase())));
240
+ if (hit) out.push({ theme: g.theme, flow: hit.flow });
241
+ }
242
+ return out;
243
+ }
244
+
176
245
  /** Collect data-correctness themes (depth.requires) for a page-type + universal. */
177
246
  export function dataThemesFor(catalog: Catalog, pageType: string | null): CatalogTheme[] {
178
247
  const themes: CatalogTheme[] = [];
@@ -384,8 +453,8 @@ const CLAIM_RULES: ClaimRule[] = [
384
453
  // "double-click does not create two orders" — not a per-feature keyword.
385
454
  claim: 'no-side-effect/no-duplicate',
386
455
  title: /(?=.*\b(submit|sen[dt]|resend|resubmit|re-?fire|re-?issue|re-?post|repost|create|charge|order|payment|\bpay\b|email|request|\botp\b|insert|register|book|duplicate|double[- ]?submit|again|twice)\b)(?=.*(\bno\b|\bnot\b|n['’]t\b|\bnever\b|\bwithout\b|\bcannot\b|prevent|block|avoid|reject|disabl|\bdeny\b|denies|\bkhông\b|\bchưa\b))/i,
387
- proof: /\bcount\b|row with \{\{|table with|tohavecount|is hidden|are hidden|not complete|no longer/,
388
- need: 'a record/request-count proof (count stays at one, e.g. `User see [Table] row with {{count}}`) or @manual with a request-count oracle',
456
+ proof: /\bcount\b|ok_count|status_counts|row with \{\{|table with|tohavecount|is hidden|are hidden|not complete|no longer/,
457
+ need: 'a record/request-count proof (count stays at one, e.g. `User see [Table] row with {{count}}`, an API `{{name.ok_count}}` invariant, or a `@query` DB count) or @manual with a request-count oracle',
389
458
  hint: 'a "does-not-happen / does-not-repeat" claim about a state-changing action is NOT proven by a terminal `see [...] page` — that page is identical whether or not the action (re-)fired. Prove the side-effect count is unchanged, or mark @manual with a setup→action→assert-no-duplicate oracle.',
390
459
  severity: 'fail',
391
460
  },
@@ -13,6 +13,7 @@
13
13
  */
14
14
  import * as fs from 'fs';
15
15
  import * as path from 'path';
16
+ import { reportSlug } from './unit-paths';
16
17
  import { segmentRuns, latestRunEvents, LedgerEvent } from './ledger';
17
18
 
18
19
  interface ManualItem { scenario: string; reason: string }
@@ -22,7 +23,7 @@ function readJson(p: string): any | null {
22
23
  }
23
24
 
24
25
  function readLedger(screen: string): any[] {
25
- const p = path.join(process.cwd(), '.sungen', 'ledger', `${screen}.jsonl`);
26
+ const p = path.join(process.cwd(), '.sungen', 'ledger', `${reportSlug(screen)}.jsonl`);
26
27
  if (!fs.existsSync(p)) return [];
27
28
  return fs.readFileSync(p, 'utf-8').split('\n').filter(Boolean).map((l) => { try { return JSON.parse(l); } catch { return null; } }).filter(Boolean);
28
29
  }
@@ -76,7 +77,7 @@ export function buildTrace(screenDir: string, screenName: string): TraceReport {
76
77
  const recordedSteps = [...new Set(ledger.map((e) => e.step.replace(/:\d+$/, '')))];
77
78
  const missingSteps = EXPECTED_PHASES.filter((p) => !recordedSteps.includes(p));
78
79
 
79
- const auditRaw = readJson(path.join(process.cwd(), '.sungen', 'reports', `${screenName}-audit.json`));
80
+ const auditRaw = readJson(path.join(process.cwd(), '.sungen', 'reports', `${reportSlug(screenName)}-audit.json`));
80
81
  let audit: TraceReport['audit'] = null;
81
82
  if (auditRaw) {
82
83
  const subs: Record<string, number> = {
@@ -91,7 +92,7 @@ export function buildTrace(screenDir: string, screenName: string): TraceReport {
91
92
  };
92
93
  }
93
94
 
94
- const scRaw = readJson(path.join(process.cwd(), '.sungen', 'reports', `${screenName}-script-check.json`));
95
+ const scRaw = readJson(path.join(process.cwd(), '.sungen', 'reports', `${reportSlug(screenName)}-script-check.json`));
95
96
  const drift = scRaw ? scRaw.drift : null;
96
97
 
97
98
  const manual = parseManual(path.join(screenDir, 'features', `${screenName}.feature`));
@@ -0,0 +1,14 @@
1
+ /**
2
+ * Unit-path helpers (api-flow fix). A unit id may be a bare name (`orders`, `login`) or a nested
3
+ * api-flow id (`flows/<flow>`). Two derivations the harness/CLI need:
4
+ * - featureBasename: the `.feature` filename — the LAST path segment (`flows/x` → `x`), so
5
+ * `<dir>/features/<basename>.feature` resolves (the bug: the full id looked for
6
+ * `features/flows/x.feature` → 0 scenarios).
7
+ * - reportSlug: a flat key for `.sungen/reports/<slug>-*.json` + `.sungen/ledger/<slug>.jsonl`
8
+ * (`flows/x` → `flows-x`), so artifacts never nest under a `flows/` subdir and read/write agree.
9
+ * Bare names (no slash) are unchanged by both → no regression for screens/flows/areas.
10
+ */
11
+ import * as path from 'path';
12
+
13
+ export const featureBasename = (unit: string): string => path.basename(unit);
14
+ export const reportSlug = (unit: string): string => unit.replace(/[\\/]+/g, '-');
package/src/index.ts ADDED
@@ -0,0 +1,32 @@
1
+ /**
2
+ * Public API of `@sun-asterisk/sungen` — the capability SPI plus the shared compiler/harness surface
3
+ * that capability drivers (`@sungen/driver-*`) build against. Drivers import from here; core never
4
+ * imports from a driver (discovery loads them at runtime). Keep this surface small and intentional.
5
+ */
6
+
7
+ // --- Capability SPI ---
8
+ export { capabilityRegistry, CapabilityRegistry } from './capabilities/registry';
9
+ export type { CapabilityDescriptor } from './capabilities/registry';
10
+ export type { Sensor, SensorFinding, AdvisoryScanInput, GateInput } from './capabilities/sensor';
11
+ export type { Context, DiscoveryProvider, ContextMapper, GenerationUnit, RepairProvider, RepairRule } from './capabilities/context';
12
+ export { discoverUnitContext } from './orchestrator/context-discovery';
13
+ export type { DiscoveredContext } from './orchestrator/context-discovery';
14
+
15
+ // --- Step-pattern authoring (a driver contributes step patterns via its descriptor) ---
16
+ export type { PatternContext, StepPattern, StepTemplateData } from './generators/test-generator/patterns/types';
17
+ export type { MappedStep } from './generators/test-generator/step-mapper';
18
+ export type { ParsedStep } from './generators/gherkin-parser';
19
+ export { getPathCode, inferPath, resolvePathVariables } from './generators/test-generator/utils/path-inference';
20
+
21
+ // --- Precondition-annotation override grammar (shared by the @query / @api driver codegen) ---
22
+ export { parseQueryOverrides } from './harness/annotation-overrides';
23
+
24
+ // --- Named-query catalog (shared: the DB driver's codegen + core's data-driven advisory lint) ---
25
+ export { resolveQuery, compileQuery, lintCatalog } from './harness/query-catalog';
26
+ export type { QueryEntry } from './harness/query-catalog';
27
+
28
+ // --- Shared harness: viewpoint catalog + coverage gate / assertion depth ---
29
+ // (the UI capability's gateProvider composes these; they also back core's ingest + audit fallback)
30
+ export { loadCatalog, viewpointGate, assertionDepth, dataThemesFor, depthThresholdFor, depthWarnOnly } from './harness/sensors';
31
+ export type { Catalog, GateResult, DepthResult } from './harness/sensors';
32
+ export type { ScenarioInfo, ViewpointEntry } from './harness/parse';
@@ -47,6 +47,7 @@ export const AI_RULES_FILE_MAPPING: [string, string][] = [
47
47
  ['claude-skill-selector-fix.md', '.claude/skills/sungen-selector-fix/SKILL.md'],
48
48
  ['claude-skill-tc-review.md', '.claude/skills/sungen-tc-review/SKILL.md'],
49
49
  ['claude-skill-harness-audit.md', '.claude/skills/sungen-harness-audit/SKILL.md'],
50
+ ['claude-skill-api-design.md', '.claude/skills/sungen-api-design/SKILL.md'],
50
51
  ['claude-skill-ingest-legacy.md', '.claude/skills/sungen-ingest-legacy/SKILL.md'],
51
52
  ['claude-skill-viewpoint.md', '.claude/skills/sungen-viewpoint/SKILL.md'],
52
53
  ['claude-skill-viewpoint-group-a-data-entry.md', '.claude/skills/sungen-viewpoint/group-a-data-entry.md'],
@@ -79,6 +80,7 @@ export const AI_RULES_FILE_MAPPING: [string, string][] = [
79
80
  ['github-skill-sungen-selector-fix.md', '.github/skills/sungen-selector-fix/SKILL.md'],
80
81
  ['github-skill-sungen-tc-review.md', '.github/skills/sungen-tc-review/SKILL.md'],
81
82
  ['github-skill-sungen-harness-audit.md', '.github/skills/sungen-harness-audit/SKILL.md'],
83
+ ['github-skill-sungen-api-design.md', '.github/skills/sungen-api-design/SKILL.md'],
82
84
  ['github-skill-sungen-ingest-legacy.md', '.github/skills/sungen-ingest-legacy/SKILL.md'],
83
85
  ['github-skill-sungen-viewpoint.md', '.github/skills/sungen-viewpoint/SKILL.md'],
84
86
  ['github-skill-sungen-viewpoint-group-a-data-entry.md', '.github/skills/sungen-viewpoint/group-a-data-entry.md'],
@@ -0,0 +1,50 @@
1
+ /**
2
+ * Orchestration phase: Discover → Contextualize (AO-3).
3
+ *
4
+ * The capability SPI declared `discovery` (sources → Context slice) + `contextMapper` (Context →
5
+ * generation units) since R1, but nothing consumed them. This is the deterministic consumer the
6
+ * orchestration loop (`/sungen:design`) calls before generating: resolve the unit's capability,
7
+ * run ITS discovery + contextMapper, and hand back the normalized Context + the generation-unit
8
+ * work-list. Capability-agnostic — a screen, an api area, or a future `mobile/` unit all flow here.
9
+ */
10
+ import { capabilityRegistry } from '../capabilities/registry';
11
+ import { discoverAndRegisterCapabilities } from '../capabilities/discover';
12
+ import type { Context, GenerationUnit } from '../capabilities/context';
13
+
14
+ export interface DiscoveredContext {
15
+ capability: string;
16
+ context: Context;
17
+ units: GenerationUnit[];
18
+ }
19
+
20
+ /** Map a unit id (relative to qa/) to a discovery target. */
21
+ function targetForUnit(unitId: string): Context['target'] {
22
+ if (unitId.startsWith('api/')) return { kind: 'api', id: unitId };
23
+ if (unitId.startsWith('flows/')) return { kind: 'flow', id: unitId };
24
+ return { kind: 'screen', id: unitId };
25
+ }
26
+
27
+ /**
28
+ * Run the Discover → Contextualize phase for a unit. `unitId` is the catalog-resolution id
29
+ * (`<screen>` · `flows/<flow>` · `api/<area>` · `api/flows/<flow>`).
30
+ */
31
+ export async function discoverUnitContext(unitId: string, cwd: string = process.cwd()): Promise<DiscoveredContext> {
32
+ discoverAndRegisterCapabilities();
33
+ const target = targetForUnit(unitId);
34
+ // The discovering capability is the first whose DiscoveryProvider claims the target; else the default.
35
+ const cap = capabilityRegistry.all().find((c) => c.discovery?.appliesTo?.(target))
36
+ ?? (capabilityRegistry.defaultCapabilityId() ? capabilityRegistry.get(capabilityRegistry.defaultCapabilityId()!) : undefined);
37
+
38
+ let context: Context = { target: { ...target, capability: cap?.id }, sources: {}, facts: {} };
39
+ if (cap?.discovery) {
40
+ const slice = await cap.discovery.discover(target, { cwd });
41
+ context = {
42
+ target: { ...target, capability: cap.id },
43
+ sources: { ...context.sources, ...(slice.sources ?? {}) },
44
+ facts: { ...context.facts, ...(slice.facts ?? {}) },
45
+ ...(slice.connectivity ? { connectivity: slice.connectivity } : {}),
46
+ };
47
+ }
48
+ const units = cap?.contextMapper?.decompose(context) ?? [];
49
+ return { capability: cap?.id ?? 'ui', context, units };
50
+ }
@@ -17,8 +17,14 @@ You are an **independent Senior QA Reviewer**. You did **not** write these tests
17
17
  - **Negative / "does-not-happen" claims** (any language — "does not", "no", "prevents", "không", "chưa"): the proof must be a step whose result **differs** between the claim holding and not holding. Ask: *would this `Then` still pass if the bad thing happened?* If yes, it proves nothing. The classic trap: title "browser back does **not** re-submit" with `Then see [sent] page` — that page is identical whether or not the request re-fired. Demand a **contrast/count** proof (record count unchanged, state hidden/empty, error shown) or a justified `@manual` with a setup→action→assert-absence oracle. This generalises to every side-effect (re-charge, duplicate order, resend OTP, data leak), not just re-submit.
18
18
  2. **Observable Then.** Is each `Then` an **observable outcome**, not a restated action or a tautology (e.g. `Then User see [Carousel] section` after clicking next — proves nothing changed)?
19
19
  3. **Business-critical depth.** For cart / product-detail / filter / list viewpoints, do steps assert **DATA** (name, price, quantity, all-items-belong) — not just page/modal visibility? Recommend the concrete deep step: `User remember [X] text as {{v}}` + `... with {{v}}`, or `User see all [X] contain {{v}}`.
20
- 4. **@manual justification.** Is each `@manual` genuinely unautomatable (cross-screen/external/visual) — or a cop-out to dodge the gate? Cross-screen → should be a flow.
20
+ 4. **@manual justification.** Is each `@manual` genuinely unautomatable (external/visual/a11y/mock-needed/judgment) — or a cop-out to dodge the gate? **Cross-screen is NOT a valid `@manual` reason** — a homedetail/cart journey runs as one automated test, so it belongs in a **flow** (`/sungen:add-flow`), not a `@manual` screen copy. A `@manual` scenario that still carries full automatable steps (a data assertion, no visual/mock/a11y judgment) is automatable — flag it (the gate reports it as `MANUAL-AUTOMATABLE`). Genuine `@manual` must name its reason (`@manual:Mx`).
21
21
  5. **Meaning-level duplicates & missing criticals** the keyword gate can't see.
22
+ 6. **API units** (`qa/api/<area>/` — `@api` scenarios, no UI). Judge what the api gate can't:
23
+ - **Prove the effect, not the status.** A mutating endpoint's success path asserting only `{{r.status}} is 201` proves nothing about WHAT changed — demand a **body** assertion (`{{r.body.id}}` / `{{r.body.<field>}}`), a **`@query`** DB side-effect, or (idempotency) a `{{r.ok_count}}` invariant. This is the API businessDepth bar.
24
+ - **Error matrix coherent.** `@cases` rows are a real failure family (validation/auth/conflict) with realistic inputs → declared statuses, not padding.
25
+ - **Flows self-clean.** A CRUD/auth chain deletes what it created (final `@api:delete_*`) or is `@cleanup`-tagged.
26
+ - **Idempotency uses the DB oracle.** A "no double-charge / exactly once" claim is proven by `@concurrent` + a `@query` count, not HTTP status alone (status can lie under a race).
27
+ - **Auth negatives** exist for protected mutations (401/403), not just the happy path.
22
28
 
23
29
  ## Output (do NOT edit any file)
24
30
  Return a concise verdict:
@@ -23,7 +23,11 @@ You are a **Senior QA Engineer** specialized in test case design. You structure
23
23
 
24
24
  Parse **name** from `$ARGUMENTS`. If missing, ask the user.
25
25
 
26
- **Auto-detect context**: check if `qa/flows/<name>/` exists → flow mode. Else check `qa/screens/<name>/` → screen mode. This determines paths, generation strategy, and CLI commands.
26
+ **Auto-detect context**: check if `qa/api/<name>/` or `qa/api/flows/<name>/` exists → **API unit mode** (below). Else if `qa/flows/<name>/` → flow mode. Else `qa/screens/<name>/` → screen mode. This determines paths, generation strategy, and CLI commands.
27
+
28
+ ## API unit mode (driver-api)
29
+
30
+ If the unit is **api-first** (`qa/api/<name>/` or `qa/api/flows/<name>/`), the design loop differs — **no visual capture, no selectors**; the contract is the named-endpoint catalog. **Follow the `sungen-api-design` skill end-to-end** instead of the screen/flow steps below: `sungen context --area <name>` (discover) → API viewpoint overview → generate `@api`/`@cases`/flow/`@concurrent`/`@query` scenarios → **`sungen audit --area <name>` gate + the `sungen-reviewer` sub-agent + repair loop to businessDepth ≥ 0.7** → record + trace. Then jump to the "Converge" next-step options (recommend `/sungen:run-test <name>`). The capture / viewpoint-group / selector steps do **not** apply.
27
31
 
28
32
  ## Steps
29
33
 
@@ -31,9 +35,10 @@ Parse **name** from `$ARGUMENTS`. If missing, ask the user.
31
35
  **Screen**: Verify `qa/screens/<name>/` exists. If not → `/sungen:add-screen` first.
32
36
  2. Check if `.feature` file already has scenarios.
33
37
  - If yes → use `AskUserQuestion` to ask the update mode (see `sungen-tc-generation` skill — mode depends on which tiers already exist).
34
- - If no → fresh creation. Use `AskUserQuestion` to ask generation scope:
35
- - **Tier 1 — Critical & High priority** — ~10-15 scenarios/section covering happy paths, core validation, security basics **(Recommended)**
36
- - **Full coverage — All tiers at once** — generates Tier 1 + 2 + 3 in one run. Large output (~40-60 scenarios/section), best for experienced users who want complete coverage immediately
38
+ - If no → fresh creation. **Write the feature file incrementally** (successive `Write`/`Edit`, ≈10-15 scenarios per call) — never emit the whole suite in one response, or it can exceed the model's output-token cap (`API Error: Claude's response exceeded the N output token maximum`). Use `AskUserQuestion` to ask generation scope:
39
+ - **Tier 1 — Critical & High priority** — ~10-15 scenarios/section: happy paths, core validation, security basics **(Recommended)**
40
+ - **Full coverage (incremental)** — Tier 1 + 2 + 3, written tier-by-tier in batches (`Write` T1 `Edit` append T2 `Edit` append T3). Safe on any output-token budget.
41
+ - **Full coverage (single pass)** — generate everything in one go (~40-60 scenarios/section). Faster, but **only if you raised your output cap** (`CLAUDE_CODE_MAX_OUTPUT_TOKENS ≥ 64000`) — otherwise it errors mid-generation. For power users on a high-token model/config.
37
42
  3. **Read project context + screen requirements**
38
43
 
39
44
  **Project context** — check `qa/context.md` (project root, not screen-specific):
@@ -73,7 +78,7 @@ Parse **name** from `$ARGUMENTS`. If missing, ask the user.
73
78
  - **Independent semantic review.** **Claude Code:** spawn the **`sungen-reviewer`** sub-agent (Task tool, `subagent_type: sungen-reviewer`) — it judges what the gate can't (does each scenario's steps PROVE its title/viewpoint, observable Thens, business-critical assertion depth) and returns `VERDICT` + `ISSUES` with concrete fixes. **Merge its NEEDS-REPAIR issues with the audit findings.** (Copilot / no sub-agents: run the same review inline using the `sungen-reviewer` criteria.)
74
79
  - Repair **both** the audit findings and the reviewer issues (budget 3 rounds), then re-audit:
75
80
  - If the gate FAILs or there are findings, **repair** (budget 3 rounds), then re-audit:
76
- - **GATE** missing critical theme → generate scenarios for it. If it is **cross-screen** (cart-correctness, product-detail-consistency, filter-result-correctness): write the scenario with **observable data assertions** (`... with {{value}}`, `table ... with {{value}}`), tag it `@manual`, and add a comment `# Deferred to a flow (<screen> -> <target>) for automation`. Do **not** fake a shallow single-screen pass.
81
+ - **GATE** missing critical theme → generate scenarios for it. If it is **cross-screen** (cart-correctness, product-detail-consistency, filter-result-correctness): **automate it in the flow** (`/sungen:add-flow` if none exists) with observable data assertions (`... with {{value}}`, `see all ... contain {{v}}`) a single home→target journey runs as one Playwright test. Do **not** write a full `@manual` duplicate of it on the screen (that is a non-running dead copy — `sungen audit` flags it `MANUAL-AUTOMATABLE`), and do **not** fake a shallow single-screen pass. Reserve `@manual` for true judgment / missing-capability, tagged `@manual:Mx`.
77
82
  - **DEPTH** → replace `see [X] page/section` on business-critical scenarios with data assertions.
78
83
  - **BALANCE** → stop expanding secondary viewpoints; add business-core scenarios first.
79
84
  - **TRACE** → align `VP-` ids with the viewpoint-overview.
@@ -30,7 +30,23 @@ If the count is 0 → use `AskUserQuestion` to offer:
30
30
 
31
31
  Skip this pre-flight when `--env` matches the base locale (no overlay needed in that case).
32
32
 
33
- **Auto-detect context**: check if `qa/flows/<name>/` exists → flow mode (base path: `qa/flows/<name>/`). Else check `qa/screens/<name>/` → screen mode (base path: `qa/screens/<name>/`).
33
+ **Auto-detect context**: check if `qa/api/<name>/` or `qa/api/flows/<name>/` exists → **API unit mode** (below). Else if `qa/flows/<name>/` → flow mode (base path: `qa/flows/<name>/`). Else `qa/screens/<name>/` → screen mode (base path: `qa/screens/<name>/`).
34
+
35
+ ## API unit mode (driver-api) — no selectors
36
+
37
+ If the unit is **api-first**, skip every selector/capture phase (an API test has no DOM). Instead:
38
+
39
+ 1. **Resolve the datasource** — ensure the `kind: api` datasource's `base_url` + auth are wired in `qa/datasources.yaml` + `.env.qa` (the `${X_URL}` key from `sungen api init`). A `production` datasource is refused unless `SUNGEN_ALLOW_PROD=1`.
40
+ 2. **Compile**: `[ -x ./bin/sungen.js ] && ./bin/sungen.js generate --area <name> || npx sungen generate --area <name>` → `specs/generated/api/<name>/`.
41
+ 3. **Run**: `npx playwright test specs/generated/api/<name>/<name>.spec.ts` (per-spec JSON results, as below).
42
+ 4. **Auto-fix** (no selectors — the failure classes differ): use `sungen-error-mapping`.
43
+ - **401/403** → wire `@hybrid` + `@auth:<role>` (reuse the UI session) or the catalog `Bearer :token` header; suggest `sungen makeauth <role>`.
44
+ - **datasource/base_url unresolved** → set the `${X_URL}` key in `.env.qa`.
45
+ - **missing/empty bound param** → trace `{{var}}` to test-data or a prior `@api` response; fill it.
46
+ - **`expect.status` mismatch** → reconcile against `apis.yaml`/spec (the catalog is the oracle); **never hand-edit the generated spec** (re-`generate --area` instead).
47
+ - **400 "parameter missing" / body ignored** → the endpoint wants a form body, not JSON → set `encoding: form` (or `multipart`) on the catalog entry, re-`generate --area`. Don't mark the scenario `@manual`.
48
+ - **flaky** → enforce self-cleaning flows, per-row isolation (`@cases`), `@concurrent` caps.
49
+ 5. **Integrity + trace** — `sungen script-check --area <name>` (verify the spec is a 1:1 of the Gherkin; on DRIFT re-`generate --area`, never hand-edit) and `sungen trace --area <name>` (process map + HUMAN-LOOP FOCUS). Then report + offer next steps.
34
50
 
35
51
  ## Pre-run (phased — per `sungen-selector-fix` skill)
36
52
 
@@ -86,6 +102,7 @@ Skip this pre-flight when `--env` matches the base locale (no overlay needed in
86
102
  9. **Integrity check & trace (always run after the final run).**
87
103
  - `sungen script-check --screen <name>` — verify the generated spec is a **1:1** of the Gherkin (every non-@manual scenario ↔ one `test()`, no drift). If it reports **DRIFT** (spec hand-edited or stale), re-run `sungen generate --screen <name>` so the spec matches the feature, then re-run — **never hand-edit the generated spec** (auto-fix must edit `selectors.yaml`, not the `.spec.ts`).
88
104
  - `sungen ledger record --screen <name> --step run --ms <elapsed>` (record this run), then `sungen trace --screen <name>` — show the process map + bottlenecks + **HUMAN-LOOP FOCUS** (the @manual scenarios the QA must verify) to the user.
105
+ 10. **Capability-pending offer (consent-gated).** If `sungen audit --screen <name>` reports `AUTOMATION-READY-PENDING` (or the run shows `@requires:<cap>` tests skipped "requires …"), these are **automation-ready** scenarios waiting on an opt-in driver. Use `AskUserQuestion` to offer: *"N scenario(s) are automation-ready — enable `<cap>` to run them? (`sungen capability add <cap>`)"*. **Only on the user's yes** run `sungen capability add <cap>` then re-run those specs; on no, leave them skipped (they are NOT failures and NOT manual). **Never auto-install.**
89
106
 
90
107
  ## Playwright command guidelines
91
108
 
@@ -0,0 +1,62 @@
1
+ ---
2
+ name: sungen-api-design
3
+ description: The API-first design loop for an api unit (qa/api/<area> or qa/api/flows/<flow>) — discover the catalog, lay out the API viewpoints, generate @api/@cases/flow/@concurrent scenarios, then drive the sungen audit --area gate + reviewer + repair to a high businessDepth (≥0.7). Use when create-test/run-test detects an api unit (no selectors, no visual capture).
4
+ ---
5
+
6
+ # API design loop (driver-api · Orchestration + Harness)
7
+
8
+ Use this when the unit is **api-first** — `qa/api/<area>/` or `qa/api/flows/<flow>/`. There are **no selectors and no visual capture**: the contract is the **named-endpoint catalog** (`api/apis.yaml`), referenced by `@api:<name>`. QA writes **no HTTP code**. Full annotation reference: the **API Steps** guide (`@api` / `@cases` / flows / `@concurrent` / `@hybrid`).
9
+
10
+ ## The loop (mirror of /sungen:design, API-native)
11
+
12
+ ### 1. Discover (no capture)
13
+ Run `sungen context --area <name>` — it reads the catalog and prints the **endpoints** + the **generation units** (one `matrix` unit per endpoint, an `async` unit per mutating endpoint, a `flow` unit for an api flow). Read `qa/api/<name>/requirements/spec.md` if present. No `apis.yaml` yet? → `sungen api import <openapi|csv>` or `sungen api add --area <name>` first.
14
+
15
+ ### 2. API viewpoint overview (by method-profile)
16
+ For each endpoint, cover its viewpoints — severity-weighted by method:
17
+
18
+ | Profile | Endpoints | Must cover | Then |
19
+ |---|---|---|---|
20
+ | read | GET, HEAD | `contract` (status + body shape) | `pagination`/`filter` (list), `not-found` (by-id) |
21
+ | mutating | POST/PUT/PATCH/DELETE | `contract`, `error` (validation/4xx/auth) | `idempotency` (`@concurrent`), `side-effect` (`@query`) |
22
+
23
+ Bands: **~70%** success+failure matrix · **~20%** flows (auth/CRUD chains) · **~10%** async/idempotency.
24
+
25
+ ### 3. Generate (incremental — never the whole suite in one Write)
26
+ - **Contract**: `@api:<name>` + `expect {{name.status}} is …` **and a body assertion** (`{{name.body.<path>}}`).
27
+ - **Error matrix**: `@api:<name>(p={{p}}) @cases:<dataset>` — one scenario, a dataset of `input → expected status`.
28
+ - **Flow**: ordered `@api` tags threading a prior response (`token={{login.body.token}}` → the catalog `Bearer :token` header; `id={{create.body.id}}` → a path param). Self-clean (delete what you create).
29
+ - **Idempotency**: `@api:<name> @concurrent:N` + `expect {{name.ok_count}} is 1`, cross-checked with `@query` (the DB is the oracle).
30
+
31
+ ### 4. Gate + repair (always — businessDepth ≥ 0.7 is the bar)
32
+ Run `sungen audit --area <name>`; read `gateStatus` + `findings`. Then the **semantic reviewer** (sungen-reviewer sub-agent, API criteria). Repair **both** (budget 3 rounds), re-audit until PASS:
33
+
34
+ | Finding | Repair |
35
+ |---|---|
36
+ | `VIEWPOINT-API-CONTRACT` | the endpoint is invoked but its response is never asserted → add `expect {{name.status}}` + a `{{name.body.…}}` check |
37
+ | `VIEWPOINT-API-ERROR` | a mutating endpoint has no failure scenario → add a `@cases` error matrix (or an explicit 4xx) |
38
+ | `VIEWPOINT-API-IDEMPOTENCY` | a mutating endpoint has no race check → add `@concurrent:N` + a `@query` DB cross-check |
39
+ | `VIEWPOINT-API-MANUAL-AUTOMATABLE` | a `@manual` scenario whose endpoint resolves is automatable → drop `@manual`, use `@api` (+ `@cases`); reserve `@manual` for genuine judgment cases |
40
+ | **`DEPTH-FAIL`** (businessDepth < 0.7) | a **mutating success** scenario asserts only `status` → make it **prove the effect**: assert a response **body** field, a **`@query`** side-effect, or a **`@concurrent` `ok_count`** invariant. (An error/`@cases` scenario proving the status is correct — it is *not* depth-required.) |
41
+
42
+ Stop when the gate PASSes + businessDepth ≥ 0.7, or the budget is exhausted → report residual gaps honestly (mark genuinely-unautomatable cases `@manual` with an oracle). Never fake a pass.
43
+
44
+ ### 5. Record + converge
45
+ `sungen manifest --area <name>` (reuse) and ledger each phase; show the trace + the HUMAN-LOOP FOCUS. (Integrity `script-check`/`trace` for api: see run-test.)
46
+
47
+ ## Taxonomy (label scenarios correctly)
48
+
49
+ | Class | What | Examples |
50
+ |---|---|---|
51
+ | **Functional** | single-endpoint behaviour | happy contract · error/validation (`@cases`) · boundary/edge |
52
+ | **Functional — flow/integration** | multi-endpoint journeys | auth/CRUD lifecycle (`create → login → get → delete`), cross-endpoint invariants |
53
+ | **Non-Functional** | performance · reliability · **security** · concurrency/idempotency | `@concurrent` race/idempotency |
54
+
55
+ A flow (`create → login → delete`) is a **Functional integration** test, **not** non-functional — don't file it under "Non-Functional". Reserve non-functional for perf/security/concurrency.
56
+
57
+ ## Rules
58
+ - **No HTTP, no selectors** — only `.feature` + the reviewed `apis.yaml` + `test-data`.
59
+ - **Non-prod default** — a `production` datasource is refused unless `SUNGEN_ALLOW_PROD=1`.
60
+ - **The DB is the oracle** for idempotency/side-effects — HTTP status alone can lie; pair `@api` with `@query`.
61
+ - **`@parallel` + mutating endpoints** — give each scenario **isolated data** (a `{{$uuid}}` email, a `@cases` row, or its own created resource) and **self-clean** (delete what it created); shared inputs race under parallel execution.
62
+ - **No dead data** — every `test-data` key must be bound into a scenario (`{{key}}`, a `@cases` dataset, or an override). `sungen audit`/the generate lint flag unreferenced keys.
@@ -213,6 +213,7 @@ Options: `nth` `exact` `scope` `match` `variant` `frame` `contenteditable` `colu
213
213
  | `@flow` | Mark feature as E2E flow (cross-screen testing) |
214
214
  | `@cases:dataset` | Data-driven: run the scenario once per row of the `dataset` LIST in test-data → one `test()` per row |
215
215
  | `@query:name` | Database: run the named query from `database/queries.yaml` (precondition) and bind its rows to `{{name}}`; assert with `expect {{name.count}} …` + path access. Override params `@query:name(p={{v}})`. Repeatable. (Optional Data Driver — see Database verification above) |
216
+ | `@api:name` | API: run the named request from `api/apis.yaml` (precondition) and bind the response to `{{name}}`; assert with `expect {{name.status}} …` + path access (`{{name.body.<path>}}`). Override params `@api:name(p={{v}})`. Repeatable. (Optional API Driver) |
216
217
 
217
218
  ### Data-driven scenarios (`@cases`)
218
219
 
@@ -58,7 +58,7 @@ Use these when repairing GATE/DEPTH findings for the hard viewpoints (cart/detai
58
58
  ```
59
59
  `see all [X] contain {{v}}` asserts EVERY matching element contains the value → "all displayed products belong to the selected category/brand", not just one.
60
60
 
61
- > Cross-screen flows (home → detail/cart): if the target screen is a separate screen, prefer a **flow** (`/sungen:add-flow`) so the journey is one test. On a single screen, keep the cross-screen assertion but tag `@manual` with a `# Deferred to a flow` comment.
61
+ > Cross-screen flows (home → detail/cart): **automate the journey as a flow** (`/sungen:add-flow`) it runs as one test, so it is automatable. Do **not** keep a full `@manual` duplicate of it on the screen (a non-running dead copy that `sungen audit` flags as `MANUAL-AUTOMATABLE` and that inflates nothing — deferred business-critical is reported as `DEPTH-DEFERRED`). The screen keeps its screen-contract; the flow owns the cross-screen depth. `@manual` is for genuine judgment / missing-capability only, tagged `@manual:Mx`.
62
62
 
63
63
  ## Repair loop rules
64
64
 
@@ -66,6 +66,7 @@ Use these when repairing GATE/DEPTH findings for the hard viewpoints (cart/detai
66
66
  2. **Stop when** `gateStatus == PASS` AND `findings` empty — or budget exhausted.
67
67
  3. **Never fake a pass.** A shallow `see [Cart] page` does not satisfy `cart-correctness`. If a gap is genuinely cross-screen or needs capabilities the DSL lacks (e.g. capture an element value to compare elsewhere), **report it as a residual gap / flow item** instead of forcing a green gate.
68
68
  4. **EP/data families are OK.** A `duplicates` cluster with `sameDataLikely=false` is an intentional equivalence-partition family (e.g. many invalid-email cases) — keep it; only collapse `sameDataLikely=true` exact duplicates.
69
+ 5. **Advisory findings — surface, don't gate.** `MANUAL-REASON-MISMATCH` → fix the scenario's `@manual:Mx` code (so the planner recommends the right driver) during repair. `CAPABILITY-SUGGESTION` → **present it to the user as a next-step option** (e.g. "N @manual could be automated — `sungen capability add api db`?"), **recommend-only — never auto-install**. Neither fails the gate.
69
70
 
70
71
  ## Discovery / fallback tree (when input is limited)
71
72