@therocketcode/gsd-core 1.7.5 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. package/.claude-plugin/plugin.json +1 -1
  2. package/agents/gsd-plan-checker.md +2 -2
  3. package/commands/gsd/cicd-strategy.md +67 -0
  4. package/commands/gsd/discover-product.md +2 -2
  5. package/commands/gsd/infrastructure-strategy.md +65 -0
  6. package/gemini-extension.json +1 -1
  7. package/gsd-core/references/ai-test-quality.md +85 -0
  8. package/gsd-core/references/architecture-decision.md +10 -7
  9. package/gsd-core/references/cicd-strategy.md +115 -0
  10. package/gsd-core/references/contract-testing.md +9 -1
  11. package/gsd-core/references/data-environments.md +89 -0
  12. package/gsd-core/references/domain-modeling.md +14 -2
  13. package/gsd-core/references/e2e-tiering.md +2 -2
  14. package/gsd-core/references/infrastructure-strategy.md +91 -0
  15. package/gsd-core/references/product-discovery.md +7 -7
  16. package/gsd-core/references/test-doubles.md +88 -0
  17. package/gsd-core/references/test-strategy.md +6 -5
  18. package/gsd-core/templates/adr.md +21 -1
  19. package/gsd-core/templates/cicd-strategy.md +72 -0
  20. package/gsd-core/templates/domain-model.md +4 -2
  21. package/gsd-core/templates/infra-strategy.md +77 -0
  22. package/gsd-core/templates/product-brief.md +10 -8
  23. package/gsd-core/templates/test-strategy.md +8 -0
  24. package/gsd-core/workflows/add-tests.md +8 -3
  25. package/gsd-core/workflows/cicd-strategy.md +152 -0
  26. package/gsd-core/workflows/discover-product.md +13 -9
  27. package/gsd-core/workflows/discuss-phase.md +1 -1
  28. package/gsd-core/workflows/help/modes/full.md +2 -0
  29. package/gsd-core/workflows/infrastructure-strategy.md +142 -0
  30. package/gsd-core/workflows/model-domain.md +13 -13
  31. package/gsd-core/workflows/plan-phase.md +2 -2
  32. package/gsd-core/workflows/recommend-architecture.md +22 -8
  33. package/gsd-core/workflows/testing-strategy.md +6 -4
  34. package/package.json +1 -1
@@ -40,7 +40,7 @@ cat .planning/DOMAIN-MODEL.md 2>/dev/null || true
40
40
 
41
41
  **If `NO_DOMAIN_MODEL`:** tell the user "No domain model found — I'll ask the complexity questions directly. (Consider `/gsd:model-domain` first for a sharper result.)" Then gather, per major area: is it core/supporting/generic, and how complex (rich rules vs CRUD)?
42
42
 
43
- From DOMAIN-MODEL.md (or the answers): extract each subdomain's type + complexity. The **core** subdomain's complexity is the primary driver of Axis A.
43
+ From DOMAIN-MODEL.md (or the answers): extract each subdomain's type + complexity, **plus** the bounded contexts (owns / talks-to), the context-map relationships, and any flagged polysemes — Step 4.5 consumes them. The **core** subdomain's complexity is the primary driver of Axis A.
44
44
 
45
45
  ## Step 3: Axis A — domain-logic organization (per subdomain)
46
46
 
@@ -53,12 +53,14 @@ For the **core** subdomain, use `AskUserQuestion` (header "Core logic"):
53
53
  **Reconcile against DOMAIN-MODEL (do not skip).** If the user's answer contradicts the documented core complexity — e.g. they say "it's just CRUD" but DOMAIN-MODEL marks this a high-complexity *differentiating* core with rich invariants — the **documented complexity governs**. Present the discrepancy back ("the domain model flags rich pricing/scoring invariants here — that's a Domain Model, not CRUD") and default to the higher rung unless the user gives a concrete reason the model is wrong. Never let an understatement silently lower the core's rung — a scripted core over a documented rich domain is the exact under-engineering this skill exists to prevent.
54
54
 
55
55
  Then probe the orthogonal signals (only when relevant):
56
- - **Hexagonal wrapper?** "Will you swap or add adapters (DBs, queues, 3rd-party APIs), have multiple delivery mechanisms, or need high test isolation over a long lifespan?" Yes → wrap **the core only** in Hexagonal/Clean. No → skip (one-impl ports would be over-engineering). If the user asks for Clean/Hexagonal across the *whole app*, push back: scope it to the core; CRUD subdomains over one DB don't earn ports.
56
+ - **Hexagonal wrapper?** "Is there a **current, concrete** second adapter or delivery mechanism (a real DB/queue/3rd-party swap, a second front-end), or a genuinely pure core worth isolating for test speed?" Yes → wrap **the core only** in Hexagonal/Clean. No → skip (one-impl ports would be over-engineering). Soft criteria — long lifespan, "testability" in the abstract — lean **no**: a framework-free domain layer enforced by fitness function gives the same isolation without one-impl ports. If the user asks for Clean/Hexagonal across the *whole app*, push back: scope it to the core; CRUD subdomains over one DB don't earn ports.
57
57
  - **CQRS?** only if reads structurally diverge from writes / reads ≫ writes.
58
58
  - **Event Sourcing?** First ask whether any financial/regulatory flow (payments, invoicing, tax) carries an audit / retention / temporal-reconstruction mandate. If there is an audit need, prefer a **scoped append-only audit log on that component** — reserve Event Sourcing for when full temporal reconstruction is the hard requirement. "Sounds robust" is never sufficient.
59
59
 
60
60
  For a **supporting** subdomain whose rules are *growing* (a state machine + multiplying invariants, e.g. an "emerging" subdomain), keep it at the floor (Transaction Script) but record a **promotion trigger** — the concrete signal that would later move it to a Domain Model.
61
61
 
62
+ For an ingestion/pipeline-shaped subdomain, the rung describes its business logic only — also record the data shape (buffer/queue, backpressure, retention, hot/cold) as a module note, or the planner gets a rung but no shape for a firehose.
63
+
62
64
  Record each subdomain → rung + the **concrete signal** that justifies anything above Transaction Script. If no concrete signal exists, stay at the floor.
63
65
 
64
66
  ## Step 4: Axis B — deployment topology
@@ -68,16 +70,26 @@ Default is **Modular Monolith**. Recommend microservices ONLY if all three gates
68
70
  2. "Is there CD / monitoring / DevOps maturity already in place?"
69
71
  3. "Are the bounded contexts well-understood already (not still being discovered)?"
70
72
 
71
- If **any** is "no" → **Modular Monolith. Say so and stop here on Axis B** (note: complexity alone never justifies microservices — the microservice premium). If the user argues for microservices from **scale** ("we'll have millions of X"), name it as axis confusion: scale is an Axis-B NFR met by horizontally replicating the monolith (and, if one component later scales divergently, a Hard-Parts extraction of *that component only*) — not by splitting into services. Say this explicitly.
73
+ If **any** is "no" → **Modular Monolith. Say so and stop on the *microservices* question** (note: complexity alone never justifies microservices — the microservice premium). If the user argues for microservices from **scale** ("we'll have millions of X"), name it as axis confusion: scale is an Axis-B NFR met by horizontally replicating the monolith (and, if one component later scales divergently, a Hard-Parts extraction of *that component only*) — not by splitting into services. Say this explicitly.
74
+
75
+ Then — **whether or not the gates passed** — if any single component shows divergent pressure (scaling, volatility, fault isolation), run the **Hard-Parts scan** on that component: score the 6 disintegrators (low cohesion · divergent volatility · divergent scaling · fault isolation · differential security · independent extensibility) vs the 4 integrators (ACID across data · tight workflow · shared code · tight data relationships). Net disintegrators ≫ integrators → extraction **candidate**: extract **now** only if the disintegrating pressure is **current (not projected)** AND the CD/ops gate passes — otherwise record it as that component's promotion trigger. Integrators dominate → keep it modular. Warn explicitly against a **distributed monolith** (services that can't deploy independently).
72
76
 
73
- If all three pass, OR a single component looks special, run the **Hard-Parts scan** on that component: score the 6 disintegrators (low cohesion · divergent volatility · divergent scaling · fault isolation · differential security · independent extensibility) vs the 4 integrators (ACID across data · tight workflow · shared code · tight data relationships). Net disintegrators integrators extract it; otherwise keep it modular. Warn explicitly against a **distributed monolith** (services that can't deploy independently).
77
+ **Tenancy probe (mandatory when the product serves multiple customer orgs):** ask "single-tenant or multi-tenant? If multi-tenant: does any customer segment carry a contractual/regulatory isolation requirement?" The ADR **must** decide the isolation rung: **shared schema + tenant-scoped RLS** (the default) schema-per-tenant DB-per-tenant. A *real* contractual/regulatory mandate climbs the rung; "enterprise customers will expect it" is the "sounds robust" of tenancy name it as such, stay at the default, and record the mandate-lands condition as a promotion trigger.
74
78
 
75
79
  When recommending the monolith (a gate failed), **record the promotion trigger** — the concrete future signal that would justify revisiting Axis B (a second team forms, a component's scaling diverges, a bounded context stabilizes). The monolith is **sacrificial/evolutionary**, not permanent: note that the eventual split, if it comes, uses **Strangler Fig + an Anti-Corruption Layer + data-decomposition-behind-module-boundaries-first (+ sagas/outbox for cross-service consistency)** — never a big-bang rewrite. Enforcing "no cross-module DB access" as a fitness function now is what makes that future split cheap. (See *Evolving the topology* in the reference.)
76
80
 
81
+ ## Step 4.5: Module map (modular monolith)
82
+
83
+ A modular monolith needs named modules — without a module list, "no cross-module DB access" is unenforceable. Derive it from DOMAIN-MODEL, don't invent it:
84
+ - **Modules = bounded contexts** when the Bounded Contexts table is filled; **modules = subdomain groupings** when a single context was assumed.
85
+ - **Resolve every flagged polyseme:** assign each meaning to one owning module — one word meaning two things across modules is exactly the rot the fitness functions exist to prevent.
86
+ - Carry the **context-map relationships** into inter-module contracts — and apply an **ACL now** to any third-party/legacy integration whose model differs from yours, not only at a future split.
87
+ - Record each module → what it **owns** (incl. its schema) → who it **talks to** and **via what** (sync call vs event). If any interaction is async, decide the mechanism in one line (in-process events / job queue / outbox); if the domain implies recurring/scheduled computation, decide where it runs (cron / job queue / recompute-on-read).
88
+
77
89
  ## Step 5: Over-/under-engineering check (the meta-tell)
78
90
 
79
91
  Run this check in **both directions** — it is a first-class gate, not a formality:
80
- - **Downward (over-engineering):** for every non-floor rung chosen in Steps 3–4, name the **current, concrete requirement** that justifies it (a real second adapter, a real divergent-scaling component, a real second team, a real audit mandate). No concrete requirement → **drop it to the floor**.
92
+ - **Downward (over-engineering):** for every non-floor rung chosen in Steps 3–4, name the **current, concrete requirement** that justifies it (a real second adapter or delivery mechanism, a real divergent-scaling component, a real second team, a real audit mandate, a real tenant-isolation mandate, a genuinely pure core isolated for test speed). No concrete requirement → **drop it to the floor**.
81
93
  - **Upward (under-engineering):** for the core, re-check DOMAIN-MODEL. If the documented complexity (rich invariants, scoring, state machines, audit needs) is *not* captured by the chosen rung — including when a user understatement lowered it in Step 3 — **raise the rung**. A scripted core over a documented rich domain, or thin CRUD over a regulated/audited flow, must be raised.
82
94
 
83
95
  State the surviving justifications; you'll record them in the ADR.
@@ -88,7 +100,7 @@ State the surviving justifications; you'll record them in the ADR.
88
100
  - Your recommended option in one paragraph (the Axis-A rungs per subdomain + the Axis-B topology), how it compares to the default baseline (modular monolith + Domain Model in the complex core + Transaction Script elsewhere + ADRs + fitness functions), and 1–2 alternatives with trade-offs.
89
101
  - options: "Accept the recommendation", "Adjust (I'll tell you what)", "Show me the alternatives in detail"
90
102
 
91
- Once approved, render `@~/.claude/gsd-core/templates/adr.md` (fill `[NNNN]` with the next number, `[DATE]` with today's date, `[PROJECT_TITLE]` from PROJECT.md, Status = Accepted). Fill: Context (complexity + NFRs + team/ops), Decision (Axis A per-subdomain table + Axis B with the three gate answers), Consequences (incl. **fitness functions** to enforce boundaries), Alternatives rejected, and the over-/under-engineering check (each non-floor rung ↔ its concrete requirement).
103
+ Once approved, render `@~/.claude/gsd-core/templates/adr.md` (fill `[NNNN]` with the next number, `[DATE]` with today's date, `[PROJECT_TITLE]` from PROJECT.md, Status = Accepted). Fill: Context (complexity + NFRs + team/ops), Decision (Axis A per-subdomain table + Axis B with the three gate answers + tenancy when multi-tenant + the Module map), Promotion triggers (component → observable condition → response), Consequences (incl. **fitness functions** to enforce boundaries), Alternatives rejected, and the over-/under-engineering check (each non-floor rung ↔ its concrete requirement).
92
104
 
93
105
  Write to `.planning/adr/NNNN-architecture.md`. Capture *why* and trade-offs, not implementation detail.
94
106
 
@@ -112,6 +124,7 @@ ADR-NNNN written — architecture decided.
112
124
  Axis A (domain logic): [core → rung]; [others → Transaction Script]
113
125
  Axis B (topology): [Modular Monolith | Microservices | component X extracted]
114
126
  Fitness functions: [N] to enforce boundaries
127
+ Promotion triggers: [N] recorded — re-check at /gsd:new-milestone
115
128
 
116
129
  Next: /gsd:plan-phase (planning + test strategy will follow this shape)
117
130
  ```
@@ -120,7 +133,7 @@ Next: /gsd:plan-phase (planning + test strategy will follow this shape)
120
133
 
121
134
  <critical_rules>
122
135
  - **Keep the two axes separate.** Domain-logic complexity drives Axis A; team structure + NFRs + ops maturity drive Axis B. Never let high scale imply a Domain Model, or rich logic imply microservices.
123
- - **Modular monolith is the default.** Microservices require ALL three "tall enough" gates; if any fails, recommend the monolith and stop — regardless of complexity.
136
+ - **Modular monolith is the default.** Microservices require ALL three "tall enough" gates; if any fails, recommend the monolith — regardless of complexity. The stop applies to the *microservices* question only: the per-component Hard-Parts scan still runs whenever a single component shows divergent pressure.
124
137
  - **The meta-tell governs.** Every non-floor rung must map to a current, concrete requirement, or it drops to the floor.
125
138
  - **Recommend, don't dictate.** Present trade-offs and alternatives; the user approves. They have context you lack.
126
139
  - **Always emit an ADR** with explicit fitness functions, and respect `commit_docs` / `response_language`.
@@ -129,7 +142,8 @@ Next: /gsd:plan-phase (planning + test strategy will follow this shape)
129
142
  <success_criteria>
130
143
  - Context loaded (PROJECT.md, REQUIREMENTS.md, DOMAIN-MODEL.md if present)
131
144
  - Axis A decided per subdomain, each non-floor rung tied to a concrete signal
132
- - Axis B decided via the three gates (+ Hard-Parts scan for any split); modular monolith unless all gates pass
145
+ - Axis B decided via the three gates (+ Hard-Parts scan for any divergent component); modular monolith unless all gates pass
146
+ - Module map derived from DOMAIN-MODEL (polysemes resolved); tenancy isolation decided when multi-tenant
133
147
  - Over-/under-engineering meta-tell applied
134
148
  - Recommendation presented with trade-offs/alternatives and user-approved
135
149
  - ADR written to `.planning/adr/NNNN-architecture.md` with fitness functions; committed when commit_docs is true
@@ -48,11 +48,11 @@ cat TESTING-STANDARDS.md 2>/dev/null || true
48
48
  The shape is an **output**, never a target you pick. For each subdomain, map its architecture rung → primary test level (use the reference's table):
49
49
  - **Domain Model / rich core** → more **small (unit)** tests of the domain logic through its public API; **sociable** (real collaborators), mock only at ports.
50
50
  - **Transaction Script / CRUD-over-DB** → more **medium (integration)** tests against a real DB (see `test-containers` / `db-test-isolation`); few unit tests.
51
- - **Hexagonal core** → unit-test the pure domain; integration-test the adapters.
51
+ - **Hexagonal core** → pure domain needs no doubles; test the application core with in-memory **fakes** at its ports (see `test-doubles.md`); integration-test the adapters real.
52
52
  - **Many external integrations** → medium integration tests at the ports; contract tests where a 3rd-party can't be seeded.
53
53
  - **Bought / off-the-shelf (Generic)** → do NOT test the vendor's internals; thin integration smoke at your own adapter seam only.
54
54
 
55
- Record subdomain → primary level + the rung that justifies it. Do NOT announce a chosen "pyramid/diamond" — let the distribution emerge. If the user asks to pick a shape, redirect: the architecture already determines where the behavior lives. If the user asks to mock the database or all collaborators, reject it: integration tests run against a **real** DB (see `test-containers` / `db-test-isolation`); mock ONLY at external ports — never the DB or in-process collaborators.
55
+ Record subdomain → primary level + the rung that justifies it. Do NOT announce a chosen "pyramid/diamond" — let the distribution emerge. If the user asks to pick a shape, redirect: the architecture already determines where the behavior lives. If the user asks to mock the database or all collaborators, reject it: integration tests run against a **real** DB (see `test-containers` / `db-test-isolation`); mock ONLY at external ports — never the DB or in-process collaborators. If the user proposes mocking a 3rd-party API in integration tests and calling it covered, reject that too: a mock proves nothing about the real provider — use a verified contract, or, for vendors who won't run verification, the schema + recorded-fixtures + sandbox-smoke fallback (see `contract-testing.md`).
56
56
 
57
57
  ## Step 4: Gnarly bits + what NOT to test
58
58
 
@@ -60,6 +60,8 @@ From DOMAIN-MODEL + REQUIREMENTS, identify the **pure, logic-dense** code that e
60
60
 
61
61
  State what NOT to test: framework/library code, trivial getters/setters, mock behavior; and the rule — **each behavior tested once, at the cheapest level** (no duplicate unit+integration+e2e coverage of the same behavior).
62
62
 
63
+ If existing code already violates a standard you are recording (e.g. money stored as floats), flag it in TEST-STRATEGY.md's Notes as a **pre-test remediation task** (refactor first) — never write tests that enshrine the violating representation.
64
+
63
65
  ## Step 5: Persistent critical-path e2e (the smoke list)
64
66
 
65
67
  Ask (AskUserQuestion, header "E2E", or a text list): "Which flows are so essential they must be smoke-tested on every CI run? (e.g., auth, payment, the core journey)." Capture 3–7 → the **persistent** smoke suite (keep it lean, <5 min). Note that everything else is **transient** (throwaway dev-loop e2e, demoted to integration once covered cheaper). If the user asks to e2e every feature/edge case, redirect: that is the ice-cream cone — cap the persistent suite at 3–7 critical journeys (the <5 min budget wins over any test count) and push edge cases down to unit/integration.
@@ -68,11 +70,11 @@ Ask (AskUserQuestion, header "E2E", or a text list): "Which flows are so essenti
68
70
 
69
71
  - **Coverage = floor, not a target.** Record that. If the user demands a coverage *target* (e.g. 100%), reframe it as a floor and warn that an excessively high floor forces low-value tests of trivial glue — the real quality signal is **mutation score** on the gnarly bits. **Mutation testing (Stryker)** applies to the critical modules (the gnarly bits from Step 4 + the core domain logic).
70
72
  - **TDD stance:** mandate behavior-level tests + **small uniform increments** + a regression floor with a real RED step. Test-first vs test-after is the `workflow.tdd_mode` knob (currently **${TDD_MODE}**) — surface it; don't force test-first as dogma. **Exception:** where `TESTING-STANDARDS.md` mandates a red-first/test-first phase for a module, that project standard governs and overrides the knob there.
71
- - Confirm the existing `TESTING-STANDARDS.md` standards remain in force, and **carry any project-specific standards beyond the reference's defaults (e.g. clock-seam concurrency, no-elapsed-time assertions, delete-bad-tests, the `fast-check` property tier) into TEST-STRATEGY.md's Notes** so downstream skills see them.
73
+ - **If `TESTING-STANDARDS.md` exists:** confirm its standards remain in force, and **carry any project-specific standards beyond the reference's defaults (e.g. clock-seam concurrency, no-elapsed-time assertions, delete-bad-tests, the `fast-check` property tier) into TEST-STRATEGY.md's Notes** so downstream skills see them. **If it's absent (greenfield):** adopt the reference's defaults (real-code-only, no vacuous assertions, typed surface, `fast-check` property tier, Stryker ≥80 on critical modules) as the project baseline, write them into TEST-STRATEGY.md's Notes as the initial standards, and offer to generate `TESTING-STANDARDS.md` from them.
72
74
 
73
75
  ## Step 7: Write TEST-STRATEGY.md
74
76
 
75
- Render `@~/.claude/gsd-core/templates/test-strategy.md` (fill `[DATE]`, `[PROJECT_TITLE]`, `[ADR-NNNN]`). Fill the per-subdomain level table, the gnarly-bits list, what-not-to-test, the integration note, the persistent/transient e2e split, coverage/mutation, and TDD stance (render `tdd_mode=false` as "off", `true` as "on").
77
+ Render `@~/.claude/gsd-core/templates/test-strategy.md` (fill `[DATE]`, `[PROJECT_TITLE]`, `[ADR-NNNN]`). Fill the per-subdomain level table, the gnarly-bits list, what-not-to-test, the integration note, the persistent/transient e2e split, coverage/mutation, the **CI execution map** (which tiers run at the PR gate vs merge-to-main vs nightly — it feeds `/gsd:cicd-strategy`), and TDD stance (render `tdd_mode=false` as "off", `true` as "on").
76
78
 
77
79
  Write to `.planning/TEST-STRATEGY.md`.
78
80
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@therocketcode/gsd-core",
3
- "version": "1.7.5",
3
+ "version": "1.8.0",
4
4
  "description": "GSD Core is a meta-prompting, context engineering, and spec-driven development system for AI coding agents.",
5
5
  "bin": {
6
6
  "gsd-core": "bin/install.js",