forgecraft-mcp 1.7.0 → 1.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +79 -0
- package/dist/registry/remote-gates.d.ts +16 -0
- package/dist/registry/remote-gates.d.ts.map +1 -1
- package/dist/registry/remote-gates.js +56 -0
- package/dist/registry/remote-gates.js.map +1 -1
- package/dist/registry/sentinel-domain-map.d.ts.map +1 -1
- package/dist/registry/sentinel-domain-map.js +16 -9
- package/dist/registry/sentinel-domain-map.js.map +1 -1
- package/dist/registry/sentinel-renderer.d.ts +13 -8
- package/dist/registry/sentinel-renderer.d.ts.map +1 -1
- package/dist/registry/sentinel-renderer.js +440 -162
- package/dist/registry/sentinel-renderer.js.map +1 -1
- package/dist/shared/harness-budget.d.ts +49 -0
- package/dist/shared/harness-budget.d.ts.map +1 -0
- package/dist/shared/harness-budget.js +123 -0
- package/dist/shared/harness-budget.js.map +1 -0
- package/dist/shared/hook-installer.d.ts.map +1 -1
- package/dist/shared/hook-installer.js +2 -1
- package/dist/shared/hook-installer.js.map +1 -1
- package/dist/tools/close-cycle-helpers.d.ts +9 -0
- package/dist/tools/close-cycle-helpers.d.ts.map +1 -1
- package/dist/tools/close-cycle-helpers.js.map +1 -1
- package/dist/tools/close-cycle.d.ts.map +1 -1
- package/dist/tools/close-cycle.js +29 -0
- package/dist/tools/close-cycle.js.map +1 -1
- package/dist/tools/contribute-gate.d.ts +30 -4
- package/dist/tools/contribute-gate.d.ts.map +1 -1
- package/dist/tools/contribute-gate.js +180 -66
- package/dist/tools/contribute-gate.js.map +1 -1
- package/dist/tools/gate-genesis.d.ts +47 -0
- package/dist/tools/gate-genesis.d.ts.map +1 -0
- package/dist/tools/gate-genesis.js +241 -0
- package/dist/tools/gate-genesis.js.map +1 -0
- package/dist/tools/learning-graph.d.ts +31 -0
- package/dist/tools/learning-graph.d.ts.map +1 -0
- package/dist/tools/learning-graph.js +266 -0
- package/dist/tools/learning-graph.js.map +1 -0
- package/dist/tools/setup-artifact-writers.d.ts +15 -3
- package/dist/tools/setup-artifact-writers.d.ts.map +1 -1
- package/dist/tools/setup-artifact-writers.js +149 -13
- package/dist/tools/setup-artifact-writers.js.map +1 -1
- package/dist/tools/setup-phase2.d.ts +9 -0
- package/dist/tools/setup-phase2.d.ts.map +1 -1
- package/dist/tools/setup-phase2.js +13 -0
- package/dist/tools/setup-phase2.js.map +1 -1
- package/dist/tools/setup-project.d.ts +6 -0
- package/dist/tools/setup-project.d.ts.map +1 -1
- package/dist/tools/setup-project.js +21 -4
- package/dist/tools/setup-project.js.map +1 -1
- package/package.json +99 -98
- package/templates/api/instructions.yaml +50 -188
- package/templates/universal/instructions.yaml +194 -1003
|
@@ -39,158 +39,60 @@ blocks:
|
|
|
39
39
|
title: "Dev Environment Hygiene"
|
|
40
40
|
content: |
|
|
41
41
|
## Dev Environment Hygiene
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
- Before
|
|
48
|
-
-
|
|
49
|
-
- Never run `code --install-extension` unconditionally in scripts or setup steps.
|
|
50
|
-
- Installing the same extension twice on the same day = a bug in your script.
|
|
51
|
-
|
|
52
|
-
### Docker Containers & Volumes
|
|
53
|
-
- Check before creating: `docker ps -a --filter name=<service>` — if it exists, start it, don't create it.
|
|
54
|
-
- Prefer `docker compose up` (reuse) over bare `docker run` (always creates new).
|
|
55
|
-
- One Compose file per project. Split files for the same project = tech debt.
|
|
56
|
-
- Log pruning: run `docker system prune -f` periodically. Never let container logs exceed 500 MB total.
|
|
57
|
-
- Time-series or synthetic data volumes: before writing >100 MB, ask whether raw retention,
|
|
58
|
-
statistical condensation, or deletion after the run is preferred.
|
|
59
|
-
- Synthetic datasets older than 7 days with no code reference: ask to delete.
|
|
60
|
-
|
|
61
|
-
### Python Virtual Environments
|
|
62
|
-
- One `.venv` per project root, one per standalone package subdirectory — never more.
|
|
63
|
-
- Before creating: check if `.venv/` exists and `python --version` matches the required major.minor.
|
|
64
|
-
Recreate only on major version mismatch or explicit user request.
|
|
65
|
-
- Never create a venv in a subdirectory unless that directory is a standalone installable package.
|
|
66
|
-
- Sanitize dependencies: if `pip list --not-required` reveals packages not in requirements, flag them.
|
|
67
|
-
|
|
68
|
-
### General Install Hygiene
|
|
69
|
-
- Before any install/download: check version already installed. Skip if within the required range.
|
|
70
|
-
- If project directory disk usage outside of `node_modules/`, `.venv/`, `dist/`, `.next/`
|
|
71
|
-
exceeds 2 GB: surface a warning and ask before continuing any file-generating operation.
|
|
72
|
-
- Never silently grow the workspace. When uncertain about retention, ask.
|
|
42
|
+
- VS Code: `code --list-extensions | grep -i <name>` before install; skip if in-range. Never install unconditionally in scripts.
|
|
43
|
+
- Docker: `docker ps -a --filter name=<service>` before create — reuse if exists. Prefer `docker compose up` over `docker run`. One Compose file per project.
|
|
44
|
+
- Docker: `docker system prune -f` periodically; container logs < 500 MB total.
|
|
45
|
+
- Data volumes: before writing >100 MB, ask retention. Synthetic data >7 days with no code ref: ask to delete.
|
|
46
|
+
- Python: one `.venv` per project root or standalone package — never more. Check `.venv/` + `python --version` before creating; recreate only on major mismatch.
|
|
47
|
+
- Before any install: check installed version, skip if in-range.
|
|
48
|
+
- If project dir usage (excluding `node_modules/`, `.venv/`, `dist/`, `.next/`) > 2 GB: warn and ask before generating files.
|
|
73
49
|
|
|
74
50
|
- id: dependency-registry
|
|
75
51
|
tier: core
|
|
76
52
|
title: "Dependency Registry"
|
|
77
53
|
content: |
|
|
78
|
-
## Dependency Registry
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
### The registry artifact
|
|
86
|
-
|
|
87
|
-
File: **`docs/approved-packages.md`** — emit in P1 alongside schema, tsconfig, package.json.
|
|
88
|
-
Update it every time a dependency is added or upgraded. If it exists only in prose or a
|
|
89
|
-
README reference, it does not exist.
|
|
90
|
-
|
|
91
|
-
```markdown
|
|
92
|
-
# Approved Packages
|
|
93
|
-
|
|
94
|
-
| Package | Version range | Purpose | Alternatives rejected | Rationale | Audit status |
|
|
95
|
-
|---|---|---|---|---|---|
|
|
96
|
-
| example-pkg | ^2.4 | HTTP client | axios (larger bundle), node-fetch (no TS types) | Wide adoption, zero known CVEs | 0 HIGH/CRITICAL |
|
|
97
|
-
```
|
|
98
|
-
|
|
99
|
-
The AI populates every row. The registry is the authoritative record of WHY each
|
|
100
|
-
dependency was chosen and that it was clean at the time of addition.
|
|
101
|
-
|
|
102
|
-
### Process rules — stack-agnostic
|
|
103
|
-
|
|
104
|
-
1. **Before adding any package**: run the project's audit command (see table below)
|
|
105
|
-
with `--dry-run` or equivalent to check the candidate for known CVEs.
|
|
106
|
-
- If HIGH or CRITICAL found: choose an alternative and document the rejection.
|
|
107
|
-
- If no CVE-free alternative exists: document the accepted risk and create an ADR
|
|
108
|
-
naming the approver. Zero-tolerance is the default; exceptions require a record.
|
|
109
|
-
2. **After adding a package**: add a row to `docs/approved-packages.md` with audit status.
|
|
110
|
-
3. **Commit gate**: the pre-commit hook runs the audit command. HIGH or CRITICAL blocks
|
|
111
|
-
the commit. If audit is not in the pre-commit hook, the gate does not exist.
|
|
112
|
-
4. **Version pins**: approved version ranges are locked in the lockfile (package-lock.json,
|
|
113
|
-
uv.lock, Cargo.lock). The lockfile is committed. Ranges without a lockfile are not pins.
|
|
114
|
-
|
|
115
|
-
### Audit commands by ecosystem
|
|
54
|
+
## Dependency Registry
|
|
55
|
+
- File: **`docs/approved-packages.md`** — emit in P1, update on every add/upgrade. Columns: Package, Version range, Purpose, Alternatives rejected, Rationale, Audit status.
|
|
56
|
+
- Before adding any package: run the audit command (table) for CVEs. HIGH/CRITICAL → pick an alternative and document the rejection. No clean alternative → ADR naming the approver.
|
|
57
|
+
- After adding: add a row with audit status.
|
|
58
|
+
- Commit gate: pre-commit hook runs audit; HIGH/CRITICAL blocks. Not in the hook = gate does not exist.
|
|
59
|
+
- Version pins live in the committed lockfile (package-lock.json, uv.lock, Cargo.lock).
|
|
116
60
|
|
|
117
61
|
| Ecosystem | Audit command | Threshold |
|
|
118
62
|
|---|---|---|
|
|
119
|
-
| npm
|
|
120
|
-
| pnpm | `pnpm audit --audit-level=high` | HIGH
|
|
121
|
-
| yarn | `yarn npm audit --severity high` | HIGH
|
|
122
|
-
|
|
|
123
|
-
|
|
|
124
|
-
| Rust | `cargo audit` | HIGH
|
|
125
|
-
| Go | `govulncheck ./...` | Any
|
|
126
|
-
|
|
|
127
|
-
| Ruby | `bundle audit` | HIGH
|
|
128
|
-
|
|
129
|
-
The correct command for **this project's ecosystem** must appear in the pre-commit hook
|
|
130
|
-
emitted in P1. Discovering CVEs at code review is too late.
|
|
63
|
+
| npm | `npm audit --audit-level=high` | HIGH/CRITICAL |
|
|
64
|
+
| pnpm | `pnpm audit --audit-level=high` | HIGH/CRITICAL |
|
|
65
|
+
| yarn | `yarn npm audit --severity high` | HIGH/CRITICAL |
|
|
66
|
+
| pip | `pip-audit --fail-on-severity high` | HIGH/CRITICAL |
|
|
67
|
+
| uv | `uv audit` | HIGH/CRITICAL |
|
|
68
|
+
| Rust | `cargo audit` | HIGH/CRITICAL |
|
|
69
|
+
| Go | `govulncheck ./...` | Any direct |
|
|
70
|
+
| Maven | `mvn dependency-check:check -DfailBuildOnCVSS=7` | CVSS ≥ 7 |
|
|
71
|
+
| Ruby | `bundle audit` | HIGH/CRITICAL |
|
|
131
72
|
|
|
132
73
|
- id: language-stack-constraints
|
|
133
74
|
tier: core
|
|
134
75
|
title: "Language Stack Constraints"
|
|
135
76
|
content: |
|
|
136
|
-
## Language Stack Constraints — Seed Defaults
|
|
137
|
-
|
|
138
|
-
These are **starting defaults for {{language}} projects** — use them to populate the
|
|
139
|
-
initial rows of `docs/approved-packages.md` in P1. They are not a permanent approved
|
|
140
|
-
list: the AI maintains the registry from here forward, keeps versions current, and
|
|
141
|
-
replaces any entry that develops a known CVE. The Dependency Registry block above
|
|
142
|
-
governs the process.
|
|
143
|
-
|
|
144
|
-
Before adding any dependency not listed here, apply the audit-before-add process.
|
|
77
|
+
## Language Stack Constraints — Seed Defaults for {{language}}
|
|
78
|
+
Seed rows for `docs/approved-packages.md` in P1; apply audit-before-add for anything not listed.
|
|
145
79
|
|
|
146
80
|
{{#if language_is_typescript}}
|
|
147
|
-
### TypeScript / Node.js
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
-
|
|
151
|
-
-
|
|
152
|
-
|
|
153
|
-
`process.env.*` from `string | undefined` at compile time.
|
|
154
|
-
|
|
155
|
-
**Linting**
|
|
156
|
-
- `eslint@^9` + `@typescript-eslint/eslint-plugin@^8` + `@typescript-eslint/parser@^8`
|
|
157
|
-
- NOT `@typescript-eslint@^5` or `^6` — old `minimatch` transitive dep has known CVEs.
|
|
158
|
-
- NOT `tslint` — deprecated.
|
|
159
|
-
|
|
160
|
-
**Test runner**
|
|
161
|
-
- `vitest@^2` (preferred — native ESM, fast, Jest-compatible API) or `jest@^29`.
|
|
162
|
-
- NOT `mocha` + `chai` for new projects (weaker TypeScript support).
|
|
163
|
-
- NOT `jasmine` (no active maintenance for Node.js use).
|
|
164
|
-
|
|
165
|
-
**Formatting**
|
|
166
|
-
- `prettier@^3` — configured via `.prettierrc`, integrated with ESLint via
|
|
167
|
-
`eslint-config-prettier`. NOT separate manual formatting.
|
|
81
|
+
### TypeScript / Node.js
|
|
82
|
+
- Node.js `^20 LTS` min. NOT `^16`/`^18` (EOL).
|
|
83
|
+
- TypeScript `^5.4` min. `tsconfig.json`: `"strict": true` AND `"noUncheckedIndexedAccess": true`.
|
|
84
|
+
- `eslint@^9` + `@typescript-eslint/*@^8`. NOT `^5`/`^6` (minimatch CVE). NOT `tslint` (deprecated).
|
|
85
|
+
- `vitest@^2` or `jest@^29`. NOT `mocha`+`chai` (weak TS). NOT `jasmine` (unmaintained).
|
|
86
|
+
- `prettier@^3` via `.prettierrc` + `eslint-config-prettier`.
|
|
168
87
|
{{/if}}
|
|
169
88
|
|
|
170
89
|
{{#if language_is_python}}
|
|
171
|
-
### Python
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
**Linting / formatting**
|
|
178
|
-
- `ruff@^0.4` — replaces `flake8` + `isort` + `black` with a single 10–100× faster tool.
|
|
179
|
-
- NOT separate `flake8` + `isort` + `black` for new projects.
|
|
180
|
-
|
|
181
|
-
**Type checking**
|
|
182
|
-
- `pyright@^1.1` (strict mode) — same engine as Pylance, best TypedDict support.
|
|
183
|
-
- `mypy@^1.9` is acceptable. Strict mode required in both cases.
|
|
184
|
-
- NOT unchecked Python — all public functions must be typed.
|
|
185
|
-
|
|
186
|
-
**Test runner**
|
|
187
|
-
- `pytest@^8` — NOT `unittest` for new projects.
|
|
188
|
-
- Async tests: `pytest-asyncio@^0.23`.
|
|
189
|
-
|
|
190
|
-
**Dependency management**
|
|
191
|
-
- `uv@^0.1` (recommended — fastest resolver) or `poetry@^1.7`.
|
|
192
|
-
- ALL dependencies pinned in lockfile (`uv.lock` or `poetry.lock`). Lockfile committed.
|
|
193
|
-
- `pip-tools` acceptable for library projects.
|
|
90
|
+
### Python
|
|
91
|
+
- Python `^3.11` min. NOT `3.8`/`3.9` (no tomllib/ExceptionGroup).
|
|
92
|
+
- `ruff@^0.4`. NOT separate `flake8`+`isort`+`black`.
|
|
93
|
+
- `pyright@^1.1` strict or `mypy@^1.9` strict. All public functions typed.
|
|
94
|
+
- `pytest@^8` (NOT `unittest`); async via `pytest-asyncio@^0.23`.
|
|
95
|
+
- `uv@^0.1` or `poetry@^1.7`. Deps pinned in committed `uv.lock`/`poetry.lock`.
|
|
194
96
|
{{/if}}
|
|
195
97
|
|
|
196
98
|
- id: production-code-standards
|
|
@@ -198,322 +100,114 @@ blocks:
|
|
|
198
100
|
title: "Production Code Standards"
|
|
199
101
|
content: |
|
|
200
102
|
## Production Code Standards — NON-NEGOTIABLE
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
-
|
|
207
|
-
-
|
|
208
|
-
-
|
|
209
|
-
-
|
|
210
|
-
-
|
|
211
|
-
|
|
212
|
-
`IEmailSender` etc. as interfaces in the domain/service layer first. Services depend on
|
|
213
|
-
the interface. The Prisma/SQL/HTTP concrete implementation lives in the adapter layer and
|
|
214
|
-
is injected at the composition root. Emit these interfaces in P1 alongside the schema —
|
|
215
|
-
a service that imports a concrete class cannot be unit-tested, cannot be swapped, and
|
|
216
|
-
is not Composable.
|
|
217
|
-
|
|
218
|
-
### Zero Hardcoded Values
|
|
219
|
-
- ALL configuration through environment variables or config files. No exceptions.
|
|
220
|
-
- ALL external URLs, ports, credentials, thresholds, feature flags must be configurable.
|
|
221
|
-
- ALL magic numbers must be named constants with documentation.
|
|
222
|
-
- Config is validated at startup — fail fast if required values are missing.
|
|
223
|
-
|
|
224
|
-
### Zero Mocks in Application Code
|
|
225
|
-
- No mock objects, fake data, or stub responses in source code. Ever.
|
|
226
|
-
- Mocks belong ONLY in test files.
|
|
227
|
-
- For local dev: create proper interface implementations selected via config.
|
|
228
|
-
- No `if DEBUG: return fake_data` patterns. Use dependency injection to swap implementations.
|
|
229
|
-
- No TODO/FIXME stubs returning hardcoded values. Use NotImplementedError with a description.
|
|
230
|
-
|
|
231
|
-
### Interfaces First
|
|
232
|
-
Before writing any implementation:
|
|
233
|
-
1. Define the interface/protocol/abstract class
|
|
234
|
-
2. Define the data contracts (input/output DTOs)
|
|
235
|
-
3. Write the consuming code against the interface
|
|
236
|
-
4. Write tests against the interface
|
|
237
|
-
5. THEN implement the concrete class
|
|
238
|
-
|
|
239
|
-
### Dependency Injection
|
|
240
|
-
- Every service receives dependencies through its constructor.
|
|
241
|
-
- A composition root (main.py / app.ts / container) wires everything.
|
|
242
|
-
- No service locator pattern. No global singletons. No module-level instances.
|
|
243
|
-
|
|
244
|
-
### Error Handling
|
|
245
|
-
- Custom exception hierarchy per module. No bare Exception raises.
|
|
246
|
-
- Errors carry context: IDs, timestamps, operation names.
|
|
247
|
-
- Fail fast, fail loud. No silent swallowing of exceptions.
|
|
248
|
-
- Domain code never returns HTTP status codes — that's the API layer's job.
|
|
249
|
-
|
|
250
|
-
### Modular from Day One
|
|
251
|
-
- Feature-based modules over layer-based. Each feature owns its models, service, repository, routes.
|
|
252
|
-
- Module dependency graph must be acyclic.
|
|
253
|
-
- Every module has a clear public API via {{#if language_is_typescript}}index.ts{{/if}}{{#if language_is_python}}__init__.py{{/if}} exports.
|
|
103
|
+
Apply to ALL code including prototypes.
|
|
104
|
+
|
|
105
|
+
- **SOLID**: SRP (one reason to change), OCP (extend, don't modify), LSP (swappable, no isinstance), ISP (small interfaces), DIP (depend on abstractions; inject concretes at composition root).
|
|
106
|
+
- Define port interfaces (`IUserRepository`, `IEmailSender`) in the domain/service layer in P1; concrete impls live in adapters, injected at the root.
|
|
107
|
+
- Zero hardcoded values: all config via env/config files, validated at startup (fail fast). Magic numbers → named constants.
|
|
108
|
+
- Zero mocks in source: mocks only in test files. No `if DEBUG: return fake_data`. Stubs use NotImplementedError, not hardcoded returns.
|
|
109
|
+
- Interfaces first: interface → DTOs → consuming code → tests → concrete impl.
|
|
110
|
+
- Type-driven design: make illegal states unrepresentable. {{#if language_is_typescript}}Discriminated unions for state, `Result<T,E>` for fallible ops, branded types for validated values; parse external input into typed objects at the boundary (Zod) — never pass raw input inward.{{/if}}{{#if language_is_python}}Tagged unions (`Literal` + dataclass), `NewType` for validated values, frozen dataclasses; parse external input at the boundary (Pydantic) — never pass raw dicts inward.{{/if}}
|
|
111
|
+
- DI via constructor; composition root wires everything. No service locator, global singletons, module-level instances.
|
|
112
|
+
- Error handling: custom exception hierarchy per module, errors carry context (IDs, timestamps, op name), fail loud. Domain never returns HTTP status codes.
|
|
113
|
+
- Feature-based modules (own models/service/repo/routes), acyclic graph, public API via {{#if language_is_typescript}}index.ts{{/if}}{{#if language_is_python}}__init__.py{{/if}}.
|
|
254
114
|
|
|
255
115
|
- id: layered-architecture
|
|
256
116
|
tier: recommended
|
|
257
117
|
title: "Layered Architecture"
|
|
258
118
|
content: |
|
|
259
119
|
## Layered Architecture (Ports & Adapters / Hexagonal)
|
|
120
|
+
Layers (outer→inner): API/CLI/Handlers (thin, validate+delegate) → Services (orchestrate, depend on ports only) → Domain models (pure, no I/O, no framework) → Port interfaces → Repositories/Adapters (external I/O) → Infrastructure/Config (DI, env).
|
|
260
121
|
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
│ Services (Business Logic) │ ← Orchestration. Depends on PORT INTERFACES only.
|
|
266
|
-
├─────────────────────────────┤
|
|
267
|
-
│ Domain Models │ ← Pure data + behavior. No I/O. No framework imports.
|
|
268
|
-
│ (Entities, Value Objects) │ The inner hexagon. Zero external dependencies.
|
|
269
|
-
├─────────────────────────────┤
|
|
270
|
-
│ Port Interfaces │ ← Abstract contracts (Repository, Gateway, Notifier).
|
|
271
|
-
│ │ Defined by the domain, implemented by adapters.
|
|
272
|
-
├─────────────────────────────┤
|
|
273
|
-
│ Repositories / Adapters │ ← DRIVEN ADAPTERS (secondary). All external I/O
|
|
274
|
-
│ │ (DB, APIs, files, queues, email, caches).
|
|
275
|
-
├─────────────────────────────┤
|
|
276
|
-
│ Infrastructure / Config │ ← DI container, env config, connection factories
|
|
277
|
-
└─────────────────────────────┘
|
|
278
|
-
```
|
|
122
|
+
### Ports & Adapters
|
|
123
|
+
- Ports (`UserRepository`, `PaymentGateway`, `EmailSender`) defined in domain/service layer, never in adapters. Specify WHAT, not HOW.
|
|
124
|
+
- Driving adapters (HTTP/CLI/consumers) call through ports; driven adapters (PostgresUserRepository, StripePaymentGateway) are called through ports.
|
|
125
|
+
- Adapters interchangeable: swap Postgres for InMemory in tests with zero logic changes.
|
|
279
126
|
|
|
280
|
-
###
|
|
281
|
-
-
|
|
282
|
-
-
|
|
283
|
-
-
|
|
284
|
-
- Port interfaces specify WHAT, never HOW.
|
|
285
|
-
|
|
286
|
-
### Adapters (Implementations of ports)
|
|
287
|
-
- **Driving adapters** (primary): HTTP controllers, CLI handlers, message consumers
|
|
288
|
-
— they CALL the application through port interfaces.
|
|
289
|
-
- **Driven adapters** (secondary): PostgresUserRepository, StripePaymentGateway,
|
|
290
|
-
SESEmailSender — they ARE CALLED BY the application through port interfaces.
|
|
291
|
-
- Adapters are interchangeable. Swap `PostgresUserRepository` for `InMemoryUserRepository`
|
|
292
|
-
in tests without changing a single line of business logic.
|
|
293
|
-
|
|
294
|
-
### Data Transfer Objects (DTOs)
|
|
295
|
-
- Use DTOs at layer boundaries — never pass domain entities to/from the API layer.
|
|
296
|
-
- **Request DTOs**: validated at the API boundary ({{#if language_is_typescript}}Zod schema{{/if}}{{#if language_is_python}}Pydantic model{{/if}} → typed object).
|
|
297
|
-
- **Response DTOs**: shaped for the consumer, not mirroring the domain model.
|
|
298
|
-
- **Domain ↔ Persistence mapping**: repositories map between domain entities and DB rows/documents.
|
|
299
|
-
- DTOs are plain data objects — no methods, no behavior, no framework decorators.
|
|
127
|
+
### DTOs
|
|
128
|
+
- DTOs at every layer boundary — never pass domain entities to/from the API layer.
|
|
129
|
+
- Request DTOs validated at boundary ({{#if language_is_typescript}}Zod{{/if}}{{#if language_is_python}}Pydantic{{/if}}); Response DTOs shaped for consumer; repos map domain ↔ persistence.
|
|
130
|
+
- DTOs are plain data — no methods, no framework decorators.
|
|
300
131
|
|
|
301
132
|
### Layer Rules
|
|
302
|
-
- Never skip layers
|
|
303
|
-
-
|
|
304
|
-
- Domain models have ZERO external dependencies.
|
|
305
|
-
- The domain layer does not know HTTP, SQL, or any framework exists.
|
|
133
|
+
- Never skip layers (handlers never call repos directly). Dependencies point INWARD only.
|
|
134
|
+
- Domain models have ZERO external dependencies; the domain does not know HTTP/SQL/frameworks exist.
|
|
306
135
|
|
|
307
136
|
- id: clean-code-principles
|
|
308
137
|
tier: recommended
|
|
309
138
|
title: "Clean Code Principles"
|
|
310
139
|
content: |
|
|
311
140
|
## Clean Code Principles
|
|
141
|
+
- **CQS**: commands change state return void; queries return data no side effects. A function does one, not both.
|
|
142
|
+
- **Guard clauses**: handle invalid cases first, return early; happy path at shallowest indent.
|
|
143
|
+
- **Composition over inheritance**: compose via interfaces/delegation; inheritance only for genuine "is-a".
|
|
144
|
+
- **Law of Demeter**: no chaining through objects (`order.getCustomer().getAddress()` — bad); add `order.getShippingCity()`.
|
|
145
|
+
- **Immutability by default**: {{#if language_is_typescript}}`const`/`readonly`, `ReadonlyArray<T>`{{/if}}{{#if language_is_python}}`Final`, `frozen=True` dataclasses, `tuple` over `list`{{/if}}; copy-on-modify; restrict mutable state to smallest scope.
|
|
146
|
+
- **Pure functions**: domain logic/validation/calculation pure; push I/O to adapters.
|
|
147
|
+
- **Factory pattern**: encapsulate construction (`User.create(dto)`); the DI container is the top-level factory.
|
|
312
148
|
|
|
313
|
-
|
|
314
|
-
- **Commands** change state but return nothing (void).
|
|
315
|
-
- **Queries** return data but change nothing (no side effects).
|
|
316
|
-
- A function should do one or the other, never both.
|
|
317
|
-
- Exception: stack.pop() style operations where separation is impractical — document why.
|
|
318
|
-
|
|
319
|
-
### Guard Clauses & Early Return
|
|
320
|
-
- Eliminate deep nesting. Handle invalid cases first, return early.
|
|
321
|
-
- The happy path runs at the shallowest indentation level.
|
|
322
|
-
- Before:
|
|
323
|
-
```
|
|
324
|
-
if (user) {
|
|
325
|
-
if (user.isActive) {
|
|
326
|
-
if (user.hasPermission) {
|
|
327
|
-
// actual logic buried 3 levels deep
|
|
328
|
-
```
|
|
329
|
-
- After:
|
|
330
|
-
```
|
|
331
|
-
if (!user) throw new NotFoundError(...);
|
|
332
|
-
if (!user.isActive) throw new InactiveError(...);
|
|
333
|
-
if (!user.hasPermission) throw new ForbiddenError(...);
|
|
334
|
-
// actual logic at top level
|
|
335
|
-
```
|
|
336
|
-
|
|
337
|
-
### Composition over Inheritance
|
|
338
|
-
- Prefer composing objects via interfaces and delegation over class inheritance.
|
|
339
|
-
- Inheritance creates tight coupling and fragile hierarchies.
|
|
340
|
-
- Use inheritance ONLY for genuine "is-a" relationships (rare).
|
|
341
|
-
- When in doubt, compose: inject a collaborator, don't extend a base class.
|
|
342
|
-
|
|
343
|
-
### Law of Demeter (Principle of Least Knowledge)
|
|
344
|
-
- A method should only call methods on: its own object, its parameters, objects it creates,
|
|
345
|
-
its direct dependencies.
|
|
346
|
-
- Do NOT chain through objects: `order.getCustomer().getAddress().getCity()` — BAD.
|
|
347
|
-
- Instead: `order.getShippingCity()` or pass the needed data directly.
|
|
348
|
-
|
|
349
|
-
### Immutability by Default
|
|
350
|
-
{{#if language_is_typescript}}- Use `const` over `let`. Use `readonly` on properties and parameters.
|
|
351
|
-
- Prefer `ReadonlyArray<T>`, `Readonly<T>`, `ReadonlyMap`, `ReadonlySet`.{{/if}}{{#if language_is_python}}- Use `Final` for constants. Use `frozen=True` on dataclasses.
|
|
352
|
-
- Prefer `tuple` over `list` for immutable sequences. Use `MappingProxyType` for immutable dicts.{{/if}}
|
|
353
|
-
- When you need to "modify" data, create a new copy with the change.
|
|
354
|
-
- Mutable state is the #1 source of bugs. Restrict it to the smallest possible scope.
|
|
355
|
-
|
|
356
|
-
### Pure Functions
|
|
357
|
-
- A pure function: same inputs → same outputs, no side effects.
|
|
358
|
-
- Domain logic, validation, transformation, and calculation should be pure.
|
|
359
|
-
- Side effects (I/O, logging, database) are pushed to the edges (adapters).
|
|
360
|
-
- Pure functions are trivially testable — no mocks needed.
|
|
361
|
-
|
|
362
|
-
### Factory Pattern
|
|
363
|
-
- Use factories to encapsulate complex object construction.
|
|
364
|
-
- Factory methods on the class itself for simple cases: `User.create(dto)`.
|
|
365
|
-
- Factory classes/functions when construction involves dependencies or conditional logic.
|
|
366
|
-
- Factories are the natural companion to dependency injection — the DI container
|
|
367
|
-
IS the top-level factory.
|
|
368
|
-
|
|
369
|
-
> **Design reference patterns** (DDD, CQRS, GoF) available on demand via `get_design_reference` tool.
|
|
149
|
+
> Design reference patterns (DDD, CQRS, GoF) on demand via `get_design_reference` tool.
|
|
370
150
|
|
|
371
151
|
- id: twelve-factor-ops
|
|
372
152
|
tier: optional
|
|
373
153
|
title: "12-Factor & Operational Readiness"
|
|
374
154
|
content: |
|
|
375
|
-
## 12-Factor
|
|
376
|
-
|
|
377
|
-
|
|
378
|
-
-
|
|
379
|
-
-
|
|
380
|
-
-
|
|
381
|
-
|
|
382
|
-
|
|
383
|
-
- Application processes are stateless. Session data lives in external stores (Redis, DB).
|
|
384
|
-
- Any process can be killed and restarted without data loss.
|
|
385
|
-
- File uploads go to object storage (S3, GCS), not local disk.
|
|
386
|
-
|
|
387
|
-
### Port Binding
|
|
388
|
-
- The application is self-contained and exports services via port binding.
|
|
389
|
-
- No runtime injection of a web server — the app embeds its own (Express, Uvicorn, etc.).
|
|
390
|
-
|
|
391
|
-
### Disposability
|
|
392
|
-
- Processes start fast (< 5 seconds) and shut down gracefully.
|
|
393
|
-
- SIGTERM triggers: stop accepting new work → finish in-flight requests → close connections → exit.
|
|
394
|
-
- Workers use robust job queues so interrupted work is retried, not lost.
|
|
395
|
-
|
|
396
|
-
### Dev/Prod Parity
|
|
397
|
-
- Minimize gaps between development and production environments.
|
|
398
|
-
- Use the same backing services in dev as prod (same DB engine, same cache).
|
|
399
|
-
- Docker / containers recommended for environment parity.
|
|
400
|
-
|
|
401
|
-
### Logs as Event Streams
|
|
402
|
-
- The app writes logs to stdout/stderr — never to local files.
|
|
403
|
-
- Log aggregation is an ops concern (ELK, Datadog, CloudWatch), not an application concern.
|
|
404
|
-
- Structured JSON logs with correlation IDs for tracing across services.
|
|
405
|
-
|
|
406
|
-
### Build, Release, Run
|
|
407
|
-
- Strict separation: build (compile + assets), release (build + config), run (execute).
|
|
408
|
-
- Every release is immutable and tagged. Rollback = deploy a previous release.
|
|
409
|
-
- CI/CD pipeline automates: lint → test → build → deploy with gates at each stage.
|
|
155
|
+
## 12-Factor & Operational Readiness
|
|
156
|
+
- **Config**: all from env/config services, validated at startup (fail fast). Commit `.env.example`, gitignore `.env`.
|
|
157
|
+
- **Stateless processes**: session data in external stores (Redis/DB); uploads to object storage (S3/GCS), not local disk.
|
|
158
|
+
- **Port binding**: app embeds its own server (Express, Uvicorn); no runtime web-server injection.
|
|
159
|
+
- **Disposability**: start < 5s; SIGTERM → stop intake → drain → close → exit; workers use durable queues.
|
|
160
|
+
- **Dev/prod parity**: same backing services (DB engine, cache); containers recommended.
|
|
161
|
+
- **Logs as streams**: write to stdout/stderr (never files); structured JSON + correlation IDs; aggregation is an ops concern.
|
|
162
|
+
- **Build/release/run**: strict separation; immutable tagged releases; rollback = redeploy prior release.
|
|
410
163
|
|
|
411
164
|
- id: cicd-deployment
|
|
412
165
|
tier: recommended
|
|
413
166
|
title: "CI/CD & Deployment"
|
|
414
167
|
content: |
|
|
415
168
|
## CI/CD & Deployment
|
|
416
|
-
|
|
417
|
-
|
|
418
|
-
-
|
|
419
|
-
-
|
|
420
|
-
-
|
|
421
|
-
- Failed pipelines block merge. No exceptions.
|
|
422
|
-
|
|
423
|
-
### Environments
|
|
424
|
-
- Minimum three environments: **development** (local), **staging** (mirrors prod), **production**.
|
|
425
|
-
- Environment config is injected — same artifact runs everywhere with different env vars.
|
|
426
|
-
- Staging is a faithful replica of production (same provider, same DB engine, same services).
|
|
427
|
-
|
|
428
|
-
### Deployment Strategy
|
|
429
|
-
- Default: **rolling deployment** with health checks (zero downtime).
|
|
430
|
-
- For critical services: **blue-green** or **canary** with automated rollback on error rate spike.
|
|
431
|
-
- Every deploy is tagged with git SHA. Rollback = redeploy a previous SHA.
|
|
432
|
-
- Deployment must be one command or one button. No multi-step manual runbooks.
|
|
433
|
-
|
|
434
|
-
### Preview Environments
|
|
435
|
-
- Pull requests get ephemeral preview deployments where feasible (Vercel, Netlify, Railway).
|
|
436
|
-
- Preview URLs in PR comments for stakeholder review before merge.
|
|
169
|
+
- Pipeline on push: lint → type-check → unit → build → integration. On main: + security scan → staging → smoke → promote. < 10 min (parallelize, cache). Failed pipeline blocks merge.
|
|
170
|
+
- Three environments min: development (local), staging (faithful prod replica), production. Same artifact, injected config.
|
|
171
|
+
- Default rolling deploy with health checks; blue-green/canary for critical services with auto-rollback on error spike.
|
|
172
|
+
- Tag every deploy with git SHA; rollback = redeploy prior SHA. One command/button — no manual runbooks.
|
|
173
|
+
- PRs get ephemeral preview deploys where feasible (Vercel, Netlify, Railway); preview URL in PR comment.
|
|
437
174
|
|
|
438
175
|
- id: testing-pyramid
|
|
439
176
|
tier: core
|
|
440
177
|
title: "Testing Pyramid"
|
|
441
178
|
content: |
|
|
442
179
|
## Testing Pyramid
|
|
443
|
-
|
|
444
|
-
```
|
|
445
|
-
/ E2E \ ← 5-10% of tests. Core journeys only.
|
|
446
|
-
/ Integration \ ← 20-30%. Real dependencies at boundaries.
|
|
447
|
-
/ Unit Tests \ ← 60-75%. Fast, isolated, every public function.
|
|
448
|
-
```
|
|
180
|
+
Unit 60-75% (fast, isolated, every public fn) · Integration 20-30% (real deps at boundaries) · E2E 5-10% (core journeys only).
|
|
449
181
|
|
|
450
182
|
### Coverage Targets
|
|
451
|
-
- Overall
|
|
452
|
-
-
|
|
453
|
-
-
|
|
454
|
-
- Mutation score (MSI) — overall: ≥ 65% (blocks PR merge)
|
|
455
|
-
- Mutation score (MSI) — new/changed code: ≥ 70% (measured on diff)
|
|
456
|
-
- Note: Line coverage and mutation score are both required. 80% line coverage can coexist
|
|
457
|
-
with 58% MSI when tests execute code without asserting its behavior (confirmed in Shattered
|
|
458
|
-
Stars). Run stryker-mutator immediately after writing each test batch, not only pre-release.
|
|
459
|
-
Tooling: stryker-mutator (JS/TS), mutmut (Python), Pitest (Java).
|
|
183
|
+
- Overall: {{coverage_minimum | default: 80}}% line (blocks commit). New/changed: {{coverage_new_code_min | default: 90}}% on diff. Critical paths (auth, data pipelines, financial): 95%+.
|
|
184
|
+
- Mutation score (MSI): overall ≥ 65% (blocks PR merge), new/changed ≥ 70% on diff.
|
|
185
|
+
- Line coverage AND MSI both required (80% line can coexist with 58% MSI). Run mutation after each test batch. Tooling: stryker-mutator (JS/TS), mutmut (Python), Pitest (Java).
|
|
460
186
|
|
|
461
187
|
### Test Rules
|
|
462
|
-
-
|
|
463
|
-
- No empty catch
|
|
464
|
-
-
|
|
465
|
-
-
|
|
466
|
-
|
|
467
|
-
|
|
468
|
-
|
|
469
|
-
|
|
470
|
-
|
|
471
|
-
- **Stub**: Returns canned data. No assertions on calls. Use when you need to control input.
|
|
472
|
-
- **Spy**: Records calls. Assert after the fact. Use to verify side effects.
|
|
473
|
-
- **Fake**: Working implementation with shortcuts (in-memory DB). Use for integration-speed tests.
|
|
474
|
-
- **Mock**: Pre-programmed expectations. Assert call patterns. Use sparingly — they couple to implementation.
|
|
475
|
-
Prefer stubs and fakes over mocks. Tests that mock everything test nothing.
|
|
476
|
-
|
|
477
|
-
### Test Data Builders
|
|
478
|
-
- Use Builder or Factory pattern for test data: `UserBuilder.anAdmin().withName('Alice').build()`.
|
|
479
|
-
- One builder per domain entity. Builders provide sensible defaults so tests only specify what matters.
|
|
480
|
-
- No raw object literals scattered across tests. Centralize in `tests/fixtures/` or `tests/builders/`.
|
|
481
|
-
|
|
482
|
-
### Property-Based Testing
|
|
483
|
-
- For pure functions with wide input ranges, add property tests (fast-check, Hypothesis, QuickCheck).
|
|
484
|
-
- Define invariants, not examples: "sorting is idempotent", "encode then decode = identity".
|
|
485
|
-
- Property tests complement, not replace, example-based tests.
|
|
188
|
+
- Test name = spec: `test_rejects_duplicate_member_ids` not `test_validation`.
|
|
189
|
+
- No empty catch, no `assert True`, no tests that can't fail.
|
|
190
|
+
- Colocate `[module].test.[ext]` or mirror src in `tests/`. Flaky tests are bugs — fix or quarantine.
|
|
191
|
+
- Run Stryker per module after writing its tests; surviving mutants = missing assertions.
|
|
192
|
+
|
|
193
|
+
### Test Doubles
|
|
194
|
+
- Stub (canned data), Spy (record+assert calls), Fake (in-memory working impl), Mock (preprogrammed expectations — use sparingly). Prefer stubs/fakes.
|
|
195
|
+
- Test data via Builder/Factory (`UserBuilder.anAdmin().build()`), one per entity, centralized in `tests/builders/`. No scattered literals.
|
|
196
|
+
- Property tests (fast-check, Hypothesis) for pure functions: assert invariants, not examples. Complement example tests.
|
|
486
197
|
|
|
487
198
|
- id: tdd-methodology
|
|
488
199
|
tier: core
|
|
489
200
|
title: "Test-Driven Development"
|
|
490
201
|
content: |
|
|
491
202
|
## Test-Driven Development (TDD)
|
|
203
|
+
- **RED**: write a failing test, run it, confirm it fails (if it passes, the test is wrong).
|
|
204
|
+
- **GREEN**: minimum code to pass. No more.
|
|
205
|
+
- **REFACTOR**: clean up while green; no new behavior.
|
|
206
|
+
Repeat for every feature, function, and bug fix.
|
|
492
207
|
|
|
493
|
-
|
|
494
|
-
|
|
495
|
-
|
|
496
|
-
2. **GREEN**: Write the minimum code to make the test pass. No more.
|
|
497
|
-
3. **REFACTOR**: Clean up while all tests stay green. No new behavior in this step.
|
|
498
|
-
Repeat. Every feature, every function, every bug fix follows this cycle.
|
|
499
|
-
|
|
500
|
-
### Tests Are Specifications, Not Confirmations
|
|
501
|
-
- Write tests against **expected behavior**, never against current implementation.
|
|
502
|
-
- A test that passes on broken code is worse than no test — it provides false confidence.
|
|
503
|
-
- Never weaken an assertion to match what the code currently does. If the code disagrees
|
|
504
|
-
with the spec, the code is wrong.
|
|
505
|
-
- Never write a test suite after the fact that just "locks in" existing behavior without
|
|
506
|
-
verifying it's correct.
|
|
507
|
-
|
|
508
|
-
### Bug Fix Protocol
|
|
509
|
-
- **Every bug fix starts with a failing test** that reproduces the bug.
|
|
510
|
-
- The test must fail before the fix and pass after. No exceptions.
|
|
511
|
-
- If you can't write a reproducing test, you don't understand the bug well enough to fix it.
|
|
512
|
-
|
|
513
|
-
### One Behavior Per Test
|
|
514
|
-
- Each test verifies exactly one behavior or rule.
|
|
515
|
-
- A test with multiple unrelated assertions is testing multiple things — split it.
|
|
516
|
-
- Test name = the specification: `rejects_expired_tokens`, not `test_auth`.
|
|
208
|
+
- Tests are specs: write against expected behavior, never current implementation. Never weaken an assertion to match the code. Never write after-the-fact tests that lock in unverified behavior.
|
|
209
|
+
- Bug fix: starts with a failing reproducing test (fails before fix, passes after). Can't reproduce = don't understand it.
|
|
210
|
+
- One behavior per test; test name = spec (`rejects_expired_tokens`, not `test_auth`).
|
|
517
211
|
|
|
518
212
|
- id: tdd-enforcement
|
|
519
213
|
tier: core
|
|
@@ -521,144 +215,47 @@ blocks:
|
|
|
521
215
|
content: |
|
|
522
216
|
## TDD Enforcement — Forbidden Patterns and Gate Protocol
|
|
523
217
|
|
|
524
|
-
|
|
525
|
-
|
|
526
|
-
|
|
527
|
-
|
|
528
|
-
|
|
529
|
-
|
|
530
|
-
- **NEVER write an implementation file before running and showing a failing test.**
|
|
531
|
-
Stating that "the test would fail" is not equivalent to running it. Run it.
|
|
532
|
-
- **NEVER write tests after implementation** except for bug fix reproduction tests on
|
|
533
|
-
pre-existing code not yet covered. Even then: write the test, show it fails, fix,
|
|
534
|
-
show it passes.
|
|
535
|
-
- **NEVER weaken an assertion** to make a test pass. If the assertion disagrees with
|
|
536
|
-
the output, the implementation is wrong.
|
|
537
|
-
- **NEVER skip the refactor phase** because "the code is clean enough." The refactor
|
|
538
|
-
phase exists to enforce separation of concerns under green. Skipping it is a
|
|
539
|
-
commitment not to separate concerns in that increment.
|
|
540
|
-
- **NEVER commit a `feat:` or `fix:` with no corresponding `test:` commit** preceding
|
|
541
|
-
it in the same branch. The test commit is the audit trail that the red phase occurred.
|
|
542
|
-
|
|
543
|
-
### The Session Gate Protocol
|
|
544
|
-
TDD across a multi-step session requires explicit checkpoints the AI reports and the
|
|
545
|
-
human can verify. At each gate, the AI must output the actual test runner output,
|
|
546
|
-
not a summary of what it expects.
|
|
218
|
+
### Forbidden (non-negotiable)
|
|
219
|
+
- NEVER write an implementation file before running and showing a failing test. "Would fail" ≠ ran it.
|
|
220
|
+
- NEVER write tests after implementation (except bug-fix repro on pre-existing code: write, show fail, fix, show pass).
|
|
221
|
+
- NEVER weaken an assertion to pass a test.
|
|
222
|
+
- NEVER skip the refactor phase.
|
|
223
|
+
- NEVER commit `feat:`/`fix:` without a preceding `test:` commit in the same branch.
|
|
547
224
|
|
|
548
|
-
|
|
549
|
-
|
|
550
|
-
|
|
551
|
-
|
|
552
|
-
│ Gate: Run test — paste full failure output │
|
|
553
|
-
│ Block: Cannot proceed until failure is shown │
|
|
554
|
-
│ Commit: test(scope): [RED] describe behavior │
|
|
555
|
-
└───────────────────┬─────────────────────────────────┘
|
|
556
|
-
│ failure confirmed
|
|
557
|
-
┌───────────────────▼─────────────────────────────────┐
|
|
558
|
-
│ PHASE 2: GREEN │
|
|
559
|
-
│ Action: Write minimum implementation │
|
|
560
|
-
│ Gate: Run test — paste full passing output │
|
|
561
|
-
│ Block: Cannot proceed until passing is shown │
|
|
562
|
-
│ Commit: feat(scope): implement to satisfy test │
|
|
563
|
-
└───────────────────┬─────────────────────────────────┘
|
|
564
|
-
│ green confirmed
|
|
565
|
-
┌───────────────────▼─────────────────────────────────┐
|
|
566
|
-
│ PHASE 3: REFACTOR │
|
|
567
|
-
│ Action: Improve structure, not behavior │
|
|
568
|
-
│ Gate: Run full suite — paste summary output │
|
|
569
|
-
│ Block: Cannot commit if any test regresses │
|
|
570
|
-
│ Commit: refactor(scope): clean without behavior │
|
|
571
|
-
└─────────────────────────────────────────────────────┘
|
|
572
|
-
```
|
|
225
|
+
### Gate Protocol — paste actual runner output at each gate, not a summary
|
|
226
|
+
- RED: write test → run, paste full failure → commit `test(scope): [RED] describe behavior`.
|
|
227
|
+
- GREEN: minimum impl → run, paste full pass → commit `feat(scope): implement to satisfy test`.
|
|
228
|
+
- REFACTOR: improve structure → run full suite, paste summary → commit `refactor(scope): clean without behavior`.
|
|
573
229
|
|
|
574
|
-
|
|
575
|
-
The git log for any feature must be readable as:
|
|
576
|
-
```
|
|
577
|
-
test(cart): [RED] add test for removing last item empties cart
|
|
578
|
-
feat(cart): remove last item empties cart
|
|
579
|
-
refactor(cart): extract empty-check to CartState predicate
|
|
580
|
-
```
|
|
581
|
-
This sequence is auditable. An AI that wrote the `feat:` commit without the preceding
|
|
582
|
-
`test:` commit either skipped the red phase entirely or conflated it with implementation.
|
|
583
|
-
The commit hook `pre-commit-tdd-check.sh` detects the second pattern before it lands.
|
|
584
|
-
|
|
585
|
-
### Why Instructions Alone Are Not Sufficient
|
|
586
|
-
A language model generating in a single context window experiences no time delay between
|
|
587
|
-
writing a test and writing an implementation that passes it. The RED phase is structurally
|
|
588
|
-
collapsed. The gates above exist precisely to make the phases non-simultaneous:
|
|
589
|
-
- The test commit must happen before the implementation can be written.
|
|
590
|
-
- The failure output must be produced (by running the code) before the game state is known.
|
|
591
|
-
- The model cannot "know" the failure output without actually running the test,
|
|
592
|
-
because the failure messages are not in the training distribution for this specific code.
|
|
593
|
-
These gates transform TDD from a discipline into a constraint.
|
|
230
|
+
The git log per feature reads as the audit trail (test → feat → refactor). `pre-commit-tdd-check.sh` detects a `feat:` with no preceding `test:`.
|
|
594
231
|
|
|
595
232
|
- id: adversarial-testing
|
|
596
233
|
tier: core
|
|
597
234
|
title: "Adversarial Testing Posture"
|
|
598
235
|
content: |
|
|
599
236
|
## Adversarial Testing Posture
|
|
600
|
-
|
|
601
|
-
|
|
602
|
-
|
|
603
|
-
|
|
604
|
-
### The adversarial posture
|
|
605
|
-
- Design every test as if the implementation is wrong until proven otherwise.
|
|
606
|
-
- Write tests that FAIL on incorrect code — not tests that pass on any reasonable implementation.
|
|
607
|
-
- If a test is hard to make fail, the specification is underspecified, not the test.
|
|
608
|
-
|
|
609
|
-
### Name tests as behaviors, not paths
|
|
610
|
-
- `rejects_expired_tokens` not `test_validate_token`
|
|
611
|
-
- `throws_on_missing_required_field` not `test_error_handling`
|
|
612
|
-
- `returns_empty_list_not_null_when_no_results` not `test_query`
|
|
613
|
-
|
|
614
|
-
### Cover the adversarial surface
|
|
615
|
-
For every public function or API endpoint, write tests for:
|
|
616
|
-
1. **Valid boundary values**: minimum, maximum, exact-zero, single-element
|
|
617
|
-
2. **Invalid boundary values**: below-minimum, above-maximum, empty, null/undefined
|
|
618
|
-
3. **Constraint violations**: values that look valid but break invariants (negative balance, future birth date)
|
|
619
|
-
4. **Ordering and concurrency**: does order matter? what if called twice?
|
|
620
|
-
5. **Authorization boundaries**: can a user access another user's resource?
|
|
621
|
-
|
|
622
|
-
A test suite that only exercises the happy path is documentation, not specification.
|
|
623
|
-
Every mutation that survives is a missing adversarial test.
|
|
237
|
+
- Write tests that FAIL on incorrect code, not tests that pass on any reasonable impl. Hard to make fail = underspecified.
|
|
238
|
+
- Name as behaviors: `rejects_expired_tokens` not `test_validate_token`; `returns_empty_list_not_null_when_no_results` not `test_query`.
|
|
239
|
+
- Per public function/endpoint, cover: valid boundaries (min/max/zero/single) · invalid boundaries (below/above/empty/null) · constraint violations (negative balance, future birth date) · ordering/concurrency · authorization boundaries.
|
|
240
|
+
- Happy-path-only suite is documentation, not spec. Every surviving mutant = a missing adversarial test.
|
|
624
241
|
|
|
625
242
|
- id: property-based-testing
|
|
626
243
|
tier: recommended
|
|
627
244
|
title: "Property-Based Testing"
|
|
628
245
|
content: |
|
|
629
246
|
## Property-Based Testing
|
|
247
|
+
Add property tests for: pure functions with wide input domains (serialization, parsing, math, sorting); encoder/decoder pairs (`decode(encode(x)) === x`); sort idempotence (`sort(sort(xs)) === sort(xs)`); financial calculations (bounded results for all valid inputs).
|
|
630
248
|
|
|
631
|
-
|
|
632
|
-
|
|
633
|
-
|
|
634
|
-
|
|
635
|
-
|
|
636
|
-
|
|
637
|
-
|
|
638
|
-
|
|
639
|
-
- Any sort or ranking: `sort(sort(xs))` must equal `sort(xs)` (idempotence)
|
|
640
|
-
- Any financial calculation: results must be within bounds for all valid inputs
|
|
641
|
-
|
|
642
|
-
### Ecosystem tools (language-agnostic principle)
|
|
643
|
-
Use whatever property testing library matches the project's language:
|
|
644
|
-
- TypeScript / JavaScript: `fast-check`
|
|
645
|
-
- Python: `hypothesis`
|
|
646
|
-
- Java / Kotlin: `jqwik` or `kotest`
|
|
647
|
-
- Go: `gopter` or `rapid`
|
|
648
|
-
- Rust: `proptest`
|
|
649
|
-
- Scala: `scalacheck`
|
|
650
|
-
|
|
651
|
-
### Template invariant structure
|
|
652
|
-
```
|
|
653
|
-
property("encode-decode round trip", () => {
|
|
654
|
-
forAll(arbitrary_valid_input(), (input) => {
|
|
655
|
-
expect(decode(encode(input))).toEqual(input);
|
|
656
|
-
});
|
|
657
|
-
});
|
|
658
|
-
```
|
|
249
|
+
| Ecosystem | Tool |
|
|
250
|
+
|---|---|
|
|
251
|
+
| TS/JS | `fast-check` |
|
|
252
|
+
| Python | `hypothesis` |
|
|
253
|
+
| Java/Kotlin | `jqwik` / `kotest` |
|
|
254
|
+
| Go | `gopter` / `rapid` |
|
|
255
|
+
| Rust | `proptest` |
|
|
256
|
+
| Scala | `scalacheck` |
|
|
659
257
|
|
|
660
|
-
|
|
661
|
-
Property failures are bugs, not edge cases to suppress.
|
|
258
|
+
A property failure is a bug — add the failing input as a regression example test, do not suppress.
|
|
662
259
|
|
|
663
260
|
- id: spec-meta-query
|
|
664
261
|
tier: recommended
|
|
@@ -712,139 +309,22 @@ blocks:
|
|
|
712
309
|
title: "Commit Protocol"
|
|
713
310
|
content: |
|
|
714
311
|
## Commit Protocol
|
|
312
|
+
- Conventional commits: `feat|fix|refactor|docs|test|chore(scope): description`.
|
|
313
|
+
- A commit must pass: compilation, lint, tests, coverage gate, mutation gate (Stryker on changed modules), anti-pattern scan.
|
|
314
|
+
- Atomic — one logical change per commit. Never combine a behavior change with a refactor.
|
|
315
|
+
- Commit BEFORE any risky refactor. Update Status.md at end of every session.
|
|
715
316
|
|
|
716
|
-
|
|
717
|
-
|
|
718
|
-
no new anti-patterns introduced.
|
|
719
|
-
|
|
720
|
-
- Conventional commits: `feat|fix|refactor|docs|test|chore(scope): description`
|
|
721
|
-
- Commits must pass: compilation, lint, tests, coverage gate, mutation score gate (Stryker on changed modules), anti-pattern scan.
|
|
722
|
-
- Keep commits atomic — one logical change per commit.
|
|
723
|
-
- Commit BEFORE any risky refactor. Tag stable states.
|
|
724
|
-
- Update Status.md at the end of every session.
|
|
725
|
-
|
|
726
|
-
### Commit Hooks — Emit, Don't Reference
|
|
727
|
-
Commit hooks, commit-message linting, and the CI pipeline must be **emitted as fenced
|
|
728
|
-
code blocks** in the first session response — not merely referenced in prose or README
|
|
729
|
-
text. A hook that exists only as "you should add a pre-commit hook" in documentation
|
|
730
|
-
provides zero enforcement. If the file is not written to disk, the gate does not exist.
|
|
731
|
-
|
|
732
|
-
The following files must be emitted for any new project:
|
|
733
|
-
|
|
734
|
-
**`package.json`** — add to `scripts` and `devDependencies`:
|
|
735
|
-
```json
|
|
736
|
-
"scripts": { "prepare": "husky install" },
|
|
737
|
-
"devDependencies": {
|
|
738
|
-
"husky": "^9.0.0",
|
|
739
|
-
"@commitlint/cli": "^19.0.0",
|
|
740
|
-
"@commitlint/config-conventional": "^19.0.0"
|
|
741
|
-
}
|
|
742
|
-
```
|
|
743
|
-
|
|
744
|
-
**`.husky/pre-commit`**:
|
|
745
|
-
```bash
|
|
746
|
-
#!/usr/bin/env sh
|
|
747
|
-
. "$(dirname -- "$0")/_/husky.sh"
|
|
748
|
-
npx tsc --noEmit && npm run lint && npm test -- --passWithNoTests
|
|
749
|
-
```
|
|
750
|
-
|
|
751
|
-
**`.husky/commit-msg`**:
|
|
752
|
-
```bash
|
|
753
|
-
#!/usr/bin/env sh
|
|
754
|
-
. "$(dirname -- "$0")/_/husky.sh"
|
|
755
|
-
npx commitlint --edit "$1"
|
|
756
|
-
```
|
|
757
|
-
|
|
758
|
-
**`commitlint.config.js`**:
|
|
759
|
-
```js
|
|
760
|
-
module.exports = { extends: ['@commitlint/config-conventional'] };
|
|
761
|
-
```
|
|
762
|
-
|
|
763
|
-
### Linter Config — Emit in P0, Don't Reference
|
|
764
|
-
Linter configuration is infrastructure, not application code. It must be committed to the
|
|
765
|
-
repo root in the **first response** (P0) alongside hooks and CI config — not added post-hoc.
|
|
766
|
-
A linter mentioned only in documentation does not enforce anything.
|
|
767
|
-
|
|
768
|
-
**TypeScript / JavaScript** — emit `.eslintrc.json` (or `eslint.config.js` for flat config):
|
|
769
|
-
```json
|
|
770
|
-
{
|
|
771
|
-
"parser": "@typescript-eslint/parser",
|
|
772
|
-
"plugins": ["@typescript-eslint"],
|
|
773
|
-
"rules": {
|
|
774
|
-
"no-unused-vars": "off",
|
|
775
|
-
"@typescript-eslint/no-unused-vars": "error",
|
|
776
|
-
"@typescript-eslint/no-explicit-any": "error"
|
|
777
|
-
}
|
|
778
|
-
}
|
|
779
|
-
```
|
|
780
|
-
|
|
781
|
-
**Python** — emit `ruff.toml` (or `[tool.ruff]` section in `pyproject.toml`):
|
|
782
|
-
```toml
|
|
783
|
-
[tool.ruff]
|
|
784
|
-
select = ["E", "F", "I"]
|
|
785
|
-
ignore = []
|
|
786
|
-
line-length = 100
|
|
787
|
-
```
|
|
788
|
-
|
|
789
|
-
**Go** — emit `.golangci.yaml`:
|
|
790
|
-
```yaml
|
|
791
|
-
linters:
|
|
792
|
-
enable:
|
|
793
|
-
- unused
|
|
794
|
-
- govet
|
|
795
|
-
- errcheck
|
|
796
|
-
```
|
|
797
|
-
|
|
798
|
-
The correct linter config for **this project's language** must be committed to the repo root
|
|
799
|
-
in the same response that emits hooks and CI. Discovering lint errors at code review is too late.
|
|
800
|
-
|
|
801
|
-
### CI Pipeline — Emit, Don't Reference
|
|
802
|
-
`.github/workflows/ci.yml` must be emitted as a fenced code block in the first response.
|
|
803
|
-
A CI configuration described only in documentation does not enforce anything.
|
|
804
|
-
Adapt service blocks, branch names, and language-specific commands to the project stack.
|
|
805
|
-
The mutation gate step (`npx stryker run` for JS/TS, `mutmut run` for Python, `pitest` for
|
|
806
|
-
Java) is non-negotiable — it is the only gate that verifies test quality, not just
|
|
807
|
-
test execution. Line coverage at 80% can coexist with 58% mutation score; the mutation
|
|
808
|
-
gate catches the difference.
|
|
809
|
-
|
|
810
|
-
Minimum CI for a Node.js/TypeScript project:
|
|
811
|
-
```yaml
|
|
812
|
-
name: CI
|
|
813
|
-
on:
|
|
814
|
-
push:
|
|
815
|
-
branches: [main, develop]
|
|
816
|
-
pull_request:
|
|
817
|
-
branches: [main, develop]
|
|
818
|
-
jobs:
|
|
819
|
-
ci:
|
|
820
|
-
runs-on: ubuntu-latest
|
|
821
|
-
steps:
|
|
822
|
-
- uses: actions/checkout@v4
|
|
823
|
-
- uses: actions/setup-node@v4
|
|
824
|
-
with:
|
|
825
|
-
node-version: '20'
|
|
826
|
-
cache: 'npm'
|
|
827
|
-
- run: npm ci
|
|
828
|
-
- run: npx tsc --noEmit
|
|
829
|
-
- run: npm run lint
|
|
830
|
-
- run: npm test -- --coverage --passWithNoTests
|
|
831
|
-
- name: Mutation gate
|
|
832
|
-
run: npx stryker run
|
|
833
|
-
```
|
|
317
|
+
### One logical change =
|
|
318
|
+
feature+tests · behavior-preserving refactor · spec change + its code change · bug fix + repro test.
|
|
834
319
|
|
|
835
|
-
###
|
|
836
|
-
|
|
837
|
-
-
|
|
838
|
-
|
|
839
|
-
The AI uses commit history as context in future sessions. Typed, scoped conventional
|
|
840
|
-
messages are a queryable episodic record. `wip` and `changes` are not.
|
|
320
|
+
### Message precision
|
|
321
|
+
- ❌ `fix bug` — not queryable.
|
|
322
|
+
- ✅ `fix(auth): reject expired tokens at middleware boundary before service invocation`
|
|
323
|
+
Commit history is episodic memory the AI reads in future sessions; `wip`/`changes` are not.
|
|
841
324
|
|
|
842
|
-
###
|
|
843
|
-
-
|
|
844
|
-
-
|
|
845
|
-
- A spec update (constitution change + the code change it governs): one commit.
|
|
846
|
-
- A bug fix with the reproducing test included: one commit.
|
|
847
|
-
Never combine a behavior change with a refactor in the same commit.
|
|
325
|
+
### Emit, Don't Reference (P0/P1)
|
|
326
|
+
Hooks (`.husky/pre-commit`, `.husky/commit-msg`, `commitlint.config.js`), linter config (`.eslintrc.json`/`ruff.toml`/`.golangci.yaml` for this stack), and `.github/workflows/ci.yml` must be written to disk as fenced code blocks in the first response — not referenced in prose. If the file is not on disk, the gate does not exist. The hook stack emits these configs in P0/P1.
|
|
327
|
+
- CI steps: checkout → install → type-check → lint → test --coverage → mutation gate (`stryker run`/`mutmut run`/`pitest`). The mutation gate is non-negotiable — it verifies test quality, not just execution.
|
|
848
328
|
|
|
849
329
|
- id: clarification-protocol
|
|
850
330
|
tier: core
|
|
@@ -868,56 +348,20 @@ blocks:
|
|
|
868
348
|
title: "Feature Completion Protocol"
|
|
869
349
|
content: |
|
|
870
350
|
## Feature Completion Protocol
|
|
871
|
-
|
|
872
|
-
|
|
873
|
-
|
|
874
|
-
|
|
875
|
-
(Or `npm test` + manual HTTP check if forgecraft is not installed.)
|
|
876
|
-
A feature is not done until verify passes. Do not proceed to docs if it fails.
|
|
877
|
-
|
|
878
|
-
### 2. Commit (code only)
|
|
879
|
-
Commit after `verify` passes. This triggers CI and the staging deploy pipeline.
|
|
880
|
-
`feat(scope): <description>` — describes the feature, not the docs update.
|
|
881
|
-
|
|
882
|
-
### 3. Deploy to Staging + Smoke Gate
|
|
883
|
-
After the CI pipeline deploys to staging, run the smoke suite:
|
|
884
|
-
```
|
|
885
|
-
npx playwright test --config playwright.smoke.config.ts --grep @smoke
|
|
886
|
-
```
|
|
887
|
-
If smoke fails: **revert the deploy**. Do not proceed to production and do not cascade docs
|
|
888
|
-
for a feature that is broken in the deployed environment.
|
|
889
|
-
|
|
890
|
-
### 4. Doc Sync Cascade
|
|
891
|
-
Update the following in order — skip any that do not exist in this project:
|
|
892
|
-
1. **spec.md** — update the relevant feature section (APIs, behavior, contract changes)
|
|
893
|
-
2. **docs/adrs/** — add an ADR if a new architectural decision was made
|
|
894
|
-
3. **docs/diagrams/c4-*.md** — update `c4-context.md` or `c4-container.md` if a new
|
|
895
|
-
module, container, or external dependency was added. Diagrams must be written to disk
|
|
896
|
-
as fenced Mermaid blocks — updating prose that references a diagram is not an update.
|
|
897
|
-
4. **docs/diagrams/sequence-*.md / state-*.md / flow-*.md** — update or create the
|
|
898
|
-
relevant diagram file for the changed surface. Sequence diagrams must name real
|
|
899
|
-
participants; state diagrams must name real states and transitions; flow diagrams must
|
|
900
|
-
have entry/exit nodes and decision diamonds. A file containing only `<!-- UNFILLED -->`
|
|
901
|
-
markers is a specification gap, not a completed diagram.
|
|
902
|
-
5. **docs/TechSpec.md** — update module list, API reference, or technology choice sections
|
|
903
|
-
6. **docs/use-cases.md** — update or add use cases if new actor interactions were introduced
|
|
904
|
-
7. **Status.md** — always update: what changed, current state, next steps
|
|
351
|
+
1. **Verify** (local): `npx forgecraft-mcp verify .` (or `npm test` + manual HTTP check). Not done until it passes; don't proceed to docs on failure.
|
|
352
|
+
2. **Commit** (code only) after verify passes: `feat(scope): <description>`. Triggers CI + staging deploy.
|
|
353
|
+
3. **Smoke gate**: after staging deploy, `npx playwright test --config playwright.smoke.config.ts --grep @smoke`. On failure: revert the deploy, do not cascade docs.
|
|
354
|
+
4. **Doc sync cascade** in order (skip non-existent): spec.md → docs/adrs/ (if new decision) → docs/diagrams/c4-*.md → docs/diagrams/sequence|state|flow-*.md → docs/TechSpec.md → docs/use-cases.md → Status.md (always). Diagrams written to disk as real Mermaid (named participants/states/nodes); `<!-- UNFILLED -->` is a gap, not a diagram.
|
|
905
355
|
|
|
906
356
|
- id: mcp-tooling
|
|
907
357
|
tier: recommended
|
|
908
358
|
title: "MCP-Powered Tooling"
|
|
909
359
|
content: |
|
|
910
360
|
## MCP-Powered Tooling
|
|
911
|
-
### CodeSeeker —
|
|
912
|
-
|
|
913
|
-
|
|
914
|
-
-
|
|
915
|
-
- **Graph traversal**: imports, calls, extends — follow dependency chains.
|
|
916
|
-
- **Coding standards**: auto-detected validation, error handling, and state patterns.
|
|
917
|
-
- **Contextual reads**: `get_file_context` returns a file with its related code.
|
|
918
|
-
Indexing is automatic on first search (~30s–5min depending on codebase size).
|
|
919
|
-
Most valuable on mid-to-large projects (10K+ files) with established patterns.
|
|
920
|
-
Install: `npx codeseeker install --vscode` or see https://github.com/jghiringhelli/codeseeker
|
|
361
|
+
### CodeSeeker — graph-powered code intelligence (hybrid vector+text+path, RRF)
|
|
362
|
+
- Semantic search (beyond grep), graph traversal (imports/calls/extends), auto-detected standards, `get_file_context` contextual reads.
|
|
363
|
+
- Auto-indexes on first search (~30s–5min). Most valuable on 10K+ file projects.
|
|
364
|
+
- Install: `npx codeseeker install --vscode` — https://github.com/jghiringhelli/codeseeker
|
|
921
365
|
|
|
922
366
|
- id: engineering-preferences
|
|
923
367
|
tier: recommended
|
|
@@ -938,108 +382,36 @@ blocks:
|
|
|
938
382
|
title: "Agent Mechanical Constraints"
|
|
939
383
|
content: |
|
|
940
384
|
## Agent Mechanical Constraints — Non-Negotiable Overrides
|
|
941
|
-
|
|
942
|
-
|
|
943
|
-
|
|
944
|
-
|
|
945
|
-
|
|
946
|
-
|
|
947
|
-
|
|
948
|
-
2. Commit that cleanup separately under `chore(scope): strip dead code`.
|
|
949
|
-
3. Only then begin the real work.
|
|
950
|
-
Dead code accelerates context compaction. Every orphaned import is budget spent on
|
|
951
|
-
tokens that contribute nothing to the task.
|
|
952
|
-
|
|
953
|
-
### 2. Phase Limit — 5 Files Maximum
|
|
954
|
-
Never attempt multi-file refactors in a single response. Break work into explicit phases.
|
|
955
|
-
Each phase must touch **no more than 5 files**. Complete Phase N, verify it compiles and
|
|
956
|
-
all tests pass, then wait for explicit approval before Phase N+1.
|
|
957
|
-
Why: context compaction fires around 167K tokens and discards all intermediate reasoning,
|
|
958
|
-
file reads, and decision chains. Keeping phases small prevents mid-task compaction.
|
|
959
|
-
|
|
960
|
-
### 3. Senior Dev Override
|
|
961
|
-
The model's default system-level directives instruct it to "try the simplest approach
|
|
962
|
-
first" and "avoid refactoring beyond what was asked." These are active constraints that
|
|
963
|
-
fight your intent on architectural tasks.
|
|
964
|
-
**Override them explicitly** when quality requires it:
|
|
965
|
-
> "What would a senior, experienced, perfectionist dev reject in code review? Fix all of it."
|
|
966
|
-
This reframes what "done" means. System-level brevity mandates yield to an explicit
|
|
967
|
-
quality bar stated in the prompt.
|
|
968
|
-
|
|
969
|
-
### 4. Sub-Agent Parallelism for Large Tasks
|
|
970
|
-
For tasks spanning > 5 independent files, **launch parallel sub-agents** (5–8 files per
|
|
971
|
-
agent). Each agent gets its own isolated context window and compaction cycle.
|
|
972
|
-
Sequential processing of large tasks guarantees context decay by the later files.
|
|
973
|
-
Batching into parallel agents multiplies the effective working memory proportionally.
|
|
974
|
-
|
|
975
|
-
### 5. File Read Budget — 2,000-Line Cap
|
|
976
|
-
Each file read is hard-capped at 2,000 lines. Everything past that is silently truncated.
|
|
977
|
-
The model does not know what it didn't see — it will hallucinate the rest.
|
|
978
|
-
**For any file over 500 LOC**: read in sequential chunks using `offset` and `limit`
|
|
979
|
-
parameters. Never assume a single read captured the full file.
|
|
980
|
-
|
|
981
|
-
### 6. Tool Result Truncation
|
|
982
|
-
Tool results exceeding ~50,000 characters are truncated to a 2,000-byte preview.
|
|
983
|
-
The model works from the preview and does not know results were cut.
|
|
984
|
-
If any search returns suspiciously few results: re-run it with narrower scope
|
|
985
|
-
(single directory, stricter glob). State explicitly when truncation may have occurred.
|
|
986
|
-
|
|
987
|
-
### 7. Grep Is Not an AST
|
|
988
|
-
`grep` is raw text pattern matching. It cannot distinguish a function call from a
|
|
989
|
-
comment, a type reference from a string literal, or an import from one module vs another.
|
|
990
|
-
On any rename or signature change, search **separately** for:
|
|
991
|
-
- Direct calls and references
|
|
992
|
-
- Type-level references (interfaces, generics, `typeof`)
|
|
993
|
-
- String literals containing the name
|
|
994
|
-
- Dynamic imports and `require()` calls
|
|
995
|
-
- Re-exports and barrel file entries (`index.ts`, `__init__.py`)
|
|
996
|
-
- Test files and mocks
|
|
997
|
-
Never assume a single grep caught everything. Verify or expect regressions.
|
|
385
|
+
1. **Dead code first**: before any refactor on a file > 300 LOC, strip dead props/exports/imports/logs and commit `chore(scope): strip dead code` separately.
|
|
386
|
+
2. **Phase limit — 5 files max** per response. Complete a phase, verify compile+tests, await approval before the next. (Compaction fires ~167K tokens and discards intermediate reasoning.)
|
|
387
|
+
3. **Senior dev override**: explicitly counter the "simplest approach / don't refactor beyond ask" defaults when quality requires — "What would a perfectionist senior reject in review? Fix all of it."
|
|
388
|
+
4. **Sub-agent parallelism**: for > 5 independent files, launch parallel sub-agents (5–8 files each) for isolated context windows.
|
|
389
|
+
5. **File read budget — 2,000-line cap** (silent truncation beyond). For files > 500 LOC, read in `offset`/`limit` chunks.
|
|
390
|
+
6. **Tool result truncation**: results > ~50K chars truncate to a 2K preview. On suspiciously few results, re-run narrower and state when truncation may have occurred.
|
|
391
|
+
7. **Grep is not an AST**: on rename/signature change, search separately for direct calls, type-level refs, string literals, dynamic imports/`require()`, re-exports/barrels (`index.ts`/`__init__.py`), and test files/mocks.
|
|
998
392
|
|
|
999
393
|
- id: code-generation-verification
|
|
1000
394
|
tier: core
|
|
1001
395
|
title: "Code Generation — Self-Verify Loop"
|
|
1002
396
|
content: |
|
|
1003
397
|
## Code Generation — Verify Before Returning
|
|
398
|
+
Show the evidence — do not claim without running.
|
|
399
|
+
1. **Compile**: `tsc --noEmit` / `mypy` / equiv — 0 errors.
|
|
400
|
+
2. **Test suite**: full run (`jest --runInBand`, `pytest`) — 0 failures.
|
|
401
|
+
3. **Interface consistency**: when changing a signature, fix ALL callers in the same pass (else oscillation).
|
|
402
|
+
4. **DRY check**: duplication < 5% (min-tokens 50) on `src/` — see project-gates.yaml `no-code-duplication`; extract above threshold.
|
|
403
|
+
5. **Interface completeness**: every interface method implemented by its concrete class — see `interface-contract-completeness`.
|
|
1004
404
|
|
|
1005
|
-
|
|
1006
|
-
until the following are true. Show the evidence in your response — do not claim without running.
|
|
1007
|
-
|
|
1008
|
-
### Verification steps (in order)
|
|
1009
|
-
1. **Compile check**: Run `tsc --noEmit` (TypeScript), `mypy` (Python), or equivalent.
|
|
1010
|
-
Zero errors required. Do not return with type errors outstanding.
|
|
1011
|
-
2. **Test suite**: Run the full test suite (`jest --runInBand`, `pytest`, etc.).
|
|
1012
|
-
Zero failures required. Fix every failure before returning.
|
|
1013
|
-
3. **Interface consistency**: When fixing a compile error in file A, check ALL callers of
|
|
1014
|
-
the changed interface. Fixing one side without seeing the other causes oscillation:
|
|
1015
|
-
the model fixes `service.ts` (3-param signature) but `routes.ts` still calls it with
|
|
1016
|
-
an object — same error reappears inverted next pass.
|
|
1017
|
-
4. **§8 DRY Check**: Run duplication detector on `src/`. Duplicated lines must be < 5%
|
|
1018
|
-
(min-tokens 50). Use the tool appropriate for your stack (see project-gates.yaml:
|
|
1019
|
-
`no-code-duplication`). If above threshold, extract duplicated logic to a shared utility
|
|
1020
|
-
before closing.
|
|
1021
|
-
5. **§9 Interface Completeness**: Every method declared in each interface must be implemented
|
|
1022
|
-
by its concrete class. Run static type checking (0 errors required). Use the tool
|
|
1023
|
-
appropriate for your stack (see project-gates.yaml: `interface-contract-completeness`).
|
|
1024
|
-
If errors exist, implement missing methods before closing.
|
|
1025
|
-
|
|
1026
|
-
### Required evidence in the final response
|
|
405
|
+
Required evidence:
|
|
1027
406
|
```
|
|
1028
407
|
tsc --noEmit: 0 errors
|
|
1029
408
|
Jest: 109 passed, 0 failed, 11 suites
|
|
1030
409
|
```
|
|
1031
410
|
|
|
1032
|
-
###
|
|
1033
|
-
-
|
|
1034
|
-
|
|
1035
|
-
|
|
1036
|
-
- **`deleteMany` in FK order, not `DROP SCHEMA`**.
|
|
1037
|
-
`$executeRawUnsafe('DROP SCHEMA public CASCADE; CREATE SCHEMA public;')` throws
|
|
1038
|
-
error 42601 — pg rejects multi-statement queries in prepared statements.
|
|
1039
|
-
Use ordered `deleteMany()` calls in `beforeEach` instead.
|
|
1040
|
-
- **JWT_SECRET minimum length**: HS256 requires ≥ 32 characters.
|
|
1041
|
-
Test secrets like `"test-secret"` (11 chars) cause startup errors.
|
|
1042
|
-
Use `"test-secret-that-is-at-least-32-chars"` in test env.
|
|
411
|
+
### Test-setup pitfalls (TS/Prisma)
|
|
412
|
+
- Use `prisma db push --accept-data-loss`, not `migrate deploy` (no-ops without a migrations folder).
|
|
413
|
+
- Reset via ordered `deleteMany()` in FK order, not `DROP SCHEMA` (pg error 42601 on multi-statement).
|
|
414
|
+
- JWT_SECRET ≥ 32 chars (HS256) in test env.
|
|
1043
415
|
|
|
1044
416
|
- id: known-pitfalls
|
|
1045
417
|
tier: core
|
|
@@ -1089,8 +461,7 @@ blocks:
|
|
|
1089
461
|
content: |
|
|
1090
462
|
## Testing Architecture
|
|
1091
463
|
|
|
1092
|
-
### Test Types by Scope and Purpose
|
|
1093
|
-
Listed from fastest/most-isolated to slowest/most-integrated:
|
|
464
|
+
### Test Types by Scope and Purpose (fast/isolated → slow/integrated)
|
|
1094
465
|
|
|
1095
466
|
| Type | Description | Tooling |
|
|
1096
467
|
|---|---|---|
|
|
@@ -1116,19 +487,9 @@ blocks:
|
|
|
1116
487
|
| **Exploratory** | Manual, session-based, scheduled. Charter-driven. Findings become regression tests. | Manual + session notes |
|
|
1117
488
|
|
|
1118
489
|
### Variant Coverage Dimensions
|
|
1119
|
-
|
|
1120
|
-
|
|
1121
|
-
- **Happy path** — nominal, valid inputs. Necessary but never sufficient.
|
|
1122
|
-
- **Sad / Negative path** — correct rejection of invalid input or sequences.
|
|
1123
|
-
- **Edge case / BVA** — boundary values: max, min, empty, null, type coercions.
|
|
1124
|
-
- **Corner case** — intersection of two or more simultaneous edge conditions. Requires explicit enumeration.
|
|
1125
|
-
- **State transition** — valid and invalid state machine transitions. Requires a state diagram as prerequisite.
|
|
1126
|
-
- **Equivalence partitioning** — one representative from each equivalence class. Reduces test count without reducing coverage.
|
|
1127
|
-
- **Error path** — infrastructure/dependency failure: timeout, 500, DB refused, queue full — conditions the user did not cause.
|
|
1128
|
-
- **Security / Adversarial input** — SQL injection, XSS, path traversal, oversized payloads, malformed tokens. Required at every layer touching user-supplied data.
|
|
1129
|
-
- **Random / Monkey** — unstructured random input. Subsumed by property-based layer.
|
|
490
|
+
Happy path · Sad/Negative · Edge/BVA (max/min/empty/null/coercion) · Corner (intersecting edges) · State transition (needs state diagram) · Equivalence partition · Error path (timeout/500/DB refused) · Security/Adversarial (SQLi, XSS, path traversal, oversized, malformed tokens) · Random/Monkey (via property-based).
|
|
1130
491
|
|
|
1131
|
-
**Variant coverage matrix** (✓
|
|
492
|
+
**Variant coverage matrix** (✓ required, ~ structural, — n/a):
|
|
1132
493
|
|
|
1133
494
|
| Variant | Unit | Integration | Contract | API | E2E | Smoke | Chaos |
|
|
1134
495
|
|---|---|---|---|---|---|---|---|
|
|
@@ -1142,8 +503,7 @@ blocks:
|
|
|
1142
503
|
| Security / Adversarial | — | — | — | ✓ | — | — | ~ always adversarial |
|
|
1143
504
|
| Random / Monkey | via property-based | — | — | — | — | — | ✓ |
|
|
1144
505
|
|
|
1145
|
-
### Test Pipeline Mapping
|
|
1146
|
-
Each trigger gate accumulates the prior gates. A gate may not be skipped.
|
|
506
|
+
### Test Pipeline Mapping (each gate accumulates prior gates; none skippable)
|
|
1147
507
|
|
|
1148
508
|
| Trigger | Gate Contents | Target Duration |
|
|
1149
509
|
|---|---|---|
|
|
@@ -1161,10 +521,7 @@ blocks:
|
|
|
1161
521
|
title: "Active Release Phase Gate"
|
|
1162
522
|
content: |
|
|
1163
523
|
## Active Release Phase: {{release_phase | default: development}}
|
|
1164
|
-
|
|
1165
|
-
Your current phase determines which test gates are **required now**, not advisory.
|
|
1166
|
-
The full taxonomy and trigger mapping are in the Testing section above.
|
|
1167
|
-
Read your phase row below and apply every requirement listed.
|
|
524
|
+
Your phase determines which gates are blocking NOW. Apply every requirement in your phase row.
|
|
1168
525
|
|
|
1169
526
|
| Phase | Required now — blocking | Not required yet |
|
|
1170
527
|
|---|---|---|
|
|
@@ -1175,89 +532,18 @@ blocks:
|
|
|
1175
532
|
|
|
1176
533
|
**Current active phase: `{{release_phase | default: development}}`**
|
|
1177
534
|
|
|
1178
|
-
> If
|
|
1179
|
-
> Hardening tests (load, DAST, penetration) are REQUIRED in this session, not deferred.
|
|
1180
|
-
> Do not proceed to merge without completing the required gate for your phase.
|
|
1181
|
-
> The Testing section above maps each gate to its tooling and target duration.
|
|
535
|
+
> If `pre-release`/`release-candidate`: hardening tests (load, DAST, penetration) are REQUIRED this session, not deferred. Do not merge without completing your phase's gate.
|
|
1182
536
|
|
|
1183
537
|
- id: gs-test-techniques
|
|
1184
538
|
tier: recommended
|
|
1185
539
|
title: "Generative Specification — Testing Techniques"
|
|
1186
540
|
content: |
|
|
1187
541
|
## Generative Specification: Testing Techniques
|
|
1188
|
-
|
|
1189
|
-
|
|
1190
|
-
|
|
1191
|
-
|
|
1192
|
-
|
|
1193
|
-
- Tests are written to FAIL on incorrect code — to find the input or condition that exposes
|
|
1194
|
-
a violation, not to confirm the current behavior.
|
|
1195
|
-
- Tests must be written against interfaces, not implementations.
|
|
1196
|
-
A test coupled to internal state fails on correct refactors and passes on behavioral violations
|
|
1197
|
-
that happen to preserve internal structure. That is the worst outcome.
|
|
1198
|
-
|
|
1199
|
-
### Expose-Store-to-Window (Interactive / Game / Real-Time UIs)
|
|
1200
|
-
For applications with a shared state store (Redux, Zustand, Pinia, state machine), expose the
|
|
1201
|
-
store to `window` in the test environment:
|
|
1202
|
-
```typescript
|
|
1203
|
-
if (process.env.NODE_ENV === 'test') {
|
|
1204
|
-
(window as any).__store = store;
|
|
1205
|
-
}
|
|
1206
|
-
```
|
|
1207
|
-
Playwright tests can then assert both what the screen renders AND what the application believes
|
|
1208
|
-
is true — the store's internal state — without coupling assertions to DOM structure. This catches
|
|
1209
|
-
the failure class that renders correctly but corrupts internal state (score displays right, stored wrong;
|
|
1210
|
-
entity in undefined state not yet manifested as a visual defect).
|
|
1211
|
-
|
|
1212
|
-
### Vertical Chain Test
|
|
1213
|
-
A single UI action triggers Playwright, which then:
|
|
1214
|
-
1. Queries the service layer response
|
|
1215
|
-
2. Queries the database state and any affected indexes
|
|
1216
|
-
3. Verifies correct propagation through every boundary the action crosses
|
|
1217
|
-
4. Returns to the UI to confirm the visible outcome matches the stored state
|
|
1218
|
-
|
|
1219
|
-
Not a unit test, not a visual check, not a flow test: a chain verification. One trigger, inspected
|
|
1220
|
-
at every boundary it crosses. Specify which critical flows receive this treatment in the test
|
|
1221
|
-
architecture document. A defect anywhere in the chain (service logic, persistence, index consistency,
|
|
1222
|
-
UI rendering) is surfaced in a single pass.
|
|
1223
|
-
|
|
1224
|
-
### Mutation Testing as Adversarial Audit
|
|
1225
|
-
An AI-generated test suite carries a structural risk: tests written by a system that knows the
|
|
1226
|
-
correct implementation may pass it rather than catch violations of it.
|
|
1227
|
-
- Run Stryker (JS/TS) or mutmut (Python) against every AI-generated suite before accepting it.
|
|
1228
|
-
- A test that passes a mutant is not testing the contract — it is confirming the absence of one
|
|
1229
|
-
specific mutation, no more.
|
|
1230
|
-
- Coverage measures what was executed. Mutation score measures what was caught. The second is
|
|
1231
|
-
the meaningful metric.
|
|
1232
|
-
- Gates: 70% mutation score at PR, 80% at release candidate on changed code.
|
|
1233
|
-
|
|
1234
|
-
### Multimodal Quality Gates (Generative Assets)
|
|
1235
|
-
When content is AI-generated (images, audio, video), the acceptance criteria must be executable.
|
|
1236
|
-
Manual review at scale is not a pipeline.
|
|
1237
|
-
|
|
1238
|
-
**Visual assets (sprite sheets, generated imagery):**
|
|
1239
|
-
```python
|
|
1240
|
-
# PCA-based orientation check
|
|
1241
|
-
from sklearn.decomposition import PCA
|
|
1242
|
-
pca = PCA(n_components=2).fit(ship_pixel_coordinates)
|
|
1243
|
-
angle = np.degrees(np.arctan2(*pca.components_[0][::-1]))
|
|
1244
|
-
assert abs(angle) <= 15, f"Sprite orientation {angle:.1f}° exceeds 15° tolerance"
|
|
1245
|
-
|
|
1246
|
-
# Symmetry check (horizontal flip similarity)
|
|
1247
|
-
similarity = ssim(img_half_left, np.fliplr(img_half_right))
|
|
1248
|
-
assert similarity >= 0.85, f"Symmetry {similarity:.2f} below 0.85 threshold"
|
|
1249
|
-
```
|
|
1250
|
-
|
|
1251
|
-
**Audio assets:**
|
|
1252
|
-
- Loudness normalization: assert target LUFS within ±1 dB of spec (pyloudnorm).
|
|
1253
|
-
- Frequency profile: no asset competes in the 2–4 kHz presence range during dialogue.
|
|
1254
|
-
- Silence detection: reject assets with generation artifacts (> X ms silence in unexpected positions).
|
|
1255
|
-
|
|
1256
|
-
**MCP-mediated inspection (judgment-requiring defects):**
|
|
1257
|
-
An instrumented game/app state exposed through an MCP server lets a language model
|
|
1258
|
-
evaluate whether a running scene satisfies its acceptance criteria without pre-scripting
|
|
1259
|
-
every assertion. Feed the model the scene spec + MCP access; it reports violations.
|
|
1260
|
-
This addresses defects that are easy to name but hard to encode as assertions.
|
|
542
|
+
- **Adversarial posture**: write tests to FAIL on incorrect code; against interfaces, not internal state.
|
|
543
|
+
- **Expose-store-to-window**: for shared-state UIs (Redux/Zustand/Pinia), set `window.__store` in test env so Playwright asserts internal state, not just DOM. Catches "renders right, stored wrong".
|
|
544
|
+
- **Vertical chain test**: one UI action → assert service response → DB state + indexes → back to UI. Surfaces a defect anywhere in the chain in one pass. Name which critical flows get it.
|
|
545
|
+
- **Mutation as adversarial audit**: run Stryker (JS/TS) / mutmut (Python) on every AI-generated suite. Gates: 70% at PR, 80% at RC on changed code.
|
|
546
|
+
- **Multimodal quality gates** (generated assets): executable acceptance, not manual review. Visual: PCA orientation (≤15°), SSIM symmetry (≥0.85). Audio: LUFS ±1 dB, frequency profile, silence detection. MCP-mediated inspection for judgment-requiring defects.
|
|
1261
547
|
|
|
1262
548
|
- id: artifact-grammar
|
|
1263
549
|
tier: core
|
|
@@ -1357,140 +643,45 @@ blocks:
|
|
|
1357
643
|
title: "ADR Protocol"
|
|
1358
644
|
content: |
|
|
1359
645
|
## ADR Protocol — Persistent Memory
|
|
1360
|
-
|
|
1361
|
-
Every non-obvious architectural decision produces an ADR before implementation begins.
|
|
1362
|
-
An unrecorded architectural decision is a gap in the grammar.
|
|
1363
|
-
|
|
1364
|
-
Without an ADR, the AI will "improve" intentional decisions that appear suboptimal
|
|
1365
|
-
without context — turning deliberate architectural tradeoffs into silently-introduced drift.
|
|
646
|
+
Every non-obvious architectural decision produces an ADR before implementation.
|
|
1366
647
|
|
|
1367
648
|
### Format (minimal)
|
|
1368
649
|
```markdown
|
|
1369
|
-
# ADR-NNNN: [
|
|
1370
|
-
|
|
650
|
+
# ADR-NNNN: [Title]
|
|
1371
651
|
**Date**: YYYY-MM-DD
|
|
1372
652
|
**Status**: Proposed | Accepted | Deprecated | Superseded by ADR-NNNN
|
|
1373
|
-
|
|
1374
|
-
## Context
|
|
1375
|
-
What is the situation that requires a decision? What forces are in tension?
|
|
1376
|
-
|
|
1377
|
-
## Decision
|
|
1378
|
-
What was decided? State it plainly.
|
|
1379
|
-
|
|
1380
|
-
## Alternatives Considered
|
|
1381
|
-
What other options were evaluated and why were they not chosen?
|
|
1382
|
-
|
|
1383
|
-
## Consequences
|
|
1384
|
-
What becomes easier or harder as a result of this decision?
|
|
1385
|
-
What will the AI need to know to work within this constraint?
|
|
653
|
+
## Context / Decision / Alternatives Considered / Consequences
|
|
1386
654
|
```
|
|
1387
655
|
|
|
1388
|
-
### When to
|
|
1389
|
-
-
|
|
1390
|
-
- Any decision that involves a tradeoff (performance vs. simplicity, security vs. UX)
|
|
1391
|
-
- Any decision that was reached after considering alternatives
|
|
1392
|
-
- Any decision that future engineers (or AI sessions) might be tempted to "fix"
|
|
1393
|
-
- Any change to the architectural constitution itself
|
|
1394
|
-
|
|
1395
|
-
### ADR Directory
|
|
1396
|
-
- Path: `docs/adrs/` (zero-padded, kebab-case: `ADR-0001-short-title.md`)
|
|
1397
|
-
- ADRs are immutable once Accepted. To change a decision: write a new ADR that supersedes the old one.
|
|
1398
|
-
- The old ADR is updated only to add `Superseded by ADR-NNNN` to its status.
|
|
1399
|
-
|
|
1400
|
-
### ADR Stubs — Emit in P1
|
|
1401
|
-
When starting a new project, emit ADR stub files as **fenced code blocks** in the first
|
|
1402
|
-
response alongside `prisma/schema.prisma`, `tsconfig.json`, and `package.json`.
|
|
1403
|
-
ADRs referenced only in a README but not written as files are not present in the project.
|
|
1404
|
-
The model cannot reference a file that does not exist. Emit the file.
|
|
1405
|
-
|
|
1406
|
-
**Minimum ADRs to emit in P1** (adapt titles to the actual stack chosen):
|
|
1407
|
-
- `docs/adrs/ADR-0001-stack.md` — language, runtime, framework, ORM selection and rationale
|
|
1408
|
-
- `docs/adrs/ADR-0002-authentication.md` — auth strategy (JWT/session), hashing algorithm and why
|
|
1409
|
-
- `docs/adrs/ADR-0003-architecture.md` — layered/hexagonal architecture decision and boundary rules
|
|
1410
|
-
|
|
1411
|
-
Each ADR stub must contain real content in `Status`, `Context`, `Decision`, and `Consequences`
|
|
1412
|
-
fields — not placeholder text. A stub that says "TBD" is not an ADR.
|
|
1413
|
-
|
|
1414
|
-
**ADR reference check:** If your README mentions `docs/adrs/ADR-0001-stack.md`, that file
|
|
1415
|
-
must appear as a fenced code block in the same response. A reference to a non-emitted file
|
|
1416
|
-
is an Auditable violation — it creates the appearance of traceability without the substance.
|
|
1417
|
-
|
|
1418
|
-
Also emit **`CHANGELOG.md`** in P1 with initial content documenting the P1 decisions:
|
|
1419
|
-
```markdown
|
|
1420
|
-
# Changelog
|
|
1421
|
-
All notable changes to this project will be documented here.
|
|
1422
|
-
Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)
|
|
1423
|
-
|
|
1424
|
-
## [Unreleased]
|
|
1425
|
-
### Added
|
|
1426
|
-
- Initial project scaffold: layered architecture, Prisma schema, repository interfaces
|
|
1427
|
-
- Authentication: JWT + Argon2 (see ADR-0002)
|
|
1428
|
-
- Dependency registry: docs/approved-packages.md with audit baseline
|
|
1429
|
-
- CI pipeline: lint, type-check, test, npm audit, mutation gate
|
|
1430
|
-
- Pre-commit hooks: tsc, lint, audit, test gates
|
|
1431
|
-
```
|
|
1432
|
-
A CHANGELOG that exists only as "we will add one" is not Auditable. Write the file.
|
|
1433
|
-
Document the P1 decisions immediately — the first entry is not a release entry, it is the
|
|
1434
|
-
architectural record of what was built in this session.
|
|
656
|
+
### When to write
|
|
657
|
+
Non-obvious choice · tradeoff (perf vs simplicity, security vs UX) · decided after alternatives · future sessions might "fix" it · any change to the constitution.
|
|
1435
658
|
|
|
1436
|
-
|
|
1437
|
-
|
|
1438
|
-
|
|
1439
|
-
first
|
|
659
|
+
- Path `docs/adrs/ADR-0001-short-title.md` (zero-padded, kebab). Immutable once Accepted; supersede with a new ADR.
|
|
660
|
+
- **Emit in P1** as fenced code blocks (a referenced-but-unwritten ADR is an Auditable violation). Minimum: ADR-0001-stack, ADR-0002-authentication (auth strategy + hashing), ADR-0003-architecture. Real content, not "TBD".
|
|
661
|
+
- Also emit **`CHANGELOG.md`** in P1 (Keep a Changelog format) documenting the P1 decisions under `[Unreleased] > Added`.
|
|
662
|
+
- Session start: read open ADRs first. Modifying an ADR-governed boundary without reading it = drift, even if it compiles.
|
|
1440
663
|
|
|
1441
664
|
- id: use-case-triple-derivation
|
|
1442
665
|
tier: recommended
|
|
1443
666
|
title: "Use Case Triple Derivation"
|
|
1444
667
|
content: |
|
|
1445
668
|
## Use Cases — Triple Derivation
|
|
669
|
+
The UC format IS Design by Contract (Meyer): precondition/postcondition/invariant — the contract is the executable spec.
|
|
670
|
+
One precise use case derives three artifacts:
|
|
671
|
+
1. **Implementation contract** — actor/precondition/trigger/postcondition is the spec the service layer is written against.
|
|
672
|
+
2. **Acceptance test** — same artifact in test dialect (Playwright/Cucumber). Hard to write the test = underspecified use case.
|
|
673
|
+
3. **User documentation** — the same content narrated for a non-technical reader; a rendering pass, not a rewrite.
|
|
1446
674
|
|
|
1447
|
-
|
|
1448
|
-
In a generative specification it is a multi-purpose production rule: a single, precise
|
|
1449
|
-
description of an interaction from which three artifacts derive independently and without
|
|
1450
|
-
redundancy.
|
|
1451
|
-
|
|
1452
|
-
### The Three Derivations
|
|
1453
|
-
1. **Implementation contract** — The use case names the actor, precondition, trigger, and
|
|
1454
|
-
postcondition with enough precision to be unambiguous. This is the specification the
|
|
1455
|
-
service layer is written against. When the AI reads a well-formed use case before
|
|
1456
|
-
generating the corresponding service method, it has what a human architect would
|
|
1457
|
-
communicate in a design review.
|
|
1458
|
-
|
|
1459
|
-
2. **Acceptance test** — The use case and the test scenario are the same artifact expressed
|
|
1460
|
-
in different dialects. A Playwright E2E test for a checkout flow is the checkout use case
|
|
1461
|
-
transcribed into executable form. A Cucumber scenario in Given-When-Then is the use case
|
|
1462
|
-
in declarative test notation. When the use case is precise, the test writes itself.
|
|
1463
|
-
**When the test is hard to write, the use case is underspecified.** The test difficulty
|
|
1464
|
-
is the diagnostic for underspecification.
|
|
1465
|
-
|
|
1466
|
-
3. **User documentation** — A use case narrated to a non-technical reader (actor, goal,
|
|
1467
|
-
precondition, sequence, expected outcome, error cases) is a user manual section.
|
|
1468
|
-
The content is identical. The framing is different. A specification with complete use
|
|
1469
|
-
cases does not need a separate documentation writing pass — it needs a rendering pass.
|
|
1470
|
-
|
|
1471
|
-
### Use Case Format (minimal)
|
|
675
|
+
### Format (minimal)
|
|
1472
676
|
```markdown
|
|
1473
677
|
## UC-NNN: [Action] [Domain Object]
|
|
1474
|
-
|
|
1475
|
-
**
|
|
1476
|
-
**
|
|
1477
|
-
**
|
|
1478
|
-
**Main Flow**:
|
|
1479
|
-
1. [Step one]
|
|
1480
|
-
2. [Step two]
|
|
1481
|
-
**Postcondition**: [what is true after success]
|
|
1482
|
-
**Error Cases**:
|
|
1483
|
-
- [Condition]: [System response]
|
|
1484
|
-
**Acceptance Criteria** (machine-checkable):
|
|
1485
|
-
- [ ] [Criterion 1]
|
|
1486
|
-
- [ ] [Criterion 2]
|
|
678
|
+
**Actor**: / **Precondition**: / **Trigger**:
|
|
679
|
+
**Main Flow**: 1. ... 2. ...
|
|
680
|
+
**Postcondition**: / **Error Cases**: [Condition]: [response]
|
|
681
|
+
**Acceptance Criteria** (machine-checkable): - [ ] ...
|
|
1487
682
|
```
|
|
1488
683
|
|
|
1489
|
-
|
|
1490
|
-
Before writing any service method, write the use case first. If you cannot state the
|
|
1491
|
-
precondition and postcondition precisely, you do not yet understand the behavior well enough
|
|
1492
|
-
to implement it correctly. The implementation will be wrong. The use case forces the
|
|
1493
|
-
understanding the implementation requires.
|
|
684
|
+
Diagnostic: write the use case before any service method. Can't state pre/postcondition precisely = don't understand the behavior well enough to implement it.
|
|
1494
685
|
|
|
1495
686
|
- id: living-documentation
|
|
1496
687
|
tier: recommended
|