@curdx/flow 1.1.4 → 1.1.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (89) hide show
  1. package/.claude-plugin/marketplace.json +25 -0
  2. package/.claude-plugin/plugin.json +43 -0
  3. package/CHANGELOG.md +279 -0
  4. package/agent-preamble/preamble.md +214 -0
  5. package/agents/flow-adversary.md +216 -0
  6. package/agents/flow-architect.md +190 -0
  7. package/agents/flow-debugger.md +325 -0
  8. package/agents/flow-edge-hunter.md +273 -0
  9. package/agents/flow-executor.md +246 -0
  10. package/agents/flow-planner.md +204 -0
  11. package/agents/flow-product-designer.md +146 -0
  12. package/agents/flow-qa-engineer.md +276 -0
  13. package/agents/flow-researcher.md +155 -0
  14. package/agents/flow-reviewer.md +280 -0
  15. package/agents/flow-security-auditor.md +398 -0
  16. package/agents/flow-triage-analyst.md +290 -0
  17. package/agents/flow-ui-researcher.md +227 -0
  18. package/agents/flow-ux-designer.md +247 -0
  19. package/agents/flow-verifier.md +283 -0
  20. package/agents/persona-amelia.md +128 -0
  21. package/agents/persona-david.md +141 -0
  22. package/agents/persona-emma.md +179 -0
  23. package/agents/persona-john.md +105 -0
  24. package/agents/persona-mary.md +95 -0
  25. package/agents/persona-oliver.md +136 -0
  26. package/agents/persona-rachel.md +126 -0
  27. package/agents/persona-serena.md +175 -0
  28. package/agents/persona-winston.md +117 -0
  29. package/bin/curdx-flow.js +5 -2
  30. package/cli/install.js +44 -5
  31. package/commands/audit.md +170 -0
  32. package/commands/autoplan.md +184 -0
  33. package/commands/debug.md +199 -0
  34. package/commands/design.md +155 -0
  35. package/commands/discuss.md +162 -0
  36. package/commands/doctor.md +124 -0
  37. package/commands/fast.md +128 -0
  38. package/commands/help.md +119 -0
  39. package/commands/implement.md +381 -0
  40. package/commands/index.md +261 -0
  41. package/commands/init.md +105 -0
  42. package/commands/install-deps.md +128 -0
  43. package/commands/party.md +241 -0
  44. package/commands/plan-ceo.md +117 -0
  45. package/commands/plan-design.md +107 -0
  46. package/commands/plan-dx.md +104 -0
  47. package/commands/plan-eng.md +108 -0
  48. package/commands/qa.md +118 -0
  49. package/commands/requirements.md +146 -0
  50. package/commands/research.md +141 -0
  51. package/commands/review.md +168 -0
  52. package/commands/security.md +109 -0
  53. package/commands/sketch.md +118 -0
  54. package/commands/spec.md +135 -0
  55. package/commands/spike.md +181 -0
  56. package/commands/start.md +189 -0
  57. package/commands/status.md +139 -0
  58. package/commands/switch.md +95 -0
  59. package/commands/tasks.md +189 -0
  60. package/commands/triage.md +160 -0
  61. package/commands/verify.md +124 -0
  62. package/gates/adversarial-review-gate.md +219 -0
  63. package/gates/coverage-audit-gate.md +184 -0
  64. package/gates/devex-gate.md +255 -0
  65. package/gates/edge-case-gate.md +194 -0
  66. package/gates/karpathy-gate.md +130 -0
  67. package/gates/security-gate.md +218 -0
  68. package/gates/tdd-gate.md +188 -0
  69. package/gates/verification-gate.md +183 -0
  70. package/hooks/hooks.json +56 -0
  71. package/hooks/scripts/fail-tracker.sh +31 -0
  72. package/hooks/scripts/inject-karpathy.sh +52 -0
  73. package/hooks/scripts/quick-mode-guard.sh +64 -0
  74. package/hooks/scripts/session-start.sh +76 -0
  75. package/hooks/scripts/stop-watcher.sh +166 -0
  76. package/knowledge/atomic-commits.md +262 -0
  77. package/knowledge/epic-decomposition.md +307 -0
  78. package/knowledge/execution-strategies.md +278 -0
  79. package/knowledge/karpathy-guidelines.md +219 -0
  80. package/knowledge/planning-reviews.md +211 -0
  81. package/knowledge/poc-first-workflow.md +227 -0
  82. package/knowledge/spec-driven-development.md +183 -0
  83. package/knowledge/systematic-debugging.md +384 -0
  84. package/knowledge/two-stage-review.md +233 -0
  85. package/knowledge/wave-execution.md +387 -0
  86. package/package.json +14 -3
  87. package/schemas/config.schema.json +100 -0
  88. package/schemas/spec-frontmatter.schema.json +42 -0
  89. package/schemas/spec-state.schema.json +117 -0
@@ -0,0 +1,255 @@
1
+ ---
2
+ gate: devex-gate
3
+ category: enterprise-mode
4
+ severity: warning
5
+ depends_on: []
6
+ ---
7
+
8
+ # DevEx Gate — Developer Experience Review
9
+
10
+ > Optional in Enterprise mode. Reviews whether code is friendly to **the next maintainer**.
11
+
12
+ ---
13
+
14
+ ## Trigger Timing
15
+
16
+ - When `/curdx-flow:plan-dx` runs (design phase)
17
+ - When `/curdx-flow:review --devex` runs (code phase)
18
+ - Enabled by default in open-source / multi-person collaboration scenarios
19
+
20
+ ---
21
+
22
+ ## Core Question
23
+
24
+ **"Six months from now, can I (or my colleague) quickly take over this code?"**
25
+
26
+ Not considering this → code becomes legacy → maintenance cost grows exponentially.
27
+
28
+ ---
29
+
30
+ ## 8 Dimensions
31
+
32
+ ### DX-01: Clear Naming
33
+
34
+ Naming = the most important documentation.
35
+
36
+ ❌ Bad:
37
+ ```typescript
38
+ function doStuff(x, y) { ... }
39
+ const d = new Date()
40
+ let flag = true
41
+ ```
42
+
43
+ ✓ Good:
44
+ ```typescript
45
+ function validateEmailFormat(email: string): boolean { ... }
46
+ const currentTimestamp = new Date()
47
+ let isAuthenticationPending = true
48
+ ```
49
+
50
+ **Checks**:
51
+ - Abbreviations (`usr`, `pwd`, `addr`) should be expanded (unless domain-standard like `API`, `URL`)
52
+ - Booleans use `is/has/can/should` prefix
53
+ - Functions start with verbs
54
+ - Variables use nouns
55
+ - Single letters only in loops/lambdas (`i`, `x`)
56
+
57
+ ---
58
+
59
+ ### DX-02: Intent Comments
60
+
61
+ Comments explain **why**, not **what**.
62
+
63
+ ❌ Bad (comment adds no value):
64
+ ```typescript
65
+ // get the user
66
+ const user = await getUser(id)
67
+
68
+ // increment i by 1
69
+ i++
70
+ ```
71
+
72
+ ✓ Good (comment explains why):
73
+ ```typescript
74
+ // bcrypt.compare has fixed execution time; run it even when the user does not exist to prevent timing attacks
75
+ const hash = user?.passwordHash ?? FAKE_HASH
76
+ await bcrypt.compare(inputPwd, hash)
77
+ ```
78
+
79
+ **Checks**:
80
+ - Are there low-value comments (`// set x to 1`)?
81
+ - Are there magic numbers / odd practices lacking explanation?
82
+ - Do public APIs have doc comments?
83
+
84
+ ---
85
+
86
+ ### DX-03: Discoverable File Structure
87
+
88
+ ```
89
+ src/auth/
90
+ ├── index.ts ← export entry
91
+ ├── login.ts ← business logic
92
+ ├── login.test.ts ← tests (colocated)
93
+ ├── types.ts ← types
94
+ └── __mocks__/ ← mocks (if any)
95
+ ```
96
+
97
+ **Checks**:
98
+ - Are new files placed in the right location (follow existing patterns)?
99
+ - Naming conventions are consistent (don't mix `login.ts` with `LoginService.ts`)
100
+ - Depth does not exceed 4 levels
101
+
102
+ ---
103
+
104
+ ### DX-04: Useful Error Messages
105
+
106
+ ❌ Bad:
107
+ ```
108
+ Error: Failed
109
+ Error: Validation error
110
+ ```
111
+
112
+ ✓ Good:
113
+ ```
114
+ Error: Failed to send email to user@example.com (SMTP 554 - rejected)
115
+ Error: email must match pattern /^[^@]+@[^@]+\.[^@]+$/, got ""
116
+ ```
117
+
118
+ **Checks**:
119
+ - Error messages include **where it went wrong**, **what went wrong**, and **what the user/developer can do**
120
+
121
+ ---
122
+
123
+ ### DX-05: Easy Setup
124
+
125
+ After a newcomer clones the repo:
126
+
127
+ ```bash
128
+ git clone ...
129
+ cd project
130
+ <single command to run>
131
+ ```
132
+
133
+ Should start with one command (`npm install && npm run dev`).
134
+
135
+ **Checks**:
136
+ - README has a "Getting Started" section
137
+ - No hidden dependencies ("must install PostgreSQL and create a user first")
138
+ - Or a single docker-compose command
139
+
140
+ ---
141
+
142
+ ### DX-06: Clear Types
143
+
144
+ TypeScript / Python with types:
145
+ ```typescript
146
+ // ❌ any hell
147
+ function process(data: any): any { ... }
148
+
149
+ // ✓ explicit types
150
+ function processUser(user: User): ProcessedUser { ... }
151
+ ```
152
+
153
+ **Checks**:
154
+ - `any` usage (the less the better)
155
+ - `unknown` for boundary inputs
156
+ - Interfaces / Types have meaningful names
157
+
158
+ ---
159
+
160
+ ### DX-07: Tests as Documentation
161
+
162
+ ```typescript
163
+ describe("login endpoint", () => {
164
+ test("rejects empty email with 400", async () => { ... })
165
+ test("rejects invalid format email with 400", async () => { ... })
166
+ test("rejects unknown email with 401", async () => { ... })
167
+ test("rejects wrong password with 401", async () => { ... })
168
+ test("accepts valid credentials with 200 + JWT", async () => { ... })
169
+ })
170
+ ```
171
+
172
+ Reading these test names = reading API behavior documentation.
173
+
174
+ **Checks**:
175
+ - Test names describe **behavior** ("returns 400 for empty email"), not implementation ("calls validateEmail")
176
+ - Tests cover happy + edge + error paths
177
+ - Each test is independent (does not depend on order)
178
+
179
+ ---
180
+
181
+ ### DX-08: Fast Dev Loop
182
+
183
+ - Is build time acceptable? (< 30s to change one line)
184
+ - Is test time acceptable? (< 60s to run relevant tests)
185
+ - Does HMR / live reload work?
186
+
187
+ **Checks**:
188
+ - tsc --watch / bundler configuration is reasonable
189
+ - Test runner supports --watch
190
+ - Non-critical tests are optional (e.g. E2E in CI, not on every commit)
191
+
192
+ ---
193
+
194
+ ## Checking Methods
195
+
196
+ ### Agent Automatic
197
+
198
+ When `flow-ux-designer` / `flow-reviewer` applies this gate, use sequential-thinking ≥ 4 rounds to scan the 8 dimensions.
199
+
200
+ ### Human Review
201
+
202
+ Attach a DevEx checklist at PR time:
203
+ - [ ] Clear naming (reviewed at least 3 times)
204
+ - [ ] Critical comments exist
205
+ - [ ] Consistent structure
206
+ - [ ] Actionable error messages
207
+ - [ ] Tests as docs
208
+
209
+ ---
210
+
211
+ ## Scoring
212
+
213
+ Each dimension 0-10 points:
214
+
215
+ ```
216
+ 10 = best practice
217
+ 8 = good
218
+ 5 = pass (production-usable)
219
+ 3 = needs improvement
220
+ 0 = serious issue
221
+ ```
222
+
223
+ Total 40+ / 80 = pass (warning, non-blocking).
224
+ Total < 40 = blocked, improvement required.
225
+
226
+ ---
227
+
228
+ ## Output Format
229
+
230
+ ```markdown
231
+ ## DevEx Gate Report
232
+
233
+ Scores:
234
+ DX-01 naming: 7/10 — 2 abbreviations (usr, pwd)
235
+ DX-02 comments: 8/10 — magic number 42 not explained
236
+ DX-03 structure: 9/10 — consistent
237
+ DX-04 errors: 5/10 — 2 uninformative "Failed"
238
+ DX-05 Setup: 8/10 — README complete
239
+ DX-06 types: 7/10 — 3 instances of any
240
+ DX-07 tests: 6/10 — test names too implementation-detail
241
+ DX-08 dev loop: 9/10 — HMR works well
242
+
243
+ Total: 59/80 (pass)
244
+
245
+ Improvement recommendations:
246
+ 1. Replace usr/pwd with user/password
247
+ 2. Comment magic number 42 (reason for timeout=42s)
248
+ 3. Change error message "Failed" → specific reason
249
+ 4. Several any usages can be typed explicitly
250
+ 5. Rewrite test names with "behavior" descriptions
251
+ ```
252
+
253
+ ---
254
+
255
+ _source: years of development experience + gstack's DX review philosophy._
@@ -0,0 +1,194 @@
1
+ ---
2
+ gate: edge-case-gate
3
+ category: enterprise-mode
4
+ severity: warning
5
+ depends_on: []
6
+ ---
7
+
8
+ # Edge Case Gate — Edge Case Hunter
9
+
10
+ > Derived from BMAD-METHOD's "Edge Case Hunter".
11
+ >
12
+ > **Core**: specifically hunts for **non-happy-path** scenarios. User stories describe the happy path; the real world is full of edge cases.
13
+
14
+ ---
15
+
16
+ ## Trigger Timing
17
+
18
+ - After the requirements phase ends (to supplement edge conditions)
19
+ - After the design phase (to check error-path completeness)
20
+ - After tests are written (to check whether only the happy path is covered)
21
+ - Explicitly requested by /curdx-flow:audit
22
+
23
+ ---
24
+
25
+ ## 7 Categories
26
+
27
+ Systematically inspect the object under review (function / component / API):
28
+
29
+ ### 1. Boundary Values
30
+
31
+ - 0, -1, 1
32
+ - INT_MAX, INT_MIN
33
+ - Empty array `[]`, single-element array `[x]`, large array `[x...10000]`
34
+ - Empty string `""`, single character `"a"`, extra-long string
35
+ - First element / last element / middle element
36
+
37
+ ### 2. Nullish
38
+
39
+ - `null`
40
+ - `undefined`
41
+ - Empty object `{}`
42
+ - Object with missing fields (key does not exist in JSON)
43
+ - Whether default parameters are actually applied
44
+
45
+ ### 3. Concurrency
46
+
47
+ - Two requests arriving simultaneously
48
+ - Write conflicts (optimistic / pessimistic lock)
49
+ - Read-modify-write race
50
+ - Cache invalidation timing
51
+ - Distributed locks
52
+
53
+ ### 4. Error Recovery
54
+
55
+ - Network outage → retry strategy?
56
+ - DB unavailable → circuit breaker?
57
+ - Disk full → degrade?
58
+ - Permission revoked → graceful exit?
59
+ - Dependency service 500 → fallback?
60
+
61
+ ### 5. Security
62
+
63
+ - SQL/Command/XSS injection
64
+ - Unauthorized access (use A's token to access B's resource)
65
+ - Sensitive data leakage (logs / error messages / response)
66
+ - Rate limiting bypass
67
+ - CSRF / session fixation
68
+ - Timing attack
69
+
70
+ ### 6. Internationalization (I18n)
71
+
72
+ - Unicode (emoji, CJK, combining characters)
73
+ - RTL (Arabic)
74
+ - Time zones (UTC vs local, DST jumps)
75
+ - Number formats (decimal point vs comma)
76
+ - Sorting (locale-aware collation)
77
+
78
+ ### 7. Performance
79
+
80
+ - N+1 queries
81
+ - Slow queries (missing indexes)
82
+ - Large responses (M/G scale)
83
+ - Memory leaks (event listeners, closures)
84
+ - Deadlocks / long-running transactions
85
+
86
+ ---
87
+
88
+ ## Required Question Checklist
89
+
90
+ For each category, the agent must answer (via sequential-thinking):
91
+
92
+ ```
93
+ Q1. What inputs/scenarios will this feature encounter for [category]?
94
+ Q2. If the input is [extreme value], what will the current implementation do?
95
+ Q3. Is there a test covering this scenario?
96
+ Q4. If no test, what test should be added to cover it?
97
+ ```
98
+
99
+ ---
100
+
101
+ ## Execution Flow
102
+
103
+ ```
104
+ Input: object under review (function / component / API) + requirements + tests
105
+
106
+ For each category (1-7):
107
+ 1. Use sequential-thinking to list at least 3 possible edge scenarios
108
+ 2. Check whether each scenario has corresponding coverage in tests
109
+ 3. Add uncovered ones to the "gap list"
110
+
111
+ Output: edge-cases.md
112
+ ```
113
+
114
+ ---
115
+
116
+ ## Output Format
117
+
118
+ ```markdown
119
+ ## Edge Case Hunt Report
120
+
121
+ Object under review: src/auth/login.ts + login.test.ts
122
+
123
+ ## Covered (✓)
124
+
125
+ - Valid email + password → 200 + JWT
126
+ - Invalid email format → 400
127
+ - Non-existent user → 401
128
+ - Wrong password → 401
129
+
130
+ ## Gap List (✗)
131
+
132
+ ### 1. Boundary Values
133
+ - ✗ Extra-long email (>255) may cause DB errors
134
+ - Recommendation: test("rejects email >255 chars", ...)
135
+ - ✗ Password containing NUL character (bcrypt has historical issue)
136
+ - Recommendation: test("handles NUL in password safely", ...)
137
+
138
+ ### 2. Nullish
139
+ - ✗ email is empty string vs undefined
140
+ - Currently: both return 400 (via schema validation), but no test
141
+ - Recommendation: explicit test for both cases
142
+
143
+ ### 3. Concurrency
144
+ - ✗ Same user with 2 concurrent logins
145
+ - Risk: token generation uniqueness?
146
+ - Recommendation: test("handles concurrent logins", async () => Promise.all([...]))
147
+
148
+ ### 4. Error Recovery
149
+ - ✗ bcrypt.compare() timeout
150
+ - Currently: no timeout, will wait indefinitely
151
+ - Recommendation: add Promise.race + timeout test
152
+
153
+ ### 5. Security
154
+ - ⚠ Error message leak (user enumeration)
155
+ - Already reported in adversarial review
156
+ - ✗ Timing attack: response time difference between email existing vs not
157
+ - Recommendation: run bcrypt.compare() in both cases, test response time difference < 10ms
158
+
159
+ ### 6. Internationalization
160
+ - ✗ Unicode email (RFC 6531)
161
+ - Currently: regex may reject legitimate Unicode emails
162
+ - Recommendation: test("accepts unicode email like ñ@example.com")
163
+
164
+ ### 7. Performance
165
+ - ⚠ bcrypt cost 12 response time (~100ms) not tested
166
+ - Recommendation: benchmark test, expect < 200ms P99
167
+
168
+ ## Summary
169
+
170
+ Covered: 4 scenarios
171
+ Gaps: 9 scenarios
172
+ Priority ranking: 1 (concurrency) > 4 (timeout) > 7 (timing attack) > others
173
+
174
+ Fix recommendations:
175
+ - High priority: add 4 tests (concurrency, timeout, timing attack, unicode email)
176
+ - Medium priority: add edge-case-tests.test.ts to unify edge-case test management
177
+ ```
178
+
179
+ ---
180
+
181
+ ## Difference from Adversarial Review
182
+
183
+ | Dimension | adversarial | edge-case |
184
+ |------|-------------|-----------|
185
+ | Goal | find **any** issue | find **edge-case** issues |
186
+ | Scope | all dimensions (architecture/implementation/...) | inputs / scenarios |
187
+ | Style | "attacker perspective" | "extreme case search" |
188
+ | Output | issue list + fix recommendations | gap list + test recommendations |
189
+
190
+ The two are complementary. Enterprise mode recommends enabling both.
191
+
192
+ ---
193
+
194
+ _Source: BMAD-METHOD's edge-case-hunter._
@@ -0,0 +1,130 @@
1
+ ---
2
+ gate: karpathy-gate
3
+ category: always-on
4
+ severity: blocking
5
+ depends_on: []
6
+ ---
7
+
8
+ # Karpathy Gate — Thinking Baseline Check
9
+
10
+ > **Always enabled**. This is the code-level enforcement of L1. Violations block immediately.
11
+
12
+ This gate maps to Karpathy's 4 principles. All flow-executor and flow-reviewer agents must enforce it.
13
+
14
+ ---
15
+
16
+ ## Trigger Timing
17
+
18
+ - Before code is written (pre-check)
19
+ - Before commit (re-check)
20
+ - When `/curdx-flow:review` runs (full review)
21
+
22
+ ---
23
+
24
+ ## 4 Checks
25
+
26
+ ### G1. Think Before Coding
27
+
28
+ **Violation patterns**:
29
+ - ✗ Code embodies unstated assumptions (e.g. default encoding, default pagination count, default permission scope)
30
+ - ✗ User goal has multiple interpretations but the agent picked one without saying so
31
+ - ✗ Business-relevant changes (data export, permission modification) were not confirmed with the user
32
+
33
+ **Check method**:
34
+ 1. Read commit message + change scope
35
+ 2. Look in `.progress.md` for "assumption:" entries
36
+ 3. If a key assumption is not explicit, mark as violation
37
+
38
+ **Auto-fix**: impossible. Report to user.
39
+
40
+ ---
41
+
42
+ ### G2. Simplicity First
43
+
44
+ **Violation patterns**:
45
+ - ✗ Introduces an abstraction with only one usage point (Strategy / Factory / Observer used in one place)
46
+ - ✗ Code goes beyond task requirements (user asked for `A`, implemented `A` + `B` + `C`)
47
+ - ✗ Over-defensive (error handling for cases that obviously won't happen)
48
+ - ✗ Premature parameterization (hooks left "in case we need it later")
49
+ - ✗ Tests changed to "always pass" to accommodate implementation
50
+
51
+ **Check method**:
52
+ 1. Cross-reference with the FR list in requirements.md
53
+ 2. Check whether the commit's diff scope exceeds the FR description
54
+ 3. Scan new classes / interfaces / factories; only reasonable if used in > 1 place
55
+
56
+ **Auto-fix**:
57
+ - Dispatch flow-adversary agent to review, flag redundant code
58
+ - Auto-deletion not allowed (may have reasons); list items and let the user decide
59
+
60
+ ---
61
+
62
+ ### G3. Surgical Changes
63
+
64
+ **Violation patterns**:
65
+ - ✗ Task only modifies `auth/login.ts`, but the commit contains changes in `utils/`
66
+ - ✗ Task is to add a feature, but the commit contains "incidental" refactoring
67
+ - ✗ Changed comments, quotes, or indentation unrelated to the task
68
+ - ✗ Deleted pre-existing (not self-caused) "dead code"
69
+
70
+ **Check method**:
71
+ 1. Read the Files field in tasks.md
72
+ 2. Compare to the commit's changed files
73
+ 3. If there is a difference (commit changed files not in Files), mark as violation
74
+
75
+ **Auto-fix**:
76
+ - Dispatch flow-executor to extract the "incidental changes" into a separate commit
77
+ - Or roll back and redo
78
+
79
+ ---
80
+
81
+ ### G4. Goal-Driven Execution
82
+
83
+ **Violation patterns**:
84
+ - ✗ Commit message contains "should", "probably", "seems", "fixed" without verification evidence
85
+ - ✗ The `Verify` field is skipped (claiming complete without running)
86
+ - ✗ Tests were deleted instead of fixed (turning green into gray)
87
+ - ✗ Claims "done" but AC-X.Y still cannot be verified via curl
88
+
89
+ **Check method**:
90
+ 1. Grep commit messages for forbidden words
91
+ 2. Check .progress.md for Verify output records
92
+ 3. For each AC, confirm that an automated verification path can be found
93
+
94
+ **Auto-fix**:
95
+ - Trigger flow-verifier to run reverse verification
96
+ - If AC is not met, send back for rework (dispatch flow-executor to fix)
97
+
98
+ ---
99
+
100
+ ## Violation Levels
101
+
102
+ | Violation | Level | Block? |
103
+ |------|------|-------|
104
+ | G1 (unstated assumption) | Medium | warning, require user confirmation |
105
+ | G2 (over-engineering) | Medium | warning + suggest simplification |
106
+ | G3 (surgical failure) | High | **block**, must split the commit |
107
+ | G4 (no evidence) | High | **block**, must run verification |
108
+
109
+ ---
110
+
111
+ ## Output Format
112
+
113
+ ```markdown
114
+ ## Karpathy Gate Check Result
115
+
116
+ [G1] Think Before Coding: ✓ pass (3 explicit assumption records)
117
+ [G2] Simplicity First: ⚠ warning — src/auth/login-strategy.ts has a single-use Strategy pattern
118
+ [G3] Surgical Changes: ✗ violated — commit abc123 contains accidental changes in utils/
119
+ [G4] Goal-Driven: ✓ pass (all ACs have verification records)
120
+
121
+ Blockers: 1
122
+ Warnings: 1
123
+
124
+ Fix recommendations:
125
+ G3: git reset HEAD~1, split commit abc123 into 2 atomic commits
126
+ ```
127
+
128
+ ---
129
+
130
+ _Applied to: all agent preamble.md has this built in; this file contains the detailed rules for concrete checks._