lemmaly 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (97) hide show
  1. package/LICENSE +201 -0
  2. package/README.md +238 -0
  3. package/cli/gen-agents-md.js +60 -0
  4. package/cli/gen-rule-docs.js +885 -0
  5. package/cli/lemmaly.js +162 -0
  6. package/commands/benchmark.md +40 -0
  7. package/commands/budget.md +53 -0
  8. package/commands/complexity.md +26 -0
  9. package/commands/cut.md +27 -0
  10. package/commands/hotpath.md +22 -0
  11. package/commands/invariant.md +22 -0
  12. package/commands/n-plus-one.md +20 -0
  13. package/commands/profile.md +34 -0
  14. package/commands/regress.md +43 -0
  15. package/commands/scale-check.md +37 -0
  16. package/commands/ship-check.md +26 -0
  17. package/package.json +48 -0
  18. package/rules/cpp.json +46 -0
  19. package/rules/csharp.json +38 -0
  20. package/rules/go.json +46 -0
  21. package/rules/java.json +38 -0
  22. package/rules/javascript.json +102 -0
  23. package/rules/php.json +38 -0
  24. package/rules/python.json +62 -0
  25. package/rules/ruby.json +38 -0
  26. package/rules/rust.json +38 -0
  27. package/rules/shell.json +38 -0
  28. package/rules/sql.json +54 -0
  29. package/skills/complexity-cuts/SKILL.md +259 -0
  30. package/skills/invariant-guard/SKILL.md +310 -0
  31. package/skills/lemmaly/AGENTS.md +1869 -0
  32. package/skills/lemmaly/SKILL.md +365 -0
  33. package/skills/lemmaly/references/async.md +135 -0
  34. package/skills/lemmaly/references/complexity.md +66 -0
  35. package/skills/lemmaly/references/hot-paths.md +87 -0
  36. package/skills/lemmaly/references/memory.md +118 -0
  37. package/skills/lemmaly/references/n-plus-one.md +139 -0
  38. package/skills/lemmaly/rules/cpp-map-double-lookup.md +38 -0
  39. package/skills/lemmaly/rules/cpp-range-loop-copy.md +33 -0
  40. package/skills/lemmaly/rules/cpp-raw-new.md +36 -0
  41. package/skills/lemmaly/rules/cpp-string-concat-in-loop.md +45 -0
  42. package/skills/lemmaly/rules/cpp-vector-push-no-reserve.md +40 -0
  43. package/skills/lemmaly/rules/cs-async-void.md +45 -0
  44. package/skills/lemmaly/rules/cs-disposable-no-using.md +32 -0
  45. package/skills/lemmaly/rules/cs-list-contains-in-loop.md +36 -0
  46. package/skills/lemmaly/rules/cs-string-concat-in-loop.md +42 -0
  47. package/skills/lemmaly/rules/go-defer-in-loop.md +39 -0
  48. package/skills/lemmaly/rules/go-err-not-checked.md +38 -0
  49. package/skills/lemmaly/rules/go-loop-var-capture.md +47 -0
  50. package/skills/lemmaly/rules/go-slice-append-no-cap.md +39 -0
  51. package/skills/lemmaly/rules/go-string-concat-in-loop.md +44 -0
  52. package/skills/lemmaly/rules/java-arraylist-remove-in-for-i.md +44 -0
  53. package/skills/lemmaly/rules/java-bare-catch-exception.md +42 -0
  54. package/skills/lemmaly/rules/java-list-contains-in-loop.md +40 -0
  55. package/skills/lemmaly/rules/java-string-concat-in-loop.md +42 -0
  56. package/skills/lemmaly/rules/js-anonymous-handler-jsx.md +31 -0
  57. package/skills/lemmaly/rules/js-array-key-index.md +29 -0
  58. package/skills/lemmaly/rules/js-async-in-foreach.md +43 -0
  59. package/skills/lemmaly/rules/js-await-in-for-loop.md +41 -0
  60. package/skills/lemmaly/rules/js-deep-clone-via-json.md +33 -0
  61. package/skills/lemmaly/rules/js-helper-call-in-iterator.md +41 -0
  62. package/skills/lemmaly/rules/js-includes-in-iterator.md +37 -0
  63. package/skills/lemmaly/rules/js-inline-object-jsx-prop.md +35 -0
  64. package/skills/lemmaly/rules/js-nested-for-loops.md +45 -0
  65. package/skills/lemmaly/rules/js-spread-in-reduce.md +38 -0
  66. package/skills/lemmaly/rules/js-unique-via-indexof.md +35 -0
  67. package/skills/lemmaly/rules/js-useeffect-missing-deps.md +33 -0
  68. package/skills/lemmaly/rules/php-count-in-for-condition.md +45 -0
  69. package/skills/lemmaly/rules/php-in-array-in-loop.md +42 -0
  70. package/skills/lemmaly/rules/php-loose-equality.md +35 -0
  71. package/skills/lemmaly/rules/php-query-in-loop.md +47 -0
  72. package/skills/lemmaly/rules/py-bare-except.md +39 -0
  73. package/skills/lemmaly/rules/py-django-loop-without-eager.md +42 -0
  74. package/skills/lemmaly/rules/py-in-list-literal.md +37 -0
  75. package/skills/lemmaly/rules/py-mutable-default-arg.md +39 -0
  76. package/skills/lemmaly/rules/py-open-without-with.md +33 -0
  77. package/skills/lemmaly/rules/py-range-len.md +35 -0
  78. package/skills/lemmaly/rules/py-string-concat-in-loop.md +43 -0
  79. package/skills/lemmaly/rules/rb-bare-rescue.md +41 -0
  80. package/skills/lemmaly/rules/rb-include-in-iterator.md +37 -0
  81. package/skills/lemmaly/rules/rb-n-plus-one-activerecord.md +39 -0
  82. package/skills/lemmaly/rules/rb-string-concat-in-loop.md +39 -0
  83. package/skills/lemmaly/rules/rs-clone-in-loop.md +38 -0
  84. package/skills/lemmaly/rules/rs-string-push-no-capacity.md +43 -0
  85. package/skills/lemmaly/rules/rs-unwrap-in-prod.md +36 -0
  86. package/skills/lemmaly/rules/rs-vec-push-no-capacity.md +42 -0
  87. package/skills/lemmaly/rules/sh-for-ls.md +41 -0
  88. package/skills/lemmaly/rules/sh-set-e-no-pipefail.md +37 -0
  89. package/skills/lemmaly/rules/sh-unquoted-var.md +35 -0
  90. package/skills/lemmaly/rules/sh-useless-cat-pipe.md +32 -0
  91. package/skills/lemmaly/rules/sql-leading-wildcard-like.md +34 -0
  92. package/skills/lemmaly/rules/sql-not-in-subquery.md +38 -0
  93. package/skills/lemmaly/rules/sql-or-in-where.md +35 -0
  94. package/skills/lemmaly/rules/sql-select-no-limit.md +37 -0
  95. package/skills/lemmaly/rules/sql-select-star.md +29 -0
  96. package/skills/lemmaly/rules/sql-update-no-where.md +35 -0
  97. package/skills/mathguard/SKILL.md +277 -0
@@ -0,0 +1,259 @@
1
+ ---
2
+ name: complexity-cuts
3
+ description: Use when refactoring existing code that has poor Big-O — nested loops, O(n^2)+ scans, repeated work, redundant allocations, blown memory, or stated symptoms like "this is slow on large inputs", "times out", "OOM", "too much memory", "reduce complexity", "optimize this algorithm". Targets time and space complexity of code that already exists. For preventing bad complexity before code is written, use lemmaly. For math-level optimizations (Bloom, HLL, FFT, JL projection), escalate to mathguard.
4
+ metadata:
5
+ priority: 2
6
+ role: corrective
7
+ pathPatterns:
8
+ - '**/*.{js,jsx,ts,tsx,mjs,cjs}'
9
+ - '**/*.py'
10
+ - '**/*.sql'
11
+ - '**/*.java'
12
+ - '**/*.cs'
13
+ - '**/*.go'
14
+ - '**/*.rs'
15
+ - '**/*.{cpp,cc,cxx,hpp,hh,hxx}'
16
+ - '**/*.php'
17
+ - '**/*.rb'
18
+ bashPatterns:
19
+ - 'slow'
20
+ - 'timeout'
21
+ - 'oom'
22
+ - 'optimize'
23
+ chainTo:
24
+ - skill: lemmaly
25
+ when: 'about to write new code instead of fixing existing'
26
+ - skill: invariant-guard
27
+ when: '3+ transformations have failed tests — likely a missing contract, not a missing optimization'
28
+ - skill: mathguard
29
+ when: 'classical floor reached and approximate/probabilistic structure could help'
30
+ retrieval:
31
+ aliases:
32
+ - optimize-bigo
33
+ - fix-n-squared
34
+ - lower-complexity
35
+ intents:
36
+ - optimize slow code
37
+ - reduce big-o
38
+ - fix n+1 query
39
+ - lower memory usage
40
+ ---
41
+
42
+ # complexity-cuts — Lower Big-O on Existing Code
43
+
44
+ lemmaly prevents bad complexity before code is written. complexity-cuts fixes it after the fact: code already exists, it works, but its time or space complexity is worse than necessary.
45
+
46
+ **Violating the letter of these rules is violating the spirit of the skill.** Adapting "just a little" is how a faster-but-wrong rewrite ships.
47
+
48
+ ## The Iron Law
49
+
50
+ ```text
51
+ NO TRANSFORMATION WITHOUT EXISTING TESTS GREEN BEFORE AND AFTER
52
+ ```
53
+
54
+ If the code has no tests, you write a characterization test first (golden input → current output). Then transform. Then verify the test still passes. If you skip this, the optimization can silently break callers — and faster-but-wrong is worse than slow-and-right.
55
+
56
+ ## Non-negotiable rules
57
+
58
+ 1. **State current and target Big-O before touching code.** In one line:
59
+ - Current: `time = O(?)`, `space = O(?)`
60
+ - Target: `time = O(?)`, `space = O(?)`
61
+ - Dominant input dimension (n = what, how large in practice)
62
+
63
+ If you cannot state current Big-O, you do not yet understand the code. Read more.
64
+
65
+ 2. **Identify the bottleneck, do not guess.** Point to the exact line(s) responsible for the dominant term. Nested loop? Repeated linear scan? Recomputation? Allocation inside a hot loop? The fix lives there, not elsewhere.
66
+
67
+ 3. **One transformation at a time, with a verify-revert-stop loop.** The loop is:
68
+
69
+ 1. Apply exactly one transformation from the playbook.
70
+ 2. Run the existing test suite (or the characterization test you wrote per the Iron Law).
71
+ 3. If any test breaks: **revert immediately.** Do not patch the test. Do not patch around the failure. Revert.
72
+ 4. Count reverts on this piece of code. If **3 reverts in a row**, STOP optimizing. The bottleneck is wrong, the transformation is wrong, or the code has invariants you have not modeled. Escalate to `invariant-guard` and write the missing contract — do not try a fourth transformation.
73
+ 5. Only after a transformation lands green: pick the next one.
74
+
75
+ Stacked changes hide regressions. Patched tests hide regressions louder.
76
+
77
+ 4. **Preserve semantics exactly.** Lower complexity must not change outputs, ordering guarantees, stability, or error behavior. If the optimization requires a semantic change (e.g. unordered output), call it out explicitly and confirm it is acceptable.
78
+
79
+ 5. **No invented numbers.** Never write "10x faster" or "saves 200MB" without measuring. Write `<measured: TBD>` and move on, or actually measure with a representative input.
80
+
81
+ 6. **Always report the measured speedup ratio after a transformation lands.** Once the new code is green, run a representative benchmark (same input, same machine, warm cache) and report `before → after` plus the ratio as `N× faster` (or `N× less memory`). One line, attached to the diff:
82
+
83
+ ```text
84
+ p50: 186 ms → 1.1 ms (169× faster, n=20,000, 200 samples)
85
+ ```
86
+
87
+ If you cannot measure (e.g. the win is purely asymptotic on inputs you don't have), say so explicitly: `asymptotic only, no measurement — O(n²) → O(n)`. Never silently skip this step. The speedup ratio is the headline number reviewers and downstream readers will quote — make sure it exists and is honest.
88
+
89
+ ## The transformation playbook
90
+
91
+ The vast majority of real-world Big-O wins come from a small set of moves. Try them in this order:
92
+
93
+ ### Time-complexity reductions
94
+
95
+ | Smell | Fix | Typical win |
96
+ |---|---|---|
97
+ | `for x in A: if x in B` where B is list/array | Convert B to `Set`/`Map` once | O(n·m) → O(n+m) |
98
+ | Nested loop computing pairs/joins | Hash-join on the key; index by lookup field | O(n·m) → O(n+m) |
99
+ | Repeated `.find` / `.indexOf` / `.includes` inside a loop | Precompute index `Map<key, item>` outside loop | O(n^2) → O(n) |
100
+ | Repeated recomputation of same value | Memoize / cache by input key | O(n·f(n)) → O(n + f(n)) |
101
+ | Sort inside a loop | Sort once outside | O(n^2 log n) → O(n log n) |
102
+ | Linear scan for min/max/median repeatedly | Heap / sorted structure | O(n·k) → O(n log k) |
103
+ | Recursive recomputation (naive Fibonacci shape) | Memoize, or convert to iterative DP | exponential → O(n) |
104
+ | String concatenation in a loop (some langs) | Use builder / `join` / `array.push` then join | O(n^2) → O(n) |
105
+ | Repeated regex compile in loop | Compile once outside | constant-factor, large |
106
+ | Counting / grouping via nested loop | Single pass with `Counter` / `Map<k, count>` | O(n^2) → O(n) |
107
+ | Sliding-window written as nested loop | Two-pointer / windowed sum | O(n^2) → O(n) |
108
+ | Repeated prefix sums | Precompute prefix array, O(1) range queries | O(n·q) → O(n+q) |
109
+ | Pairwise distance / containment checks on intervals | Sort + sweep line | O(n^2) → O(n log n) |
110
+ | Top-K via full sort | Heap of size K | O(n log n) → O(n log k) |
111
+ | Repeated set membership in loop body | `Set` once, reuse | O(n·m) → O(n) |
112
+ | `await` inside a `for` over independent items | `Promise.all` / batched concurrency | wall-clock O(n·latency) → O(latency) |
113
+ | ORM query inside a loop (N+1) | `IN (...)` / `select_related` / bulk fetch | O(n) round-trips → O(1) |
114
+
115
+ ### Space-complexity reductions
116
+
117
+ | Smell | Fix | Typical win |
118
+ |---|---|---|
119
+ | Materializing whole list/array just to iterate | Generator / iterator / stream | O(n) → O(1) |
120
+ | Building intermediate arrays via chained `.map().filter().map()` on huge data | Single-pass loop or lazy pipeline | k·O(n) → O(n) (often O(1) extra) |
121
+ | Caching every intermediate result of a recursion | Rolling window (keep last k states) | O(n) → O(k) |
122
+ | Storing parents/visited for graph traversal when only count needed | Bitset / counter only | O(n) → O(1) |
123
+ | Copying input to mutate | In-place mutation when caller allows | O(n) → O(1) |
124
+ | Reading entire file before processing | Stream line-by-line / chunked | O(file) → O(chunk) |
125
+ | Deep-clone for safety in a loop | Clone once, or use structural sharing / immutables | O(n·m) → O(n+m) |
126
+ | Holding references that prevent GC (closures, listeners, caches) | Bound the cache (LRU), remove listeners, scope closures tightly | unbounded → bounded |
127
+ | Loading full result set from DB | Cursor / pagination / streaming query | O(rows) → O(page) |
128
+ | `JSON.parse(JSON.stringify(x))` for cloning | `structuredClone` or targeted copy | O(n) work and allocation removed |
129
+
130
+ ### When you cannot lower asymptotic Big-O
131
+
132
+ Sometimes O(n log n) really is the floor. Then move to constant-factor wins:
133
+ - Replace pointer-chasing structures with contiguous arrays (cache locality).
134
+ - Hoist invariants out of loops.
135
+ - Avoid allocation in the hot loop (reuse buffers).
136
+ - Prefer typed arrays / native containers over boxed objects for numeric work.
137
+ - Batch syscalls / I/O.
138
+
139
+ State explicitly: "Asymptotic floor is O(n log n); applying constant-factor optimizations only."
140
+
141
+ ## Required workflow
142
+
143
+ For each piece of code you optimize:
144
+
145
+ 1. **Measure or estimate current Big-O.** Write it down.
146
+ 2. **Identify the bottleneck line(s).** Point at them.
147
+ 3. **Pick one transformation from the playbook.** Name it.
148
+ 4. **Apply it.** One change.
149
+ 5. **Verify behavior.** Tests pass, or outputs match on a representative input.
150
+ 6. **State new Big-O.** Time and space.
151
+ 7. **Repeat if more wins exist and are worth the complexity cost.**
152
+
153
+ ## Canonical example — workflow vs no-workflow
154
+
155
+ The same optimization with and without the verify-revert-stop loop.
156
+
157
+ **Bottleneck.** `getOrdersWithUsers()` runs 10s on 10k orders. Cause: `users.find(u => u.id === o.userId)` inside the map → O(n·m).
158
+
159
+ <Bad>
160
+
161
+ ```ts
162
+ // No workflow: change semantics + the optimization in one go
163
+ export function getOrdersWithUsers(orders, users) {
164
+ const userById = Object.fromEntries(users.map(u => [u.id, u]));
165
+ return orders
166
+ .map(o => ({ ...o, user: userById[o.userId] }))
167
+ .filter(o => o.user); // silently drops orders whose user was deleted
168
+ }
169
+ ```
170
+
171
+ Faster, *and* changes the result set. Existing tests catch it — but the diff also "fixes" a flaky test by removing the assertion that checked the old behavior. Ships green. Breaks the billing report two weeks later.
172
+
173
+ </Bad>
174
+
175
+ <Good>
176
+
177
+ ```ts
178
+ // Workflow applied:
179
+ // Bottleneck: orders.map → users.find (line 14)
180
+ // Current: time = O(n·m), space = O(1)
181
+ // Target: time = O(n+m), space = O(m)
182
+ // Transformation: precompute index Map<userId, User> outside the loop
183
+ // Semantic risk: None — orders with missing users still emit `user: undefined` exactly as before
184
+ // Reverts so far: 0
185
+
186
+ export function getOrdersWithUsers(orders, users) {
187
+ const userById = new Map(users.map(u => [u.id, u]));
188
+ return orders.map(o => ({ ...o, user: userById.get(o.userId) }));
189
+ }
190
+ ```
191
+
192
+ One transformation. Existing tests stay untouched. Run them. If green, ship. If red, revert (don't patch). After 3 reverts, stop and load `invariant-guard` — the bottleneck is wrong, or the function has a contract no one wrote down.
193
+
194
+ </Good>
195
+
196
+ ## Output discipline
197
+
198
+ When proposing or applying an optimization, your message must contain — in this order:
199
+
200
+ 1. **Bottleneck** — file:line and one-sentence reason.
201
+ 2. **Current complexity** — `time = O(?)`, `space = O(?)`.
202
+ 3. **Transformation** — name from the playbook (or describe it if novel).
203
+ 4. **New complexity** — `time = O(?)`, `space = O(?)`.
204
+ 5. **Semantic risk** — anything callers might notice (ordering, stability, error timing). "None" is a valid answer if true.
205
+ 6. **Measured speedup** — `before → after` with the ratio as `N× faster` (or `asymptotic only` if not measured). One line, honest numbers.
206
+ 7. **The diff.**
207
+
208
+ If any of 1–6 is missing, the optimization is not ready to apply.
209
+
210
+ ## Stop conditions — do not optimize further when
211
+
212
+ - Asymptotic Big-O already matches a known lower bound for the problem.
213
+ - The input is provably small and bounded (n < ~100 and not on a hot path).
214
+ - The optimization would obscure correctness or harm readability without a measured win.
215
+ - The bottleneck is I/O or external service latency, not CPU/memory — go fix that instead.
216
+
217
+ Premature optimization past these points adds risk without payoff.
218
+
219
+ ## Rationalizations to watch for
220
+
221
+ These are real verbatim thoughts captured from a controlled test where the model produced a correct optimization but skipped the workflow that would document it for reviewers:
222
+
223
+ | Excuse | Reality |
224
+ | --- | --- |
225
+ | "I already solved this in my head — just paste the diff and add labels after." | Retrofitted labels lie about the reasoning order. Write bottleneck → complexity → transformation → diff in that order, or you are writing fiction. |
226
+ | "Stating the current Big-O is busywork — everyone can see the nested loop." | If everyone can see it, writing one line costs nothing. If only you can see it, you just saved the reviewer's time. |
227
+ | "Semantic risk is None, skip that step." | "None" is a valid answer — but write it. The next reader does not know which guarantees you considered. |
228
+ | "I'll do all three transformations in one diff." | Stacked transformations hide regressions. One transformation, verify, repeat. |
229
+ | "It's just a small refactor, the workflow is overkill." | Then it takes 30 seconds. The cases where you skip the workflow are the ones where you miss the optimization next to the obvious one. |
230
+ | "I'll measure later." | Later is `<measured: TBD>` forever. Either measure now or accept the asymptotic argument as the only claim. |
231
+
232
+ If any of these sound familiar mid-edit: stop, restart the seven-step output discipline.
233
+
234
+ ## Red flags — STOP
235
+
236
+ - Optimizing without stating current Big-O.
237
+ - "This should be faster" without identifying a specific bottleneck line.
238
+ - Stacking multiple transformations before verifying any one of them.
239
+ - Claiming a speedup without measuring or without an asymptotic argument.
240
+ - Lowering complexity by silently changing output semantics.
241
+ - Rewriting code that runs once at startup with n = 12.
242
+
243
+ All of these mean: back up, restate the rules, start the workflow over.
244
+
245
+ ## Verification checklist
246
+
247
+ Before claiming an optimization is complete:
248
+
249
+ - [ ] Existing tests (or a written characterization test) were green BEFORE the transformation.
250
+ - [ ] Exactly one transformation was applied.
251
+ - [ ] Tests are green AFTER the transformation.
252
+ - [ ] No test was modified, weakened, or skipped to make it pass.
253
+ - [ ] Current Big-O and target Big-O are stated in the diff or PR description.
254
+ - [ ] Semantic risk is written down ("None" is valid if true).
255
+ - [ ] Measured speedup ratio is reported as `before → after · N× faster` (or explicitly marked `asymptotic only` if no measurement was possible).
256
+ - [ ] If a measured claim was made (e.g. "3x faster"), the measurement command is included.
257
+ - [ ] Revert count on this code is < 3.
258
+
259
+ Cannot check every box? The optimization is not done. Either revert or finish the gap — do not ship a half-verified speedup.
@@ -0,0 +1,310 @@
1
+ ---
2
+ name: invariant-guard
3
+ description: Use when writing or reviewing algorithms where the obvious implementation is subtly wrong — postcondition stronger than the loop's natural invariant (Boyer–Moore majority, Floyd cycle, leftmost vs any binary search, QuickSelect partition); in-place mutation with read+write pointers (dedup-in-place, partition, rotate); recursion with multiple parameters or accumulator state; off-by-one suspects with duplicates, empty inputs, boundary values; iterative refinements that must terminate (fixed-point, Newton, EM); any function where you catch yourself thinking "I know this algorithm" — the trap is usually in the contract, not the loop body. Forces writing the function contract (especially the postcondition) and loop invariant BEFORE code. Pairs with lemmaly (picks the algorithm) and mathguard (picks the math).
4
+ metadata:
5
+ priority: 2
6
+ role: correctness
7
+ pathPatterns:
8
+ - '**/*.{js,jsx,ts,tsx,mjs,cjs}'
9
+ - '**/*.py'
10
+ - '**/*.go'
11
+ - '**/*.rs'
12
+ - '**/*.java'
13
+ - '**/*.kt'
14
+ - '**/*.cs'
15
+ - '**/*.{cpp,cc,cxx,hpp,hh,hxx}'
16
+ - '**/*.php'
17
+ - '**/*.rb'
18
+ - '**/*.{sh,bash}'
19
+ chainTo:
20
+ - skill: lemmaly
21
+ when: 'algorithm choice is unsettled — pick the family first, then prove it'
22
+ - skill: mathguard
23
+ when: 'invariants involve ε-bounds (approximate / randomized algorithms)'
24
+ retrieval:
25
+ aliases:
26
+ - loop-invariants
27
+ - correctness-first
28
+ - postcondition-first
29
+ intents:
30
+ - prove the algorithm is correct
31
+ - write loop invariants
32
+ - handle edge cases
33
+ - off-by-one safety
34
+ ---
35
+
36
+ # invariant-guard — Correctness-First Coding
37
+
38
+ The model knows what a loop invariant is. It knows recursion needs a base case. It knows about empty lists, integer overflow, and the difference between `<` and `≤`. It just does not write these down before producing code, so it ships subtle correctness bugs that tests do not catch.
39
+
40
+ invariant-guard fixes the behavior. State the invariants. State the base case. State the termination argument. State the edge cases. Then write the code — and verify that the code maintains what you stated.
41
+
42
+ **Violating the letter of these rules is violating the spirit of the skill.** "I know this algorithm" is the exact rationalization that ships off-by-one and missing-postcondition bugs.
43
+
44
+ ## The Iron Law
45
+
46
+ ```text
47
+ NO LOOP OR RECURSION WITHOUT A WRITTEN INVARIANT AND TERMINATION ARGUMENT
48
+ ```
49
+
50
+ If you cannot write the invariant in one sentence, you have not designed the loop. Write code anyway and you are coding by guess — and the bug will be in the case you did not enumerate.
51
+
52
+ ## Non-negotiable rules
53
+
54
+ 1. **Every loop gets a one-line invariant.** Before writing any loop, state in one sentence what is true at the top of every iteration. Examples:
55
+ - "At loop top: `result` contains the sum of `a[0..i)`."
56
+ - "At loop top: `lo ≤ target_position ≤ hi`."
57
+ - "At loop top: `seen` contains every element processed so far; `dups` contains every element that appeared at least twice."
58
+
59
+ If you cannot write the invariant in one sentence, you have not designed the loop yet.
60
+
61
+ 2. **Every loop gets a one-line termination argument.** Name the quantity that strictly decreases (or strictly increases toward a bound) on every iteration. Examples:
62
+ - "`hi − lo` strictly decreases each iteration."
63
+ - "`i` increases by 1 and is bounded above by `n`."
64
+ - "`stack.length` strictly decreases each pop; nothing pushes inside this branch."
65
+
66
+ No termination argument, no loop.
67
+
68
+ 3. **Every recursion gets an explicit base case and a measure.** Before writing a recursive function, state:
69
+ - The base case(s) — the smallest inputs that return without recursing.
70
+ - The measure — a non-negative integer that strictly decreases on every recursive call (e.g. `len(xs)`, `hi − lo`, `depth`, `n`).
71
+ - The combination — how the recursive results combine into the answer.
72
+
73
+ No base case + measure, no recursion. (Mutual recursion: state the measure across the cycle.)
74
+
75
+ 4. **List edge cases before writing, not after.** For every function operating on a collection or number, list which of these apply and how they behave:
76
+ - Empty input (`[]`, `""`, `null`, `undefined`, `None`).
77
+ - Singleton (`[x]`).
78
+ - All-equal elements.
79
+ - Already-sorted / reverse-sorted input.
80
+ - Duplicates (when uniqueness is assumed).
81
+ - Negative numbers, zero, exactly the boundary value.
82
+ - Integer overflow / underflow at the type max/min.
83
+ - NaN, ±Infinity, `-0`, denormals (for floats).
84
+ - Off-by-one boundaries: index 0, index n−1, index n, length 0, length 1.
85
+ - Concurrent modification while iterating.
86
+
87
+ The cases that apply must each have a one-phrase expected behavior written down.
88
+
89
+ 5. **Make illegal states unreachable, not just unhandled.** Prefer encoding constraints in types and structure so the wrong state cannot be constructed:
90
+ - Sum type over boolean flag soup (`Loading | Loaded(data) | Error(msg)` not `{loading, data, error}`).
91
+ - Newtype for IDs that must not be swapped (`UserId` vs `OrderId`).
92
+ - Non-empty list type when the function requires at least one element.
93
+ - Parsed value at the boundary, not validated repeatedly downstream (parse-don't-validate).
94
+
95
+ If the language cannot encode it, write the invariant as a comment and assert it at the boundary.
96
+
97
+ ## The pre-write protocol
98
+
99
+ Before producing non-trivial code that has loops, recursion, or non-trivial state, your message must contain — in this order:
100
+
101
+ 1. **Function contract** — preconditions, postconditions, and what the function returns. One line each.
102
+ 2. **Loop invariants** — one per loop. (Rule 1.)
103
+ 3. **Termination arguments** — one per loop or recursion. (Rules 2, 3.)
104
+ 4. **Base cases and measure** — for recursion. (Rule 3.)
105
+ 5. **Edge case table** — bullets, one per applicable case, with expected behavior. (Rule 4.)
106
+ 6. **Illegal states made unrepresentable** — name the types or asserts that enforce invariants. (Rule 5.)
107
+ 7. **The code.**
108
+ 8. **Self-check** — one line per loop confirming the invariant holds at top, body preserves it, and exit implies postcondition.
109
+
110
+ If any of 1–6 is missing, do not emit code.
111
+
112
+ ## Worked trap — Boyer–Moore majority vote
113
+
114
+ This is the canonical "the trap is in the contract, not the loop body" case. Every step of the protocol either catches the bug or fails to — observe which.
115
+
116
+ **Naive baseline (what gets shipped without the skill):**
117
+
118
+ ```typescript
119
+ function findMajority(arr: number[]): number | null {
120
+ if (arr.length === 0) return null;
121
+ let candidate = arr[0], count = 0;
122
+ for (const x of arr) {
123
+ if (count === 0) candidate = x;
124
+ if (x === candidate) count++; else count--;
125
+ }
126
+ return candidate; // BUG: returns the candidate even when no majority exists
127
+ }
128
+ ```
129
+
130
+ This implementation fails on `[1,2,3]` (returns `3`, expected `null`) and `[2,2,1,1]` (returns `1`, expected `null`). The voting loop is correct; the postcondition is wrong.
131
+
132
+ **Why the protocol catches it.** Writing **step 1 (function contract)** forces the postcondition in plain language:
133
+
134
+ > Returns `x` iff `count(x, arr) > arr.length / 2`; else `null`.
135
+
136
+ Then writing **step 2 (loop invariant)** forces the invariant of the voting pass:
137
+
138
+ > If a strict majority element exists in `arr`, it equals `candidate` when the loop exits.
139
+
140
+ These two statements are not equivalent. The loop invariant guarantees "if a majority exists, it is the candidate" — not "the candidate is a majority." Once you write both down, the gap is visible: you need a second pass to verify, or the postcondition is unmet.
141
+
142
+ **Correct implementation that survives the protocol:**
143
+
144
+ ```typescript
145
+ function findMajority(arr: number[]): number | null {
146
+ if (arr.length === 0) return null;
147
+ // Pass 1: vote.
148
+ let candidate = arr[0], count = 0;
149
+ // inv: if a strict majority exists in arr, it equals candidate at every count===0 reset.
150
+ for (const x of arr) {
151
+ if (count === 0) candidate = x;
152
+ if (x === candidate) count++; else count--;
153
+ }
154
+ // Pass 2: verify — the voting invariant is strictly weaker than the postcondition.
155
+ let tally = 0;
156
+ // inv: tally = count of candidate in arr[0..i).
157
+ for (const x of arr) if (x === candidate) tally++;
158
+ return tally * 2 > arr.length ? candidate : null;
159
+ }
160
+ ```
161
+
162
+ **Pattern to generalize.** The same trap appears in:
163
+
164
+ - **Floyd's cycle detection** — finding the meeting point tells you a cycle exists, *not* where it starts. You need a second walk.
165
+ - **Two-pointer "find any"** vs **"find leftmost"** — the loop invariant for one does not satisfy the postcondition of the other.
166
+ - **QuickSelect partition** — the loop returns a position; the postcondition is that the element at that position is the k-th smallest. Off by one in the partition invariant silently breaks it.
167
+ - **DP with reconstruction** — the table tells you the optimum value; reconstructing the optimum path needs separate invariants on the choice array.
168
+
169
+ In every case: **write the postcondition first; write the loop invariant second; check that the second implies the first. If not, you are missing a pass, a check, or an auxiliary state.**
170
+
171
+ ## Canonical example — binary search for the leftmost match
172
+
173
+ Most "I know binary search" implementations are written for "find any match." The trap is the postcondition.
174
+
175
+ **Problem.** Given a sorted array with duplicates, return the index of the **leftmost** occurrence of `target`, or `-1`.
176
+
177
+ <Bad>
178
+
179
+ ```ts
180
+ function leftmost(a: number[], target: number): number {
181
+ let lo = 0, hi = a.length - 1;
182
+ while (lo <= hi) {
183
+ const mid = (lo + hi) >> 1;
184
+ if (a[mid] === target) return mid; // ❌ returns ANY occurrence
185
+ if (a[mid] < target) lo = mid + 1; else hi = mid - 1;
186
+ }
187
+ return -1;
188
+ }
189
+ // leftmost([1,2,2,2,3], 2) → may return 2, not 1
190
+ ```
191
+
192
+ The loop invariant ("target lies in a[lo..hi] if anywhere") is satisfied. But the postcondition ("returned index is the *smallest* i with a[i] === target") is strictly stronger. The loop body's early return abandons the search before reaching the leftmost.
193
+
194
+ </Bad>
195
+
196
+ <Good>
197
+
198
+ ```ts
199
+ function leftmost(a: number[], target: number): number {
200
+ // contract:
201
+ // pre: a is sorted ascending
202
+ // post: returns smallest i with a[i] === target, or -1 if absent
203
+ let lo = 0, hi = a.length; // half-open [lo, hi)
204
+ // inv: every index < lo has a[i] < target; every index ≥ hi has a[i] > target OR is past leftmost match
205
+ // term: hi - lo strictly halves each iteration
206
+ while (lo < hi) {
207
+ const mid = (lo + hi) >> 1;
208
+ if (a[mid] < target) lo = mid + 1; else hi = mid;
209
+ }
210
+ // exit: lo === hi, and by invariant lo is the leftmost index where a[lo] >= target
211
+ return lo < a.length && a[lo] === target ? lo : -1;
212
+ }
213
+ ```
214
+
215
+ </Good>
216
+
217
+ Same loop shape. The difference is the contract was written first — and the loop body was chosen to maintain an invariant that *implies* the postcondition.
218
+
219
+ ## Common invariant patterns to reach for
220
+
221
+ | Loop / algorithm shape | Canonical invariant | Termination |
222
+ |---|---|---|
223
+ | Linear scan accumulating | `acc = f(a[0..i))` at top | `i` increases by 1, bounded by `n` |
224
+ | Two-pointer (sorted) | `target (if any) lies in a[lo..hi]` | `hi − lo` strictly decreases |
225
+ | Binary search | `target (if present) ∈ a[lo..hi]` and `a[lo..hi]` non-empty | `hi − lo` strictly halves |
226
+ | Sliding window | window `[l..r)` satisfies the constraint; answer ≥ best so far | `r` advances at least once per outer iter |
227
+ | BFS | every node at distance < d has been popped; queue contains some at distance d | strict node count decrease per pop |
228
+ | DFS / recursion on tree | result for subtree rooted at v = combine(children results) | depth (or remaining nodes) strictly decreases |
229
+ | Divide and conquer | result on `a[lo..hi]` = combine(results on the two halves) | `hi − lo` strictly halves |
230
+ | Greedy with priority queue | extracted item is globally optimal for the remaining problem | heap size strictly decreases per extract |
231
+ | Union-Find op | `find(x)` always returns the canonical root of x's component | tree height bounded by O(log n) (with rank) |
232
+ | In-place partition | `a[0..i)` < pivot; `a[i..j)` ≥ pivot; `a[j..n)` unseen | `n − j` strictly decreases |
233
+
234
+ ## Edge case table — defaults to consider
235
+
236
+ | Input shape | Cases to check |
237
+ |---|---|
238
+ | Array / list | empty, singleton, all-equal, sorted, reversed, with duplicates |
239
+ | String | empty, single char, all whitespace, unicode (surrogates, combining), bytes vs code points |
240
+ | Integer | 0, 1, −1, MIN, MAX, MAX − 1, near overflow in arithmetic, division by 0 |
241
+ | Float | 0.0, −0.0, NaN, ±Inf, denormal, exact comparison should be ε-based |
242
+ | Map / dict | empty, missing key (default vs error), key collision semantics |
243
+ | Tree / graph | empty, single node, cycle (if undirected), self-loop, multigraph, disconnected |
244
+ | Stream / iterator | empty, infinite, single yield, exception mid-iteration |
245
+ | Time / date | DST transition, leap second/day, timezone offset, epoch boundary |
246
+ | Concurrent | empty contention, single thread, max contention, cancellation mid-op |
247
+
248
+ ## Output discipline
249
+
250
+ Code you emit must:
251
+
252
+ - Have one comment per loop stating the invariant (use `// inv:` or `# inv:`).
253
+ - Have one comment per recursion stating the base case and measure.
254
+ - Handle every edge case you listed in step 5, or explicitly delegate ("throws on empty — caller responsibility").
255
+ - Assert preconditions at function entry when the language supports it cheaply.
256
+ - Use types (sum types, newtypes, non-empty, non-null) over runtime checks where the language allows.
257
+
258
+ ## When to escalate or redirect
259
+
260
+ - The function is performance-critical and you have not picked the algorithm — go back to **lemmaly** first; pick the algorithm, then state its invariants here.
261
+ - The technique is mathematical (probabilistic, FFT, geometry) — load **mathguard**; invariants for approximate algorithms include ε-bounds, not equality.
262
+ - The code is concurrent — invariants must account for interleaving; explicitly state "single-threaded only" if that is the assumption.
263
+
264
+ ## Rationalizations to watch for
265
+
266
+ These are real verbatim thoughts captured from a controlled test where confidence in a "well-known algorithm" caused the model to ship a Boyer–Moore implementation that was wrong on `[1,2,3]` and `[2,2,1,1]`:
267
+
268
+ | Excuse | Reality |
269
+ | --- | --- |
270
+ | "I know this algorithm — single pass, done." | Knowing the loop ≠ knowing the contract. The trap usually lives in the postcondition the loop does not enforce. |
271
+ | "I traced it in my head, it works." | Mental tracing skips edge cases. Write the invariant; check it implies the postcondition. |
272
+ | "Edge cases are obvious." | Then write them down in 30 seconds. If they are obvious, the table is cheap. If they are not, the table just saved you. |
273
+ | "Tests will catch it." | Tests catch the examples you thought of. The trap is the example you did not. Postconditions catch all examples. |
274
+ | "The postcondition is implied." | If it were, the natural loop invariant would equal it. When they differ (Boyer–Moore, leftmost search, QuickSelect), you need a second pass, an extra check, or auxiliary state. |
275
+ | "Adding a verification pass feels redundant." | Boyer–Moore voting + verification is still O(n). "Feels redundant" is the rationalization that ships the bug. |
276
+
277
+ If any of these sound familiar mid-thought: stop, write the contract and the invariant, check that one implies the other.
278
+
279
+ ## Red flags — STOP and write the invariant first
280
+
281
+ - About to write `while (...)` without having stated what is true on entry.
282
+ - About to write `if (i === n − 1)` or `if (i === n)` — boundary suspicious, restate the invariant.
283
+ - About to recurse without naming the base case in this message.
284
+ - About to write `// TODO: handle empty` — handle it now or change the type so empty is impossible.
285
+ - About to use `==` on floats.
286
+ - About to compare across signed/unsigned or across types where overflow rolls.
287
+ - About to silently swallow an error in the middle of a loop ("just continue").
288
+ - Tests pass but you did not actually state what the function guarantees.
289
+ - "It works on the examples I tried."
290
+
291
+ All of these mean: stop, restart the eight-step protocol, write the invariant, then write the code.
292
+
293
+ ## Verification checklist
294
+
295
+ Before claiming the function is correct:
296
+
297
+ - [ ] Every loop has a one-line `// inv:` comment in code.
298
+ - [ ] Every loop has a termination argument written down (in comment or PR description).
299
+ - [ ] Every recursion names its base case and measure in code.
300
+ - [ ] The function's postcondition is written and is implied by the exit state of the last loop.
301
+ - [ ] Every applicable edge case from the table has a test or an explicit "delegated to caller" note.
302
+ - [ ] At least one test exercises each non-trivial boundary (empty, singleton, max, off-by-one).
303
+ - [ ] Illegal states the function rejects are either unrepresentable in the type, or asserted at entry.
304
+ - [ ] For approximate/randomized algorithms (escalated to mathguard): ε-bounds are part of the postcondition, not equality.
305
+
306
+ Cannot check every box? The code is example-correct, not behavior-correct. Either fill the gap or downgrade the function's claimed contract.
307
+
308
+ ## The thesis, in one line
309
+
310
+ > **Tests verify examples. Invariants verify behavior. AI assistants ship example-correct, behavior-wrong code by default. invariant-guard makes them reason about behavior first.**