@simpleapps-com/augur-skills 2026.4.15 → 2026.4.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@simpleapps-com/augur-skills",
3
- "version": "2026.04.15",
3
+ "version": "2026.04.17",
4
4
  "description": "Install curated Claude Code skills",
5
5
  "license": "MIT",
6
6
  "type": "module",
@@ -18,8 +18,8 @@ The entire plugin system exists to remove the user as the bottleneck. Every perm
18
18
 
19
19
  | Tier | Method | Speed | Example |
20
20
  |------|--------|-------|---------|
21
- | 1 | Dedicated tools (Read, Grep, Glob, Edit) | **WILL** run immediately, zero permission chance | `Grep(pattern: "...", path: "repo")` |
22
- | 2 | Simple Bash (one command, no operators) | **MAY** run immediately if pre-approved | `pnpm typecheck` |
21
+ | 1 | Dedicated tools (Read, Edit, Write) | **WILL** run immediately, zero permission chance | `Read(file_path: "repo/src/foo.ts")` |
22
+ | 2 | Simple Bash (one command, no operators) | **MAY** run immediately if pre-approved | `pnpm typecheck`, `grep -rn pattern repo` |
23
23
  | 3 | Complex Bash (operators, plumbing) | **WILL** trigger a permission prompt | `pnpm typecheck 2>&1; echo $?` |
24
24
 
25
25
  Prefer tier 1 over tier 2. Use tier 2 only when no dedicated tool exists. NEVER use tier 3.
@@ -37,26 +37,32 @@ The Bash tool is a managed environment, not a raw shell. It already captures std
37
37
  | Limit output | Returned in full | `\| head`, `\| tail`, `\| grep` |
38
38
  | Run the next step | Make a separate tool call | `&&`, `;`, `\|\|` |
39
39
  | Pass output to another command | Write to a tmp file | `$(...)`, backticks |
40
- | Run inline code | Use Read/Grep/Edit tools | `node -e`, `python -c` |
40
+ | Run inline code | Use Read/Edit tools | `node -e`, `python -c` |
41
41
 
42
42
  **One command per Bash call. No operators. No plumbing. If the command has a `;`, `&&`, `|`, `$()`, `2>&1`, or `2>/dev/null` in it, it is wrong.**
43
43
 
44
44
  ## Use Dedicated Tools
45
45
 
46
- Dedicated tools are faster, require no permission, and produce better output. MUST use them instead of Bash equivalents:
46
+ Dedicated tools are faster, require no permission, and produce better output. MUST use them instead of Bash equivalents when one exists:
47
47
 
48
48
  | Instead of | Use |
49
49
  |------------|-----|
50
- | `grep`, `rg` | Grep tool |
51
- | `find`, `ls` (for search) | Glob tool |
52
50
  | `cat`, `head`, `tail` | Read tool |
53
51
  | `sed`, `awk` | Edit tool |
54
52
  | `echo >`, `cat <<EOF` | Write tool |
55
53
 
56
- Reserve Bash for commands that have no dedicated tool equivalent: build tools, test runners, git, package managers, and system commands.
54
+ **Search is now Bash-only.** Claude Code 2.1.117 removed the dedicated Grep and Glob tools. Search files with one of:
55
+
56
+ | Use case | Bash command |
57
+ |----------|--------------|
58
+ | Search file contents | `grep -rn <pattern> <path>` or `rg <pattern> <path>` |
59
+ | Find files by name | `find <path> -name <pattern>` |
60
+ | List directory entries | `ls <path>` |
61
+
62
+ Reserve Bash for these and for commands that never had a dedicated tool: build tools, test runners, git, package managers, system commands.
57
63
 
58
64
  These commands are **denied** in project settings and will always be rejected. Do not attempt them:
59
- `cd`, `cat`, `grep`, `rg`, `find`, `sed`, `awk`, `head`, `tail`, `sleep`, `kill`, `pkill`
65
+ `cd`, `cat`, `sed`, `awk`, `head`, `tail`, `sleep`, `kill`, `pkill`
60
66
 
61
67
  MUST NOT use `node -e` or `python -c` to run inline scripts. These trigger permission prompts. If you need to read a file, use the Read tool. If you need to process data, do it in your response, not in a shell script.
62
68
 
@@ -64,13 +70,11 @@ MUST NOT use `node -e` or `python -c` to run inline scripts. These trigger permi
64
70
 
65
71
  If a Bash call is denied, do NOT retry the same command and do NOT ask the user to approve it. Before anything else, check for a tool equivalent or shell plumbing that can be decomposed:
66
72
 
67
- - `grep`/`rg` → Grep tool (for files on disk); for command output, the Bash tool already returned it — read what you have
68
- - `find`/`ls` → Glob tool
69
73
  - `cat`/`head`/`tail` → Read tool
70
74
  - `sed`/`awk` → Edit tool
71
75
  - `|`, `2>&1`, `&&`, `;`, `$()` → split into separate calls; the Bash tool already captures stdout, stderr, and exit code
72
76
 
73
- Worked example: `pnpm --filter <package> typecheck 2>&1 | grep -c "error TS"` is denied because of the pipe to `grep`. The fix is to run `pnpm --filter <package> typecheck` alone — the Bash tool returns the full output and exit code — then count "error TS" occurrences in the returned output yourself. No grep, no redirection, no retry.
77
+ Worked example: `pnpm --filter <package> typecheck 2>&1 | grep -c "error TS"` is denied because of the pipe and redirection. The fix is to run `pnpm --filter <package> typecheck` alone — the Bash tool returns the full output and exit code — then count "error TS" occurrences in the returned output yourself. No pipe, no redirection, no retry. (`grep` itself is allowed; the deny is on the shell plumbing around it.)
74
78
 
75
79
  ## Background Tasks
76
80
 
@@ -96,21 +100,21 @@ Do not retry the server start until the user confirms the port is free.
96
100
 
97
101
  ## Cross-Project Searching
98
102
 
99
- When looking at another project's code, use dedicated tools with the project path. MUST NOT use shell commands.
103
+ When looking at another project's code, search with Bash directly using the project path. MUST keep it to one simple command per call — no pipes, no `-exec`, no `2>&1 | head`.
100
104
 
101
105
  Wrong: `find {path}/repo -name "*.ts" -exec grep -l "pattern" {} \; 2>/dev/null | head -10`
102
- Right: `Grep(pattern: "pattern", path: "{path}/repo", glob: "*.ts")`
106
+ Right: `grep -rln --include="*.ts" "pattern" {path}/repo`
103
107
 
104
- Wrong: `ls {path}/repo/src/components/`
105
- Right: `Glob(pattern: "{path}/repo/src/components/**/*")`
108
+ Wrong: `ls {path}/repo/src/components/ | head`
109
+ Right: `ls {path}/repo/src/components/`
106
110
 
107
- All project paths are known and predictable (see `simpleapps:wiki` Cross-Project Wiki Access). MUST NOT search the filesystem with `find` or download from the internet. Just use the dedicated tool with the known path.
111
+ All project paths are known and predictable (see `simpleapps:wiki` Cross-Project Wiki Access). Use the known path; do not search the entire filesystem.
108
112
 
109
113
  ## Subagent Responsibility
110
114
 
111
115
  Subagents do NOT inherit this skill. They see only the prompt you give them. The primary agent MUST brief every subagent on bash-simplicity before delegating shell work, and owns the output that comes back.
112
116
 
113
- Every subagent prompt that touches Bash MUST include a one-liner: "One command per Bash call. No operators. Use dedicated tools (Read, Grep, Glob, Edit, Write) over shell equivalents."
117
+ Every subagent prompt that touches Bash MUST include a one-liner: "One command per Bash call. No operators. Use dedicated tools (Read, Edit, Write) over their shell equivalents (`cat`, `sed`, `awk`, `echo >`). Search with Bash directly: `grep -rn`, `find`, `ls` — Claude Code 2.1.117 removed the Grep/Glob tools."
114
118
 
115
119
  If a subagent returns a command containing any forbidden operator (see the table above), that is the primary agent's failure. Reject and ask for a re-plan, or translate into separate simple calls. Do not execute it. A subagent violating this is running on a stale prompt; fix the prompt.
116
120
 
@@ -0,0 +1,270 @@
1
+ ---
2
+ name: code-contracts
3
+ description: When working in load-bearing code (the 5-10% where correctness matters most — money math, auth, concurrency, state machines, security boundaries), tighten the type system first, then add formal contracts (@requires/@ensures/@invariant/@trusted) in the host language's native comment syntax. Use Unicode glyphs (∀, ∈, ≥, ℕ) for AI priming, paired with ASCII gloss for human readers. The contracts pay off through three independent mechanisms: priming agent reasoning, surfacing tests, and adding context-window value that agents themselves report. Drift between contract and code is a defect.
4
+ ---
5
+
6
+ # Code Contracts
7
+
8
+ > **Status: EXPERIMENTAL** for the priming hypothesis (mechanism #1). The test-surfacing (#2) and agent-reported-value (#3) mechanisms are observed in practice. Apply selectively to load-bearing code; treat as a defensible bet across three channels — pays off if any one of them lands.
9
+
10
+ Persistent prompt engineering encoded in source code. Formal contracts on load-bearing functions pay off through three independent mechanisms.
11
+
12
+ ## Three mechanisms — why this is worth the context cost
13
+
14
+ | # | Mechanism | What it does | Status |
15
+ |---|-----------|--------------|--------|
16
+ | 1 | Latent-space priming | Unicode glyph rarity pulls hidden state toward the formal-methods neighborhood — sharper reasoning on the *next edit* | Hypothesis (defensible bet) |
17
+ | 2 | Test surfacing | Explicit clauses make intent legible — agents generate *new and better tests* using each clause as an oracle | **Observed in practice** |
18
+ | 3 | Agent-reported value | Agents themselves report that contracts add useful context for complex / load-bearing methods | **Observed in practice** |
19
+
20
+ The contracts are the artifact. The three mechanisms are *consequences*. Even if mechanism #1 is weaker than hoped, mechanisms #2 and #3 are already paying off.
21
+
22
+ ## What this is — and is not
23
+
24
+ This is **not** documentation. It is **not** a parallel comment style for human readers alone. It is a cognitive switch that targets future agents reading the file, paired with an ASCII gloss that bridges human readers.
25
+
26
+ Models trained on F\*/Lean/Dafny/Coq corpora develop a "spec-then-implement" reasoning policy — slower, more rigorous, explicit about pre/postconditions and effects. Plain TS/PHP/Python does not activate that policy because the surface form does not match. Unicode-heavy contract prose in the same file *does* match — the model upshifts into the more rigorous mode for the function it precedes.
27
+
28
+ The annotations do not add information the model could not infer. They change the *mode* the model reasons in. Same idea as "let's reason step by step" or "you are an expert at X" — moved out of the system prompt and into the artifact, where it primes every future agent that touches the code, not just the current session.
29
+
30
+ ## When to use it
31
+
32
+ MUST apply ONLY in **load-bearing code** — the 5-10% of functions where a subtle bug compounds:
33
+
34
+ - Money math (pricing, tax, fees, totals, currency conversion, rounding)
35
+ - Auth and permission decisions
36
+ - Concurrency and ordering (locks, queues, retries, idempotency keys)
37
+ - State machines and protocols (multi-step flows that must not be entered out of order)
38
+ - Security boundaries (input validation at trust boundaries, sanitization, sinks like SQL/HTML/shell)
39
+ - Algorithms with non-trivial invariants (custom sort/search variants, bespoke data structures)
40
+
41
+ MUST NOT apply by default to:
42
+
43
+ - Getters, setters, simple field accessors
44
+ - Glue code, plumbing, framework adapters
45
+ - UI components, presentational code
46
+ - One-off scripts, throwaway code
47
+ - Test bodies (the test name is the spec)
48
+
49
+ Annotation density has a real context-window cost on every read. MUST concentrate it where the rigorous-reasoning upgrade matters most.
50
+
51
+ ## Two surfaces per clause
52
+
53
+ Every clause has two surfaces riding the same line:
54
+
55
+ | Surface | Purpose |
56
+ |---------|---------|
57
+ | **Formal** (Unicode) | Activates the rigorous-reasoning latent circuits in readers with the bandwidth to parse it. Rarity is the activation. |
58
+ | **Prose** (gloss + assumptions) | Renders the same content as natural language **with assumptions named**, so readers with less parsing bandwidth derive the same conclusions without paying a notation tax. |
59
+
60
+ These are not redundant translations. **The prose surface carries assumptions, not just symbols.**
61
+
62
+ ### The gloss MUST name assumptions
63
+
64
+ When a clause depends on something the formal notation does not make explicit — a precondition the caller is presumed to satisfy, a sentinel value the contract treats as out-of-scope, a side condition the body relies on — the gloss MUST name it. Naming what is *not* part of the contract is as important as naming what is.
65
+
66
+ The formal notation cannot say "and X is assumed to hold." The gloss can. Examples:
67
+
68
+ - `@requires q ≥ 0 // q is non-negative; assumes caller validated input, no NaN check here`
69
+ - `@ensures result ∈ ℕ // result is a non-negative integer; overflow is caller's responsibility`
70
+ - `@invariant balance ≥ 0 // balance never negative; assumes no concurrent mutators outside this lock`
71
+
72
+ Without explicit assumption-naming, a reader must derive the assumptions from negative space — by inspecting callers, the implementation, related tests. That derivation is the parsing tax that costs careful reasoning capacity. The gloss eliminates the tax by naming it directly.
73
+
74
+ ### Why both surfaces
75
+
76
+ Dense formal notation can crowd out attention to factual claims when parsing it costs the reader most of their bandwidth — the cognitive cost of parsing symbols leaves less capacity for engaging with what the contract actually says. Pairing the formal surface with prose-and-assumptions means:
77
+
78
+ - Readers with high parsing bandwidth get the formal surface activating careful reasoning **plus** the gloss catching what the formal notation leaves implicit
79
+ - Readers with less parsing bandwidth derive the same conclusions from the prose alone, no tax paid
80
+ - Either reader gets a working contract; neither is left to derive the assumptions from negative space
81
+
82
+ This is also why the gloss helps human readers: a junior dev seeing `∀ x ∈ xs. x ≥ 0` cold has nowhere to grab. The same dev seeing `∀ x ∈ xs. x ≥ 0 // every x in xs is non-negative; assumes upstream validation` reads it once and starts building intuition for the symbol set with the assumptions made explicit.
83
+
84
+ ### Two pairing patterns
85
+
86
+ Pick one and apply consistently within a file.
87
+
88
+ ```ts
89
+ // Pattern A — inline assumption-naming gloss after the formal clause
90
+ @requires q ≥ 0 // q is non-negative; assumes caller validated input
91
+ @ensures result ∈ ℕ // result is a natural number; overflow is caller's responsibility
92
+ ```
93
+
94
+ ```ts
95
+ // Pattern B — bracketed gloss on the same line
96
+ @requires q ≥ 0 (q is non-negative; assumes caller validated input)
97
+ @ensures ∀ x ∈ items. x.qty ≥ 0 (every item has non-negative qty; empty array is allowed)
98
+ ```
99
+
100
+ Pattern A is denser; Pattern B reads more like prose. Either works.
101
+
102
+ See `vocabulary.md` for the full glyph palette, the clause-first derivation table, and per-language examples.
103
+
104
+ ## Order of preference
105
+
106
+ When a function deserves a contract, work down this list. MUST use the first form that expresses the property; SHOULD fall back to looser forms only when the tighter one cannot.
107
+
108
+ ### 1. Tighten the type system first
109
+
110
+ A type the compiler enforces beats a comment the compiler ignores. The model reads types the same way it reads contracts.
111
+
112
+ ```ts
113
+ // Loose type with comment contract
114
+ // @requires y !== 0
115
+ function divide(x: number, y: number): number { return x / y; }
116
+
117
+ // Branded type that makes the precondition unrepresentable
118
+ type NonZero = number & { readonly __brand: 'NonZero' };
119
+ function divide(x: number, y: NonZero): number { return x / y; }
120
+ ```
121
+
122
+ Tools by language:
123
+
124
+ - **TypeScript** — branded types, narrow union types, `readonly`, template literal types, exhaustive `switch` over discriminated unions
125
+ - **PHP** — typed properties, `readonly`, enums (8.1+), Psalm template types
126
+ - **Python** — `Literal`, `Final`, `NewType`, `Annotated`, `Protocol`, `assert_never`
127
+
128
+ ### 2. Use the language's checker-enforced annotation
129
+
130
+ When the type system cannot express the property, reach for an annotation a real static analyzer enforces. Drift produces a tool error, not just an agent surprise.
131
+
132
+ | Language | Tool | Real annotations |
133
+ |----------|------|------------------|
134
+ | PHP | Psalm | `@psalm-assert`, `@psalm-pure`, `@psalm-immutable`, `@psalm-mutation-free` |
135
+ | PHP | PHPStan | `@phpstan-assert`, `@phpstan-pure`, generic types |
136
+ | TypeScript | tsc + ESLint | branded types, `eslint-plugin-functional` for purity, `assert_never` |
137
+ | Python | mypy / pyright | `assert_type`, `TypeGuard`, `TypeIs`, `Never`, `@final` |
138
+
139
+ ### 3. Formal contract for the residual
140
+
141
+ For properties no checker can express — algebraic laws, multi-step protocol invariants, invariants over external state — leave a contract in the host language's native comment syntax with Unicode + ASCII gloss.
142
+
143
+ Annotation forms:
144
+
145
+ - `@requires <precondition>` — must hold of inputs at call time
146
+ - `@ensures <postcondition>` — guaranteed of result / observable state
147
+ - `@invariant <property>` — holds at loop head, between method calls, across state transitions
148
+ - `@trusted <param>` — value inlined into a security-sensitive sink; origin must be trusted code, never user input
149
+ - `@pure` / `@mutates X` / `@throws Y` / `@io` — effect declaration
150
+ - `@property <law>` — algebraic law (idempotence, associativity, commutativity, monotonicity)
151
+ - `@time <bound>` / `@space <bound>` — asymptotic complexity. Use `Θ(...)` for tight bound, `O(...)` for upper bound only, `Ω(...)` for lower bound. Most contracts write `O(...)`; SHOULD prefer `Θ(...)` when the bound is actually tight, because `O(n²)` is technically true of an `O(n)` function and that looseness primes the wrong reasoning.
152
+
153
+ See `vocabulary.md` for the clause-first derivation table (clause shape → encoding tier), the full glyph palette, and the complexity-notation glossary.
154
+
155
+ ## Style rule — form is the activation
156
+
157
+ The form MUST match formal-language conventions. Informal prose does not switch the reasoning mode. Four tiers, with the strongest pairing Unicode formal + assumption-naming gloss:
158
+
159
+ - ✗ informal English: `// always positive` — neither activation nor assumptions
160
+ - ✗ ASCII formal alone: `@ensures result >= 0` — weak priming, no assumption-naming
161
+ - ◐ Unicode formal + translation gloss: `@ensures result ≥ 0 // result >= 0` — primes the formal surface, gloss is redundant translation
162
+ - ✓ Unicode formal + assumption-naming gloss: `@ensures result ≥ 0 // result is non-negative; overflow is caller's responsibility` — primes high-bandwidth readers AND gives low-bandwidth readers the same content as prose with assumptions explicit
163
+
164
+ The shape signals "this is a function to reason about formally," not "this is a function to skim and pattern-match." The gloss makes the contract robust across reader capacities.
165
+
166
+ ## Semantic-ambiguity second pass
167
+
168
+ After writing range and type constraints, ask: **does any constant or sentinel in this function carry more than one meaning?**
169
+
170
+ Formal annotations bias toward easy formal targets — range, type, sign. They miss *semantic overloading*: a single value standing for two distinct domain states. The blind spot is structural — `@requires x ≥ 0` cannot express "and `0` is distinct from `null`."
171
+
172
+ Common patterns to flag:
173
+
174
+ - `value ?? 0` (or `?? ""`, `?? -1`) where the coalesced-from value and the coalesced-to value carry different meanings downstream
175
+ - Sentinel integers (`-1` for "not found", `999` for "all", `0` for "default")
176
+ - Empty string vs missing string
177
+ - `0` returned from a counter that also legitimately returns `0`
178
+
179
+ When you find one, lift the sentinel into the type — `null | { kind: 'loaded'; price: PositiveAmount } | { kind: 'call-for-price' }` — so the two states are unrepresentable as the same value. Then the formal annotation regains coverage.
180
+
181
+ ### Real example
182
+
183
+ `@simpleapps-com/augur-utils` `derive-price.ts` was written with the contract treatment, including a self-aware note about IEEE-754 vs ℝ. It still missed this exact blind spot:
184
+
185
+ ```ts
186
+ const unitPrice = priceData.unitPrice ?? 0; // null collapses to 0
187
+ isCallForPrice: unitPrice === 0, // 0 means "call for price"
188
+ ```
189
+
190
+ The contract correctly enforces `unitPrice ≥ 0 ∧ Number.isFinite(unitPrice)` — but cannot say "and `null` is distinct from `0`," because `null` was already coalesced away. The fix is to handle `priceData.unitPrice === null` *before* the coalesce, lifting the two states into the type.
191
+
192
+ ## Drift is a defect — not a sync target
193
+
194
+ If the code contradicts an annotation, that is a bug. MUST decide which is wrong and fix it. MUST NOT silently rewrite the annotation to match incorrect code.
195
+
196
+ Unenforced annotations that drift do active harm — they prime future agents toward the wrong invariant. A wrong contract is worse than no contract.
197
+
198
+ When you find drift while editing:
199
+
200
+ 1. Read the contract carefully
201
+ 2. Read the code carefully
202
+ 3. Decide which one captures the intended behavior — look at call sites, tests, related code
203
+ 4. Fix whichever is wrong
204
+ 5. If you cannot tell which is intended, MUST stop and ask the user. MUST NOT guess.
205
+
206
+ ### Tautological postconditions
207
+
208
+ A related anti-pattern: postconditions that mirror the constructor. `@ensures result.kind === 'loaded'` on `function loaded(item) { return { kind: 'loaded', item } }` adds nothing — the constructor already guarantees it. Useful postconditions assert properties the *reader of the call* would not derive from the constructor alone (algebraic laws, conservation invariants, observable state changes). Restating the constructor adds bookkeeping without adding reasoning, and is a sign the contract was back-fitted rather than written clause-first.
209
+
210
+ ## Why mechanism #1 works (priming hypothesis)
211
+
212
+ LLM behavior is conditional on context shape. Models trained on verified-language corpora develop latent circuits for spec-then-implementation reasoning. Native-syntax contracts in the F\*/Lean/Dafny shape *plausibly* activate those circuits during code generation, review, and refactoring — even though the host language has no formal semantics.
213
+
214
+ **Rarity is the activation.** The Unicode glyphs (∀, ∃, ⟨⟩, ↦, ⊑, ≥, ≤, ≠, ⇒, ∧, ∨, ¬, ℕ, ℝ, ∈, ∉) co-occur in training data with theorem-prover output, type-theory papers, and formal-methods source. Sampling tokens with those glyphs pulls hidden state toward that neighborhood. ASCII transliterations (`forall`, `>=`, `=>`) live in commoner code-review text — for AI priming, that familiarity dampens the shift.
215
+
216
+ The asymmetry: if the priming hypothesis works, as proof-trained model capability improves over time, annotations added today retroactively become more valuable. Zero extra work from the developer; the priming benefit grows with each model upgrade.
217
+
218
+ ## Why mechanism #2 works (test surfacing)
219
+
220
+ Distinct from priming. **Observed in practice:** when a function carries explicit clauses, the agent generates *new and better tests* on subsequent edits — tests it would not have surfaced reading the implementation alone.
221
+
222
+ Contracts make intent legible. Each clause is an oracle for at least one test class:
223
+
224
+ - Each `@requires` → input-boundary tests (`q < 0`, `q = 0`, `q = NaN`)
225
+ - Each `@ensures` → property to assert on the output
226
+ - Each `@invariant` → multi-step test (call sequence, then check invariant)
227
+ - Each `@trusted` → fuzz-test target on untrusted inputs
228
+ - Each `@time O(...)` → scaling benchmark (assert the bound holds at n=10, n=100, n=1000)
229
+ - Each `@space O(...)` → memory-stability test (assert the bound holds across input sizes)
230
+ - Clauses also expose edge cases by negation: `@requires q ≥ 0` makes the agent ask "what about q < 0? what about NaN?"
231
+
232
+ Without the contract, the agent has only the function body to inspect — and the body rarely announces its boundaries explicitly. With the contract, every clause is a test-case oracle. This effect is observable session-by-session — test count, edge-case coverage, mutation-test kill rate.
233
+
234
+ ### Close cousin: improvement-opportunity surfacing
235
+
236
+ The same legibility that surfaces tests also surfaces *improvement opportunities*. A function annotated `@time O(n²)` invites the agent reading it to ask "can this be improved to `O(n log n)`?" The complexity contract makes the target visible. Without it, the agent maintains the function as written; with it, the agent has an explicit invariant to question. Same mechanism (legible intent → visible gap), different output (refactor candidates instead of test cases).
237
+
238
+ This is why complexity contracts are not just documentation. A function honestly labeled `@time O(n²)` and called from a hot loop is a refactor target the agent will surface on the next read.
239
+
240
+ ## Why mechanism #3 works (agent-reported value)
241
+
242
+ Agents themselves report, in active session use, that the contract content adds useful context for complex / load-bearing methods. The cost of carrying contracts in the context window is paid back in usefulness, not just consumed. Qualitative confirmation, but real-world signal that the practice pays off in routine session work.
243
+
244
+ ## Evidence and falsifiability
245
+
246
+ Mechanism #1 is **plausible but not directly measured for in-source contracts**. Cite-able adjacent evidence:
247
+
248
+ - **ContractEval** (arXiv 2510.12047) — LLM contract-satisfaction rises from 0% (vanilla) to ~50% when contracts are stated in the prompt. Demonstrates the mechanism works for *prompt-supplied* contracts.
249
+ - **Specification-Guided Repair of Dafny Programs with LLMs** (arXiv 2507.03659) — LLMs reason measurably better when Dafny pre/postconditions are present.
250
+ - **Type-Constrained Code Generation** (arXiv 2504.09246) — type annotations cut hallucinated APIs and compilation errors >50%. Supports "tighten the type system first."
251
+ - **CoT mech-interp** (arXiv 2402.18312, arXiv 2507.22928) — surface cues like "let's think step by step" route through identifiable internal circuits in larger models. Supports "surface form conditions reasoning mode."
252
+
253
+ **Falsifiable prediction:** annotating a load-bearing function with these contracts produces, on the next agent edit, fewer correctness regressions and/or more rigorous reasoning traces than the same function un-annotated.
254
+
255
+ **Practical signals before the formal experiment runs:**
256
+
257
+ - Test-count / edge-case-coverage / mutation-kill-rate delta between Unicode-bearing and ASCII-bearing contracts on equivalent functions (mechanism #2; observable session-by-session)
258
+ - Fewer correctness regressions in functions carrying contract annotations (mechanism #1; quarter-scale)
259
+ - Fewer "oops, missed the case where X" follow-up commits to annotated functions
260
+ - Agent feedback on whether the contracts are pulling weight in their reasoning (mechanism #3; per-session)
261
+
262
+ If signals trend the wrong way over a quarter of usage, the bet has not paid off and the skill SHOULD be downgraded or removed.
263
+
264
+ Until then, treat this skill as a defensible bet across three independent mechanisms.
265
+
266
+ ## See also
267
+
268
+ - **`vocabulary.md`** — full Unicode glyph palette, ASCII gloss patterns, clause-first derivation table, per-language examples
269
+ - **`audit.md`** — the audit modes (per-file + session-aware), invoked by `/contract-audit`
270
+ - **`apply.md`** — six-phase loop (Orient → Read → Draft → Trim → Discover → Verify) for adding contracts to existing code
@@ -0,0 +1,154 @@
1
+ # Code Contracts — Apply
2
+
3
+ The six-phase loop for adding contracts to existing code. Use when starting from a function that does not yet carry contracts and you have decided (manually or via `audit.md`) it is load-bearing enough to warrant them.
4
+
5
+ ## When to invoke this
6
+
7
+ Three triggers:
8
+
9
+ 1. **Audit recommended it.** A `/contract-audit` report named this file as a contract candidate.
10
+ 2. **You're already editing the file** for unrelated reasons and noticed it deserves contracts. Add them as part of the edit pass; do not split into a separate task.
11
+ 3. **A bug fix landed here recently.** The fix is the highest-quality signal that the function carries non-trivial invariants worth documenting. Walk the loop after the fix.
12
+
13
+ MUST NOT invoke broadly. Applying contracts to non-load-bearing code is itself a failure of the practice — it produces JSDoc bloat without the priming or test-surfacing payoff.
14
+
15
+ ## The six phases
16
+
17
+ Run them in order. Each phase has a discrete output that feeds the next.
18
+
19
+ ### Phase 1: Orient
20
+
21
+ Read the file's recent history. Bug fixes are bug-locus candidates — they reveal where the invariants matter.
22
+
23
+ ```
24
+ git -C repo log --oneline -10 -- <path>
25
+ ```
26
+
27
+ Flag any `fix:` commits as locus candidates. For each, read the commit message and the diff — what invariant was being violated? Capture it; phase 3 will turn it into a clause.
28
+
29
+ Re-verify any wiki rule that may apply (e.g., the project's PHP conventions for contract docblocks, naming conventions, helper patterns) before invoking it. Wiki content drifts; do not rely on stale recall.
30
+
31
+ **Output:** a short list of candidate invariants pulled from recent fix commits.
32
+
33
+ ### Phase 2: Read
34
+
35
+ Read three things, in order:
36
+
37
+ 1. **The target file** — full content, not skimmed.
38
+ 2. **The sibling methods it dispatches to** — if `processOrder` calls `validateLineItems` and `applyTax`, read both.
39
+ 3. **The heaviest callers** — find the top three call sites and read enough of each to understand what they assume about this function's behavior.
40
+
41
+ Reasoning from method names alone produces false recommendations. The contract clauses MUST come from observed behavior (the body) AND observed assumptions (the call sites). Skip this phase and the contracts will be plausible-but-wrong — the worst kind of contract.
42
+
43
+ **Output:** a working understanding of what the function actually does, what its callers assume, and any drift between intent and implementation.
44
+
45
+ ### Phase 3: Draft
46
+
47
+ Write the contract clauses, clause-first.
48
+
49
+ For load-bearing files, write a **file-level docblock** that frames the file's role in the system. For each load-bearing method, write a per-method contract:
50
+
51
+ ```php
52
+ /**
53
+ * @requires <precondition> (ASCII gloss)
54
+ * @ensures <postcondition> (ASCII gloss)
55
+ * @invariant <property> (ASCII gloss, where applicable)
56
+ *
57
+ * Footgun: <named footgun from phase 1 or 2 — bug that hit, sentinel ambiguity, etc.>
58
+ * (ASCII gloss of the footgun)
59
+ */
60
+ ```
61
+
62
+ Use Unicode glyphs paired with ASCII gloss per the dual-audience pattern in `SKILL.md`. See `vocabulary.md` for the glyph palette and per-language examples.
63
+
64
+ Cite specifics:
65
+
66
+ - The bug-fix commit (e.g., "Footgun: see fix in <sha> — empty array vs single-zero array were silently the same precondition")
67
+ - Sibling files where applicable (e.g., "see `OrderTotal.php` for how the result is consumed")
68
+ - Wiki rules where they apply (file:line citations)
69
+
70
+ **Output:** a draft docblock per load-bearing method, plus a file-level overview if the file's role warrants one.
71
+
72
+ ### Phase 4: Trim
73
+
74
+ Cut anything an agent could recover from reading the code itself. The contract MUST add information, not restate the implementation.
75
+
76
+ Concrete cuts:
77
+
78
+ - Postconditions that mirror the constructor (e.g., `@ensures result.kind === 'loaded'` on `function loaded(item) { return { kind: 'loaded', item } }`) — drop them.
79
+ - Range assertions that the type system already enforces (`@requires q ≥ 0` when the parameter type is `Natural`) — drop them.
80
+ - Prose that narrates the body (`// loop builds the running sum`) — drop them.
81
+ - Comments that explain syntax (`// using ?? for default`) — drop them.
82
+
83
+ What stays:
84
+
85
+ - Footguns that aren't visible in the body (sentinel ambiguities, cross-file trust boundaries, ordering invariants, conservation properties)
86
+ - Algebraic laws (idempotence, commutativity, monotonicity)
87
+ - Cross-method invariants
88
+ - Anything a future reader would not derive from the body alone
89
+
90
+ **Output:** a tight contract that earns every line.
91
+
92
+ ### Phase 5: Discover
93
+
94
+ Ask: **what invariant did you surface that does not belong in this docblock but should be captured as a wiki rule?**
95
+
96
+ The discover phase is the highest-leverage step and the easiest to skip. While writing contracts, you often surface unstated conventions — a parameter ordering rule, a naming convention, a trust assumption that applies across many files. These do not belong in any one docblock but are the kind of knowledge that evaporates on `/clear`.
97
+
98
+ For each discovered convention:
99
+
100
+ - Decide where it belongs (project wiki page, shared skill, a `README.md` next to the code)
101
+ - Write it down (or offer to write it down — the user MAY want to phrase it themselves)
102
+ - Cite the file/line that surfaced it
103
+
104
+ **Output:** zero or more wiki-bound captures, written or queued.
105
+
106
+ Examples of the kind of invariants this phase catches:
107
+
108
+ - A `$siteId`-first-parameter rule across helper functions that was implicit in the code but not documented
109
+ - A trust assumption that strings labeled "internal" never reach SQL without going through a quoter — true in practice, undocumented
110
+ - A "delete flag is `'Y'`, never `null`" convention that the code relies on but no test or comment names
111
+
112
+ ### Phase 6: Verify
113
+
114
+ For each clause written in phase 3 (and not cut in phase 4), ask: **what tests does this clause demand that the test suite does not yet have?**
115
+
116
+ Walk the clauses:
117
+
118
+ - Each `@requires` → name the missing input-boundary tests (negative inputs, NaN, empty, max, etc.)
119
+ - Each `@ensures` → name the missing output-property assertions
120
+ - Each `@invariant` → name the missing multi-step tests that exercise the invariant across call sequences
121
+ - Each `@trusted` → name the missing fuzz cases on untrusted-input simulation
122
+ - Each `@time O(...)` / `Θ(...)` → name the missing scaling benchmark (assert the bound holds at n=10, n=100, n=1000)
123
+ - Each `@space O(...)` / `Θ(...)` → name the missing memory-stability test across input sizes
124
+
125
+ This phase is the test-gap report for the contract. It is parallel to phase 5: phase 5 captures wiki-bound knowledge gaps, phase 6 captures test-bound knowledge gaps. Complexity-claim gaps are especially load-bearing — an unverified `@time Θ(n)` claim drifts to `O(n²)` silently across refactors.
126
+
127
+ The output of phase 6 is a list of tests to write. **Do not write them in this loop** — that's a separate task. The list is the artifact; whether the user writes the tests now or later is their call.
128
+
129
+ **Output:** a list of missing tests, one per clause, ordered by which would catch the most likely class of regression first.
130
+
131
+ ## Convention authority
132
+
133
+ The PHP-side convention for contract docblocks lives in `wiki/PHP-Conventions.md § Contract Docblocks for Load-Bearing Methods` (in the originating repo's wiki). MUST cite that page rather than duplicating its content here. This skill's role is the *workflow*; the *convention* is a per-repo artifact.
134
+
135
+ For TS and Python, the conventions are folded directly into `SKILL.md` and `vocabulary.md` because those languages do not yet have a per-repo conventions page that lives elsewhere.
136
+
137
+ ## Canonical examples
138
+
139
+ Reference the existing contracted methods so an agent can model new work on a verified example:
140
+
141
+ - `packages/roark/src/MathUtils.php::normalizeL2`
142
+ - `packages/roark/src/StringUtils.php::makeKey`
143
+ - `packages/roark/src/Helpers/CacheHelper.php` (file-level overview plus `tryLock`, `lock`, `isLocked`, `get`, `getLockKeyByName`)
144
+ - `packages/roark/src/Enums/CacheTtl.php` (file-level overview plus `toSeconds`)
145
+ - `packages/roark/src/Enums/StatusChar.php` (file-level overview plus `description`)
146
+ - `packages/open_search/src/Helpers/ItemsHelper.php` (file-level overview plus `applyIndexAction`, `getIndexAction`, `modifyIndex`)
147
+
148
+ (All paths are in the `simpleapps-com/augur` repo. Cross-repo read may be needed; the WIP-side `code-contracts-cluster.md` notes this as an open item.)
149
+
150
+ ## Reference
151
+
152
+ - `SKILL.md` — the writing skill (auto-triggered on load-bearing edits) and the three-mechanism framing
153
+ - `vocabulary.md` — full glyph palette + clause-first derivation table + per-language examples
154
+ - `audit.md` — the audit modes (per-file + session-aware)
@@ -0,0 +1,173 @@
1
+ # Code Contracts — Audit
2
+
3
+ The audit modes invoked by `/contract-audit`. Two modes; pick by argument:
4
+
5
+ | Mode | Trigger | What it does |
6
+ |------|---------|--------------|
7
+ | **Per-file** | `/contract-audit <file>` | Walks the file for security seams and contract gaps; produces contracts and a test-gap report |
8
+ | **Session-aware** | `/contract-audit` (no args) | Scans files **read or written in this session**; for each, evaluates load-bearing-ness, existing contracts, and recommendation |
9
+
10
+ The session-aware mode is the high-leverage one: it surfaces contract candidates *while context is fresh*, instead of relying on the user to remember to run the audit later.
11
+
12
+ ## Frame the audit as bug discovery, not documentation
13
+
14
+ The exercise's value is the discipline of writing contracts forcing real bugs to surface. The annotations themselves are a secondary artifact. Every audit MUST produce a *report*, not auto-applied annotations — the human stays in the loop on what to annotate vs. fix.
15
+
16
+ ## Per-file audit
17
+
18
+ For a given file, walk these four questions in order:
19
+
20
+ ### 1. Where do types not capture a runtime constraint?
21
+
22
+ Look for:
23
+
24
+ - `string` parameters inlined into SQL, HTML, shell, or other security-sensitive sinks — trusted vs. untrusted origin is invisible
25
+ - Coupled optional fields (e.g. `afterKey` requires `afterKeyColumn`)
26
+ - Non-null assertions (`!`) load-bearing on a runtime guard
27
+ - Sentinel values (`-1`, `0`, `999`, empty string) carrying domain meaning the type cannot express
28
+ - Number that should be ℕ, ℤ⁺, or a refinement (e.g. percentage in `[0, 100]`)
29
+ - Loops over inputs of unknown size — implicit `O(n)` or `O(n²)` claims with no contract to make the cost visible
30
+
31
+ For each finding, propose a contract clause AND the type-level fix that would make the clause unnecessary (the order-of-preference rule from `SKILL.md`). For complexity findings, propose `@time` / `@space` clauses *and* flag the function as a refactor candidate if the asymptotic class is plausibly improvable (e.g., nested-loop O(n²) where a hash-keyed O(n) is reachable).
32
+
33
+ ### 2. Does the function's name and docstring match its actual behavior?
34
+
35
+ Read the function name, its docstring (if any), and its body. Look for:
36
+
37
+ - Plural names returning singular results (e.g. `guardrailPlugins` returning one plugin)
38
+ - Verbs that misstate the operation (`getX` that mutates, `isX` that returns a string)
39
+ - Docstrings describing intent that the body doesn't enforce
40
+ - Function that does N+1 things when the name promises 1
41
+
42
+ For each finding, decide: rename the function OR rewrite the body. Drift between name and behavior is a defect by the same rule that applies to contracts.
43
+
44
+ ### 3. Does this file produce values consumed elsewhere with implicit contracts?
45
+
46
+ This is the **highest-value finding** and the hardest to surface. Walk every value the file emits — return values, exported constants, structures pushed into shared state, strings written to global queries, fields assigned on shared objects.
47
+
48
+ For each emitted value, ask:
49
+
50
+ - Where is it consumed? (Other files in the repo)
51
+ - What does the consumer assume about its shape, format, sanitization, or origin?
52
+ - Is that assumption written down anywhere?
53
+
54
+ The producer/consumer pair has an *implicit contract* in the gap. Surface it. Either annotate the producer with a `@trusted`/`@ensures` clause, or harden the consumer to defend against violation. Cross-file trust boundaries are where the worst bugs live.
55
+
56
+ ### 4. For each contract this audit produces, what tests does the contract demand that the test suite does not yet have?
57
+
58
+ This is the test-gap report. For each clause from questions 1–3:
59
+
60
+ - `@requires` → list missing input-boundary tests (negative inputs, NaN, empty, max, etc.)
61
+ - `@ensures` → list missing output-property assertions
62
+ - `@invariant` → list missing multi-step tests that exercise the invariant across calls
63
+ - `@trusted` → list missing fuzz cases on untrusted-input simulation
64
+ - `@time O(...)` / `@time Θ(...)` → list missing scaling benchmarks (assert the bound holds at n=10, n=100, n=1000)
65
+ - `@space O(...)` / `@space Θ(...)` → list missing memory-stability tests across input sizes
66
+
67
+ The test-gap pass turns the audit from "find bugs" into "find bugs *and* find the test that would have caught them next time." Complexity-claim gaps are especially load-bearing — without scaling tests, the claim is unfalsifiable and an `O(n²)` regression slips into an `O(n)`-claimed function unchecked.
68
+
69
+ ## When NOT to annotate
70
+
71
+ This section MUST appear in every audit report, before the per-section findings. Without it, the audit produces JSDoc bloat instead of signal.
72
+
73
+ Skip annotations on:
74
+
75
+ - **Pure transforms whose contract is fully expressed in the type signature.** A function `(n: NonZero) => number` already says `@requires n !== 0` in the type — no comment needed.
76
+ - **Test files.** The test name is the spec. Annotating tests adds noise.
77
+ - **Generated code.** Don't annotate; the generator is the spec.
78
+ - **One-off scripts and throwaway code.** Not load-bearing; not worth the context cost.
79
+ - **Glue code, plumbing, framework adapters, UI components, getters/setters.** Same scope rule as `SKILL.md`.
80
+
81
+ If the comment would just restate the type, it MUST NOT be added. The audit's signal is bugs surfaced and gaps in tests, not annotation count.
82
+
83
+ ## Output format
84
+
85
+ Every audit produces a markdown report with this structure:
86
+
87
+ ```markdown
88
+ # Audit: <file or session>
89
+
90
+ ## Scope
91
+ <file paths audited; load-bearing assessment per file>
92
+
93
+ ## When NOT to annotate (carryover)
94
+ <note any patterns in this audit that fall in the skip list — pure transforms whose type already says it, tests, generated, etc.>
95
+
96
+ ## Findings
97
+
98
+ ### 1. Types that don't capture runtime constraints
99
+ <list per-file findings; for each, propose contract + type-level fix>
100
+
101
+ ### 2. Name / docstring drift
102
+ <list per-file findings; for each, propose rename OR body fix>
103
+
104
+ ### 3. Cross-file trust boundaries (highest-value)
105
+ <list producer/consumer pairs with implicit contracts; propose @trusted/@ensures on the producer or hardening on the consumer>
106
+
107
+ ### 4. Test gaps
108
+ <list per-finding tests that the proposed contracts demand and the suite does not have>
109
+
110
+ ## Suggested next steps
111
+ <ordered list — usually: fix bugs found in section 1, address drift in 2, harden boundaries in 3, write missing tests in 4>
112
+ ```
113
+
114
+ The audit is **read-only**. MUST NOT auto-apply contracts or modify code or tests. The human reviews the report and decides what to act on.
115
+
116
+ ## Session-aware mode
117
+
118
+ When `/contract-audit` is invoked with no arguments, run a session-aware scan instead of per-file.
119
+
120
+ ### Discover the candidate set
121
+
122
+ Look at the files **read or written in this session** (using session context, recent tool-call history, or open editor state). Filter to:
123
+
124
+ - Source files in `repo/` (not test files, not config, not generated)
125
+ - Files with at least one substantive read or edit this session
126
+ - Files large enough to plausibly contain load-bearing functions (skip files < ~50 lines unless the agent has reason to believe they're security-critical)
127
+
128
+ ### Score each file
129
+
130
+ For each candidate, evaluate:
131
+
132
+ | Dimension | Question |
133
+ |-----------|----------|
134
+ | Load-bearing-ness | Does this file contain money math, auth, concurrency, state machines, security boundaries, or non-trivial algorithms? |
135
+ | Existing contracts | Are there already `@requires` / `@ensures` / `@invariant` / `@trusted` clauses? |
136
+ | Worth adding | If no contracts present, would adding them surface a bug or fill a test gap? |
137
+
138
+ The agent has been in the file's context this session — use that. Don't re-derive load-bearing-ness from cold.
139
+
140
+ ### Output: ranked recommendations
141
+
142
+ Produce a ranked list:
143
+
144
+ ```markdown
145
+ # Session-aware contract audit — N candidates
146
+
147
+ ## Recommended (highest leverage)
148
+
149
+ 1. **`<path>`** — <one-line rationale: load-bearing reason + observed gap>
150
+ Suggested action: <add contracts to function X; harden the boundary at line Y; etc.>
151
+
152
+ 2. **`<path>`** — ...
153
+
154
+ ## Already covered
155
+
156
+ <files with contracts already in place — note any drift the agent observed during the session>
157
+
158
+ ## Skip (not load-bearing)
159
+
160
+ <files in the candidate set that don't meet the scope rule; brief reason>
161
+ ```
162
+
163
+ Keep the rationale short. The user picks the top item and runs `/contract-audit <path>` to do a per-file deep dive.
164
+
165
+ ### Why this mode pays off
166
+
167
+ The user does not have to remember to run an audit. The agent surfaces candidates while the context is fresh. Mechanism #3 (agent-reported value) operationalizes here — the agent saw the file in this session and can score it; a cold audit cannot.
168
+
169
+ ## Reference
170
+
171
+ - `SKILL.md` — the writing skill (auto-triggered on load-bearing edits)
172
+ - `vocabulary.md` — full glyph palette + clause-first derivation table + per-language examples
173
+ - `apply.md` — six-phase loop for adding contracts to existing code (when this audit recommends "add contracts to function X")
@@ -0,0 +1,196 @@
1
+ # Code Contracts — Vocabulary Reference
2
+
3
+ The full glyph palette, ASCII gloss patterns, clause-first derivation table, and per-language examples. Loaded by `SKILL.md` on demand.
4
+
5
+ ## Glyph palette
6
+
7
+ A small, opinionated set. MUST stay narrow — each glyph the agent emits should be one a reader has seen elsewhere in this codebase, not a novelty.
8
+
9
+ | Family | Glyphs | ASCII gloss |
10
+ |--------|--------|-------------|
11
+ | Logic | `∀`, `∃`, `∧`, `∨`, `¬`, `⇒`, `⇔` | forall, exists, and, or, not, implies, iff |
12
+ | Comparison | `≤`, `≥`, `≠`, `≡`, `≜` | <=, >=, !=, ==, := (definition) |
13
+ | Sets | `∈`, `∉`, `⊆`, `⊂`, `∪`, `∩`, `∅` | in, not in, subset of, proper subset, union, intersection, empty |
14
+ | Numbers | `ℕ`, `ℤ`, `ℚ`, `ℝ` | natural, integer, rational, real |
15
+ | Functions | `→`, `↦`, `∘` | function-of, mapsto, compose |
16
+ | Brackets | `⟨ ⟩`, `⟦ ⟧` | tuple, denotation |
17
+ | Complexity | `Θ`, `O`, `Ω`, `ω`, `~`, superscripts (`²`, `³`, `ⁿ`) | tight bound, upper bound, lower bound, strictly lower, asymptotic equivalence, exponentiation |
18
+ | Limits | `→ ∞`, `lim` | "as n grows without bound" |
19
+
20
+ ### What stays ASCII
21
+
22
+ - Tag prefixes (`@requires`, `@ensures`, `@invariant`, `@trusted`, `@pure`) — for tooling compatibility (Psalm, PHPStan, JSDoc tooling, mypy, pyright)
23
+ - Operators inside type signatures (TS/PHP/Python syntax — `&`, `|`, `:`, `?`, etc.)
24
+ - Inline gloss text on every clause (the dual-audience pattern)
25
+
26
+ Unicode lives inside the *clause body*. ASCII gloss lives on the same or next line.
27
+
28
+ ### Common glyph confusions to avoid
29
+
30
+ - `⇒` (implication) vs `→` (function arrow) vs `↦` (mapsto). Use `⇒` for logical implication, `→` for "function from A to B," `↦` for "x mapsto f(x)."
31
+ - `≡` (logical equivalence / equal-by-definition in some traditions) vs `≜` (definition). Prefer `≜` for definitions, `≡` for "same up to" relations.
32
+ - `⊆` (subset-or-equal) vs `⊂` (proper subset). Most contract uses want `⊆`.
33
+ - `Θ(f)` (tight bound) vs `O(f)` (upper bound only) vs `Ω(f)` (lower bound). Most code calling itself "O(n log n)" is actually `Θ(n log n)` — the bound is tight, not just an upper limit. SHOULD use `Θ` when the bound is tight; `O` is correct only when the function may run *faster* than f. Loose `O` claims prime the wrong reasoning ("this is at most O(n²)" when the agent should think "this is exactly Θ(n log n)").
34
+
35
+ ## Pairing patterns
36
+
37
+ Pick one and apply consistently within a file. **The gloss MUST carry assumption-naming, not just translation** (see `SKILL.md` § "Two surfaces per clause"). The Unicode formal surface activates careful reasoning in readers with bandwidth to parse it; the prose gloss carries the same content for readers without that bandwidth, plus the assumptions the formal notation cannot express.
38
+
39
+ ### Pattern A — inline assumption-naming gloss after the formal clause
40
+
41
+ ```ts
42
+ @requires q ≥ 0 // q is non-negative; assumes caller validated input
43
+ @ensures result ∈ ℕ // result is a natural number; overflow is caller's responsibility
44
+ ```
45
+
46
+ Denser. Reads as a column of formal clauses with the prose-and-assumptions as a sidebar.
47
+
48
+ ### Pattern B — bracketed gloss on the same line
49
+
50
+ ```ts
51
+ @requires q ≥ 0 (q is non-negative; assumes caller validated input)
52
+ @ensures ∀ x ∈ items. x.qty ≥ 0 (every item has non-negative qty; empty array is allowed)
53
+ ```
54
+
55
+ Less dense. Reads more like prose. Better when the clauses are short and the assumption-naming is short.
56
+
57
+ ### What the gloss is NOT
58
+
59
+ A redundant translation. `@requires q ≥ 0 // q >= 0` is the wrong gloss — both lines say the same thing, neither names what is assumed. The gloss MUST add at least one of:
60
+
61
+ - An assumption the formal notation does not express (no NaN, integer not float, validated upstream)
62
+ - A side condition (locking, transactional context, ordering)
63
+ - A boundary the contract treats as out-of-scope (overflow, empty input, sentinels)
64
+
65
+ If the gloss is exactly `@requires q ≥ 0 // q >= 0`, drop it — it's bookkeeping, not load-bearing prose.
66
+
67
+ ## Clause-first derivation table
68
+
69
+ Write the clause first. The encoding falls out of the clause shape.
70
+
71
+ | Clause shape | Tier | Encoding |
72
+ |--------------|------|----------|
73
+ | Refinement: `q : ℕ ∧ q > 0` *(q is a positive natural)* | 1 | Branded type + smart constructor: `type PositiveInt = number & { readonly __brand: 'PositiveInt' }; function mkPositive(n: number): PositiveInt` |
74
+ | Sum: `Status ≜ Loaded(Item) ∨ NotFound ∨ CallForPrice` *(tagged union of three states)* | 1 | Discriminated union: `{ kind: 'loaded'; item: Item } \| { kind: 'not-found' } \| { kind: 'call-for-price' }` |
75
+ | Predicate: `∀ x ∈ xs. p(x)` *(every x in xs satisfies p)* | 2 | `xs.every(p)` |
76
+ | Predicate: `∃ x ∈ xs. p(x)` *(some x in xs satisfies p)* | 2 | `xs.some(p)` |
77
+ | Effect: `pure`, `mutates X`, `throws Y` | 2 | `@psalm-pure`, `@psalm-mutation-free`, `eslint-plugin-functional` |
78
+ | Otherwise (algebraic law, multi-step protocol invariant, external state) | 3 | Formal-prose annotation in JSDoc / docstring |
79
+
80
+ The discipline: write the clause **before** the function. Reversing the order — writing the function and back-fitting a contract — bypasses the cognitive work and produces tautological annotations.
81
+
82
+ ## Per-language examples
83
+
84
+ ### TypeScript
85
+
86
+ ```ts
87
+ /**
88
+ * Compute order total in cents.
89
+ *
90
+ * @requires items.length > 0 // at least one line item; empty cart is caller's responsibility
91
+ * @requires ∀ i ∈ items. i.qty > 0 // every item has positive qty; assumes upstream validation
92
+ * @ensures result === Σ(items, i ↦ i.qty × i.unitPrice)
93
+ * // result is the sum of qty × unitPrice; assumes integer cents, no rounding here
94
+ * @ensures result ∈ ℕ // result is non-negative; overflow is caller's responsibility
95
+ * @time Θ(n) // linear in n = items.length
96
+ * @space O(1) // auxiliary space — accumulator only
97
+ * @pure
98
+ */
99
+ function totalCents(items: LineItem[]): number { ... }
100
+ ```
101
+
102
+ ### PHP
103
+
104
+ ```php
105
+ /**
106
+ * Rebuild the latest snapshot from the event log.
107
+ *
108
+ * @requires $events is sorted ascending by occurredAt
109
+ * (events are chronological; ordering is caller's responsibility, not validated here)
110
+ *
111
+ * @ensures result ≜ fold($events, replay)
112
+ * (result is the snapshot replayable from the event log; assumes replay is deterministic)
113
+ *
114
+ * @invariant during fold, accumulator state ≡ replay($events[0..i])
115
+ * (at each step i, accumulator equals replay of events 0 through i; holds only if events are immutable during fold)
116
+ *
117
+ * @time Θ(n) // linear fold over events; n = count($events)
118
+ * @space O(1) // running accumulator, no buffering of events
119
+ * @pure
120
+ *
121
+ * Footgun: $events === ∅ vs $events === [genesis] are not the same precondition.
122
+ * (empty array means "no genesis"; caller must distinguish from "not yet bootstrapped")
123
+ */
124
+ function rebuild(array $events): Snapshot { ... }
125
+ ```
126
+
127
+ ### Python
128
+
129
+ ```python
130
+ def transfer(src: Account, dst: Account, cents: int) -> None:
131
+ """
132
+ Transfer cents from src to dst, atomically.
133
+
134
+ @requires cents > 0 # cents must be positive; zero-amount is caller's responsibility
135
+ @requires src.balance ≥ cents # sufficient funds; assumes balance was read inside the same lock
136
+ @requires src ≠ dst # src and dst differ; self-transfer is caller's bug, not ours
137
+ @ensures src.balance ≡ old(src.balance) − cents
138
+ # src.balance decreases by cents; assumes no concurrent mutators outside this lock
139
+ @ensures dst.balance ≡ old(dst.balance) + cents
140
+ # dst.balance increases by cents; same locking assumption
141
+ @invariant src.balance + dst.balance ≡ old(src.balance) + old(dst.balance)
142
+ # total balance is conserved; holds across the atomic boundary, not mid-flight
143
+ @time Θ(1) # constant — three field updates
144
+ @space Θ(1)
145
+ @mutates src, dst
146
+ """
147
+ ```
148
+
149
+ ## Annotation forms reference
150
+
151
+ - `@requires <precondition>` — must hold of inputs at call time
152
+ - `@ensures <postcondition>` — guaranteed of result / observable state
153
+ - `@invariant <property>` — holds at loop head, between method calls, across state transitions
154
+ - `@trusted <param>` — value inlined into a security-sensitive sink (SQL, HTML, shell); origin must be trusted code, never user input
155
+ - `@pure` — no observable effects (no mutation, no IO, no throws under normal inputs)
156
+ - `@mutates <state>` — names the state mutated (a parameter, a field, a global)
157
+ - `@throws <type>` — names the exceptions that may be raised
158
+ - `@io` — performs IO (filesystem, network, console)
159
+ - `@property <law>` — algebraic law the function satisfies (idempotence, commutativity, associativity, monotonicity)
160
+ - `@time <bound>` — runtime complexity. Use `Θ(...)` for tight bound, `O(...)` for upper bound only, `Ω(...)` for lower bound. Amortized: `@time Θ(1) amortized`. Worst/avg split: `@time worst Θ(n²) avg Θ(n log n)`.
161
+ - `@space <bound>` — auxiliary space complexity (excluding input). Same `Θ`/`O`/`Ω` conventions.
162
+
163
+ ## Glossary
164
+
165
+ A reader unfamiliar with the symbols can use this section. Over repeated exposure these become routine.
166
+
167
+ | Symbol | Reads as | Means |
168
+ |--------|----------|-------|
169
+ | `∀ x ∈ S. P(x)` | "for all x in S, P of x" | every element of S satisfies the predicate P |
170
+ | `∃ x ∈ S. P(x)` | "there exists x in S such that P of x" | at least one element of S satisfies P |
171
+ | `A ⇒ B` | "A implies B" | if A then B |
172
+ | `A ⇔ B` | "A iff B" | A holds exactly when B holds |
173
+ | `A ∧ B` | "A and B" | both A and B hold |
174
+ | `A ∨ B` | "A or B" | at least one of A, B holds |
175
+ | `¬A` | "not A" | A does not hold |
176
+ | `x ∈ S` | "x is in S" | x is an element of S |
177
+ | `S ⊆ T` | "S is a subset of T" | every element of S is in T |
178
+ | `S ∪ T` | "S union T" | elements in S or T (or both) |
179
+ | `S ∩ T` | "S intersect T" | elements in both S and T |
180
+ | `∅` | "empty" | the empty set or empty collection |
181
+ | `ℕ` | "naturals" | non-negative integers (0, 1, 2, …) |
182
+ | `ℤ` | "integers" | …, -1, 0, 1, … |
183
+ | `ℝ` | "reals" | the real numbers (note IEEE-754 ≠ ℝ) |
184
+ | `f : A → B` | "f from A to B" | f is a function from set A to set B |
185
+ | `x ↦ f(x)` | "x mapsto f of x" | the function that maps x to f(x) |
186
+ | `≜` | "is defined as" | left side is defined to mean the right |
187
+ | `≡` | "equivalent to" | the two are interchangeable in this context |
188
+ | `Σ(xs, f)` | "sum of f over xs" | sum of f(x) for each x in xs |
189
+ | `Θ(f(n))` | "Theta of f of n" | tight asymptotic bound — function grows exactly at the rate f(n) |
190
+ | `O(f(n))` | "Big-O of f of n" | upper bound only — function grows at most as fast as f(n) |
191
+ | `Ω(f(n))` | "Big-Omega of f of n" | lower bound — function grows at least as fast as f(n) |
192
+ | `ω(f(n))` | "little-omega of f of n" | strictly faster than f(n) |
193
+ | `f ~ g` | "f is asymptotic to g" | f(n)/g(n) → 1 as n → ∞ |
194
+ | `n → ∞` | "as n grows without bound" | the limiting case for asymptotic claims |
195
+
196
+ The glossary is intentionally short. Other symbols MAY be added when load-bearing across multiple files; bare additions for a single use SHOULD be avoided.
@@ -56,7 +56,7 @@ The parent `{project}/` is NOT a git repo. It keeps code and wiki side-by-side.
56
56
  | Temporary files | `tmp/` | Scratch space: commit msgs, PR bodies, intermediate output. Full access. |
57
57
  | SimpleApps config | `.simpleapps/` | Settings, site profile, credentials (see below) |
58
58
 
59
- **WIP**: Research, plans, decisions, test results. MUST NOT contain secrets, final docs, or code.
59
+ **WIP**: Research, plans, decisions, test results. MUST NOT contain secrets, final docs, or code. See `simpleapps:wip` for the frontmatter schema, status lifecycle, retention rule, and daily processing via `/process-wips`.
60
60
 
61
61
  **tmp/**: Fully available for scratch work: commit messages, PR bodies, issue comments, intermediate output, and any throwaway files. Read, write, and delete freely without asking. Create the folder if missing. Clean up files after use.
62
62
 
@@ -148,30 +148,30 @@ Every project SHOULD configure `.claude/settings.local.json` with these deny rul
148
148
  },
149
149
  "permissions": {
150
150
  "allow": [
151
- "Bash(pnpm:*)",
151
+ "Bash(basename:*)",
152
+ "Bash(dirname:*)",
153
+ "Bash(find:*)",
154
+ "Bash(grep:*)",
152
155
  "Bash(ls:*)",
153
- "Bash(wc:*)",
156
+ "Bash(lsof:*)",
154
157
  "Bash(md5:*)",
155
158
  "Bash(md5sum:*)",
159
+ "Bash(pnpm:*)",
160
+ "Bash(pwd:*)",
156
161
  "Bash(readlink:*)",
162
+ "Bash(rg:*)",
163
+ "Bash(wc:*)",
157
164
  "Bash(which:*)",
158
- "Bash(basename:*)",
159
- "Bash(dirname:*)",
160
- "Bash(pwd:*)",
161
- "Bash(lsof:*)",
162
165
  "mcp__plugin_simpleapps_augur-api__*"
163
166
  ],
164
167
  "deny": [
165
168
  "Bash(awk:*)",
166
169
  "Bash(cat:*)",
167
170
  "Bash(cd:*)",
168
- "Bash(find:*)",
169
171
  "Bash(for:*)",
170
- "Bash(grep:*)",
171
172
  "Bash(head:*)",
172
173
  "Bash(kill:*)",
173
174
  "Bash(pkill:*)",
174
- "Bash(rg:*)",
175
175
  "Bash(sed:*)",
176
176
  "Bash(sleep:*)",
177
177
  "Bash(tail:*)",
@@ -187,16 +187,17 @@ Why each is denied:
187
187
  - **`awk`**: Use the Edit tool instead.
188
188
  - **`cat`**: Use the Read tool instead.
189
189
  - **`cd`**: MUST NOT use in any Bash command, including compound commands (`cd /path && git`). Use `git -C repo` for git, path arguments for everything else. Compound cd+git commands trigger an unblockable Claude Code security prompt that interrupts the user even when `cd` is denied.
190
- - **`find`**: Use the Glob tool instead.
191
- - **`for`**: Shell loops are unnecessary; use dedicated tools or make multiple tool calls instead.
192
- - **`grep`**: Use the Grep tool instead.
190
+ - **`for`**: Shell loops are unnecessary; make multiple tool calls instead.
193
191
  - **`head`/`tail`**: Use the Read tool with `offset` and `limit` parameters instead.
194
192
  - **`kill`/`pkill`**: Use `TaskStop` to manage background processes. `TaskStop` cleanly shuts down the task and updates Claude Code's internal tracking.
195
- - **`rg`**: Use the Grep tool instead (it uses ripgrep internally).
196
193
  - **`sed`**: Use the Edit tool instead.
197
194
  - **`sleep`**: Unnecessary; use proper sequencing or background tasks.
198
195
  - **`Edit(~/.claude/plugins/**)` / `Write(~/.claude/plugins/**)`**: The installed plugin tree is a cache. Marketplace updates clobber it. To change plugin behavior, edit the plugin's source repo (e.g., `~/projects/simpleapps/augur-skills/`) instead.
199
196
 
197
+ Why `find`, `grep`, and `rg` are now ALLOWED (previously denied):
198
+
199
+ Claude Code 2.1.117 removed the dedicated Grep and Glob tools. Search is now done via Bash. Denying `grep`/`find`/`rg` while no dedicated alternative exists makes the agent unable to search anything. These commands are allowed by default; the bash-simplicity skill still applies (one command per call, no shell plumbing).
200
+
200
201
  ## Bin Scripts (PATH)
201
202
 
202
203
  The augur-skills plugin includes shell scripts (`cld`, `cldo`, `tmcld`, etc.) in `plugins/simpleapps/bin/`. When installed via the Claude Code marketplace, these live at:
@@ -188,12 +188,12 @@ Every wiki on the machine is a local knowledge base. When looking for how someth
188
188
  2. Pull the latest for all wikis before searching:
189
189
  - `git -C {projectRoot}/clients/*/wiki pull` (one call per wiki, not a glob)
190
190
  - `git -C {projectRoot}/simpleapps/*/wiki pull`
191
- 3. Search across all wikis with Grep:
192
- - `Grep(pattern: "...", path: "{projectRoot}/clients", glob: "*/wiki/*.md")`
193
- - `Grep(pattern: "...", path: "{projectRoot}/simpleapps", glob: "*/wiki/*.md")`
191
+ 3. Search across all wikis with Bash `grep`:
192
+ - `grep -rn --include="*.md" "<pattern>" {projectRoot}/clients/*/wiki/`
193
+ - `grep -rn --include="*.md" "<pattern>" {projectRoot}/simpleapps/*/wiki/`
194
194
  4. Read the matching pages to get the full context
195
195
 
196
- Use Glob to discover which projects have wikis: `Glob(pattern: "{projectRoot}/clients/*/wiki")` and `Glob(pattern: "{projectRoot}/simpleapps/*/wiki")`.
196
+ Discover which projects have wikis with Bash `ls`: `ls -d {projectRoot}/clients/*/wiki` and `ls -d {projectRoot}/simpleapps/*/wiki`.
197
197
 
198
198
  The wikis are kept fresh by `/curate-wiki` runs across projects. Searching locally is instant and requires no internet access; the knowledge is already on the machine.
199
199
 
@@ -0,0 +1,125 @@
1
+ ---
2
+ name: wip
3
+ description: WIP file conventions. Frontmatter schema, status lifecycle, retention, promotion to wiki, and daily processing rules. Use when creating, updating, or retiring files in wip/.
4
+ ---
5
+
6
+ # WIP
7
+
8
+ WIP files (`wip/*.md`) hold active task context: the issue or request, investigation notes, the plan, implementation decisions, and post-ship notes. They are gitignored and local to each project. This skill defines the format and lifecycle so tools and agents treat them consistently.
9
+
10
+ ## Frontmatter schema
11
+
12
+ Every WIP file MUST start with YAML frontmatter. The `/wip` command writes it on scaffold; other commands update it as the work progresses.
13
+
14
+ ```yaml
15
+ ---
16
+ issue: https://github.com/simpleapps-com/<repo>/issues/<N>
17
+ branch: <type>/<N>-<slug>
18
+ status: open
19
+ created: 2026-04-22
20
+ last_reviewed: 2026-04-22
21
+ shipped_at:
22
+ pr:
23
+ disposition:
24
+ wiki_candidates:
25
+ ---
26
+ ```
27
+
28
+ Fields:
29
+
30
+ | Field | Values | Written by | Notes |
31
+ |-------|--------|------------|-------|
32
+ | `issue` | URL or empty | `/wip` | GitHub URL, Basecamp URL, or empty for freeform WIPs |
33
+ | `branch` | string | `/wip`, `/implement` | Feature branch the work lands on |
34
+ | `status` | `open` \| `in-progress` \| `shipped` \| `abandoned` | lifecycle commands | See state machine below |
35
+ | `created` | ISO date | `/wip` | Never changes |
36
+ | `last_reviewed` | ISO date | every lifecycle command | Used to detect stale WIPs |
37
+ | `shipped_at` | ISO date or empty | `/submit` (after CI green) | Set exactly once, when status flips to `shipped` |
38
+ | `pr` | URL or SHA | `/submit` | PR URL if one exists, otherwise the commit SHA |
39
+ | `disposition` | `keep` \| `promote` \| `delete` or empty | `/process-wips` or user | Empty means "not yet decided" |
40
+ | `wiki_candidates` | comma-separated wiki page names | `/process-wips` (draft) or user | Optional; only meaningful when `disposition: promote` |
41
+
42
+ Freeform WIPs (no issue) leave `issue`, `branch`, and `pr` empty. Everything else still applies.
43
+
44
+ ## Status lifecycle
45
+
46
+ ```
47
+ open ──┬─▶ in-progress ──▶ shipped ──▶ (retained 7d) ──▶ retired
48
+ └─▶ abandoned ──▶ (retained 7d) ──▶ retired
49
+ ```
50
+
51
+ | Status | Set when | Set by |
52
+ |--------|----------|--------|
53
+ | `open` | Scaffold — issue exists, no work started | `/wip` |
54
+ | `in-progress` | Research or code work is underway | `/investigate`, `/implement` |
55
+ | `shipped` | Work is on main and CI is green | `/submit` (after push + CI pass) |
56
+ | `abandoned` | User decides not to pursue | User edits manually, or `/process-wips` on request |
57
+
58
+ `/implement` marks `in-progress` on entry (it's starting work). `/submit` marks `shipped` only after confirming the push landed and CI went green; if either fails it leaves status unchanged so the user can re-run after fixing.
59
+
60
+ ## Retention rule
61
+
62
+ A shipped or abandoned WIP is retained for **7 days** after `shipped_at` (or the date abandoned) to give the user time to lift content into the wiki. After 7 days:
63
+
64
+ - `disposition: delete` → auto-deleted by `/process-wips`
65
+ - `disposition: promote` → flagged for interactive review; deleted once promotion completes
66
+ - `disposition:` empty → flagged for interactive review; the user chooses between promote and delete in the review table
67
+
68
+ `open` and `in-progress` WIPs are never auto-deleted. `last_reviewed` is advisory: if it has not moved in 30 days, `/process-wips` surfaces the file with a "stale, likely abandoned" note for the user to confirm.
69
+
70
+ ## Daily processing (`/process-wips`)
71
+
72
+ `/process-wips` walks `wip/*.md` and classifies each file into one of three buckets based on frontmatter plus ground truth (gh issue state, branch merge, `shipped_at` age):
73
+
74
+ 1. **Auto-delete** (silent, reported in summary): `status: shipped` or `abandoned`, `shipped_at` > 7 days ago, `disposition: delete`. Deleted without prompting.
75
+ 2. **Confirm promotion** (interactive, per-row): `disposition: promote` and older than 7 days. The command drafts the proposed wiki edit (which page, what content), shows it to the user, and writes on `y`. Delete the WIP after the wiki commits.
76
+ 3. **Leave alone** (no action): anything with `status` in `open`/`in-progress`, or shipped/abandoned within the 7-day window, or with `disposition` empty and recently shipped.
77
+
78
+ ### Reconciliation
79
+
80
+ Before classifying, the command reconciles frontmatter against ground truth and overwrites stale fields:
81
+
82
+ - If `issue` is set and `gh issue view` shows `state: CLOSED` but the WIP still reads `status: open`/`in-progress`, flip to `shipped` and set `shipped_at` to the issue's `closed_at`.
83
+ - If `branch` is set and `git log origin/main` shows it merged, the work shipped even if the frontmatter claims otherwise.
84
+ - If no frontmatter exists at all (legacy file), the command treats the whole file as the migration case: read the file, infer initial values, write frontmatter, set `last_reviewed` to today, report as migrated.
85
+
86
+ Reconciliation runs on every invocation so users can edit source of truth (GitHub) and let the daily run propagate.
87
+
88
+ ## Promotion to wiki
89
+
90
+ A WIP is worth promoting when it contains **evergreen content**: new conventions, gotchas, architectural decisions, patterns other agents will encounter again. Task-specific details (the exact PR, the exact bug, the specific file paths) are NOT evergreen and should be left to delete.
91
+
92
+ When the user marks `disposition: promote`, the agent's job at `/process-wips` time is to:
93
+
94
+ 1. Scan the WIP for sections that generalize (Analysis > Suggested approach, Research > "gotcha" notes, Implementation > Deviations with reasoning).
95
+ 2. Match those against existing wiki pages (`wiki/*.md`). If a page exists on the topic, propose an append or inline edit. If no page fits, propose a new page and note it in the draft.
96
+ 3. Draft the exact wiki edit (full new content, not a summary) and present it for user confirmation before writing.
97
+ 4. After the user approves and the wiki edit commits, delete the WIP.
98
+
99
+ Do NOT promote the whole WIP. A WIP is scaffolding; only the load-bearing bits belong in the wiki. If no section generalizes, change `disposition` to `delete` and let the next run clean it up.
100
+
101
+ ## Freeform WIPs
102
+
103
+ Not every WIP has an issue. Investigation notes, spike results, meeting takeaways, and one-off plans are legitimate. Freeform WIPs:
104
+
105
+ - Leave `issue`, `branch`, and `pr` empty
106
+ - Use a descriptive filename (no `GH`/`BC` prefix) like `wip/bin-scripts-and-tmux.md`
107
+ - Still have `status`, `created`, `last_reviewed`, `disposition`
108
+ - Transition to `shipped` manually when the user is done, or `abandoned` when the user walks away
109
+
110
+ They follow the same retention rule once `status` is terminal. `/process-wips` leaves them alone while `status` is `open` or `in-progress`.
111
+
112
+ ## What NOT to put in a WIP
113
+
114
+ - Secrets, tokens, credentials
115
+ - Final user-facing documentation (that's the wiki)
116
+ - Production code (that's the repo)
117
+ - Large binary attachments (link to them instead)
118
+
119
+ ## Related
120
+
121
+ - `/wip` — scaffold a WIP from an issue or URL; writes frontmatter
122
+ - `/investigate` — research; bumps `last_reviewed`, sets `status: in-progress`
123
+ - `/implement` — build; bumps `last_reviewed`, keeps `status: in-progress`
124
+ - `/submit` — commit and push; after CI green, flips `status: shipped` and fills `shipped_at` / `pr`
125
+ - `/process-wips` — daily reconciliation and retention pass
@@ -11,13 +11,14 @@ Do not add features, refactor surrounding code, or "improve" beyond the request.
11
11
 
12
12
  ## Use the right tool
13
13
 
14
- Prefer dedicated tools over Bash equivalents. They are faster, need no permissions, and produce cleaner output:
14
+ Prefer dedicated tools over Bash equivalents when one exists. They are faster, need no permissions, and produce cleaner output:
15
15
  - Read not `cat`/`head`/`tail`
16
- - Grep not `grep`/`rg`
17
- - Glob not `find`/`ls`
18
16
  - Edit not `sed`/`awk`
17
+ - Write not `echo >`/`cat <<EOF`
19
18
 
20
- Reserve Bash for commands that have no dedicated tool equivalent.
19
+ Search is Bash-only Claude Code 2.1.117 removed the dedicated Grep and Glob tools. Use `grep -rn`, `rg`, `find`, and `ls` directly via Bash, one command per call (no `-exec`, no piping to `head`).
20
+
21
+ Reserve Bash for searches above and for commands that never had a dedicated tool.
21
22
 
22
23
  MUST NOT use `cd` in any Bash command, not even in compound commands like `cd /path && git log`. Use `git -C repo` for git, and path arguments for everything else. The `cd` deny rule does not suppress Claude Code's built-in security prompt for compound cd+git commands, so any `cd` usage will interrupt the user.
23
24
 
@@ -29,12 +30,19 @@ Before asking the user for credentials, tokens, siteId, domain, or any site-spec
29
30
 
30
31
  When debugging in the browser, MUST check for error overlays (red error pill/badge at the bottom of the page) before guessing at the problem. Click it, read the full error, stack trace, and source location. The answer is almost always right there.
31
32
 
32
- ## Protect the context window
33
+ ## Context discipline
34
+
35
+ Every file read, command output, and subagent response sits in context for the rest of the session. The agent behaviors that matter:
36
+
37
+ - Broad exploration ("where is X wired up", "how does Y work") → delegate to an Explore subagent with a word cap. The entire exploration happens outside your context; only the returned summary costs you tokens. This is the single biggest lever for keeping the main thread slim. The trick is asking for everything you will need up front — file paths, line numbers, surrounding context, edge cases — in one specific request. A complete request yields a complete answer; a vague one forces a second round-trip that erases the saving. See `subagent-briefing.md` for the required briefing elements.
38
+ - Do not re-read files already loaded in this session. Trust the earlier Read.
39
+ - After Grep gives you a line number, Read with offset/limit — not the whole file.
40
+ - Commands with large output (test runs, build logs, long grep results): redirect to a `tmp/` file, then Grep or targeted-Read the parts you need.
41
+ - Do not duplicate subagent work. If you delegated the search, use the answer — do not re-run the greps inline to verify.
33
42
 
34
- - Prefer targeted searches over broad exploration
35
- - Use subagents for verbose operations (test runs, log analysis, large file reads)
36
- - `/clear` between unrelated tasks
37
- - Two sentences that answer the question beat two pages that fill the context window
43
+ ## Response length
44
+
45
+ Be complete and concise. Accuracy and completeness come first — do not truncate a real answer to look terse. But verbosity is not thoroughness. Every token sent to the user is a token they are expected to read; too many tokens raise cognitive load and annoy them. Output tokens also cost ~5x input on Opus, so the waste compounds. Multi-option writeups, draft code blocks, and "here are my thoughts" bullets are the default failure mode when a paragraph would cover it. Say what's needed, then stop.
38
46
 
39
47
  ## Verify your own work
40
48
 
@@ -96,24 +104,23 @@ These actions hide problems; they do not fix them. If a rule or test seems wrong
96
104
 
97
105
  ## Branch hygiene before starting work
98
106
 
99
- The lifecycle commands `/wip`, `/investigate`, and `/implement` MUST verify branch state before doing anything else. This is a HARD STOP, not a warning. A warning the agent emits and then ignores is identical to no check three concerns end up on one branch and the mess is only discovered at `/submit`.
107
+ Branch management is the agent's job, not the user's. The lifecycle commands `/wip`, `/investigate`, and `/implement` MUST verify branch state before starting, but the agent SHOULD handle the safe transitions itself rather than blocking the user with "go run `git switch` first."
100
108
 
101
109
  When invoked for issue `#N`:
102
110
 
103
111
  1. Run `git -C repo branch --show-current` → branch `B`
104
112
  2. Run `git -C repo status --porcelain` → working tree state `T`
105
113
 
106
- Proceed ONLY if one of these holds:
107
-
108
- - `B` is `main` or `master` AND `T` is clean
109
- - `B` contains `N` (e.g., `feat/N-...`, `fix/N-…`) you are continuing in-flight work for the same issue; dirty tree is allowed
110
-
111
- Otherwise STOP. Report exactly what you saw and the recovery path:
114
+ | `B` | `T` | Action |
115
+ |-----|-----|--------|
116
+ | Contains `N` | any | Proceed continuing in-flight work for this issue |
117
+ | `main` / `master` | clean | For `/wip` and `/investigate` (read/scaffold only): proceed on `main`. For `/implement` (code changes): create the branch yourself with `git -C repo switch -c <type>/<N>-<slug>`, then proceed. Derive `<type>` from the issue title prefix (`feat:`, `fix:`, `chore:`, `docs:`). Derive `<slug>` from the issue title (lowercase-hyphenated, ≤40 chars). |
118
+ | `main` / `master` | dirty | HARD STOP — uncommitted changes need to land somewhere first. Report exactly which files are modified. Let the user decide (commit on a branch, discard, stash). MUST NOT touch their changes. |
119
+ | Belongs to a different issue (`feat/M-…`, `M ≠ N`) | any | HARD STOP tell the user to `/submit` the in-flight work first, then re-run. MUST NOT switch branches with uncommitted work present. |
112
120
 
113
- - If `B` is for a different issue: tell the user to `/submit` the in-flight work first, then `git -C repo switch main` and re-run the command. Do NOT offer to commit or stash on their behalf.
114
- - If `B` is `main`/`master` but `T` is dirty: tell the user the uncommitted changes need to land somewhere (their own branch + `/submit`, or explicit discard) before starting new work.
121
+ The HARD STOPs only fire when the user's working state would be lost or mixed by proceeding. Clean main + known issue is not a stop condition — `/implement` creates the branch itself; `/wip` and `/investigate` proceed in place because they don't write code.
115
122
 
116
- Do NOT proceed with a "warning." Do NOT scaffold, investigate, or implement on a stale or wrong branch. Branching mistakes compound silently and the cost of recovery scales with how many commands later they are caught.
123
+ Branching mistakes compound silently and the cost of recovery scales with how many commands later they are caught but the answer is the agent doing the safe transition autonomously, not screaming the sky is falling at the user every time the workflow requires a routine `git switch`.
117
124
 
118
125
  ## Track progress
119
126
 
@@ -92,6 +92,8 @@ The three shipping commands (`/submit`, `/deploy`, `/publish`) read project-spec
92
92
 
93
93
  Commands like `/research` and `/discuss` can be used at any stage. `/quality`, `/verify`, `/curate-wiki`, and `/wiki-audit` can run independently.
94
94
 
95
+ `/process-wips` runs daily (outside the lifecycle above) to reconcile WIP frontmatter with ground truth, auto-delete shipped WIPs older than 7 days, and confirm wiki promotions. See `simpleapps:wip` for the frontmatter schema and retention rule.
96
+
95
97
  ## References
96
98
 
97
99
  - See `simpleapps:basecamp` skill for MCP tools, Chrome fallback, and Basecamp navigation