@neikyun/ciel 6.11.0 → 6.11.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/assets/.claude/hooks/memory-engine.py +29 -4
  2. package/assets/commands/ciel-create-skill.md +2 -2
  3. package/assets/commands/ciel-status.md +1 -1
  4. package/assets/platforms/opencode/.opencode/agents/ciel-improver.md +2 -2
  5. package/assets/platforms/opencode/.opencode/commands/ciel-create-skill.md +2 -2
  6. package/assets/platforms/opencode/.opencode/commands/ciel-memory-bootstrap.md +195 -0
  7. package/assets/skills/workflow/adr-auto/SKILL.md +88 -0
  8. package/assets/skills/workflow/ai-failure-modes-detector/SKILL.md +180 -0
  9. package/assets/skills/workflow/ask-window/SKILL.md +119 -0
  10. package/assets/skills/workflow/avec-quoi-versioner/SKILL.md +111 -0
  11. package/assets/skills/workflow/ci-watcher/SKILL.md +194 -0
  12. package/assets/skills/workflow/critiquer-auditor/SKILL.md +135 -0
  13. package/assets/skills/workflow/critiquer-auditor/reference.md +134 -0
  14. package/assets/skills/workflow/debug-reasoning-rca/SKILL.md +174 -0
  15. package/assets/skills/workflow/depth-classifier/SKILL.md +118 -0
  16. package/assets/skills/workflow/diverge/SKILL.md +91 -0
  17. package/assets/skills/workflow/doc-validator-official/SKILL.md +196 -0
  18. package/assets/skills/workflow/evaluer-sizer/SKILL.md +112 -0
  19. package/assets/skills/workflow/faire-gatekeeper/SKILL.md +99 -0
  20. package/assets/skills/workflow/flux-narrator/SKILL.md +93 -0
  21. package/assets/skills/workflow/memoire/SKILL.md +198 -0
  22. package/assets/skills/workflow/memoire-consolidator/SKILL.md +91 -0
  23. package/assets/skills/workflow/meta-critiquer/SKILL.md +112 -0
  24. package/assets/skills/workflow/modern-patterns-checker/SKILL.md +166 -0
  25. package/assets/skills/workflow/pattern-fitness-check/SKILL.md +108 -0
  26. package/assets/skills/workflow/playwright-visual-critic/SKILL.md +98 -0
  27. package/assets/skills/workflow/pr-review-responder/SKILL.md +214 -0
  28. package/assets/skills/workflow/prouver-verifier/SKILL.md +184 -0
  29. package/assets/skills/workflow/prouver-verifier/reference.md +152 -0
  30. package/assets/skills/workflow/quoi-framer/SKILL.md +91 -0
  31. package/assets/skills/workflow/relire-critic/SKILL.md +99 -0
  32. package/assets/skills/workflow/security-regression-check/SKILL.md +86 -0
  33. package/assets/skills/workflow/self-consistency-verifier/SKILL.md +85 -0
  34. package/assets/skills/workflow/spike-mode/SKILL.md +101 -0
  35. package/assets/skills/workflow/stride-analyzer/SKILL.md +96 -0
  36. package/assets/skills/workflow/stride-analyzer/reference.md +144 -0
  37. package/assets/skills/workflow/test-strategy-vitest-playwright/SKILL.md +119 -0
  38. package/package.json +1 -1
@@ -0,0 +1,119 @@
1
+ ---
2
+ name: ask-window
3
+ description: How to use the ASK window in Ciel v5 — before coding, clarify ambiguities using the question tool (OpenCode) or plan mode (Claude Code). Covers etapes 3 (ASK) and 10 (ASK2) of the pipeline. Prevents coding on assumptions.
4
+ ---
5
+
6
+ # ASK Window — Clarify Before You Code (Ciel v5)
7
+
8
+ ## What this covers
9
+
10
+ How to use the ASK window in the Ciel v5 pipeline. Before coding, the agent must ask clarifying questions rather than assuming. This skill covers etapes 3 (ASK after QUOI) and 10 (ASK2 after EVALUER).
11
+
12
+ ## Core principle
13
+
14
+ **Do not code on assumptions.** When requirements are ambiguous, parameters undefined, or choices implicit -> ask. Use the question tool (OpenCode) or plan mode (Claude Code).
15
+
16
+ ## Two modes — when to use which
17
+
18
+ ```
19
+ MODE ASK (step 3) → "What should I build?" → after QUOI, before research
20
+ MODE ASK2 (step 10) → "Should I build this way?" → after EVALUER, before coding
21
+ ```
22
+
23
+ ASK is about **requirements** — clarify what to build. ASK2 is about **the plan** — validate how to build it.
24
+
25
+ ### ASK (step 3) — clarify requirements
26
+
27
+ After QUOI, before any research or coding. Questions are about the **what**, not the **how**:
28
+
29
+ - Requirements: "Is the email field required?"
30
+ - Ambiguities: "Session cookie or JWT?"
31
+ - Assumptions: "I assume the database is PostgreSQL, correct?"
32
+ - Missing info: "What is the expected throughput?"
33
+ - Scope boundaries: "Does this include the admin panel?"
34
+
35
+ ### ASK2 (step 10) — validate the plan
36
+
37
+ After EVALUER, before FAIRE. Questions are about the **approach**, not the **requirements**:
38
+
39
+ - Approach validation: "I'm going with approach A because X. OK?"
40
+ - Trade-off validation: "Approach A is simpler but B is more flexible. OK with A?"
41
+ - Risk confirmation: "The main risk is X. Acceptable?"
42
+ - Effort check: "This will take ~2 hours. OK?"
43
+
44
+ ## How to ask
45
+
46
+ ### OpenCode: use the `question` tool
47
+
48
+ The `question` tool is a built-in OpenCode tool. Each question includes:
49
+ 1. A header (category of question)
50
+ 2. The question text
51
+ 3. A list of options (at least 2)
52
+ 4. A custom answer option
53
+
54
+ Example:
55
+ ```
56
+ Tool: question
57
+ Parameters:
58
+ header: "Database choice"
59
+ question: "Which database should we use for the new feature?"
60
+ options: ["PostgreSQL (existing)", "SQLite (simpler)", "MySQL (new)"]
61
+ ```
62
+
63
+ ### Claude Code: use plan mode
64
+
65
+ In Claude Code, switch to plan mode (Tab key), then:
66
+ 1. List your assumptions explicitly
67
+ 2. Ask questions one at a time
68
+ 3. Wait for answers before proceeding
69
+ 4. Update the plan based on responses
70
+
71
+ ## Structure of a good question
72
+
73
+ ```
74
+ QUESTION CATEGORY: <requirement | assumption | tradeoff | risk | scope>
75
+
76
+ Context: <1-2 sentences explaining why you're asking>
77
+
78
+ Question: <clear, specific question>
79
+
80
+ Options:
81
+ A: <option>
82
+ B: <option>
83
+ C: <other> (if relevant)
84
+ ```
85
+
86
+ ## What NOT to ask
87
+
88
+ - Things you can discover yourself (read the code, check package.json)
89
+ - Trivial preferences that don't affect the design (naming, formatting)
90
+ - The same question twice (check your previous answers)
91
+ - Questions you already have the answer to (check .ciel/memory.json, overlay)
92
+
93
+ ## Output format
94
+
95
+ After the ASK phase, include in the plan:
96
+
97
+ ```
98
+ ## Questions asked (ASK)
99
+
100
+ 1. <question> -> <answer selected>
101
+ 2. <question> -> <custom answer>
102
+ ```
103
+
104
+ ## Common rationalizations
105
+
106
+ | Rationalization | Reality |
107
+ |---|---|
108
+ | "I'll just assume and fix it later" | Later is when it's in production and costs 10x to fix. Asking now costs 30 seconds. |
109
+ | "The user would have told me if it mattered" | Users don't know what they don't specify. Assumptions are silent bugs waiting to surface. |
110
+ | "I can figure it out from the code" | Code tells you WHAT, not WHY. If there are two valid approaches, code can't tell you which one the project prefers. |
111
+ | "Asking makes me look uncertain" | Coding on wrong assumptions makes you look incompetent. Asking is what senior engineers do. |
112
+
113
+ ## How to verify
114
+
115
+ - [ ] All unclear requirements have been asked about?
116
+ - [ ] Assumptions have been validated (not silently filled)?
117
+ - [ ] Trade-offs have been offered to the user?
118
+ - [ ] Questions are specific, not vague ("what do you think?")
119
+ - [ ] Answers are captured in the plan?
@@ -0,0 +1,111 @@
1
+ ---
2
+ name: avec-quoi-versioner
3
+ description: Reads actual installed library versions from package.json, build.gradle, go.mod, Cargo.toml, pyproject.toml, Gemfile.lock — never trusts memory or assumptions. Loads ciel-overlay.md if present for project-specific stack context. Invoked before research to ensure all subsequent docs lookups target the correct versions.
4
+ allowed-tools: Read, Grep, Glob, Bash
5
+ ---
6
+
7
+ # avec-quoi-versioner — Read real installed versions
8
+
9
+ Step 2 of CRÉER. The research quality is bounded by version accuracy. A skill that looks up "Ktor 2.x docs" when the project runs Ktor 3.x produces anti-patterns.
10
+
11
+ ---
12
+
13
+ ## Process
14
+
15
+ ### 1. Detect package manager(s)
16
+
17
+ Scan project root for the following files (in order):
18
+
19
+ | File | Stack |
20
+ |------|-------|
21
+ | `package.json` + `package-lock.json` | npm / Node.js |
22
+ | `package.json` + `yarn.lock` | yarn |
23
+ | `package.json` + `pnpm-lock.yaml` | pnpm |
24
+ | `package.json` + `bun.lockb` | bun |
25
+ | `build.gradle.kts` / `build.gradle` | JVM / Gradle |
26
+ | `pom.xml` | Maven |
27
+ | `go.mod` + `go.sum` | Go |
28
+ | `Cargo.toml` + `Cargo.lock` | Rust |
29
+ | `pyproject.toml` + `poetry.lock` / `uv.lock` | Python |
30
+ | `requirements.txt` | Python (pip) |
31
+ | `Gemfile` + `Gemfile.lock` | Ruby |
32
+ | `composer.json` | PHP |
33
+ | `Package.swift` / `Package.resolved` | Swift |
34
+
35
+ Multiple lockfiles may exist (monorepo). Read them all.
36
+
37
+ ### 2. Extract exact versions (not semver ranges)
38
+
39
+ For each relevant dependency in the task scope:
40
+
41
+ - Read the **lockfile** for the pinned version (not `package.json`'s range)
42
+ - For Gradle, run `./gradlew dependencies` if needed, or read `gradle.properties`
43
+ - For Go, `go.mod` already pins; verify with `go list -m all`
44
+ - For Maven, effective POM: `mvn help:effective-pom`
45
+
46
+ ### 3. Load ciel-overlay.md
47
+
48
+ If present at project root, extract:
49
+
50
+ - `## Stack` section — project's declared stack
51
+ - `## Versions` section — URLs to docs
52
+ - Any project-specific rules in `## Règles projet-spécifiques`
53
+
54
+ ### 4. State assumptions explicitly
55
+
56
+ For anything NOT verified from lockfile:
57
+
58
+ - "Assuming build tool X because [reason]."
59
+ - "Assuming PostgreSQL is running on default port because [reason]."
60
+
61
+ These assumptions must be flagged for `researcher` to verify.
62
+
63
+ ---
64
+
65
+ ## Output format
66
+
67
+ ```
68
+ ## AVEC QUOI
69
+
70
+ Stack detected:
71
+ - Frontend: <framework> <version> (from <file>)
72
+ - Backend: <framework> <version> (from <file>)
73
+ - Database: <type> <version> (from <file or overlay>)
74
+ - Test: <framework> <version> (from <file>)
75
+ - Build: <tool> <version>
76
+
77
+ Overlay:
78
+ - [Loaded: yes/no]
79
+ - [Relevant sections: Stack, Versions, Règles, Leçons]
80
+
81
+ Assumptions (NOT from lockfile):
82
+ - <assumption> — <reason>
83
+
84
+ Docs URLs (from overlay):
85
+ - <lib>: <url>
86
+ ```
87
+
88
+ ---
89
+
90
+ ## Guardrails
91
+
92
+ - **Never assume a version** — if lockfile is absent, state "version unknown" and flag it
93
+ - **Range vs pinned**: always report the pinned version from the lockfile, not the `^1.2.3` range from the manifest
94
+ - **Monorepo caution**: multiple lockfiles may diverge across packages. Specify which package the version applies to.
95
+ - **Don't guess URLs**: only report doc URLs from the overlay. Let `researcher` agent WebSearch for the rest.
96
+
97
+ ---
98
+
99
+ ## How to verify
100
+
101
+ - [ ] Versions read from lock files (not package.json ranges)?
102
+ - [ ] ciel-overlay.md consulted for project-specific versions?
103
+ - [ ] Framework detected (React/Vue/Svelte/Ktor/Express/etc)?
104
+ - [ ] Version gaps flagged (installed vs latest)?
105
+ - [ ] Overlay updated if new versions discovered?
106
+
107
+ ## When triggered
108
+
109
+ - Standard/Critical tasks, immediately after `quoi-framer`
110
+ - Before dispatching `researcher` agent (research quality depends on version accuracy)
111
+ - When user asks "what versions are we on?" or the task mentions a specific library
@@ -0,0 +1,194 @@
1
+ ---
2
+ name: ci-watcher
3
+ description: Streams GitHub Actions via `gh run watch`, classifies failures flaky (≥15% fail rate on main → auto-`gh run rerun --failed`) vs real (hand off to debug-reasoning-rca). Invoke after pr-opener, before pr-merger, or on "CI stuck" / "why is CI red" / "flaky test". Inline.
4
+ allowed-tools: Bash, Read
5
+ context: inline
6
+ ---
7
+
8
+ # ci-watcher — Watch CI, distinguish flaky from broken, retry smart
9
+
10
+ `prouver-verifier` takes a single-point snapshot of CI state. `ci-watcher` watches over time: streams the run, waits for completion, classifies failures as flaky vs real, retries only what's safe.
11
+
12
+ ---
13
+
14
+ ## Inputs
15
+
16
+ ```
17
+ BRANCH: [current branch — from git rev-parse]
18
+ PR_NUMBER: [optional — derived if branch has an open PR]
19
+ WORKFLOW: [optional — filter to specific workflow name; default all]
20
+ MODE: [watch | snapshot — default watch; snapshot = single poll + return]
21
+ MAX_RETRIES: [default 1 — for flaky-detected failures only]
22
+ FLAKY_THRESHOLD: [default 15 — % fail rate on main that classifies as flaky]
23
+ ```
24
+
25
+ ### Auto-inference sources
26
+
27
+ - **BRANCH** → `git rev-parse --abbrev-ref HEAD`
28
+ - **PR_NUMBER** → `gh pr view --json number --jq .number 2>/dev/null`
29
+ - **WORKFLOW** → all workflows that ran on the branch
30
+
31
+ ---
32
+
33
+ ## Preflight
34
+
35
+ ```bash
36
+ gh auth status 2>&1 | grep -q "Logged in" || exit 1
37
+ BRANCH=${BRANCH:-$(git rev-parse --abbrev-ref HEAD)}
38
+
39
+ # Confirm branch has at least one run
40
+ LATEST=$(gh run list --branch="$BRANCH" --limit=1 --json databaseId,status,conclusion --jq '.[0] // empty')
41
+ [ -z "$LATEST" ] && { echo "No runs found for branch $BRANCH — push first"; exit 1; }
42
+ ```
43
+
44
+ ---
45
+
46
+ ## Process
47
+
48
+ ### 1. Stream or snapshot
49
+
50
+ **Watch mode (default)** — stream until completion:
51
+
52
+ ```bash
53
+ RUN_ID=$(gh run list --branch="$BRANCH" --limit=1 --json databaseId --jq '.[0].databaseId')
54
+ gh run watch "$RUN_ID" --exit-status
55
+ RESULT=$?
56
+ ```
57
+
58
+ `--exit-status` returns non-zero on run failure. `gh run watch` streams logs as they appear.
59
+
60
+ **Snapshot mode** — single poll:
61
+
62
+ ```bash
63
+ gh run list --branch="$BRANCH" --limit=5 --json databaseId,name,status,conclusion,url
64
+ ```
65
+
66
+ ### 2. On failure — classify flaky vs real
67
+
68
+ ```bash
69
+ # Get failing jobs for this run
70
+ FAILED_JOBS=$(gh run view "$RUN_ID" --json jobs --jq '.jobs[] | select(.conclusion == "failure") | .name')
71
+
72
+ # For each failed job, check history on the base branch
73
+ BASE=$(gh pr view "$PR_NUMBER" --json baseRefName --jq .baseRefName 2>/dev/null || echo "main")
74
+
75
+ for JOB in $FAILED_JOBS; do
76
+ # Last 50 runs on base branch for same workflow
77
+ WORKFLOW=$(gh run view "$RUN_ID" --json workflowName --jq .workflowName)
78
+ FAIL_RATE=$(gh run list \
79
+ --branch="$BASE" \
80
+ --workflow="$WORKFLOW" \
81
+ --limit=50 \
82
+ --json conclusion \
83
+ --jq '[.[] | select(.conclusion == "failure")] | length')
84
+
85
+ FAIL_PCT=$((FAIL_RATE * 100 / 50))
86
+
87
+ if [ "$FAIL_PCT" -ge "$FLAKY_THRESHOLD" ]; then
88
+ echo "Job '$JOB' fails ${FAIL_PCT}% of the time on $BASE — CLASSIFIED FLAKY"
89
+ FLAKY_JOBS+=("$JOB")
90
+ else
91
+ echo "Job '$JOB' fails ${FAIL_PCT}% of the time on $BASE — CLASSIFIED REAL FAILURE"
92
+ REAL_FAILURES+=("$JOB")
93
+ fi
94
+ done
95
+ ```
96
+
97
+ **Flaky threshold rationale**: 15% = 7-8 failures in 50 runs. Below that, a single failure is likely the PR's fault. Above, it's environmental/test-harness instability.
98
+
99
+ ### 3. Retry flaky jobs (up to MAX_RETRIES)
100
+
101
+ ```bash
102
+ if [ ${#FLAKY_JOBS[@]} -gt 0 ] && [ "$RETRY_COUNT" -lt "$MAX_RETRIES" ]; then
103
+ echo "Retrying flaky jobs (attempt $((RETRY_COUNT+1))/$MAX_RETRIES)"
104
+ gh run rerun "$RUN_ID" --failed
105
+ RETRY_COUNT=$((RETRY_COUNT+1))
106
+ # Re-enter step 1 (watch the rerun)
107
+ fi
108
+ ```
109
+
110
+ `--failed` only reruns failed jobs (saves CI minutes).
111
+
112
+ ### 4. Extract log excerpt for real failures
113
+
114
+ For handoff to `debug-reasoning-rca`:
115
+
116
+ ```bash
117
+ for JOB in $REAL_FAILURES; do
118
+ JOB_ID=$(gh run view "$RUN_ID" --json jobs --jq ".jobs[] | select(.name == \"$JOB\") | .databaseId")
119
+
120
+ # Last 50 lines of the failing step
121
+ gh run view --job="$JOB_ID" --log-failed | tail -50
122
+ done
123
+ ```
124
+
125
+ ### 5. Emit output
126
+
127
+ ```
128
+ [CI WATCHER]
129
+ Run: <URL>
130
+ Status: <success | failure | in_progress>
131
+ Duration: <Xm Ys>
132
+
133
+ Jobs:
134
+ [OK] build
135
+ [OK] lint
136
+ [WARN] integration-tests — FLAKY (fails 18% on main, retried — now green)
137
+ [FAIL] unit-tests — REAL FAILURE (fails 2% on main — investigate)
138
+
139
+ Handoff (if real failures):
140
+ - debug-reasoning-rca with SYMPTOM=<failing test name> + LOG excerpt
141
+ ```
142
+
143
+ ---
144
+
145
+ ## Guardrails
146
+
147
+ - **MAX_RETRIES=1 by default** — a flaky test that fails twice in a row is likely not flaky. Don't spam retries.
148
+ - **Never retry real failures** — the retry mechanism is ONLY for jobs classified flaky. Real failures need a code fix.
149
+ - **Never retry pre-merge checks on main** — only PR branches. Retrying on main risks hiding real regressions.
150
+ - **Budget-aware**: large rerun loops burn CI minutes. Log estimated minutes cost before retry on repos with tight budgets.
151
+ - **Respect timeouts**: `gh run watch` can hang if a job hangs. Wrap in `timeout 1800 gh run watch` for 30-min ceiling.
152
+ - **Flaky classification is per-job, not per-run**: if 3 of 5 jobs are flaky but 1 is real, DO NOT retry — fix the real one first.
153
+ - **Store flaky detections** — append to `.claude/flaky-tests.log` (optional, per-project) so patterns surface across sessions.
154
+
155
+ ---
156
+
157
+ ## When triggered
158
+
159
+ - After `pr-opener` in Standard pipeline (step 11 post-insertion)
160
+ - Before `pr-merger` as CI-green verification (replaces inline `gh run list` check)
161
+ - User says: "watch CI", "is CI green?", "CI is flaky", "rerun failed jobs"
162
+ - `prouver-verifier` detects a red CI and needs disambiguation
163
+
164
+ ---
165
+
166
+ ## Anti-pattern
167
+
168
+ ```
169
+ ❌ Failed → rerun blindly → rerun → rerun → real bug hidden, minutes wasted
170
+ ✅ Failed → classify (fail % on main) → retry only flaky → real fail → debug-reasoning-rca
171
+ ```
172
+
173
+ ```
174
+ ❌ sleep 300 && gh run list # blocked by harness; also cache-cold
175
+ ✅ gh run watch --exit-status # streams, no sleep
176
+ ```
177
+
178
+ ---
179
+
180
+ ## Handoff
181
+
182
+ - **If all green** → `pr-merger` can proceed
183
+ - **If real failure** → `debug-reasoning-rca` via `@ciel-critic` with log excerpt as SYMPTOM + failing job as SCOPE
184
+ - **If flaky detected + retry succeeded** → proceed to `pr-merger`, log flaky for future `/ciel-improve` signal
185
+ - **If flaky + retry failed** → escalate to user (flaky-turned-real or real-misclassified)
186
+
187
+ ---
188
+
189
+ ## References
190
+
191
+ - `gh run watch` — cli.github.com/manual/gh_run_watch
192
+ - `gh run rerun --failed` — cli.github.com/manual/gh_run_rerun
193
+ - Flaky test classification — Google's 2020 paper "Taming Google-scale continuous testing" (15% threshold baseline)
194
+ - Ciel pipeline: pr-opener → ci-watcher → (flaky? retry : debug-reasoning-rca) → pr-merger
@@ -0,0 +1,135 @@
1
+ ---
2
+ name: critiquer-auditor
3
+ description: How to audit code comprehensively — 7-dimension review methodology covering expected behavior, assumptions, scope, code-vs-model comparison, STRIDE security, pattern consistency, and findings with severity. For PR reviews, retrospective audits, and "is this code correct?" questions.
4
+ allowed-tools: Read, Grep, Glob, Bash, WebSearch
5
+ ---
6
+
7
+ # Code Audit — 7-Dimension Review Methodology
8
+
9
+ ## What this covers
10
+
11
+ How to do a thorough code audit. Distinct from quick self-review (relire-critic) — this is the comprehensive methodology for PR reviews, retrospective audits, and quality checks.
12
+
13
+ ## Core principle
14
+
15
+ **Read the diff/changed files FIRST.** All dimensions operate on actual code, never on assumptions. Description lies; code doesn't.
16
+
17
+ ## Dimension 1: Expected behavior model
18
+
19
+ From issue/spec/PR description: "what was this SUPPOSED to do?"
20
+
21
+ - Build a bypass signal checklist for this change type BEFORE scanning code
22
+ - If external lib involved: search `[lib] [version] anti-patterns common mistakes`
23
+
24
+ Output: 1-2 sentence behavior model + min 3 bypass signals to look for.
25
+
26
+ ## Dimension 2: Assumptions
27
+
28
+ - Git blame: why was the original code written this way?
29
+ - Surface 3 assumptions, verify each (grep / blame / read)
30
+
31
+ Output: 3 assumptions + verification status each.
32
+
33
+ ## Dimension 3: Scope
34
+
35
+ - "What if we do nothing?" considered?
36
+ - Scope of change proportional to the problem?
37
+
38
+ Output: counterfactual + proportionality judgment.
39
+
40
+ ## Dimension 4: Code vs model + STRIDE + OPS
41
+
42
+ - Code matches expected behavior model? (grep-backed)
43
+ - All bypass signals checked from dimension 1's list?
44
+ - **STRIDE all 6 categories**: S / T / R / I / D / E — mark N/A explicitly, never skip silently
45
+ - OPS lens: unclosed connections, memory leaks, locks, 100x volume
46
+
47
+ ### STRIDE reference
48
+
49
+ | Category | What to check |
50
+ |----------|--------------|
51
+ | **S**poofing | Authentication bypass, identity assumption |
52
+ | **T**ampering | Data integrity, unauthorized modification |
53
+ | **R**epudiation | Audit trail, logging completeness |
54
+ | **I**nformation disclosure | Data exposure, error messages, logs |
55
+ | **D**enial of service | Resource exhaustion, infinite loops, missing limits |
56
+ | **E**levation of privilege | Authorization bypass, role escalation |
57
+
58
+ ## Dimension 5: Consistency
59
+
60
+ - Grep: pattern used consistently elsewhere in the codebase?
61
+ - Layer boundaries respected (no business logic in routes, no DB in controllers)?
62
+ - Health thresholds from overlay met (complexity, coverage)?
63
+
64
+ ## Dimension 6: Findings with severity
65
+
66
+ Format: `RISQUE: X parce que Y — IMPACT: Z`
67
+
68
+ Severity levels:
69
+ - **BLOCKING** — must fix before merge (correctness, security, data loss). Requires specific FIX.
70
+ - **IMPORTANT** — should fix (degraded behavior, tech debt with near-term risk)
71
+ - **MINOR** — nice to fix (style, naming, low-risk improvement)
72
+ - **VALIDATED** — explicitly checked and confirmed correct
73
+
74
+ Every finding: RISQUE format. Every BLOCKING: specific FIX + NOT-X (what solution must NOT do).
75
+
76
+ ## Dimension 7: Close the loop
77
+
78
+ - New anti-pattern found? → add to Guards or project overlay
79
+ - New failure mode? → add Guard immediately
80
+ - Capture learnings for future reference
81
+
82
+ ## Output format
83
+
84
+ ```
85
+ ## AUDIT
86
+
87
+ ### Expected behavior
88
+ <1-2 sentences + bypass signals>
89
+
90
+ ### Assumptions
91
+ 1. <assumption> — verified: <yes/no, evidence>
92
+ 2. ...
93
+ 3. ...
94
+
95
+ ### Scope
96
+ - Nothing-counterfactual: <consequence if no change>
97
+ - Scope proportional: <yes/no, reason>
98
+
99
+ ### Code vs model + STRIDE
100
+ - Code vs model: <matches | deviates at file:line>
101
+ - Bypass signals: <N/3 flagged>
102
+ - STRIDE:
103
+ - S: <N/A because X | RISQUE: ...>
104
+ - T/R/I/D/E: ...
105
+
106
+ ### Consistency
107
+ - Pattern: <grep evidence>
108
+ - Layers: <clean | violation at file:line>
109
+ - Thresholds: <met | violation>
110
+
111
+ ### Findings
112
+ BLOCKING: <RISQUE + FIX>
113
+ IMPORTANT: <RISQUE + FIX/ACCEPT>
114
+ MINOR: <note>
115
+ VALIDATED: <what was verified>
116
+
117
+ ### Learnings
118
+ - New Guard: <yes/no>
119
+ - Overlay update: <yes/no>
120
+ ```
121
+
122
+ ## How to verify
123
+
124
+ - [ ] All 7 dimensions completed (Expected behavior, Assumptions, Scope, Code vs model + STRIDE, Consistency, Findings, Learnings)?
125
+ - [ ] All 6 STRIDE categories present (even if N/A)?
126
+ - [ ] Findings have severity (BLOCKING/IMPORTANT/MINOR)?
127
+ - [ ] VALIDATED section identifies what code got right?
128
+ - [ ] Learnings captured?
129
+
130
+ ## Common mistakes
131
+
132
+ - **Operating from PR description alone**: always read the actual code
133
+ - **Skipping STRIDE categories**: all 6 must be explicit, even if N/A
134
+ - **BLOCKING without FIX**: if you can't name the fix, it's not actionable enough for BLOCKING
135
+ - **No VALIDATED section**: reviews that only report problems miss what the code got right
@@ -0,0 +1,134 @@
1
+ # critiquer-auditor — Reference
2
+
3
+ ## STRIDE — audit probes (7-step audit context)
4
+
5
+ Use these probes when running COMPARER on each STRIDE category. Mark N/A explicitly; never skip.
6
+
7
+ ### S — Spoofing
8
+ - Can I impersonate another user/service in this code path?
9
+ - Identity: client-supplied or server-resolved?
10
+ - WebSocket / SSE / GraphQL subscription: same auth as REST?
11
+
12
+ ### T — Tampering
13
+ - Input modified in transit? HTTPS? Signatures?
14
+ - Idempotency keys present?
15
+ - CSRF protection on state-changing endpoints?
16
+
17
+ ### R — Repudiation
18
+ - Audit log coverage: who, what, when recorded?
19
+ - Log integrity: append-only? remote-shipped?
20
+
21
+ ### I — Information Disclosure
22
+ - Error messages: stack traces? SQL? paths?
23
+ - Logs: PII? secrets?
24
+ - Response bodies: over-fetching? unprojected columns?
25
+ - Timing attacks: 404 vs 403 distinction?
26
+
27
+ ### D — Denial of Service
28
+ - Rate limiting per IP/user/endpoint?
29
+ - Resource bounds: payload size, query depth, file upload?
30
+ - Algorithmic complexity on user-controlled input?
31
+ - Regex catastrophic backtracking?
32
+
33
+ ### E — Elevation of Privilege
34
+ - Permission check BEFORE action?
35
+ - Horizontal escalation: user A read user B's data?
36
+ - Vertical escalation: mass assignment setting `isAdmin`?
37
+
38
+ ## Severity rubric
39
+
40
+ ### BLOCKING
41
+ - Correctness bug: code produces wrong result for some input
42
+ - Security: any STRIDE finding that an attacker can exploit
43
+ - Data loss: delete/overwrite without backup/confirm
44
+ - Production crash: uncaught exception on common path
45
+
46
+ ### IMPORTANT
47
+ - Degraded behavior: works but slow / intermittent
48
+ - Tech debt with near-term risk: pattern that will break at 2x current load
49
+ - Accessibility violation: keyboard/screen reader broken
50
+ - Test debt: feature ships without meaningful test
51
+
52
+ ### MINOR
53
+ - Naming / style inconsistency
54
+ - Unused import
55
+ - Todo comment for future work
56
+ - Minor DRY violation (< 3 copies)
57
+
58
+ ### VALIDATED
59
+ - Explicit callout of what was checked and confirmed correct
60
+ - Useful because it shows the reviewer's mental map
61
+ - Helps author understand what was covered vs skipped
62
+
63
+ ## Counterfactual analysis
64
+
65
+ Questions to answer in QUESTIONNER step:
66
+
67
+ - What if we merged without this change? What breaks?
68
+ - Is there a 10% of this change that would solve 90% of the problem?
69
+ - Is this fixing a symptom or a cause? If symptom: where's the cause?
70
+ - Is this change reversible? If yes, risk is lower.
71
+
72
+ ## Bypass signal checklist (build in APPRENDRE)
73
+
74
+ Common bypass signals to look for per framework:
75
+
76
+ ### React / frontend
77
+ - `window.*` or `document.*` inside components
78
+ - `useEffect` with no dependency array
79
+ - Direct DOM manipulation via `refs.current`
80
+ - `dangerouslySetInnerHTML` with non-sanitized input
81
+
82
+ ### Backend / JVM
83
+ - Raw SQL string concatenation
84
+ - `catch(Exception e) { }` or `catch → null`
85
+ - `as` cast without type guard (Kotlin) or unchecked cast (Java)
86
+ - Thread creation without pool
87
+
88
+ ### Async / concurrent
89
+ - `async` function called without `await`
90
+ - Promise created but not awaited
91
+ - Race conditions on shared state
92
+ - Timeout of 0 or infinite
93
+
94
+ ## Layer boundary violations
95
+
96
+ - Business logic in routes / controllers → should be in services
97
+ - DB calls in controllers → should be behind repository
98
+ - UI logic in models → should be in view layer
99
+ - Tests reaching across layers without mocks
100
+
101
+ ## Overlay thresholds
102
+
103
+ If `ciel-overlay.md` exists under `## Santé du code`, check its thresholds:
104
+
105
+ ```
106
+ ### Santé du code
107
+ - Complexité cyclomatique: < 15 par fonction
108
+ - Profondeur d'imbrication: < 4
109
+ - Taille de fonction: < 50 lignes
110
+ - Couverture test: > 80% lignes modifiées
111
+ ```
112
+
113
+ If any violation: IMPORTANT finding (can be demoted to MINOR if tiny exceedance).
114
+
115
+ ## Capitalization format
116
+
117
+ When `learnings-capture` is invoked from CAPITALISER:
118
+
119
+ ```
120
+ [YYYY-MM-DD] MISTAKE: <what happened, 1 line>
121
+ → RULE: <how to avoid in future, 1 line>
122
+ → Invoke: <which skill/guard catches this>
123
+ → Evidence: <file:line where it was found>
124
+ ```
125
+
126
+ This format feeds into `ciel-overlay.md` under `## Leçons projet` (project-specific) or `.claude/learnings.md` (general).
127
+
128
+ ## Anti-patterns in audits
129
+
130
+ - Reviewing without reading the diff first → operate on assumptions
131
+ - STRIDE performed but all 6 "N/A" → didn't actually probe each category
132
+ - Only finding problems (no VALIDATED) → unclear what was checked
133
+ - BLOCKING without FIX → not actionable, author can't resolve
134
+ - Copying PR description into audit → pure theater, no independent thought