@neikyun/ciel 6.10.1 → 6.11.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/assets/.claude/hooks/memory-engine.py +256 -0
  2. package/assets/commands/ciel-audit.md +42 -0
  3. package/assets/commands/ciel-create-skill.md +2 -2
  4. package/assets/commands/ciel-status.md +1 -1
  5. package/assets/platforms/opencode/.opencode/agents/ciel-improver.md +2 -2
  6. package/assets/platforms/opencode/.opencode/commands/ciel-create-skill.md +2 -2
  7. package/assets/platforms/opencode/.opencode/commands/ciel-memory-bootstrap.md +195 -0
  8. package/assets/skills/ciel/SKILL.md +2 -1
  9. package/assets/skills/workflow/adr-auto/SKILL.md +88 -0
  10. package/assets/skills/workflow/ai-failure-modes-detector/SKILL.md +180 -0
  11. package/assets/skills/workflow/ask-window/SKILL.md +119 -0
  12. package/assets/skills/workflow/avec-quoi-versioner/SKILL.md +111 -0
  13. package/assets/skills/workflow/ci-watcher/SKILL.md +194 -0
  14. package/assets/skills/workflow/critiquer-auditor/SKILL.md +135 -0
  15. package/assets/skills/workflow/critiquer-auditor/reference.md +134 -0
  16. package/assets/skills/workflow/debug-reasoning-rca/SKILL.md +174 -0
  17. package/assets/skills/workflow/depth-classifier/SKILL.md +118 -0
  18. package/assets/skills/workflow/diverge/SKILL.md +91 -0
  19. package/assets/skills/workflow/doc-validator-official/SKILL.md +196 -0
  20. package/assets/skills/workflow/evaluer-sizer/SKILL.md +112 -0
  21. package/assets/skills/workflow/faire-gatekeeper/SKILL.md +99 -0
  22. package/assets/skills/workflow/flux-narrator/SKILL.md +93 -0
  23. package/assets/skills/workflow/memoire/SKILL.md +198 -0
  24. package/assets/skills/workflow/memoire-consolidator/SKILL.md +91 -0
  25. package/assets/skills/workflow/meta-critiquer/SKILL.md +112 -0
  26. package/assets/skills/workflow/modern-patterns-checker/SKILL.md +166 -0
  27. package/assets/skills/workflow/pattern-fitness-check/SKILL.md +108 -0
  28. package/assets/skills/workflow/playwright-visual-critic/SKILL.md +98 -0
  29. package/assets/skills/workflow/pr-review-responder/SKILL.md +214 -0
  30. package/assets/skills/workflow/prouver-verifier/SKILL.md +184 -0
  31. package/assets/skills/workflow/prouver-verifier/reference.md +152 -0
  32. package/assets/skills/workflow/quoi-framer/SKILL.md +91 -0
  33. package/assets/skills/workflow/relire-critic/SKILL.md +99 -0
  34. package/assets/skills/workflow/security-regression-check/SKILL.md +86 -0
  35. package/assets/skills/workflow/self-consistency-verifier/SKILL.md +85 -0
  36. package/assets/skills/workflow/spike-mode/SKILL.md +101 -0
  37. package/assets/skills/workflow/stride-analyzer/SKILL.md +96 -0
  38. package/assets/skills/workflow/stride-analyzer/reference.md +144 -0
  39. package/assets/skills/workflow/test-strategy-vitest-playwright/SKILL.md +119 -0
  40. package/package.json +1 -1
@@ -0,0 +1,135 @@
1
+ ---
2
+ name: critiquer-auditor
3
+ description: How to audit code comprehensively — 7-dimension review methodology covering expected behavior, assumptions, scope, code-vs-model comparison, STRIDE security, pattern consistency, and findings with severity. For PR reviews, retrospective audits, and "is this code correct?" questions.
4
+ allowed-tools: Read, Grep, Glob, Bash, WebSearch
5
+ ---
6
+
7
+ # Code Audit — 7-Dimension Review Methodology
8
+
9
+ ## What this covers
10
+
11
+ How to do a thorough code audit. Distinct from quick self-review (relire-critic) — this is the comprehensive methodology for PR reviews, retrospective audits, and quality checks.
12
+
13
+ ## Core principle
14
+
15
+ **Read the diff/changed files FIRST.** All dimensions operate on actual code, never on assumptions. Description lies; code doesn't.
16
+
17
+ ## Dimension 1: Expected behavior model
18
+
19
+ From issue/spec/PR description: "what was this SUPPOSED to do?"
20
+
21
+ - Build a bypass signal checklist for this change type BEFORE scanning code
22
+ - If external lib involved: search `[lib] [version] anti-patterns common mistakes`
23
+
24
+ Output: 1-2 sentence behavior model + min 3 bypass signals to look for.
25
+
26
+ ## Dimension 2: Assumptions
27
+
28
+ - Git blame: why was the original code written this way?
29
+ - Surface 3 assumptions, verify each (grep / blame / read)
30
+
31
+ Output: 3 assumptions + verification status each.
32
+
33
+ ## Dimension 3: Scope
34
+
35
+ - "What if we do nothing?" considered?
36
+ - Scope of change proportional to the problem?
37
+
38
+ Output: counterfactual + proportionality judgment.
39
+
40
+ ## Dimension 4: Code vs model + STRIDE + OPS
41
+
42
+ - Code matches expected behavior model? (grep-backed)
43
+ - All bypass signals checked from dimension 1's list?
44
+ - **STRIDE all 6 categories**: S / T / R / I / D / E — mark N/A explicitly, never skip silently
45
+ - OPS lens: unclosed connections, memory leaks, locks, 100x volume
46
+
47
+ ### STRIDE reference
48
+
49
+ | Category | What to check |
50
+ |----------|--------------|
51
+ | **S**poofing | Authentication bypass, identity assumption |
52
+ | **T**ampering | Data integrity, unauthorized modification |
53
+ | **R**epudiation | Audit trail, logging completeness |
54
+ | **I**nformation disclosure | Data exposure, error messages, logs |
55
+ | **D**enial of service | Resource exhaustion, infinite loops, missing limits |
56
+ | **E**levation of privilege | Authorization bypass, role escalation |
57
+
58
+ ## Dimension 5: Consistency
59
+
60
+ - Grep: pattern used consistently elsewhere in the codebase?
61
+ - Layer boundaries respected (no business logic in routes, no DB in controllers)?
62
+ - Health thresholds from overlay met (complexity, coverage)?
63
+
64
+ ## Dimension 6: Findings with severity
65
+
66
+ Format: `RISQUE: X parce que Y — IMPACT: Z`
67
+
68
+ Severity levels:
69
+ - **BLOCKING** — must fix before merge (correctness, security, data loss). Requires specific FIX.
70
+ - **IMPORTANT** — should fix (degraded behavior, tech debt with near-term risk)
71
+ - **MINOR** — nice to fix (style, naming, low-risk improvement)
72
+ - **VALIDATED** — explicitly checked and confirmed correct
73
+
74
+ Every finding: RISQUE format. Every BLOCKING: specific FIX + NOT-X (what solution must NOT do).
75
+
76
+ ## Dimension 7: Close the loop
77
+
78
+ - New anti-pattern found? → add to Guards or project overlay
79
+ - New failure mode? → add Guard immediately
80
+ - Capture learnings for future reference
81
+
82
+ ## Output format
83
+
84
+ ```
85
+ ## AUDIT
86
+
87
+ ### Expected behavior
88
+ <1-2 sentences + bypass signals>
89
+
90
+ ### Assumptions
91
+ 1. <assumption> — verified: <yes/no, evidence>
92
+ 2. ...
93
+ 3. ...
94
+
95
+ ### Scope
96
+ - Nothing-counterfactual: <consequence if no change>
97
+ - Scope proportional: <yes/no, reason>
98
+
99
+ ### Code vs model + STRIDE
100
+ - Code vs model: <matches | deviates at file:line>
101
+ - Bypass signals: <N/3 flagged>
102
+ - STRIDE:
103
+ - S: <N/A because X | RISQUE: ...>
104
+ - T/R/I/D/E: ...
105
+
106
+ ### Consistency
107
+ - Pattern: <grep evidence>
108
+ - Layers: <clean | violation at file:line>
109
+ - Thresholds: <met | violation>
110
+
111
+ ### Findings
112
+ BLOCKING: <RISQUE + FIX>
113
+ IMPORTANT: <RISQUE + FIX/ACCEPT>
114
+ MINOR: <note>
115
+ VALIDATED: <what was verified>
116
+
117
+ ### Learnings
118
+ - New Guard: <yes/no>
119
+ - Overlay update: <yes/no>
120
+ ```
121
+
122
+ ## How to verify
123
+
124
+ - [ ] All 7 dimensions completed (Expected behavior, Assumptions, Scope, Code vs model + STRIDE, Consistency, Findings, Learnings)?
125
+ - [ ] All 6 STRIDE categories present (even if N/A)?
126
+ - [ ] Findings have severity (BLOCKING/IMPORTANT/MINOR)?
127
+ - [ ] VALIDATED section identifies what code got right?
128
+ - [ ] Learnings captured?
129
+
130
+ ## Common mistakes
131
+
132
+ - **Operating from PR description alone**: always read the actual code
133
+ - **Skipping STRIDE categories**: all 6 must be explicit, even if N/A
134
+ - **BLOCKING without FIX**: if you can't name the fix, it's not actionable enough for BLOCKING
135
+ - **No VALIDATED section**: reviews that only report problems miss what the code got right
@@ -0,0 +1,134 @@
1
+ # critiquer-auditor — Reference
2
+
3
+ ## STRIDE — audit probes (7-step audit context)
4
+
5
+ Use these probes when running COMPARER on each STRIDE category. Mark N/A explicitly; never skip.
6
+
7
+ ### S — Spoofing
8
+ - Can I impersonate another user/service in this code path?
9
+ - Identity: client-supplied or server-resolved?
10
+ - WebSocket / SSE / GraphQL subscription: same auth as REST?
11
+
12
+ ### T — Tampering
13
+ - Input modified in transit? HTTPS? Signatures?
14
+ - Idempotency keys present?
15
+ - CSRF protection on state-changing endpoints?
16
+
17
+ ### R — Repudiation
18
+ - Audit log coverage: who, what, when recorded?
19
+ - Log integrity: append-only? remote-shipped?
20
+
21
+ ### I — Information Disclosure
22
+ - Error messages: stack traces? SQL? paths?
23
+ - Logs: PII? secrets?
24
+ - Response bodies: over-fetching? unprojected columns?
25
+ - Timing attacks: 404 vs 403 distinction?
26
+
27
+ ### D — Denial of Service
28
+ - Rate limiting per IP/user/endpoint?
29
+ - Resource bounds: payload size, query depth, file upload?
30
+ - Algorithmic complexity on user-controlled input?
31
+ - Regex catastrophic backtracking?
32
+
33
+ ### E — Elevation of Privilege
34
+ - Permission check BEFORE action?
35
+ - Horizontal escalation: user A read user B's data?
36
+ - Vertical escalation: mass assignment setting `isAdmin`?
37
+
38
+ ## Severity rubric
39
+
40
+ ### BLOCKING
41
+ - Correctness bug: code produces wrong result for some input
42
+ - Security: any STRIDE finding that an attacker can exploit
43
+ - Data loss: delete/overwrite without backup/confirm
44
+ - Production crash: uncaught exception on common path
45
+
46
+ ### IMPORTANT
47
+ - Degraded behavior: works but slow / intermittent
48
+ - Tech debt with near-term risk: pattern that will break at 2x current load
49
+ - Accessibility violation: keyboard/screen reader broken
50
+ - Test debt: feature ships without meaningful test
51
+
52
+ ### MINOR
53
+ - Naming / style inconsistency
54
+ - Unused import
55
+ - Todo comment for future work
56
+ - Minor DRY violation (< 3 copies)
57
+
58
+ ### VALIDATED
59
+ - Explicit callout of what was checked and confirmed correct
60
+ - Useful because it shows the reviewer's mental map
61
+ - Helps author understand what was covered vs skipped
62
+
63
+ ## Counterfactual analysis
64
+
65
+ Questions to answer in QUESTIONNER step:
66
+
67
+ - What if we merged without this change? What breaks?
68
+ - Is there a 10% of this change that would solve 90% of the problem?
69
+ - Is this fixing a symptom or a cause? If symptom: where's the cause?
70
+ - Is this change reversible? If yes, risk is lower.
71
+
72
+ ## Bypass signal checklist (build in APPRENDRE)
73
+
74
+ Common bypass signals to look for per framework:
75
+
76
+ ### React / frontend
77
+ - `window.*` or `document.*` inside components
78
+ - `useEffect` with no dependency array
79
+ - Direct DOM manipulation via `refs.current`
80
+ - `dangerouslySetInnerHTML` with non-sanitized input
81
+
82
+ ### Backend / JVM
83
+ - Raw SQL string concatenation
84
+ - `catch(Exception e) { }` or `catch → null`
85
+ - `as` cast without type guard (Kotlin) or unchecked cast (Java)
86
+ - Thread creation without pool
87
+
88
+ ### Async / concurrent
89
+ - `async` function called without `await`
90
+ - Promise created but not awaited
91
+ - Race conditions on shared state
92
+ - Timeout of 0 or infinite
93
+
94
+ ## Layer boundary violations
95
+
96
+ - Business logic in routes / controllers → should be in services
97
+ - DB calls in controllers → should be behind repository
98
+ - UI logic in models → should be in view layer
99
+ - Tests reaching across layers without mocks
100
+
101
+ ## Overlay thresholds
102
+
103
+ If `ciel-overlay.md` exists under `## Santé du code`, check its thresholds:
104
+
105
+ ```
106
+ ### Santé du code
107
+ - Complexité cyclomatique: < 15 par fonction
108
+ - Profondeur d'imbrication: < 4
109
+ - Taille de fonction: < 50 lignes
110
+ - Couverture test: > 80% lignes modifiées
111
+ ```
112
+
113
+ If any violation: IMPORTANT finding (can be demoted to MINOR if tiny exceedance).
114
+
115
+ ## Capitalization format
116
+
117
+ When `learnings-capture` is invoked from CAPITALISER:
118
+
119
+ ```
120
+ [YYYY-MM-DD] MISTAKE: <what happened, 1 line>
121
+ → RULE: <how to avoid in future, 1 line>
122
+ → Invoke: <which skill/guard catches this>
123
+ → Evidence: <file:line where it was found>
124
+ ```
125
+
126
+ This format feeds into `ciel-overlay.md` under `## Leçons projet` (project-specific) or `.claude/learnings.md` (general).
127
+
128
+ ## Anti-patterns in audits
129
+
130
+ - Reviewing without reading the diff first → operate on assumptions
131
+ - STRIDE performed but all 6 "N/A" → didn't actually probe each category
132
+ - Only finding problems (no VALIDATED) → unclear what was checked
133
+ - BLOCKING without FIX → not actionable, author can't resolve
134
+ - Copying PR description into audit → pure theater, no independent thought
@@ -0,0 +1,174 @@
1
+ ---
2
+ name: debug-reasoning-rca
3
+ description: How to debug systematically — hypothesis-driven root cause analysis methodology. 3 parallel hypotheses, fault-type taxonomy (model/context/orchestration/environment), semantic diff between expected and actual behavior. For bugs, incidents, flaky tests, regressions, production failures.
4
+ allowed-tools: Read, Grep, Glob, Bash
5
+ ---
6
+
7
+ # Systematic Debugging — Root Cause Analysis Methodology
8
+
9
+ ## What this covers
10
+
11
+ How to find the real cause of a bug, not just patch the symptom. Default LLM failure: jump to the first plausible fix. Proper debugging is hypothesis-driven (Hunt & Thomas) and catches 75% more recurrences (STRATUS 2025).
12
+
13
+ ## Core principle
14
+
15
+ **Never propose a fix before a hypothesis is SUPPORTED by evidence.** "It might be this, let me fix it" is forbidden.
16
+
17
+ ## Step 1: Gather context
18
+
19
+ Before hypothesizing, understand the failure:
20
+
21
+ - **Read the error literally** — stack trace, log line, exit code. What does the system actually say?
22
+ - **Read the failing code** at the exact `file:line` from the trace
23
+ - **Check recent changes** — `git log -p --since="7 days ago" -- <scope>`. A recent bug usually has a recent cause.
24
+ - **Run the repro** once and capture full output
25
+
26
+ Skip this step = hypotheses based on vibes.
27
+
28
+ ## Step 2: Generate 3 hypotheses
29
+
30
+ Generate EXACTLY 3 **causally distinct** hypotheses. Not 3 variants of the same theory.
31
+
32
+ Format:
33
+ ```
34
+ H<n>: <cause> → <mechanism> → <observable effect>
35
+ Evidence for: <what would be true if correct>
36
+ Evidence against: <what would be true if wrong>
37
+ Fault-type: [MODEL | CONTEXT | ORCHESTRATION | ENVIRONMENT]
38
+ ```
39
+
40
+ ### Fault-type taxonomy
41
+
42
+ | Type | What it means | Example |
43
+ |------|--------------|---------|
44
+ | **MODEL** | Code logic wrong | Off-by-one, wrong algorithm, wrong assumption |
45
+ | **CONTEXT** | Missing/stale input | Wrong config, race window, state leak |
46
+ | **ORCHESTRATION** | Infrastructure misconfigured | Retry/timeout wrong, queue backlog |
47
+ | **ENVIRONMENT** | External change | Dependency drift, OS change, infra outage |
48
+
49
+ **Distribution rule**: hypotheses must span AT LEAST 2 fault-types. Three MODEL hypotheses = tunnel vision.
50
+
51
+ ## Step 3: Validate (targeted checks)
52
+
53
+ For each hypothesis, run ONE targeted check (not fix):
54
+
55
+ - MODEL → add a log line or unit test asserting the expected invariant
56
+ - CONTEXT → dump actual input/config at failure point; diff vs expected
57
+ - ORCHESTRATION → check retry count, timeout, queue depth at failure time
58
+ - ENVIRONMENT → `<pkg-mgr> list | grep <dep>` vs lockfile; `uname -a`
59
+
60
+ Record: evidence collected, hypothesis supported/refuted/inconclusive.
61
+
62
+ ## Step 4: Semantic diff
63
+
64
+ Once supported, write the diff between expected and actual:
65
+
66
+ ```
67
+ EXPECTED: <behavior that should happen>
68
+ ACTUAL: <behavior that happens>
69
+ GAP: <precise mechanism>
70
+ ROOT: <why the gap exists — not "because of the bug", the underlying why>
71
+ ```
72
+
73
+ If ROOT reads like "because the code is buggy" — you've only found the symptom. Ask "why" again.
74
+
75
+ ## Step 5: Fix (two layers)
76
+
77
+ - **Direct fix** — address the supported hypothesis (the bug itself)
78
+ - **Systemic fix** — address why the bug was possible (missing test, missing alert, missing type)
79
+
80
+ Systemic fix is the 75% MTTR-reduction lever. Don't skip it on Critical bugs.
81
+
82
+ ## Output format
83
+
84
+ ```
85
+ ## RCA VERDICT
86
+
87
+ ### Symptom
88
+ <1 sentence>
89
+
90
+ ### Repro
91
+ <exact command or "flaky — triggers ~1/N runs">
92
+
93
+ ### Hypotheses explored
94
+ H1 [MODEL]: <cause> — <supported|refuted|inconclusive> — <evidence>
95
+ H2 [CONTEXT]: <cause> — <supported|refuted|inconclusive> — <evidence>
96
+ H3 [ORCHESTRATION]: <cause> — <supported|refuted|inconclusive> — <evidence>
97
+
98
+ ### Root cause
99
+ <hypothesis number>: <cause>
100
+
101
+ ### Semantic diff
102
+ EXPECTED/ACTUAL/GAP/ROOT
103
+
104
+ ### Fix
105
+ - Direct: <exact code change>
106
+ - Systemic: <test/alert/process to add>
107
+
108
+ ### Confidence
109
+ HIGH | MEDIUM | LOW — <why>
110
+ ```
111
+
112
+ ## Auto-inference (before asking the user)
113
+
114
+ Exhaust these sources before flagging input as unknown:
115
+
116
+ - **SYMPTOM** → grep last error in user's prompt; tail service logs; check recent PR descriptions
117
+ - **REPRO** → read `package.json` scripts, `Makefile`, `README.md`, test files, CI workflow
118
+ - **SCOPE** → `git diff HEAD~10 --stat` then rank by overlap with symptom keywords
119
+ - **RECENT_CHANGES** → `git log --since="7 days ago" --oneline -- <scope>`
120
+
121
+ State inferred values as `[ASSUMED from <source>]`. Only flag as `[UNKNOWN]` if truly blocking.
122
+
123
+ ## How to verify
124
+
125
+ - [ ] ≥ 3 hypotheses generated (not just 1)?
126
+ - [ ] Each hypothesis has a fault type from the taxonomy?
127
+ - [ ] Semantic diff completed (EXPECTED vs ACTUAL vs GAP)?
128
+ - [ ] Root cause identified with evidence (file:line)?
129
+ - [ ] Fix addresses root cause, not symptom?
130
+ - [ ] Confidence level stated (HIGH/MEDIUM/LOW)?
131
+
132
+ ## Anti-patterns
133
+
134
+ - **Patch-the-symptom**: add try/catch without understanding WHY it failed
135
+ - **Fix-the-test**: modify assertion to match wrong behavior instead of fixing code
136
+ - **Guess-and-check**: 5 commits titled "try fix" — no hypothesis discipline
137
+ - **First-hypothesis-wins**: commit first theory without validating alternatives
138
+ - **No repro, no RCA**: chasing intermittent bugs without deterministic repro burns hours
139
+
140
+ ## Structured RCA methods (complementary)
141
+
142
+ The 3-hypothesis method above is the default — fast, hypothesis-driven, good for most bugs. For complex, recurrent, or systemic problems, these structured RCA methods add depth.
143
+
144
+ ### Decision guide
145
+
146
+ | Problem type | Method | Why |
147
+ |-------------|--------|-----|
148
+ | Linear, single-symptom | **3 hypotheses** (default) | Fastest — parallel hypotheses, minimal overhead |
149
+ | Recurrent incident, process failure | **5 Whys** | Iterative questioning reaches systemic root cause |
150
+ | Multi-factor, need exhaustive exploration | **Ishikawa (Fishbone)** | 6M families (Method/Machine/Manpower/Material/Milieu/Measurement) guide complete coverage |
151
+ | Multi-layer, complex system | **Drill Down / Tree Diagram** | Decompose recursively (build → deploy → runtime → data) into atomic sub-causes; visualize as tree |
152
+ | Interacting causes, feedback loops | **Relations Diagram** | Map causal links, count outbound/inbound arrows to find drivers vs effects |
153
+
154
+ **When to use the full sequence**: if the problem involves ≥ 3 interacting factors across distinct system layers, use the full chain: Ishikawa (explore) → Relations Diagram (map interactions) → 5 Whys on each promising node → Tree Diagram (document). For simpler problems, pick one method from the guide.
155
+
156
+ ### 5 Whys
157
+
158
+ Ask "why?" iteratively (5× typical) on the symptom. Each answer becomes the next question. Stop when the cause is systemic/process-level, not technical. **Anti-pattern**: stopping at "error 500" — the real cause may be "no integration test catches this path."
159
+
160
+ ### Ishikawa (Fishbone)
161
+
162
+ Draw a horizontal spine ending at the problem (fish head). Add diagonal bones for 6 families: Method, Machine, Manpower, Material, Milieu, Measurement (adapt to software: Technology, Data/API). Branch sub-causes off each family. **Anti-pattern**: filling every family superficially — depth > breadth.
163
+
164
+ ### Drill Down / Tree Diagram
165
+
166
+ Decompose the problem into 2-4 MECE sub-causes at each level, recursing until atomic (directly fixable). Visualize the result as a hierarchical tree with AND/OR logic per branch. These are the same analytical process — decomposition (Drill Down) and visualization (Tree Diagram). **Anti-pattern**: stopping at shallow levels — "module X crashes" isn't actionable, "method Y throws Z when condition W" is.
167
+
168
+ ### Relations Diagram
169
+
170
+ List all discovered factors. For each pair, ask if causation exists and in which direction. Draw arrows. Count outbound (drivers) vs inbound (effects). Nodes with the most outbound arrows are root cause candidates. **Anti-pattern**: connecting everything — if most factors connect to most others, the diagram is not discriminating; focus on clear causal links only.
171
+
172
+ ## Key insight
173
+
174
+ The hardest part of debugging is not finding the fix — it's resisting the urge to fix before understanding. The 3-hypothesis discipline forces you to consider alternatives before committing to one.
@@ -0,0 +1,118 @@
1
+ ---
2
+ name: depth-classifier
3
+ description: Classifies a coding task as Trivial, Standard, or Critical based on mechanical signals (auth paths, security code, DB tables, diff size, route handlers). Use at the start of every Ciel workflow to determine which downstream skills to invoke. Returns a one-word depth + rationale + pipeline recommendation.
4
+ allowed-tools: Read, Grep, Glob
5
+ ---
6
+
7
+ # depth-classifier — Classify task depth
8
+
9
+ Gatekeeper skill at the entry of every Ciel workflow. Wrong classification = wrong depth = either waste (over-processing trivial) or risk (under-processing critical).
10
+
11
+ ---
12
+
13
+ ## Inputs
14
+
15
+ - **task**: the task description in natural language (from `/ciel <task>` or user message)
16
+ - **project-root** (optional): absolute path, defaults to CWD
17
+ - **overlay** (optional): `ciel-overlay.md` content if available
18
+
19
+ ---
20
+
21
+ ## Classification signals
22
+
23
+ ### Critical if ANY match:
24
+
25
+ - Path patterns: `auth/`, `security/`, `Token`, `Password`, `Secret`, `Session`, `Crypto`
26
+ - DB table names: `users`, `sessions`, `tokens`, `accounts`, `credentials`, `2fa`, `api_keys`
27
+ - Code patterns: `.executeQuery`, `.executeUpdate`, raw SQL, `userId` (server-provided vs client-provided), `role`, `permission`
28
+ - Task keywords: "authentication", "authorization", "payment", "migration (DB schema)", "JWT", "OAuth", "encryption", "2FA", "session"
29
+ - Scope: touches user data, money, audit trails
30
+
31
+ ### Standard if ANY match (and not Critical):
32
+
33
+ - Path patterns: `routes/`, `controllers/`, `services/`, `components/`, `hooks/`
34
+ - **CI/CD & pipeline files**: `.github/workflows/*.yml`, `.gitlab-ci.yml`, `.circleci/`, `Dockerfile`, `docker-compose*.yml`, `Jenkinsfile`, `.buildkite/`, `.drone.yml`
35
+ - **PR-review signals**:
36
+ - Prompt contains a PR number (`#\d+`, `PR \d+`, `pull request \d+`) OR phrases "open PR", "review PR", "fix PR", "merge PR"
37
+ - Planned tool calls include `gh pr list`, `gh pr view`, `gh pr checks`, `gh pr review`, `gh pr merge` (any variant: `--auto`, `--squash`, `--merge`, `--rebase`)
38
+ - Planned edits touch any CI/CD pipeline file (see row above)
39
+ - Diff scope (estimated): > 1 file OR > 50 lines change
40
+ - Code patterns: `validate`, `sanitize`, `rateLimit`, route handlers, state management
41
+ - Task keywords: "add endpoint", "new component", "refactor", "extract helper", "feature", "integration"
42
+
43
+ **Floor rule**: if ANY PR-review signal OR any CI/CD-file signal is present, depth is **at minimum Standard** — Trivial is disqualified even if the diff is small. PR review plus CI fix is never "just a one-line change".
44
+
45
+ ### Trivial otherwise:
46
+
47
+ - Rename, typo, 1-line fix, copyright update, README edit
48
+ - Single-file localized change ≤ 10 lines
49
+ - No business logic change
50
+
51
+ ### Default rule
52
+
53
+ If unsure → **Standard**. If touching user data or auth → **Critical**.
54
+
55
+ ---
56
+
57
+ ## Pipeline recommendations
58
+
59
+ Return pipeline for each depth:
60
+
61
+ ### Trivial
62
+ `quoi-framer` → `pattern-fitness-check` → `faire-gatekeeper` → `relire-critic` (inline) → push → `meta-critiquer`
63
+
64
+ ### Standard
65
+ `quoi-framer` → `avec-quoi-versioner` → [researcher agent + explorer agent IN PARALLEL] → `evaluer-sizer` → `faire-gatekeeper` → `critic` agent MODE=RELIRE → `prouver-verifier` → `meta-critiquer`
66
+
67
+ ### Critical
68
+ All of Standard + `stride-analyzer` (after `avec-quoi-versioner`) + `security-regression-check` (between FAIRE and RELIRE) + critic agent MANDATORY
69
+
70
+ ---
71
+
72
+ ## Output format
73
+
74
+ ```
75
+ ## DEPTH CLASSIFICATION
76
+
77
+ Depth: **Trivial | Standard | Critical**
78
+
79
+ Signals detected:
80
+ - [signal 1 with source — e.g. "path matches /auth/"]
81
+ - [signal 2]
82
+
83
+ Rationale: [1-2 sentences]
84
+
85
+ Pipeline:
86
+ 1. <skill>
87
+ 2. <skill>
88
+ ...
89
+
90
+ Agents required:
91
+ - [researcher: yes/no]
92
+ - [explorer: yes/no]
93
+ - [critic: yes/no]
94
+ ```
95
+
96
+ ---
97
+
98
+ ## Guardrails
99
+
100
+ - **Asymmetric bias**: when borderline between Trivial/Standard → Standard wins. When borderline between Standard/Critical → Critical wins. Missing a Critical is worse than over-processing a Standard.
101
+ - **Auth/security override**: any mention of auth, credentials, tokens, or user identity → Critical regardless of diff size
102
+ - **Single-line fix can still be Critical**: e.g. a 1-char fix in an auth check is Critical
103
+ - **Don't infer from filename alone**: `UserService.kt` could be Trivial if the change is a rename. Look at the actual code change being proposed.
104
+
105
+ ---
106
+
107
+ ## How to verify
108
+
109
+ - [ ] Classification signals checked (Critical, Standard, Trivial)?
110
+ - [ ] Pipeline recommendation provided?
111
+ - [ ] Default rule applied (Unsure → Standard)?
112
+ - [ ] Auth/security files → Critical?
113
+
114
+ ## When triggered
115
+
116
+ - Automatically at start of `/ciel <task>` via the `ciel` orchestrator
117
+ - By `UserPromptSubmit` hook (light classification hint injected into context)
118
+ - Explicitly when depth is ambiguous after initial assessment
@@ -0,0 +1,91 @@
1
+ ---
2
+ name: diverge
3
+ description: How to explore 2-3 radically different approaches before choosing one (Ciel v5 etape 5). Used after AVEC QUOI, before RECHERCHE. Prevents single-approach bias and premature convergence. Use when the task is non-trivial and there are multiple valid approaches.
4
+ ---
5
+
6
+ # Divergent Exploration — 2-3 Approaches Before Choosing (Ciel v5)
7
+
8
+ ## What this covers
9
+
10
+ How to explore multiple approaches before committing to one. In Ciel v5, this is etape 5 (DIVERGE). The goal is to avoid premature convergence on the first viable approach that comes to mind.
11
+
12
+ ## Core principle
13
+
14
+ **Generate 2-3 approaches before evaluating any of them.** The first approach that works is rarely the best. In v5, DIVERGE happens after AVEC QUOI (versions checked) and before RECHERCHE (external research).
15
+
16
+ ## When to use
17
+
18
+ - Non-trivial tasks with multiple valid solutions
19
+ - Architectural decisions
20
+ - Library/framework choices
21
+ - Design patterns
22
+ - Database schema design
23
+ - API design
24
+
25
+ **When NOT to use**: 1-line fix, rename, trivial config change, obvious solution.
26
+
27
+ ## The process
28
+
29
+ ### Step 1: Generate at least 2 approaches
30
+
31
+ For each approach, describe:
32
+ - What it does (1-2 sentences)
33
+ - Key trade-offs (not "it's better" -- specific pros/cons)
34
+ - Implementation effort (rough estimate)
35
+ - Risk level (low/medium/high)
36
+
37
+ Approaches should be GENUINELY different. Not "use React vs use React with hooks" -- same approach. Bad:
38
+ - "Use PostgreSQL vs MySQL" (trivial database choice)
39
+ - "Use REST vs GraphQL" (genuinely different)
40
+
41
+ ### Step 2: Let them compete (not you decide)
42
+
43
+ Generate approaches WITHOUT evaluating them. Evaluation happens in EVALUER (etape 9), after RECHERCHE has gathered external data about each approach.
44
+
45
+ Common trap: generating 2 approaches but immediately choosing the first one without research.
46
+
47
+ ### Step 3: Document for EVALUER
48
+
49
+ Pass both approaches (with their trade-offs, effort, risk) to the EVALUER phase. The researcher should check documentation for BOTH approaches.
50
+
51
+ ## Output format
52
+
53
+ ```
54
+ ## DIVERGE
55
+
56
+ ### Approach A: <name>
57
+ What: <1-2 sentences>
58
+ Trade-offs:
59
+ + <pro>
60
+ - <con>
61
+ Effort: <XS/S/M/L/XL>
62
+ Risk: <low/medium/high>
63
+
64
+ ### Approach B: <name>
65
+ What: <1-2 sentences>
66
+ Trade-offs:
67
+ + <pro>
68
+ - <con>
69
+ Effort: <XS/S/M/L/XL>
70
+ Risk: <low/medium/high>
71
+
72
+ ### (Optional) Approach C: <name>
73
+ ...
74
+ ```
75
+
76
+ ## Common rationalizations
77
+
78
+ | Rationalization | Reality |
79
+ |---|---|
80
+ | "I already know the best approach" | You know the first approach that came to mind. That's not the same as the best approach. Generate 2-3 then compare. |
81
+ | "Diverging takes too long" | It takes 5 minutes. Committing to the wrong approach costs days. The math is clear. |
82
+ | "There's only one valid way to do this" | There are almost always 2+ valid approaches. If you can't think of alternatives, you don't understand the problem well enough. |
83
+
84
+ ## How to verify
85
+
86
+ - [ ] >= 2 genuinely different approaches generated?
87
+ - [ ] Approaches are different in kind, not degree?
88
+ - [ ] Trade-offs documented for each?
89
+ - [ ] Effort estimated?
90
+ - [ ] Risk assessed?
91
+ - [ ] Evaluation deferred to next phase (EVALUER)?