@neikyun/ciel 6.3.0 → 6.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/assets/.claude/settings.json +1 -1
- package/assets/CLAUDE.md +5 -9
- package/assets/commands/ciel-audit.md +195 -59
- package/assets/commands/ciel-status.md +1 -1
- package/assets/commands/ciel-update.md +4 -0
- package/assets/dist/plugin/index.js +7 -9
- package/assets/platforms/opencode/.opencode/agents/ciel-critic.md +320 -483
- package/assets/platforms/opencode/.opencode/agents/ciel-explorer.md +113 -95
- package/assets/platforms/opencode/.opencode/agents/ciel-improver.md +204 -273
- package/assets/platforms/opencode/.opencode/agents/ciel-researcher.md +259 -270
- package/assets/platforms/opencode/.opencode/agents/ciel.md +1 -1
- package/assets/platforms/opencode/.opencode/commands/ciel-audit.md +300 -10
- package/assets/platforms/opencode/.opencode/commands/ciel-create-skill.md +75 -10
- package/assets/platforms/opencode/.opencode/commands/ciel-eval.md +71 -10
- package/assets/platforms/opencode/.opencode/commands/ciel-improve.md +7 -13
- package/assets/platforms/opencode/.opencode/commands/ciel-init.md +165 -11
- package/assets/platforms/opencode/.opencode/commands/ciel-migrate.md +5 -0
- package/assets/platforms/opencode/.opencode/commands/ciel-refresh.md +89 -13
- package/assets/platforms/opencode/.opencode/commands/ciel-status.md +6 -1
- package/assets/platforms/opencode/.opencode/commands/ciel-update.md +31 -18
- package/assets/platforms/opencode/.opencode/commands/ciel.md +1 -2
- package/assets/platforms/opencode/.opencode/plugins/ciel.ts +146 -0
- package/assets/platforms/opencode/AGENTS.md +3 -3
- package/assets/skills/ciel/SKILL.md +1 -1
- package/dist/plugin/index.d.ts.map +1 -1
- package/dist/plugin/index.js +7 -9
- package/dist/plugin/index.js.map +1 -1
- package/package.json +3 -2
|
@@ -33,7 +33,7 @@ Tu es l'orchestrateur Ciel v6. Analyse, planifie, implemente, verifie.
|
|
|
33
33
|
|
|
34
34
|
## Regles (immutables)
|
|
35
35
|
|
|
36
|
-
1. **
|
|
36
|
+
1. **Pipeline interne** — classifie la profondeur et suis les 16 étapes en interne. N'affiche pas la classification.
|
|
37
37
|
2. **Pipeline** — suis les 16 etapes (tableau ci-dessous). Le plugin injecte un rappel avant chaque message.
|
|
38
38
|
3. **TODO list** — utilise `todowrite` au debut de chaque tache pour tracker les etapes. Marque chaque etape completed/in_progress au fur et a mesure.
|
|
39
39
|
4. **ASK** — utilise `question` tool SEULEMENT si ambigu. Si le contexte est suffisant, DECIDE et avance. Ne demande pas pour chaque etape.
|
|
@@ -1,19 +1,309 @@
|
|
|
1
1
|
---
|
|
2
|
-
|
|
3
|
-
description: Session post-mortem — detect Ciel paradigm violations
|
|
2
|
+
description: ---
|
|
4
3
|
subtask: false
|
|
5
4
|
---
|
|
6
5
|
|
|
7
|
-
|
|
6
|
+
---
|
|
7
|
+
description: Audits the current Claude Code session for Ciel paradigm violations (missed Task dispatches, inline gathering, hook inactivity, skill overlaps, intent routing misses). Produces a structured report with a Ciel Health Score (0-100). If score < 75, creates a GitHub Issue on the Ciel repository with the findings and session timeline. Hook-independent — works even when Ciel hooks are broken.
|
|
8
|
+
---
|
|
8
9
|
|
|
9
|
-
|
|
10
|
+
# /ciel-audit — Session post-mortem
|
|
11
|
+
|
|
12
|
+
*Generates a structured report of Ciel behavior violations observed in the current session. Calculates a Ciel Health Score (0-100). If the score is below 75, creates a GitHub Issue on the Ciel repository (github.com/KaosKyun/Ciel) with the full timeline and findings — otherwise produces the report only without creating an issue.*
|
|
10
13
|
|
|
11
14
|
Usage: `/ciel-audit`
|
|
12
15
|
|
|
13
|
-
|
|
16
|
+
Runs inline in the main session. Does not dispatch agents. Does not depend on hooks being active.
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Instructions to the model
|
|
21
|
+
|
|
22
|
+
You are auditing the **current conversation session** — the one you are participating in right now. Jump directly to the analysis. No preamble. No meta-commentary. No "I will now audit…".
|
|
23
|
+
|
|
24
|
+
### What to audit
|
|
25
|
+
|
|
26
|
+
Scan the session's tool-use history (your own prior turns). For each `/ciel <task>` invocation in this session, check the **eight dimensions** below. For each dimension, assign a severity and penalty score to calculate the final Ciel Health Score.
|
|
27
|
+
|
|
28
|
+
#### Dimension 1: Dispatch discipline (critical) — penalty up to -25
|
|
29
|
+
|
|
30
|
+
- Did the assistant emit a `Task(subagent_type="ciel-*")` within the **first 3 tool calls** after the `/ciel` prompt?
|
|
31
|
+
- If NO: count the inline `Bash` / `Read` / `Grep` / `Glob` / `WebSearch` / `WebFetch` calls emitted in the main session before any dispatch. This is the v2.1.5 anti-pattern documented in `skills/ciel/SKILL.md:146-156`.
|
|
32
|
+
- Exception: Trivial tasks (rename, typo, 1-line fix, docs-only) are allowed to run inline.
|
|
33
|
+
- Severity→score:
|
|
34
|
+
- Critical dispatch violation (Standard/Critical task with no Task()): **-25**
|
|
35
|
+
- High (delayed dispatch, >1 inline tool before dispatch): **-15**
|
|
36
|
+
- OK (Task() within first 3 tool calls): **0**
|
|
37
|
+
|
|
38
|
+
#### Dimension 2: Hook activity (critical) — penalty up to -25
|
|
39
|
+
|
|
40
|
+
Search the session transcript for strings that Ciel hooks would have injected:
|
|
41
|
+
|
|
42
|
+
- `"CIEL depth hint:"` — from `hooks/user-prompt-submit.sh:36`, injected via `additionalContext` on every UserPromptSubmit.
|
|
43
|
+
- `"CIEL "` prefix (e.g., `"CIEL [CRITIQUE]"`, `"CIEL src/...` ) — from `hooks/pre-tool-write.sh:34,36`, injected before every Write/Edit.
|
|
44
|
+
- Session banner from `hooks/session-start.sh` (SessionStart context).
|
|
45
|
+
- `"META-CRITIQUER"` or similar end-of-session signal from `hooks/stop.sh`.
|
|
46
|
+
|
|
47
|
+
- None found despite writes/edits: **-25** → hooks broken
|
|
48
|
+
- Partial (some hooks firing but not all): **-10**
|
|
49
|
+
- All present: **0**
|
|
50
|
+
|
|
51
|
+
Most likely root cause: relative paths in `Ciel/settings.json:8,19,31,42,54,65,76`. The `command` field is written as `bash .claude/plugins/ciel/hooks/<file>.sh`, which Claude Code resolves against the current working directory, not the plugin directory.
|
|
52
|
+
|
|
53
|
+
#### Dimension 3: Skill invocation coverage vs depth — penalty up to -15
|
|
54
|
+
|
|
55
|
+
Cross-reference `skills/ciel/SKILL.md:20-64` (Depth Gauge) with what actually happened:
|
|
56
|
+
|
|
57
|
+
- **Standard** task → `researcher` and `explorer` agents must be dispatched in parallel before FAIRE. Missing both: **-15**. Missing one: **-8**.
|
|
58
|
+
- **Critical** task → must additionally invoke `stride-analyzer` and `security-regression-check`. Missing: **-15**.
|
|
59
|
+
- Depth ambiguous and `depth-classifier` not invoked: **-5**.
|
|
60
|
+
|
|
61
|
+
#### Dimension 4: Skill overlap / redundancy — penalty up to -10
|
|
62
|
+
|
|
63
|
+
- Both `relire-critic` AND `critiquer-auditor` on the same diff: **-10**
|
|
64
|
+
- `meta-critiquer` did not fire at end-of-task: **-5**
|
|
65
|
+
- Same skill invoked 3+ times for same scope: **-5**
|
|
66
|
+
|
|
67
|
+
#### Dimension 5: Agent report quality — penalty up to -5
|
|
68
|
+
|
|
69
|
+
- Any `Task(subagent_type="ciel-*")` with returned message under 200 tokens: **-5** per occurrence (max -10)
|
|
70
|
+
|
|
71
|
+
#### Dimension 6: Intent routing hits / misses — penalty up to -10
|
|
72
|
+
|
|
73
|
+
Scan the user's `/ciel` prompt text against the intent signals in `SKILL.md:79-94`. For each matched intent, verify the corresponding skill was invoked. Each miss: **-5** (max -10).
|
|
74
|
+
|
|
75
|
+
#### Dimension 7: npm version staleness — penalty up to -10
|
|
76
|
+
|
|
77
|
+
Check if a newer version of Ciel is available on npm:
|
|
78
|
+
|
|
79
|
+
```bash
|
|
80
|
+
NPM_VERSION=$(npm view @neikyun/ciel version 2>/dev/null || echo "unknown")
|
|
81
|
+
LOCAL_VERSION=$(cat /path/to/VERSION 2>/dev/null || cat package.json 2>/dev/null | grep '"version"' | head -1 | cut -d'"' -f4 || echo "unknown")
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
- npm version > local version: **-10**
|
|
85
|
+
- Version check failed (npm not installed, no network): **0** (not a violation, just incomplete data)
|
|
86
|
+
- Versions match or local is newer: **0**
|
|
87
|
+
|
|
88
|
+
If npm version > local version, include an **Update notification** in the report:
|
|
89
|
+
> A newer version of Ciel is available on npm: **{npm_version}** (installed: **{local_version}**). Run `ciel-update` or `npm update -g @neikyun/ciel` to upgrade.
|
|
90
|
+
|
|
91
|
+
#### Dimension 8: Platform health — penalty up to -5
|
|
92
|
+
|
|
93
|
+
Check which Ciel platforms are present and whether they have valid configurations:
|
|
94
|
+
|
|
95
|
+
```bash
|
|
96
|
+
ls /path/to/platforms/ 2>/dev/null
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
Expected platforms: codex, cursor, kilocode, lmstudio, ollama, opencode, windsurf
|
|
100
|
+
|
|
101
|
+
- All platforms present: **0**
|
|
102
|
+
- Missing 1-2 platforms: **-3**
|
|
103
|
+
- Missing 3+ platforms: **-5**
|
|
104
|
+
|
|
105
|
+
---
|
|
106
|
+
|
|
107
|
+
### Scoring
|
|
108
|
+
|
|
109
|
+
**Ciel Health Score** = 100 - sum(penalties)
|
|
110
|
+
|
|
111
|
+
| Score range | Status | Issue created? |
|
|
112
|
+
|-------------|--------|----------------|
|
|
113
|
+
| 90-100 | Excellent | No |
|
|
114
|
+
| 75-89 | Good (minor issues) | No |
|
|
115
|
+
| 50-74 | Needs improvement | **Yes** — creates issue with timeline |
|
|
116
|
+
| 0-49 | Critical | **Yes** — creates issue with timeline |
|
|
117
|
+
|
|
118
|
+
The score is calculated automatically from the detected violations.
|
|
119
|
+
|
|
120
|
+
---
|
|
121
|
+
|
|
122
|
+
### Report format
|
|
123
|
+
|
|
124
|
+
Begin the output with the literal line `# Ciel Session Audit Report`. End with the literal line `**End of audit report.**` on its own line.
|
|
125
|
+
|
|
126
|
+
```markdown
|
|
127
|
+
# Ciel Session Audit Report
|
|
128
|
+
|
|
129
|
+
**Date**: <today's date>
|
|
130
|
+
**Ciel Health Score**: <N>/100 — <Excellent|Good|Needs improvement|Critical>
|
|
131
|
+
**npm**: local v<X.Y.Z> | npm v<X.Y.Z> | <up-to-date|update available>
|
|
132
|
+
**Platforms**: codex ✓ cursor ✓ kilo ✓ lmstudio ✓ ollama ✓ opencode ✓ windsurf ✓ (or ✗ if missing)
|
|
133
|
+
**Session summary**: <N> /ciel invocation(s), <N> total tool calls, <N> Task() dispatches, <N> inline Bash/Read/Grep/WebSearch calls in main session.
|
|
134
|
+
|
|
135
|
+
**Verdict**: <PASS | VIOLATIONS FOUND>
|
|
136
|
+
|
|
137
|
+
---
|
|
138
|
+
|
|
139
|
+
## Violations detected
|
|
140
|
+
|
|
141
|
+
### <N>. <Short violation name> — penalty: -<N>
|
|
142
|
+
|
|
143
|
+
**Severity**: <critical | high | medium | low>
|
|
144
|
+
**Evidence from session**
|
|
145
|
+
- User prompt (turn N): `<exact prompt text, truncated to 200 chars>`
|
|
146
|
+
- Assistant's first 3 tool calls after this prompt:
|
|
147
|
+
1. `<tool_name>(<brief args>)`
|
|
148
|
+
2. `<tool_name>(<brief args>)`
|
|
149
|
+
3. `<tool_name>(<brief args>)`
|
|
150
|
+
- Expected: `Task(subagent_type="ciel-<role>", ...)` as first tool call
|
|
151
|
+
- Observed: `<actual behavior>`
|
|
152
|
+
|
|
153
|
+
**Root cause hypothesis**
|
|
154
|
+
<one or two sentences>
|
|
155
|
+
|
|
156
|
+
**Proposed fix**
|
|
157
|
+
- <concrete edit with file:line>
|
|
158
|
+
|
|
159
|
+
...
|
|
160
|
+
|
|
161
|
+
---
|
|
162
|
+
|
|
163
|
+
## Update notification
|
|
164
|
+
|
|
165
|
+
<If npm version > local version, show update notification here. Otherwise omit this section.>
|
|
166
|
+
|
|
167
|
+
---
|
|
168
|
+
|
|
169
|
+
## Scoring breakdown
|
|
170
|
+
|
|
171
|
+
| Dimension | Penalty |
|
|
172
|
+
|-----------|---------|
|
|
173
|
+
| D1 — Dispatch discipline | -<N> |
|
|
174
|
+
| D2 — Hook activity | -<N> |
|
|
175
|
+
| D3 — Skill coverage vs depth | -<N> |
|
|
176
|
+
| D4 — Skill overlap | -<N> |
|
|
177
|
+
| D5 — Agent report quality | -<N> |
|
|
178
|
+
| D6 — Intent routing | -<N> |
|
|
179
|
+
| D7 — npm version | -<N> |
|
|
180
|
+
| D8 — Platform health | -<N> |
|
|
181
|
+
| **Total** | **-<N>** |
|
|
182
|
+
| **Health Score** | **<N>/100** |
|
|
183
|
+
|
|
184
|
+
---
|
|
185
|
+
|
|
186
|
+
## Summary of fixes to apply
|
|
187
|
+
|
|
188
|
+
Apply these in order. Each points to a file:line and a concrete change.
|
|
189
|
+
|
|
190
|
+
1. **<Fix name>** — `<path>:<line>` — <one-line description>
|
|
191
|
+
2. ...
|
|
192
|
+
|
|
193
|
+
After each fix: re-run the scenario that triggered the violation in a fresh Claude session to verify.
|
|
194
|
+
|
|
195
|
+
---
|
|
196
|
+
|
|
197
|
+
**End of audit report.**
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
### If no violations are found (score = 100)
|
|
201
|
+
|
|
202
|
+
Output a single short section — **no issue is created** for PASS verdicts:
|
|
203
|
+
|
|
204
|
+
```markdown
|
|
205
|
+
# Ciel Session Audit Report
|
|
206
|
+
|
|
207
|
+
**Date**: <today's date>
|
|
208
|
+
**Ciel Health Score**: 100/100 — Excellent
|
|
209
|
+
**npm**: local v<X.Y.Z> | npm v<X.Y.Z> | up-to-date
|
|
210
|
+
**Platforms**: all 7 platforms present ✓
|
|
211
|
+
**Session summary**: <N> /ciel invocation(s), <N> tool calls, <N> Task() dispatches.
|
|
212
|
+
**Verdict**: PASS
|
|
213
|
+
|
|
214
|
+
**No violations found.** No issue created.
|
|
215
|
+
|
|
216
|
+
**End of audit report.**
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
---
|
|
220
|
+
|
|
221
|
+
### GitHub Issue creation (only if score < 75)
|
|
222
|
+
|
|
223
|
+
If the Ciel Health Score is **below 75**, create a GitHub Issue with the report AND the session timeline.
|
|
224
|
+
|
|
225
|
+
**Important**: Do NOT create an issue if score >= 75. Only create for scores < 75.
|
|
226
|
+
|
|
227
|
+
1. **Check for duplicate issues first**:
|
|
228
|
+
```bash
|
|
229
|
+
EXISTING=$(gh issue list --repo KaosKyun/Ciel --label audit --state open \
|
|
230
|
+
--json title --jq '.[].title' 2>/dev/null | \
|
|
231
|
+
grep -c "^\[CIEL-AUDIT\] $(date +%Y-%m-%d) -" || true)
|
|
232
|
+
if [ "$EXISTING" -gt 0 ]; then
|
|
233
|
+
echo "Warning: today's audit issue already exists. Skipping creation."
|
|
234
|
+
echo "View existing at: https://github.com/KaosKyun/Ciel/issues?q=is%3Aopen+label%3Aaudit"
|
|
235
|
+
exit 0
|
|
236
|
+
fi
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
2. **Save the report to a temp file**: Write the complete report text to a temp file:
|
|
240
|
+
```bash
|
|
241
|
+
REPORT_FILE="/tmp/ciel-audit-report-$(date +%Y%m%d).md"
|
|
242
|
+
cat > "$REPORT_FILE" << 'REPORT_EOF'
|
|
243
|
+
# Ciel Session Audit Report
|
|
244
|
+
...
|
|
245
|
+
**End of audit report.**
|
|
246
|
+
REPORT_EOF
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
3. **Determine the repository**: run `gh repo view --json owner,name` to confirm, or default to `KaosKyun/Ciel`.
|
|
250
|
+
|
|
251
|
+
4. **Create the issue** with `gh issue create`, using Python to avoid shell quoting issues:
|
|
252
|
+
```bash
|
|
253
|
+
python3 -c "
|
|
254
|
+
import subprocess, sys
|
|
255
|
+
ymd, date_str, score, verdict = sys.argv[1:5]
|
|
256
|
+
report_path = f'/tmp/ciel-audit-report-{ymd}.md'
|
|
257
|
+
with open(report_path) as f:
|
|
258
|
+
body = f.read()
|
|
259
|
+
subprocess.run(['gh', 'issue', 'create',
|
|
260
|
+
'--repo', 'KaosKyun/Ciel',
|
|
261
|
+
'--title', f'[CIEL-AUDIT] {date_str} - {verdict} (Score: {score}/100)',
|
|
262
|
+
'--label', 'audit,ciel',
|
|
263
|
+
'--body', body])
|
|
264
|
+
" "$(date +%Y%m%d)" "$(date +%Y-%m-%d)" "<score>" "<verdict>"
|
|
265
|
+
```
|
|
266
|
+
|
|
267
|
+
5. **Include the session timeline** in the issue body. Before the report, add a timeline section listing all `/ciel` invocations in chronological order:
|
|
268
|
+
```markdown
|
|
269
|
+
## Session Timeline
|
|
270
|
+
|
|
271
|
+
| Time | Event |
|
|
272
|
+
|------|-------|
|
|
273
|
+
| T+N | /ciel <prompt excerpt> |
|
|
274
|
+
| T+N | Write/Edit on <file> |
|
|
275
|
+
| T+N | Task() dispatch to <agent> |
|
|
276
|
+
...
|
|
277
|
+
|
|
278
|
+
---
|
|
279
|
+
|
|
280
|
+
<full audit report>
|
|
281
|
+
```
|
|
282
|
+
|
|
283
|
+
6. **Handle errors gracefully**:
|
|
284
|
+
- If `gh` is not available: print a message telling the user to install (`brew install gh`) and print the report to stdout instead.
|
|
285
|
+
- If the repository can't be reached: skip issue creation, print a warning, but still output the report.
|
|
286
|
+
|
|
287
|
+
7. **After creating the issue**: include the issue URL at the end of your response so the user can open it directly.
|
|
288
|
+
|
|
289
|
+
**Title format**: `[CIEL-AUDIT] YYYY-MM-DD - VIOLATIONS FOUND (Score: <N>/100)`
|
|
290
|
+
|
|
291
|
+
**Labels**: `audit`, `ciel`
|
|
292
|
+
|
|
293
|
+
---
|
|
294
|
+
|
|
295
|
+
### Tone and length
|
|
296
|
+
|
|
297
|
+
- **Mechanically actionable, not narrative**. File paths, line numbers, before/after snippets.
|
|
298
|
+
- Target length: 400–800 lines of markdown for a session with 2–3 violations. Shorter if PASS.
|
|
299
|
+
- No rhetorical preamble. No meta-commentary about the audit process itself. No emoji. No closing remarks after the `**End of audit report.**` marker.
|
|
300
|
+
|
|
301
|
+
### What NOT to do
|
|
14
302
|
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
303
|
+
- Do NOT fix violations in the current session. Only produce the report and optionally create the issue.
|
|
304
|
+
- Do NOT invoke other Ciel skills. This command is fully self-contained.
|
|
305
|
+
- Do NOT dispatch `Task()` agents. Audit happens inline.
|
|
306
|
+
- Do NOT ask clarifying questions. Produce the report with the information you have.
|
|
307
|
+
- Do NOT create an issue if score >= 75. Only create for score < 75.
|
|
308
|
+
- Do NOT create duplicate issues — run the `gh issue list` check before creating.
|
|
309
|
+
- Do NOT restart, rerun, or attempt to fix the session in-flight. The audit report is the deliverable.
|
|
@@ -1,20 +1,85 @@
|
|
|
1
1
|
---
|
|
2
|
-
|
|
3
|
-
description: Generate a new Ciel SKILL.md scaffold
|
|
2
|
+
description: ---
|
|
4
3
|
agent: ciel-improver
|
|
5
4
|
subtask: true
|
|
6
5
|
---
|
|
7
6
|
|
|
8
|
-
|
|
7
|
+
> **OpenCode note**: This command requires `claude --print` headless mode for full functionality (binary evals, skill scaffold generation). On OpenCode it runs in degraded mode — the improver agent returns proposals only. For the full harness, use Claude Code.
|
|
9
8
|
|
|
10
|
-
|
|
9
|
+
---
|
|
10
|
+
description: Generates a valid Ciel SKILL.md scaffold following Anthropic Skills-first rules (kebab-case ≤64, YAML description ≤1024, body ≤500 lines).
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# /ciel-create-skill — Create a new Ciel skill
|
|
14
|
+
|
|
15
|
+
*Generates a valid SKILL.md scaffold following Anthropic Skills-first rules (kebab-case name ≤64 chars, YAML frontmatter ≤1024-char description, ≤500-line body, progressive disclosure to one reference.md).*
|
|
11
16
|
|
|
12
17
|
Usage: `/ciel-create-skill <name> <purpose>`
|
|
13
18
|
|
|
14
|
-
|
|
19
|
+
- `name` — kebab-case, max 64 chars, unique across `skills/`
|
|
20
|
+
- `purpose` — one-line description of what the skill does
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## What it does
|
|
25
|
+
|
|
26
|
+
1. Dispatches the `improver` agent in MODE=CREATE-SKILL
|
|
27
|
+
2. The agent invokes `skill-creator` skill
|
|
28
|
+
3. Validates name, category (you pick), description length, uniqueness, paths glob
|
|
29
|
+
4. Generates a scaffold SKILL.md + optional reference.md
|
|
30
|
+
5. Returns the proposed files for user review
|
|
31
|
+
6. On approval, writes files + registers in `skills/ciel/reference.md` catalog
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## Example
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
/ciel-create-skill kotlin-coroutines-mastery "Expert in Kotlin coroutines: structured concurrency, Flow operators, cancellation, testing. Use when working with suspend functions, CoroutineScope, Flow, or coroutine builders."
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
Expected output:
|
|
42
|
+
1. Validation: ✓ name valid, unique, no reserved words
|
|
43
|
+
2. Category suggestion: `domain` (based on "Expert in X" pattern)
|
|
44
|
+
3. Preview of `skills/domain/kotlin-coroutines-mastery/SKILL.md` (~150 lines)
|
|
45
|
+
4. Preview of `skills/domain/kotlin-coroutines-mastery/reference.md` (~300 lines) with Flow operators cheatsheet
|
|
46
|
+
5. Catalog entry: `| kotlin-coroutines-mastery | Kotlin files with suspend/Flow |`
|
|
47
|
+
6. Approve? [y/n/edit]
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
## Naming rules
|
|
52
|
+
|
|
53
|
+
- kebab-case only (no underscores, no camelCase)
|
|
54
|
+
- Max 64 chars
|
|
55
|
+
- No reserved words: `anthropic`, `claude`, `mcp`
|
|
56
|
+
- Must be unique across `skills/**/SKILL.md`
|
|
57
|
+
- Warn if starts with category name (e.g. `workflow-foo` in `workflow/` is redundant)
|
|
58
|
+
|
|
59
|
+
---
|
|
60
|
+
|
|
61
|
+
## Category decision
|
|
62
|
+
|
|
63
|
+
- `workflow` — enforces a CRÉER/CRITIQUER/META-CRITIQUER step
|
|
64
|
+
- `research` — finds information outside the codebase
|
|
65
|
+
- `domain` — encodes expertise in a specific tech or pattern family
|
|
66
|
+
- `utility` — wraps a frequent mechanical operation
|
|
67
|
+
- `meta` — modifies Ciel itself
|
|
68
|
+
|
|
69
|
+
---
|
|
70
|
+
|
|
71
|
+
## Guardrails
|
|
72
|
+
|
|
73
|
+
- **Max 1 new skill per invocation** — prevents skill explosion
|
|
74
|
+
- **SKILL.md size**: ≤ 300 lines (hard cap)
|
|
75
|
+
- **reference.md size**: ≤ 500 lines (hard cap)
|
|
76
|
+
- **Duplication detection**: warns if description overlaps ≥ 70% with existing skill
|
|
77
|
+
- **Never creates**: files in `.claude-plugin/`, agents, hooks, commands — those use different patterns
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## After creation
|
|
15
82
|
|
|
16
|
-
1.
|
|
17
|
-
2.
|
|
18
|
-
3.
|
|
19
|
-
4. Returns proposed skill for user approval
|
|
20
|
-
5. Never writes autonomously
|
|
83
|
+
1. Test the skill: `/ciel-eval <name>` (runs baseline eval if dataset exists)
|
|
84
|
+
2. If it's a `workflow` skill, update `skills/ciel/SKILL.md` pipeline sections to reference it
|
|
85
|
+
3. Commit with message `feat(ciel): add <category>/<name> skill`
|
|
@@ -1,27 +1,88 @@
|
|
|
1
1
|
---
|
|
2
|
-
|
|
3
|
-
description: Run binary eval harness for Ciel skills
|
|
2
|
+
description: ---
|
|
4
3
|
agent: ciel-improver
|
|
5
4
|
subtask: true
|
|
6
5
|
---
|
|
7
6
|
|
|
7
|
+
> **OpenCode note**: This command requires `claude --print` headless mode for full functionality (binary evals, skill scaffold generation). On OpenCode it runs in degraded mode — the improver agent returns proposals only. For the full harness, use Claude Code.
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
description: Runs the binary eval dataset for one or all Ciel skills via claude --print headless, comparing variants.
|
|
11
|
+
---
|
|
12
|
+
|
|
8
13
|
# /ciel-eval — Run eval harness
|
|
9
14
|
|
|
10
|
-
Runs the binary eval dataset for one skill (or all skills), comparing variants and persisting scoreboards
|
|
15
|
+
*Runs the binary eval dataset for one skill (or all skills) via `claude --print` headless mode, comparing variants and persisting scoreboards.*
|
|
11
16
|
|
|
12
17
|
Usage: `/ciel-eval [skill-name]`
|
|
13
18
|
|
|
14
19
|
- No arg → runs baseline eval for ALL skills that have a dataset
|
|
15
20
|
- `skill-name` → runs eval for that skill only (baseline + any variants in `variants/`)
|
|
16
21
|
|
|
17
|
-
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## What it does
|
|
18
25
|
|
|
19
26
|
1. Dispatches the `improver` agent in MODE=EVAL
|
|
20
27
|
2. The agent invokes `skill-variant-evaluator` skill
|
|
21
28
|
3. For each target skill:
|
|
22
|
-
- Loads dataset from `evals/datasets/<skill>.jsonl`
|
|
23
|
-
-
|
|
24
|
-
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
29
|
+
- Loads dataset from `evals/datasets/<skill-name>.jsonl`
|
|
30
|
+
- If variants exist (`skills/<category>/<name>/variants/*.md`), evaluates each
|
|
31
|
+
- Otherwise, baseline-only scoring
|
|
32
|
+
4. Persists results to `evals/results/<skill-name>-<timestamp>.json`
|
|
33
|
+
5. Returns a scoreboard
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
## Example output
|
|
38
|
+
|
|
39
|
+
```
|
|
40
|
+
# Eval results — flux-narrator
|
|
41
|
+
|
|
42
|
+
Dataset: evals/datasets/flux-narration.jsonl (8 entries)
|
|
43
|
+
Variants evaluated: 3
|
|
44
|
+
|
|
45
|
+
| Variant | Score | Tokens | Duration |
|
|
46
|
+
|---------|-------|--------|----------|
|
|
47
|
+
| A (baseline) | 0.72 | 14.5k | 12.5s |
|
|
48
|
+
| B (tightened) | 0.89 | 15.1k | 13.2s ← WINNER |
|
|
49
|
+
| C (reduced) | 0.81 | 12.8k | 11.7s |
|
|
50
|
+
|
|
51
|
+
Winner: **Variant B** (+0.17 over baseline)
|
|
52
|
+
|
|
53
|
+
Recommendation: adopt Variant B via `/ciel-improve`
|
|
54
|
+
|
|
55
|
+
Result persisted: evals/results/flux-narrator-2026-04-17-142345.json
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
## Guardrails
|
|
61
|
+
|
|
62
|
+
- **Max 20 eval entries per dataset** — prevents runaway costs
|
|
63
|
+
- **Max 3 variants per skill** — A/B/C, more is noise
|
|
64
|
+
- **Cost estimation** — displayed before starting; user can abort if > 500k tokens projected
|
|
65
|
+
- **Read-only evals** — `--disallowed-tools "Write Edit NotebookEdit"` to prevent side effects
|
|
66
|
+
- **Graceful fallback** — if `claude --print` is unavailable, outputs manual eval prompts and waits for user input
|
|
67
|
+
|
|
68
|
+
---
|
|
69
|
+
|
|
70
|
+
## When to run
|
|
71
|
+
|
|
72
|
+
- Before and after every `/ciel-improve` pass (baseline vs improved)
|
|
73
|
+
- On every CHANGELOG version bump — regression sanity check
|
|
74
|
+
- When creating a new skill — establish baseline score
|
|
75
|
+
- When adding new eval criteria — re-score existing skills
|
|
76
|
+
|
|
77
|
+
---
|
|
78
|
+
|
|
79
|
+
## Eval dataset format
|
|
80
|
+
|
|
81
|
+
One JSON object per line (JSONL). See `evals/datasets/depth-classification.jsonl` for examples. Required fields:
|
|
82
|
+
|
|
83
|
+
- `id` — unique in dataset
|
|
84
|
+
- `input` — prompt / context passed to skill
|
|
85
|
+
- `expected_behavior` — object mapping criterion name → boolean
|
|
86
|
+
- `skill` — which skill this tests
|
|
87
|
+
|
|
88
|
+
See `skills/meta/skill-variant-evaluator/reference.md` for the full schema and supported criterion evaluators.
|
|
@@ -1,19 +1,13 @@
|
|
|
1
1
|
---
|
|
2
|
-
|
|
3
|
-
description: Self-improvement — analyze sessions and propose skill patches
|
|
4
|
-
agent: ciel-improver
|
|
5
|
-
subtask: true
|
|
2
|
+
description: Slash trigger for the ciel-improve skill on OpenCode.
|
|
6
3
|
---
|
|
7
4
|
|
|
8
|
-
|
|
5
|
+
Invoke the `ciel-improve` skill via the Skill tool with the user's arguments:
|
|
9
6
|
|
|
10
|
-
|
|
7
|
+
```
|
|
8
|
+
$ARGUMENTS
|
|
9
|
+
```
|
|
11
10
|
|
|
12
|
-
|
|
11
|
+
If `$ARGUMENTS` is empty, invoke the skill with no argument — it will classify the current context and prompt for a task if needed.
|
|
13
12
|
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
1. Scans `.ciel/learnings.md` for recent user corrections
|
|
17
|
-
2. Analyzes session transcripts for failure patterns
|
|
18
|
-
3. Produces a patch-set with specific skill rewrites
|
|
19
|
-
4. Never rewrites autonomously — returns proposals for review
|
|
13
|
+
The full logic (depth classifier, intent routing, pipeline selection, agent dispatch rules) lives in the `ciel-improve` skill itself. This command file is a thin trigger; modify the skill, not this file, to change behavior.
|