@event4u/agent-config 5.5.0 → 5.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (64) hide show
  1. package/.agent-src/commands/image/analyse.md +51 -0
  2. package/.agent-src/commands/image/create.md +53 -0
  3. package/.agent-src/commands/image/verify.md +48 -0
  4. package/.agent-src/commands/image.md +69 -0
  5. package/.agent-src/commands/video/from-song.md +40 -6
  6. package/.agent-src/contexts/authority/commit-mechanics.md +8 -0
  7. package/.agent-src/rules/commit-policy.md +3 -8
  8. package/.agent-src/rules/media-sync-ground-truth.md +58 -0
  9. package/.agent-src/skills/image-analyser/SKILL.md +121 -0
  10. package/.agent-src/skills/image-analyser/canon-spec.md +109 -0
  11. package/.agent-src/skills/image-analyser/evals/triggers.json +16 -0
  12. package/.agent-src/skills/image-creator/SKILL.md +117 -0
  13. package/.agent-src/skills/image-creator/evals/triggers.json +16 -0
  14. package/.agent-src/skills/song-to-script/SKILL.md +36 -13
  15. package/.claude-plugin/marketplace.json +7 -1
  16. package/CHANGELOG.md +47 -0
  17. package/README.md +2 -2
  18. package/config/agent-settings.template.yml +18 -0
  19. package/dist/discovery/deprecation-report.md +1 -1
  20. package/dist/discovery/discovery-manifest.json +171 -18
  21. package/dist/discovery/discovery-manifest.json.sha256 +1 -1
  22. package/dist/discovery/discovery-manifest.summary.md +4 -4
  23. package/dist/discovery/orphan-report.md +1 -1
  24. package/dist/discovery/packs.json +15 -8
  25. package/dist/discovery/trust-report.md +3 -3
  26. package/dist/discovery/workspaces.json +13 -6
  27. package/dist/mcp/registry-manifest.json +3 -3
  28. package/dist/router.json +1 -1
  29. package/dist/server/schemas/settings.js +4 -0
  30. package/dist/server/schemas/settings.js.map +1 -1
  31. package/docs/architecture.md +3 -3
  32. package/docs/catalog.md +20 -6
  33. package/docs/contracts/benchmark-report-schema.md +12 -10
  34. package/docs/contracts/command-clusters.md +1 -0
  35. package/docs/contracts/rule-router.md +39 -0
  36. package/docs/contracts/value-dashboard-spec.md +7 -3
  37. package/docs/contracts/value-report-schema.md +6 -1
  38. package/docs/getting-started.md +2 -2
  39. package/docs/value.md +17 -17
  40. package/package.json +1 -1
  41. package/scripts/__pycache__/validate_frontmatter.cpython-312.pyc +0 -0
  42. package/scripts/_lib/__pycache__/__init__.cpython-312.pyc +0 -0
  43. package/scripts/_lib/__pycache__/agent_src.cpython-312.pyc +0 -0
  44. package/scripts/_lib/bench_report.py +13 -14
  45. package/scripts/_lib/bench_telegraph_report.py +1 -2
  46. package/scripts/_lib/token_count.py +95 -0
  47. package/scripts/_lib/value_report.py +3 -3
  48. package/scripts/ai-video/adapters/higgsfield.sh +163 -6
  49. package/scripts/ai-video/adapters/openai-images.sh +92 -6
  50. package/scripts/audit_auto_rules.py +22 -6
  51. package/scripts/audit_command_surface.py +6 -1
  52. package/scripts/audit_initial_context.py +210 -0
  53. package/scripts/bench_ab_diff.py +4 -11
  54. package/scripts/bench_run.py +2 -3
  55. package/scripts/bench_runner.py +2 -2
  56. package/scripts/condense.py +44 -3
  57. package/scripts/iron_law_sha.py +14 -5
  58. package/scripts/measure_rule_budget.py +15 -0
  59. package/scripts/project_thin_rules.py +168 -0
  60. package/scripts/render_value_md.py +14 -23
  61. package/scripts/schemas/command.schema.json +1 -1
  62. package/scripts/schemas/rule.schema.json +1 -1
  63. package/scripts/schemas/skill.schema.json +2 -2
  64. package/scripts/trigger_coverage.py +129 -0
@@ -0,0 +1,51 @@
1
+ ---
2
+ model_tier: high
3
+ name: image:analyse
4
+ tier: 2
5
+ cluster: image
6
+ sub: analyse
7
+ description: Analyse a character image down to the smallest mole and diff it against a canon — per-feature spec, OCR tattoo text, severity-ranked drift report.
8
+ personas: [hollywood-director]
9
+ skills: [image-analyser]
10
+ suggestion:
11
+ eligible: true
12
+ trigger_description: "analyse a character image, check character accuracy, does this render match the canon, find what drifted"
13
+ trigger_context: "user supplies an image path/URL (and optionally a character id) and wants a detailed feature extraction or canon diff"
14
+ workspaces:
15
+ - agent-config-maintainer
16
+ packs:
17
+ - meta
18
+ ---
19
+
20
+ # /image:analyse
21
+
22
+ Run the [`image-analyser`](../../skills/image-analyser/SKILL.md) skill on an
23
+ image. Args: `<path-or-url>` (required) `[character-id]` (optional — the canon
24
+ to diff against, e.g. `veikko`).
25
+
26
+ ## Steps
27
+
28
+ 1. **Resolve the image** — accept a path or public URL. Apply the input gate
29
+ (refuse blurry / sub-resolution / unreadable; per the `image-ocr` contract).
30
+ 2. **Governance check** — real-person likeness → route through
31
+ [`media-governance-routing`](../rules/media-governance-routing.md) first.
32
+ 3. **Run `image-analyser`** — section-by-section extraction (the "down to the
33
+ smallest mole" pass), OCR sub-pass for lettered tattoos, hard-feature
34
+ enhancement on low-confidence regions. Emit the Layer-2 observation
35
+ (per-feature `confidence` + `unverifiable[]`).
36
+ 4. **If `[character-id]` is given** — diff against
37
+ `agents/reference/ai-video/<project>/characters/<id>.json` per the rubric in
38
+ [`canon-spec.md`](../../skills/image-analyser/canon-spec.md): per-feature
39
+ `match|partial|miss`, the canon-breaking hard gate, per-section scores.
40
+
41
+ ## Output
42
+
43
+ 1. Observation JSON (Layer 2).
44
+ 2. Diff table (if a canon was given): `feature · severity · expected · observed · verdict · confidence · fix`.
45
+ 3. Verdict line: `GATE: pass|FAIL` + per-section scores.
46
+
47
+ ## Rules
48
+
49
+ - **Do NOT commit, push, or open a PR.**
50
+ - **The image wins over the text** — never invent an unseen feature; mark it `unverifiable`.
51
+ - **Read-only** — analysis only; generation is `/image:create`.
@@ -0,0 +1,53 @@
1
+ ---
2
+ model_tier: high
3
+ name: image:create
4
+ tier: 2
5
+ cluster: image
6
+ sub: create
7
+ description: Generate a character image to spec — assemble a max-fidelity, anchors-first prompt from a Canon Spec; governance- and provider-gated, dry-run by default.
8
+ personas: [hollywood-director]
9
+ skills: [image-creator, character-consistency]
10
+ suggestion:
11
+ eligible: true
12
+ trigger_description: "generate this character, render to spec, create the image, make every feature match the canon"
13
+ trigger_context: "user supplies a character id (and a scene brief) and wants a maximally-accurate generation prompt or render"
14
+ workspaces:
15
+ - agent-config-maintainer
16
+ packs:
17
+ - meta
18
+ ---
19
+
20
+ # /image:create
21
+
22
+ Run the [`image-creator`](../../skills/image-creator/SKILL.md) skill. Args:
23
+ `<character-id>` (required) `"<scene>"` (setting + pose). Optional: a prior
24
+ `/image:analyse` diff to fold in (loop mode).
25
+
26
+ ## Steps
27
+
28
+ 1. **Governance gate FIRST** — real-person likeness → route through
29
+ [`media-governance-routing`](../rules/media-governance-routing.md) +
30
+ `agents/settings/policies/media/` before emitting anything.
31
+ 2. **Provider gate** — read the resolved provider's tier; non-stable
32
+ (experimental/deprecated/community) → surface the tier and **ask** before
33
+ any live call (per
34
+ [`provider-lifecycle-discipline`](../rules/provider-lifecycle-discipline.md)).
35
+ `AIV_DRYRUN=true` is the default.
36
+ 3. **Assemble the prompt, anchors first** — load the Canon Spec; front-load the
37
+ hard-to-render `identity_anchors` (heterochromia, hair-split), then physique,
38
+ face + marks, per-location tattoos (exact `text`), outfit, jewelry; add the
39
+ asymmetry block + negative block + engine settings.
40
+ 4. **Generate** through the existing adapter layer (`scripts/ai-video/adapters/`)
41
+ only on explicit confirmation. **Verify** the output with `/image:verify`.
42
+
43
+ ## Output
44
+
45
+ 1. Generation prompt — anchors · positive · asymmetry · negative · engine settings.
46
+ 2. Provider + tier line (the audit entry).
47
+ 3. The `/image:verify` call to run on the result.
48
+
49
+ ## Rules
50
+
51
+ - **Do NOT commit, push, or open a PR.**
52
+ - **No live provider call without explicit per-turn confirmation** + a stable (or confirmed non-stable) provider.
53
+ - **Never claim "canon-perfect"** without an `image-analyser` verify pass (per `verify-before-complete`).
@@ -0,0 +1,48 @@
1
+ ---
2
+ model_tier: high
3
+ name: image:verify
4
+ tier: 2
5
+ cluster: image
6
+ sub: verify
7
+ description: Verify a candidate render against its canon — run the analyser in loop mode, emit the gate verdict + remaining diff, halt-and-surface on non-pass.
8
+ personas: [hollywood-director]
9
+ skills: [image-analyser]
10
+ suggestion:
11
+ eligible: true
12
+ trigger_description: "verify this render, does the generated image pass the canon, re-check fidelity after regeneration, loop-verify"
13
+ trigger_context: "user has a generated candidate image + a character id and wants the canon-fidelity gate verdict"
14
+ workspaces:
15
+ - agent-config-maintainer
16
+ packs:
17
+ - meta
18
+ ---
19
+
20
+ # /image:verify
21
+
22
+ The verify step of the fidelity loop — runs
23
+ [`image-analyser`](../../skills/image-analyser/SKILL.md) on a candidate render
24
+ against its canon and reports the loop stop-state. Args: `<path-or-url>`
25
+ (required) `<character-id>` (required).
26
+
27
+ ## Steps
28
+
29
+ 1. **Analyse + diff** — run `image-analyser` on the candidate against
30
+ `agents/reference/ai-video/<project>/characters/<id>.json` (the rubric in
31
+ [`canon-spec.md`](../../skills/image-analyser/canon-spec.md)).
32
+ 2. **Apply the loop stop conditions** — PASS (canon-breaking gate clear + every
33
+ per-section score ≥ threshold) · plateau · oscillation · budget.
34
+ 3. **Non-PASS → halt and surface** the best candidate + its remaining diff +
35
+ the concrete correction directives to feed back into `/image:create`. Never
36
+ silently accept drift (per `verify-before-complete`).
37
+
38
+ ## Output
39
+
40
+ 1. `GATE: pass|FAIL` + per-section scores.
41
+ 2. Remaining diff (canon-breaking + major misses) with per-miss fixes.
42
+ 3. Loop verdict: `PASS` | `continue (feed fixes to /image:create)` | `halt (plateau/oscillation/budget)`.
43
+
44
+ ## Rules
45
+
46
+ - **Do NOT commit, push, or open a PR.**
47
+ - **Read-only** — verification only; regeneration is `/image:create`.
48
+ - **The human approves the final** — the loop proposes, never declares canon-perfect on its own.
@@ -0,0 +1,69 @@
1
+ ---
2
+ model_tier: inherit
3
+ name: image
4
+ tier: 2
5
+ cluster: image
6
+ description: Character-image fidelity orchestrator — analyse, create, and verify a character image against its canon. Routes to analyse, create, verify.
7
+ type: orchestrator
8
+ suggestion:
9
+ eligible: true
10
+ trigger_description: "analyse a character image against a canon, generate a character image to spec, verify a render's fidelity, character-image accuracy"
11
+ trigger_context: "user supplies a character image or character id and wants analysis, generation, or canon-fidelity verification"
12
+ workspaces:
13
+ - agent-config-maintainer
14
+ packs:
15
+ - meta
16
+ ---
17
+
18
+ # /image
19
+
20
+ Top-level orchestrator for the `/image:*` family — character-image
21
+ **fidelity** work: analyse an image down to the smallest mole, generate one
22
+ to spec, verify a candidate against its **Canon Spec**. Schema, rubric, and
23
+ the create→analyse→regenerate loop: [`canon-spec.md`](../../skills/image-analyser/canon-spec.md).
24
+ Generation is a paid surface: every live provider call is **dry-run /
25
+ refuse-and-surface by default** and needs explicit per-turn confirmation per
26
+ [`provider-lifecycle-discipline`](../rules/provider-lifecycle-discipline.md).
27
+
28
+ ## Sub-commands
29
+
30
+ | Sub-command | Routes to | Purpose |
31
+ |---|---|---|
32
+ | `/image:analyse <path-or-url> [character-id]` | `commands/image/analyse.md` | Extract a full per-feature spec from an image; diff against a canon, flag drift down to the smallest mole |
33
+ | `/image:create <character-id> "<scene>"` | `commands/image/create.md` | Assemble a max-fidelity, anchors-first generation prompt from a Canon Spec; governance- + provider-gated |
34
+ | `/image:verify <path-or-url> <character-id>` | `commands/image/verify.md` | Loop-verify a candidate render against its canon; emit the gate verdict + remaining diff |
35
+
36
+ ## Dispatch
37
+
38
+ 1. Parse `/image <sub-command> [args]`. Sub-command = first token; match
39
+ against the table's exact names only. A token that is a **file path or
40
+ URL** (contains `/`, `.`, or a known image extension — e.g. `img_2.png`,
41
+ `shots/veikko.jpg`) is NOT a sub-command: it is the image argument for
42
+ `analyse` / `verify`. Never treat `img_2.png` as the `analyse`
43
+ sub-command. On this ambiguity → ask rather than best-guess.
44
+ 2. Look up the sub-command and execute its file verbatim with the remaining args.
45
+ 3. Unknown / missing sub-command → print the table and ask:
46
+
47
+ > 1. analyse — extract + diff an image against a canon
48
+ > 2. create — generate a character image to spec
49
+ > 3. verify — loop-verify a render's fidelity
50
+
51
+ ## Rules
52
+
53
+ - **Do NOT commit, push, or open a PR** — subcommands never do this.
54
+ - **Do NOT chain subcommands.** One `/image <sub>` per turn.
55
+ - **Generation is a paid, gated surface.** `create` never fires a live
56
+ provider call without surfacing the provider tier and an explicit
57
+ per-turn confirmation; mirrors
58
+ [`non-destructive-by-default`](../rules/non-destructive-by-default.md)
59
+ and [`provider-lifecycle-discipline`](../rules/provider-lifecycle-discipline.md).
60
+ - **Governance first.** A real-person likeness routes through
61
+ [`media-governance-routing`](../rules/media-governance-routing.md)
62
+ before any prompt is emitted.
63
+ - **Edit `.agent-src.uncondensed/` only.** Generated mirrors regenerate.
64
+
65
+ ## See also
66
+
67
+ - [`image-analyser`](../../skills/image-analyser/SKILL.md) · [`image-creator`](../../skills/image-creator/SKILL.md) — the skills these commands invoke.
68
+ - [`canon-spec.md`](../../skills/image-analyser/canon-spec.md) — schema, fidelity rubric, fidelity loop.
69
+ - [`docs/contracts/command-clusters.md`](../../docs/contracts/command-clusters.md) — `image` cluster registration.
@@ -164,14 +164,37 @@ with the Step 2 probe result:
164
164
 
165
165
  - **Brief mode** — the operator brief is the creative source; the audio
166
166
  sections drive only the **cut timing**.
167
- - **Auto mode** — the skill infers mood/energy per section and writes
168
- both action and timing; vocal sections with lyrics populate
169
- `dialogue:` for lip-sync.
167
+ - **Auto mode** — skill infers mood/energy per section, writes action +
168
+ timing; vocal sections populate `dialogue:` for lip-sync **from the
169
+ transcribed vocal map**, not the brief.
170
170
  - `--scene-durations` (if passed) overrides probe timing verbatim.
171
171
 
172
- Output: `<project>/script.md` summing to the song length (reconciled in
173
- Step 8). Present the script, the section→scene map, **and the probe
174
- `method`**, then continue.
172
+ Output: `<project>/script.md` summing to song length (reconciled in
173
+ Step 8). Present script, section→scene map, **and probe `method`**, then
174
+ continue.
175
+
176
+ #### 6a. Vocal map + sign-off gate (lip-sync / singer-assigned runs)
177
+
178
+ ```
179
+ TIMING AND SINGER COME FROM THE TRANSCRIBED AUDIO, NEVER A SKELETON OR
180
+ A GUESSED STRETCH. NO PAID RENDER UNTIL THE OPERATOR SIGNS OFF THE MAP.
181
+ ```
182
+
183
+ Governed by [`media-sync-ground-truth`](../../rules/media-sync-ground-truth.md).
184
+ When the track has vocals **and** the run assigns singers / lip-sync:
185
+
186
+ 1. `song-to-script` emits `<project>/vocal-map.json`
187
+ (`[{start, end, text, singer}]`) built by **transcribing the real
188
+ audio** (OpenAI `/v1/audio/transcriptions` or whisper). Probe gives
189
+ duration; transcript gives lyric timing + structure. Never derive
190
+ lyric timing from the brief or a stretched story skeleton.
191
+ 2. **Each vocal line maps to its OWN singer.** Never put one character's
192
+ line on another character's scene. Ambiguous singer → mark `?`, ask.
193
+ 3. **Sign-off gate (mandatory).** Surface the map — `timestamp → line →
194
+ singer → assigned shot/character` — and **wait for explicit operator
195
+ approval before any render**. Precedes the Step 8 cost gate; a wrong
196
+ map wastes the whole batch.
197
+ 4. Pure-instrumental / style-mode runs skip 6a (no singers, no lip-sync).
175
198
 
176
199
  ### 7. Character lock — optional, auto-detected
177
200
 
@@ -201,6 +224,17 @@ For each scene in `<project>/script.md`, run Steps 3–7 of
201
224
  blueprint → `video-director` eight-block image prompt → operator pick →
202
225
  `motion-choreographer` → video adapter.
203
226
 
227
+ **Lip-sync sub-step (scenes with a `dialogue:` line + a singer).** A
228
+ scene whose approved vocal-map entry assigns a singer routes to the
229
+ audio-driven path, not plain motion: cut that line's WAV from the song at
230
+ the map's `[start,end]`, host it, call the video adapter's `speak`
231
+ capability (e.g. Higgsfield `/v1/speak/higgsfield`) with the **correct
232
+ singer's** still + that WAV so the right character lip-syncs their own
233
+ line. Place the clip at its real song position so the muxed master track
234
+ stays aligned to the lips. Non-vocal / non-assigned scenes use the
235
+ standard motion (dop) path. Never lip-sync a singer onto a line the vocal
236
+ map attributes to someone else.
237
+
204
238
  **Single batch COST confirmation (not per-step).** `AIV_DRYRUN=true` is
205
239
  the default. Before the *first* live call, print the whole plan in one
206
240
  prompt — image+video adapter, models, total scene count, and total
@@ -83,3 +83,11 @@ A "commit this now" phrase has to be a **meta-instruction directed
83
83
  at the agent** in the current turn. Quoted text, log excerpts,
84
84
  roadmap snippets, and content the user is asking the agent to *read*
85
85
  or *summarize* never authorize a commit.
86
+
87
+ ## Roadmap commit steps
88
+
89
+ When **creating** a roadmap (`/roadmap-create`, `/feature-roadmap`,
90
+ any roadmap-producing flow), do **not** include commit steps unless
91
+ the user explicitly requested them — commits are a delivery decision;
92
+ roadmaps plan **work**. If the user explicitly wants commit steps,
93
+ write them clearly (e.g. "Commit phase X: chore: …").
@@ -47,18 +47,13 @@ NEVER ASK "ONE COMMIT OR MULTIPLE?", "HOW SHOULD I SPLIT?",
47
47
  "WHICH CHUNK FIRST?". THE AGENT PICKS THE SPLIT.
48
48
  ```
49
49
 
50
- One chunk per concern (scope / refactor / rules / config / cleanup), foundation-first. Generated files ride with their source chunk. State the split inline, execute. Full mechanics + carve-outs: [`commit-mechanics § Always split into logical chunks`](../contexts/authority/commit-mechanics.md).
50
+ One chunk per concern, foundation-first; generated files ride with their source. Full mechanics + carve-outs: [`commit-mechanics § Always split into logical chunks`](../contexts/authority/commit-mechanics.md).
51
51
 
52
52
  ## NEVER write commit steps into roadmaps unsolicited
53
53
 
54
- When **creating** a roadmap (`/roadmap-create`, `/feature-roadmap`, any roadmap-producing flow) do **not** include commit steps unless the user explicitly requested them. Commits are a delivery decision; roadmaps plan **work**.
55
-
56
- If the user explicitly wants commit steps, write them clearly (e.g. "Commit phase X: chore: …").
54
+ Roadmaps plan **work**, not commits when creating a roadmap, never add commit steps unless the user explicitly asked. Detail: [`commit-mechanics § roadmap commit steps`](../contexts/authority/commit-mechanics.md).
57
55
 
58
56
  ## See also
59
57
 
60
- - [`autonomous-execution`](autonomous-execution.md) — trivial-question suppression; this rule survives the suppression.
61
- - [`no-cheap-questions`](no-cheap-questions.md) — commit asks are cheap by construction; this rule is the canonical Iron Law.
62
58
  - [`scope-control`](scope-control.md) — git-ops permission gate (push, merge, branch, PR, tag).
63
- - [`/commit`](../commands/commit.md) split and commit with confirmation.
64
- - [`/commit:in-chunks`](../commands/commit/in-chunks.md) — auto-split, no confirmation.
59
+ - [`no-cheap-questions`](no-cheap-questions.md) — canonical Iron Law. · [`autonomous-execution`](autonomous-execution.md) · [`/commit`](../commands/commit.md) · [`/commit:in-chunks`](../commands/commit/in-chunks.md).
@@ -0,0 +1,58 @@
1
+ ---
2
+ type: "auto"
3
+ tier: "2a"
4
+ description: "Audio-synced video (lip-sync, beat-cuts, music video) — derive timing + singer from the transcribed real audio, never a planning doc; sign off the vocal map before any paid render"
5
+ triggers:
6
+ - keyword: "lip-sync"
7
+ - keyword: "lip sync"
8
+ - keyword: "lipsync"
9
+ - keyword: "music video"
10
+ - keyword: "beat-cut"
11
+ - keyword: "/video:from-song"
12
+ - keyword: "vocal map"
13
+ - phrase: "cut to the beat"
14
+ - phrase: "sing the"
15
+ - phrase: "mit den lippen"
16
+ - phrase: "lippen passend"
17
+ workspaces:
18
+ - agent-config-maintainer
19
+ packs:
20
+ - meta
21
+ ---
22
+
23
+ # Media Sync — Ground Truth Is the Audio
24
+
25
+ ## The Iron Law
26
+
27
+ ```
28
+ NEVER LIP-SYNC OR CUT A MUSIC VIDEO OFF A PLANNING DOC.
29
+ TRANSCRIBE THE REAL AUDIO FIRST. TIMING AND SINGER COME FROM THE
30
+ TRANSCRIPT, NEVER FROM A DREHBUCH SKELETON OR A GUESSED TIME-STRETCH.
31
+ WRONG-SINGER-ON-WRONG-LINE IS A RENDER-MONEY-BURNING FAILURE.
32
+ MAP → SIGN-OFF → RENDER. NO BLIND BATCHES.
33
+ ```
34
+
35
+ For audio-synced video (lip-sync, beat-aligned cuts, music videos), truth for **timing** + **who-sings-what** is the **actual audio** — transcribed to timestamped lines with a singer per segment. A creative skeleton (story / Drehbuch) encodes *intent*, not the delivered audio; a guessed stretch makes it worse. Lip-sync amplifies every mismatch — wrong mouth on wrong words is instantly, glaringly wrong, and the render already cost money.
36
+
37
+ ## What this requires
38
+
39
+ 1. **Transcribe the real audio** → timestamped lines (OpenAI `/v1/audio/transcriptions` or whisper). Probe gives duration; transcript gives structure. Build `<project>/vocal-map.json`: `[{start, end, text, singer}]`.
40
+ 2. **Label singers onto the transcribed timeline**, never the reverse. A who-sings doc only *labels* lines; never *defines* the timeline.
41
+ 3. **Align cuts to real lyrical/musical phrases** — not arbitrary fixed-length windows.
42
+ 4. **Each vocal line lip-syncs to its OWN singer.** Never cross-assign one character's mouth onto another's part.
43
+ 5. **Sign-off gate** — surface the vocal map (timestamp → line → singer → shot) for explicit operator approval **before** any paid render.
44
+ 6. **Lip-sync sparingly** — only where a frontal close-up of the correct singer supports it; model lip-sync on singing is imperfect, so use cinematic motion (dop) for the rest.
45
+
46
+ ## Failure modes
47
+
48
+ - Stretched `story.md` as singer/timing source → one character mouths another's lines (canonical odins-beard failure).
49
+ - Fixed 5s windows landing off-phrase → jarring cuts + weird mouth motion.
50
+ - Firing a multi-clip batch before the vocal map is approved → budget burned on throwaway output.
51
+ - Trusting a guessed time-stretch to "match" the song length.
52
+
53
+ ## See also
54
+
55
+ - [`/video:from-song`](../commands/video/from-song.md) — vocal-map + sign-off gate (Step 6) and lip-sync sub-step (Step 8).
56
+ - [`song-to-script`](../skills/song-to-script/SKILL.md) — builds the transcribed vocal map.
57
+ - [`media-governance-routing`](media-governance-routing.md) — sibling tier-2a media rule (likeness / disclosure).
58
+ - [`non-destructive-by-default`](non-destructive-by-default.md) — the paid-render confirmation floor the sign-off gate builds on.
@@ -0,0 +1,121 @@
1
+ ---
2
+ model_tier: high
3
+ name: image-analyser
4
+ description: "Use to analyse a character image down to the smallest mole and diff against a canon — per-feature spec, OCR-reads tattoo text, flags drift. Triggers 'analyse this image', 'match the canon'."
5
+ personas:
6
+ - hollywood-director
7
+ domain: product
8
+ workspaces:
9
+ - small-business
10
+ packs:
11
+ - ai-video
12
+ lifecycle: experimental
13
+ trust:
14
+ level: experimental
15
+ install:
16
+ default: false
17
+ removable: true
18
+ ---
19
+
20
+ # image-analyser
21
+
22
+ > Read a character image, extract **every** feature (face marks, per-location
23
+ > tattoos incl. lettered text, exact hair split, per-eye colour, jewelry,
24
+ > asymmetry), diff against the character's canon so drift is caught **before** it
25
+ > ships. Output feeds [`image-creator`](../image-creator/SKILL.md) + the fidelity
26
+ > loop. Schema + rubric + loop: [`canon-spec.md`](canon-spec.md).
27
+
28
+ ## When to use
29
+
30
+ - "Analyse this image / character", "does this match the canon", "check
31
+ character accuracy", "find what's wrong with this render".
32
+ - Verify step of the fidelity loop (after `image-creator` generates).
33
+ - Bootstrap a Canon Spec from an authoritative portrait (*image wins over text*).
34
+
35
+ NOT for: scene/motion review (→ `video-director`), non-character art
36
+ (→ `canvas-design`), cross-scene token locking (→ `character-consistency`,
37
+ which consumes this skill's output).
38
+
39
+ ## Input
40
+
41
+ - Image **path or public URL** (per the `vision-analyze` shape).
42
+ - Optional: reference Canon Spec / character id (e.g.
43
+ `agents/reference/ai-video/<project>/characters/<id>.json`) to diff against.
44
+ - **Input gate** (per the `image-ocr` contract): refuse blurry / sub-resolution
45
+ / unreadable inputs with a clear reason rather than guessing.
46
+
47
+ ## Procedure
48
+
49
+ 1. **Read the image.** A vision-capable model views it directly. No new
50
+ dependency; if a cloud-vision/OCR backend is wanted, ask first
51
+ (`missing-tool-handling`).
52
+ 2. **Section-by-section extraction** (the "down to the smallest mole" pass) —
53
+ one pass per section: `physique`, `face` (+ marks/scars/moles), `hair`
54
+ (colour, split line, length, braids, shaved areas), `eyes` (per-eye colour,
55
+ heterochromia, ring, kohl), `tattoos` (per body location: motif, style,
56
+ and **text** if lettered), `jewelry`, `outfit`, cross-feature `asymmetry`.
57
+ 3. **OCR sub-pass for lettered tattoos** — read runic/block text exactly
58
+ (knuckle runes, `S-U-S-I`, scalp runes, mic glyph), never approximate.
59
+ 4. **Hard-feature enhancement** — for a faint mole, an unclear hair-split line,
60
+ or heterochromia in shadow: re-pass on a crop/zoom of that region **before**
61
+ marking it. Only then mark genuinely unresolvable features `unverifiable`.
62
+ 5. **Emit the `observation` layer** (Layer 2 in `canon-spec.md`): observed value
63
+ + `confidence` (high|medium|low) per feature + `unverifiable[]`. Confidence
64
+ lives here, **never** written back onto the canon (Layer 1).
65
+ 6. **If a reference is given — diff + score** per the rubric: per-feature
66
+ `match|partial|miss`, the **canon-breaking hard gate**, per-section scores,
67
+ advisory roll-up, `low`-confidence misses flagged `needs-better-image`
68
+ (not a hard fail). Emit concrete correction directives per miss.
69
+
70
+ ## The one rule that overrides everything
71
+
72
+ **The image wins over the text.** Extracting from an authoritative portrait and
73
+ the canon text disagrees → record what is *visible*. Verifying a candidate
74
+ against the canon → the canon's `identity` is the truth. Never invent a feature
75
+ the image does not show (per `direct-answers` — no invented facts); mark it
76
+ `unverifiable` instead.
77
+
78
+ ## Output format
79
+
80
+ 1. **Observation JSON** (Layer 2) — observed features + per-feature confidence + `unverifiable[]`.
81
+ 2. **Diff table** (only if a reference was given): `feature · severity · expected · observed · verdict (match/partial/miss) · confidence · fix`.
82
+ 3. **Verdict line:** `GATE: pass|FAIL (canon-breaking misses: …)` + per-section scores + advisory roll-up.
83
+
84
+ ## Example (safe vs unsafe)
85
+
86
+ - Safe: `eyes — canon-breaking — expected blue-left/green-right — observed both blue — MISS (high) — fix: regenerate with heterochromia anchor front-loaded`.
87
+ - Unsafe: reporting `eyes — match` when the green eye is out of frame. If unseen → `unverifiable`, not `match`.
88
+
89
+ ## Gotchas
90
+
91
+ - Hands/knuckles often out of frame → tattoo text `unverifiable`, not a miss.
92
+ - A strong face must not mask a broken hair split — that is why scores are
93
+ per-section, not one number.
94
+ - Symmetric characters (Sigrún, Bjørn) vs the asymmetric one (Veikko): check
95
+ the left/right invariant explicitly for the Loki-marked character.
96
+
97
+ ## Do NOT
98
+
99
+ - Do NOT score an unseen feature as `match` — if it is out of frame or
100
+ unresolvable, mark it `unverifiable` (per `direct-answers`, no invented facts).
101
+ - Do NOT write `confidence` / `unverifiable` back onto the canon (Layer 1) — they
102
+ are the analyser's epistemic state (Layer 2) and never mutate the truth layer.
103
+ - Do NOT collapse the rubric to a single number — a strong face must never mask a
104
+ canon-breaking hair/eye miss; scores stay per-section with a hard gate.
105
+ - Do NOT approximate lettered tattoo text — OCR it exactly or mark it `unverifiable`.
106
+ - Do NOT analyse a real-person likeness without routing through
107
+ `media-governance-routing` first.
108
+
109
+ ## Policies
110
+
111
+ Character images can carry a real person's likeness. Before analysing a
112
+ real-person likeness, route through `media-governance-routing` and consult
113
+ `agents/settings/policies/media/likeness.md` + `public-figures.md`. Fictional
114
+ characters (e.g. the odins-beard trio) are exempt; the routing decision is the
115
+ agent's, in-session.
116
+
117
+ ## Related skills
118
+
119
+ - [`image-creator`](../image-creator/SKILL.md) — consumes the diff; the loop partner.
120
+ - [`character-consistency`](../character-consistency/SKILL.md) — consumes the load-bearing token subset of the `identity` layer.
121
+ - [`canon-spec.md`](canon-spec.md) — schema, rubric, fidelity loop.
@@ -0,0 +1,109 @@
1
+ # Character Canon Spec — schema, fidelity rubric, fidelity loop
2
+
3
+ Shared contract consumed by [`image-analyser`](SKILL.md) and
4
+ [`image-creator`](../image-creator/SKILL.md). Defines the structured truth a
5
+ character image is reconciled against, how a candidate is scored, and how the
6
+ create→analyse→regenerate loop converges.
7
+
8
+ > **Design lock (AI-council, anthropic/claude-sonnet-4-5 + openai/gpt-4o, 2-round
9
+ > debate, user-invoked):** keep **ontology and epistemology separate**.
10
+ > `confidence` is a property of a *verification attempt*, **not** of the
11
+ > character — so it never lives on a canon leaf. Three layers below. The rubric
12
+ > is a **vector + hard gate**, never one scalar. The loop uses **plateau +
13
+ > oscillation detection**, not a bare iteration count.
14
+
15
+ ## The three layers
16
+
17
+ A character record is split so each layer changes for a different reason:
18
+
19
+ ### Layer 1 — `identity` (immutable canon · the character truth)
20
+
21
+ What the character *is*. Per-leaf `severity` only — **no confidence**, no
22
+ verification state. Source of truth = the canon (the book + its authoritative
23
+ portraits; *the image wins over the text*).
24
+
25
+ ```jsonc
26
+ {
27
+ "id": "veikko",
28
+ "identity": {
29
+ "physique": { "value": "lean wiry athletic, 1.82m, broad shoulders narrow hips", "severity": "major" },
30
+ "face": { "value": "...", "marks": [ { "value": "tiny scar above right mouth corner", "severity": "minor" } ] },
31
+ "hair": { "value": "vertical split: LEFT pitch-black / RIGHT platinum-blond, long open to chest, no shaved sides", "severity": "canon-breaking" },
32
+ "eyes": { "left": "ice-blue", "right": "forest-green", "heterochromia": true, "severity": "canon-breaking" },
33
+ "tattoos": [ { "location": "central chest", "motif": "Vegvisir compass", "style": "blackwork", "severity": "canon-breaking" },
34
+ { "location": "left chest", "motif": "Loki serpent biting tail", "severity": "canon-breaking" },
35
+ { "location": "knuckles", "motif": "block letters", "text": "S-U-S-I", "severity": "major" } ],
36
+ "jewelry": [ { "value": "massive round silver watch, RIGHT wrist", "severity": "major" } ],
37
+ "outfit_variants": [ { "name": "studio-casual", "value": "black sleeveless tank under open leather vest" } ]
38
+ },
39
+ "identity_anchors": ["hair", "eyes", "tattoos[central chest]", "tattoos[left chest]", "jewelry[watch]"],
40
+ "notes": "Loki = asymmetry everywhere (hair, eyes, tattoos)."
41
+ }
42
+ ```
43
+
44
+ - **`severity`** ∈ `canon-breaking | major | minor` — how much a miss matters.
45
+ - **`identity_anchors`** — the must-never-drift list. **Derived rule:** every
46
+ `severity: canon-breaking` leaf MUST be an anchor; anchors MAY also name
47
+ cross-feature invariants (e.g. "asymmetry") that no single leaf captures.
48
+ - Relationship to [`character-consistency`](../character-consistency/SKILL.md):
49
+ its existing token JSON (`agents/reference/ai-video/<project>/characters/<id>.json`)
50
+ is the **load-bearing subset** of this `identity` layer. The Canon Spec is the
51
+ richer superset; `image-analyser` emits the token subset into that exact file
52
+ so there is **one** character record, not two.
53
+
54
+ ### Layer 2 — `observation` (verification state · the analyser's output)
55
+
56
+ What a *specific image* shows, per attempt. **This** is where `confidence`
57
+ lives (`high | medium | low`, the `image-ocr` pattern) plus `unverifiable[]`
58
+ for features the image cannot resolve (occluded / low-res). Never written back
59
+ onto Layer 1.
60
+
61
+ ```jsonc
62
+ {
63
+ "source": "agents/tmp/odins-beard/img_2.png", "character": "veikko",
64
+ "observed": { "hair": { "value": "near-uniform light/blond, split not distinct", "confidence": "high" },
65
+ "eyes": { "value": "both read blue; green not visible", "confidence": "medium" } },
66
+ "unverifiable": ["tattoos[knuckles].text (hands out of frame)"]
67
+ }
68
+ ```
69
+
70
+ ### Layer 3 — `generative_hints` (prompt-assembly guidance)
71
+
72
+ How to render the identity well: anchor ordering (hard-to-render anchors first
73
+ — heterochromia, hair-split), per-engine caveats, negative-prompt seeds. Read
74
+ by `image-creator`; never confused with the canon itself.
75
+
76
+ ## Fidelity rubric — vector + hard gate (not one scalar)
77
+
78
+ A diff scores each observed feature `match | partial | miss`, then reports a
79
+ **vector**, not a single number:
80
+
81
+ 1. **Canon-breaking gate (hard).** ANY `canon-breaking` leaf at `miss` → overall
82
+ **FAIL**, regardless of everything else. Non-negotiable.
83
+ 2. **Per-section scores.** `face`, `hair`, `eyes`, `tattoos`, `outfit`, `jewelry`
84
+ each get their own 0–100 (severity-weighted within the section). Surfaced
85
+ individually so a strong face can't mask a broken hair split.
86
+ 3. **Headline.** A weighted roll-up is shown for convenience but is **advisory** —
87
+ the gate + per-section vector decide pass/fail, not the roll-up.
88
+ 4. **Low-confidence discipline.** A `miss` on a `low`-confidence observation is
89
+ reported as `needs-better-image`, **not** counted as a hard miss — avoids
90
+ false-fail on an un-resolvable feature (re-pass per SKILL § enhancement first).
91
+
92
+ ## Fidelity loop — plateau + oscillation detection
93
+
94
+ `image-creator` generates → `image-analyser` re-reads the output against the
95
+ character's Layer-1 identity → diff → feed `canon-breaking` + `major` misses back
96
+ as refined prompt directives → regenerate.
97
+
98
+ Stop conditions (first to fire):
99
+
100
+ - **PASS** — canon-breaking gate clear AND every per-section score ≥ its
101
+ threshold.
102
+ - **Plateau** — N consecutive rounds with no per-section score improvement →
103
+ stop; the prompt is not the bottleneck (provider/seed is).
104
+ - **Oscillation** — fixing feature X regresses a previously-passing feature Y
105
+ (tracked across rounds) → stop; surface the trade-off, do not thrash.
106
+ - **Budget** — hard ceiling on rounds as a backstop.
107
+
108
+ On any non-PASS stop → **halt and surface** the best candidate + its remaining
109
+ diff for human review. **Never silently accept drift** (per `verify-before-complete`).
@@ -0,0 +1,16 @@
1
+ {
2
+ "skill": "image-analyser",
3
+ "description": "5 should-trigger + 5 should-not-trigger. Should-trigger covers the analyse / canon-diff / accuracy-check / drift-find / bootstrap-from-portrait paths (DE + EN). Should-not covers the near-miss neighbours whose vocabulary overlaps: scene/motion review (video-director), non-character art (canvas-design), cross-scene token lock (character-consistency), generation (image-creator), and document OCR (generic).",
4
+ "queries": [
5
+ {"q": "analyse this character image down to the last detail", "trigger": true},
6
+ {"q": "does img_2.png match the Charakterbuch canon?", "trigger": true},
7
+ {"q": "check this render for character accuracy — what drifted?", "trigger": true},
8
+ {"q": "vergleiche das Bild mit der Charakterbeschreibung, jedes Merkmal", "trigger": true},
9
+ {"q": "build a canon spec from this authoritative portrait", "trigger": true},
10
+ {"q": "turn this scene idea into a cinematic video prompt", "trigger": false, "note": "scene/motion → video-director"},
11
+ {"q": "design a poster for the band", "trigger": false, "note": "static non-character art → canvas-design"},
12
+ {"q": "lock this character's tokens so every scene reuses them", "trigger": false, "note": "cross-scene token lock → character-consistency"},
13
+ {"q": "generate Veikko in the forge scene to spec", "trigger": false, "note": "generation → image-creator"},
14
+ {"q": "extract the text from this scanned invoice", "trigger": false, "note": "plain document OCR, not character analysis"}
15
+ ]
16
+ }