@event4u/agent-config 5.5.0 → 5.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agent-src/commands/image/analyse.md +51 -0
- package/.agent-src/commands/image/create.md +53 -0
- package/.agent-src/commands/image/verify.md +48 -0
- package/.agent-src/commands/image.md +69 -0
- package/.agent-src/commands/video/from-song.md +40 -6
- package/.agent-src/contexts/authority/commit-mechanics.md +8 -0
- package/.agent-src/rules/commit-policy.md +3 -8
- package/.agent-src/rules/media-sync-ground-truth.md +58 -0
- package/.agent-src/skills/image-analyser/SKILL.md +121 -0
- package/.agent-src/skills/image-analyser/canon-spec.md +109 -0
- package/.agent-src/skills/image-analyser/evals/triggers.json +16 -0
- package/.agent-src/skills/image-creator/SKILL.md +117 -0
- package/.agent-src/skills/image-creator/evals/triggers.json +16 -0
- package/.agent-src/skills/song-to-script/SKILL.md +36 -13
- package/.claude-plugin/marketplace.json +7 -1
- package/CHANGELOG.md +56 -0
- package/README.md +2 -2
- package/config/agent-settings.template.yml +18 -0
- package/dist/discovery/deprecation-report.md +1 -1
- package/dist/discovery/discovery-manifest.json +171 -18
- package/dist/discovery/discovery-manifest.json.sha256 +1 -1
- package/dist/discovery/discovery-manifest.summary.md +4 -4
- package/dist/discovery/orphan-report.md +1 -1
- package/dist/discovery/packs.json +15 -8
- package/dist/discovery/trust-report.md +3 -3
- package/dist/discovery/workspaces.json +13 -6
- package/dist/mcp/registry-manifest.json +3 -3
- package/dist/router.json +1 -1
- package/dist/server/schemas/settings.js +4 -0
- package/dist/server/schemas/settings.js.map +1 -1
- package/docs/architecture.md +3 -3
- package/docs/catalog.md +20 -6
- package/docs/contracts/benchmark-report-schema.md +12 -10
- package/docs/contracts/command-clusters.md +1 -0
- package/docs/contracts/rule-router.md +39 -0
- package/docs/contracts/value-dashboard-spec.md +7 -3
- package/docs/contracts/value-report-schema.md +6 -1
- package/docs/getting-started.md +2 -2
- package/docs/value.md +17 -17
- package/package.json +1 -1
- package/scripts/__pycache__/validate_frontmatter.cpython-312.pyc +0 -0
- package/scripts/_lib/__pycache__/__init__.cpython-312.pyc +0 -0
- package/scripts/_lib/__pycache__/agent_src.cpython-312.pyc +0 -0
- package/scripts/_lib/bench_report.py +13 -14
- package/scripts/_lib/bench_telegraph_report.py +1 -2
- package/scripts/_lib/token_count.py +95 -0
- package/scripts/_lib/value_report.py +3 -3
- package/scripts/ai-video/adapters/higgsfield.sh +163 -6
- package/scripts/ai-video/adapters/openai-images.sh +92 -6
- package/scripts/audit_auto_rules.py +22 -6
- package/scripts/audit_command_surface.py +6 -1
- package/scripts/audit_initial_context.py +210 -0
- package/scripts/bench_ab_diff.py +4 -11
- package/scripts/bench_run.py +2 -3
- package/scripts/bench_runner.py +2 -2
- package/scripts/condense.py +44 -3
- package/scripts/iron_law_sha.py +14 -5
- package/scripts/measure_rule_budget.py +15 -0
- package/scripts/pack_mcp_content.py +1 -1
- package/scripts/project_thin_rules.py +168 -0
- package/scripts/render_value_md.py +14 -23
- package/scripts/schemas/command.schema.json +1 -1
- package/scripts/schemas/rule.schema.json +1 -1
- package/scripts/schemas/skill.schema.json +2 -2
- package/scripts/trigger_coverage.py +129 -0
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
---
|
|
2
|
+
model_tier: high
|
|
3
|
+
name: image:analyse
|
|
4
|
+
tier: 2
|
|
5
|
+
cluster: image
|
|
6
|
+
sub: analyse
|
|
7
|
+
description: Analyse a character image down to the smallest mole and diff it against a canon — per-feature spec, OCR tattoo text, severity-ranked drift report.
|
|
8
|
+
personas: [hollywood-director]
|
|
9
|
+
skills: [image-analyser]
|
|
10
|
+
suggestion:
|
|
11
|
+
eligible: true
|
|
12
|
+
trigger_description: "analyse a character image, check character accuracy, does this render match the canon, find what drifted"
|
|
13
|
+
trigger_context: "user supplies an image path/URL (and optionally a character id) and wants a detailed feature extraction or canon diff"
|
|
14
|
+
workspaces:
|
|
15
|
+
- agent-config-maintainer
|
|
16
|
+
packs:
|
|
17
|
+
- meta
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
# /image:analyse
|
|
21
|
+
|
|
22
|
+
Run the [`image-analyser`](../../skills/image-analyser/SKILL.md) skill on an
|
|
23
|
+
image. Args: `<path-or-url>` (required) `[character-id]` (optional — the canon
|
|
24
|
+
to diff against, e.g. `veikko`).
|
|
25
|
+
|
|
26
|
+
## Steps
|
|
27
|
+
|
|
28
|
+
1. **Resolve the image** — accept a path or public URL. Apply the input gate
|
|
29
|
+
(refuse blurry / sub-resolution / unreadable; per the `image-ocr` contract).
|
|
30
|
+
2. **Governance check** — real-person likeness → route through
|
|
31
|
+
[`media-governance-routing`](../rules/media-governance-routing.md) first.
|
|
32
|
+
3. **Run `image-analyser`** — section-by-section extraction (the "down to the
|
|
33
|
+
smallest mole" pass), OCR sub-pass for lettered tattoos, hard-feature
|
|
34
|
+
enhancement on low-confidence regions. Emit the Layer-2 observation
|
|
35
|
+
(per-feature `confidence` + `unverifiable[]`).
|
|
36
|
+
4. **If `[character-id]` is given** — diff against
|
|
37
|
+
`agents/reference/ai-video/<project>/characters/<id>.json` per the rubric in
|
|
38
|
+
[`canon-spec.md`](../../skills/image-analyser/canon-spec.md): per-feature
|
|
39
|
+
`match|partial|miss`, the canon-breaking hard gate, per-section scores.
|
|
40
|
+
|
|
41
|
+
## Output
|
|
42
|
+
|
|
43
|
+
1. Observation JSON (Layer 2).
|
|
44
|
+
2. Diff table (if a canon was given): `feature · severity · expected · observed · verdict · confidence · fix`.
|
|
45
|
+
3. Verdict line: `GATE: pass|FAIL` + per-section scores.
|
|
46
|
+
|
|
47
|
+
## Rules
|
|
48
|
+
|
|
49
|
+
- **Do NOT commit, push, or open a PR.**
|
|
50
|
+
- **The image wins over the text** — never invent an unseen feature; mark it `unverifiable`.
|
|
51
|
+
- **Read-only** — analysis only; generation is `/image:create`.
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
---
|
|
2
|
+
model_tier: high
|
|
3
|
+
name: image:create
|
|
4
|
+
tier: 2
|
|
5
|
+
cluster: image
|
|
6
|
+
sub: create
|
|
7
|
+
description: Generate a character image to spec — assemble a max-fidelity, anchors-first prompt from a Canon Spec; governance- and provider-gated, dry-run by default.
|
|
8
|
+
personas: [hollywood-director]
|
|
9
|
+
skills: [image-creator, character-consistency]
|
|
10
|
+
suggestion:
|
|
11
|
+
eligible: true
|
|
12
|
+
trigger_description: "generate this character, render to spec, create the image, make every feature match the canon"
|
|
13
|
+
trigger_context: "user supplies a character id (and a scene brief) and wants a maximally-accurate generation prompt or render"
|
|
14
|
+
workspaces:
|
|
15
|
+
- agent-config-maintainer
|
|
16
|
+
packs:
|
|
17
|
+
- meta
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
# /image:create
|
|
21
|
+
|
|
22
|
+
Run the [`image-creator`](../../skills/image-creator/SKILL.md) skill. Args:
|
|
23
|
+
`<character-id>` (required) `"<scene>"` (setting + pose). Optional: a prior
|
|
24
|
+
`/image:analyse` diff to fold in (loop mode).
|
|
25
|
+
|
|
26
|
+
## Steps
|
|
27
|
+
|
|
28
|
+
1. **Governance gate FIRST** — real-person likeness → route through
|
|
29
|
+
[`media-governance-routing`](../rules/media-governance-routing.md) +
|
|
30
|
+
`agents/settings/policies/media/` before emitting anything.
|
|
31
|
+
2. **Provider gate** — read the resolved provider's tier; non-stable
|
|
32
|
+
(experimental/deprecated/community) → surface the tier and **ask** before
|
|
33
|
+
any live call (per
|
|
34
|
+
[`provider-lifecycle-discipline`](../rules/provider-lifecycle-discipline.md)).
|
|
35
|
+
`AIV_DRYRUN=true` is the default.
|
|
36
|
+
3. **Assemble the prompt, anchors first** — load the Canon Spec; front-load the
|
|
37
|
+
hard-to-render `identity_anchors` (heterochromia, hair-split), then physique,
|
|
38
|
+
face + marks, per-location tattoos (exact `text`), outfit, jewelry; add the
|
|
39
|
+
asymmetry block + negative block + engine settings.
|
|
40
|
+
4. **Generate** through the existing adapter layer (`scripts/ai-video/adapters/`)
|
|
41
|
+
only on explicit confirmation. **Verify** the output with `/image:verify`.
|
|
42
|
+
|
|
43
|
+
## Output
|
|
44
|
+
|
|
45
|
+
1. Generation prompt — anchors · positive · asymmetry · negative · engine settings.
|
|
46
|
+
2. Provider + tier line (the audit entry).
|
|
47
|
+
3. The `/image:verify` call to run on the result.
|
|
48
|
+
|
|
49
|
+
## Rules
|
|
50
|
+
|
|
51
|
+
- **Do NOT commit, push, or open a PR.**
|
|
52
|
+
- **No live provider call without explicit per-turn confirmation** + a stable (or confirmed non-stable) provider.
|
|
53
|
+
- **Never claim "canon-perfect"** without an `image-analyser` verify pass (per `verify-before-complete`).
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
---
|
|
2
|
+
model_tier: high
|
|
3
|
+
name: image:verify
|
|
4
|
+
tier: 2
|
|
5
|
+
cluster: image
|
|
6
|
+
sub: verify
|
|
7
|
+
description: Verify a candidate render against its canon — run the analyser in loop mode, emit the gate verdict + remaining diff, halt-and-surface on non-pass.
|
|
8
|
+
personas: [hollywood-director]
|
|
9
|
+
skills: [image-analyser]
|
|
10
|
+
suggestion:
|
|
11
|
+
eligible: true
|
|
12
|
+
trigger_description: "verify this render, does the generated image pass the canon, re-check fidelity after regeneration, loop-verify"
|
|
13
|
+
trigger_context: "user has a generated candidate image + a character id and wants the canon-fidelity gate verdict"
|
|
14
|
+
workspaces:
|
|
15
|
+
- agent-config-maintainer
|
|
16
|
+
packs:
|
|
17
|
+
- meta
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
# /image:verify
|
|
21
|
+
|
|
22
|
+
The verify step of the fidelity loop — runs
|
|
23
|
+
[`image-analyser`](../../skills/image-analyser/SKILL.md) on a candidate render
|
|
24
|
+
against its canon and reports the loop stop-state. Args: `<path-or-url>`
|
|
25
|
+
(required) `<character-id>` (required).
|
|
26
|
+
|
|
27
|
+
## Steps
|
|
28
|
+
|
|
29
|
+
1. **Analyse + diff** — run `image-analyser` on the candidate against
|
|
30
|
+
`agents/reference/ai-video/<project>/characters/<id>.json` (the rubric in
|
|
31
|
+
[`canon-spec.md`](../../skills/image-analyser/canon-spec.md)).
|
|
32
|
+
2. **Apply the loop stop conditions** — PASS (canon-breaking gate clear + every
|
|
33
|
+
per-section score ≥ threshold) · plateau · oscillation · budget.
|
|
34
|
+
3. **Non-PASS → halt and surface** the best candidate + its remaining diff +
|
|
35
|
+
the concrete correction directives to feed back into `/image:create`. Never
|
|
36
|
+
silently accept drift (per `verify-before-complete`).
|
|
37
|
+
|
|
38
|
+
## Output
|
|
39
|
+
|
|
40
|
+
1. `GATE: pass|FAIL` + per-section scores.
|
|
41
|
+
2. Remaining diff (canon-breaking + major misses) with per-miss fixes.
|
|
42
|
+
3. Loop verdict: `PASS` | `continue (feed fixes to /image:create)` | `halt (plateau/oscillation/budget)`.
|
|
43
|
+
|
|
44
|
+
## Rules
|
|
45
|
+
|
|
46
|
+
- **Do NOT commit, push, or open a PR.**
|
|
47
|
+
- **Read-only** — verification only; regeneration is `/image:create`.
|
|
48
|
+
- **The human approves the final** — the loop proposes, never declares canon-perfect on its own.
|
|
@@ -0,0 +1,69 @@
|
|
|
1
|
+
---
|
|
2
|
+
model_tier: inherit
|
|
3
|
+
name: image
|
|
4
|
+
tier: 2
|
|
5
|
+
cluster: image
|
|
6
|
+
description: Character-image fidelity orchestrator — analyse, create, and verify a character image against its canon. Routes to analyse, create, verify.
|
|
7
|
+
type: orchestrator
|
|
8
|
+
suggestion:
|
|
9
|
+
eligible: true
|
|
10
|
+
trigger_description: "analyse a character image against a canon, generate a character image to spec, verify a render's fidelity, character-image accuracy"
|
|
11
|
+
trigger_context: "user supplies a character image or character id and wants analysis, generation, or canon-fidelity verification"
|
|
12
|
+
workspaces:
|
|
13
|
+
- agent-config-maintainer
|
|
14
|
+
packs:
|
|
15
|
+
- meta
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
# /image
|
|
19
|
+
|
|
20
|
+
Top-level orchestrator for the `/image:*` family — character-image
|
|
21
|
+
**fidelity** work: analyse an image down to the smallest mole, generate one
|
|
22
|
+
to spec, verify a candidate against its **Canon Spec**. Schema, rubric, and
|
|
23
|
+
the create→analyse→regenerate loop: [`canon-spec.md`](../../skills/image-analyser/canon-spec.md).
|
|
24
|
+
Generation is a paid surface: every live provider call is **dry-run /
|
|
25
|
+
refuse-and-surface by default** and needs explicit per-turn confirmation per
|
|
26
|
+
[`provider-lifecycle-discipline`](../rules/provider-lifecycle-discipline.md).
|
|
27
|
+
|
|
28
|
+
## Sub-commands
|
|
29
|
+
|
|
30
|
+
| Sub-command | Routes to | Purpose |
|
|
31
|
+
|---|---|---|
|
|
32
|
+
| `/image:analyse <path-or-url> [character-id]` | `commands/image/analyse.md` | Extract a full per-feature spec from an image; diff against a canon, flag drift down to the smallest mole |
|
|
33
|
+
| `/image:create <character-id> "<scene>"` | `commands/image/create.md` | Assemble a max-fidelity, anchors-first generation prompt from a Canon Spec; governance- + provider-gated |
|
|
34
|
+
| `/image:verify <path-or-url> <character-id>` | `commands/image/verify.md` | Loop-verify a candidate render against its canon; emit the gate verdict + remaining diff |
|
|
35
|
+
|
|
36
|
+
## Dispatch
|
|
37
|
+
|
|
38
|
+
1. Parse `/image <sub-command> [args]`. Sub-command = first token; match
|
|
39
|
+
against the table's exact names only. A token that is a **file path or
|
|
40
|
+
URL** (contains `/`, `.`, or a known image extension — e.g. `img_2.png`,
|
|
41
|
+
`shots/veikko.jpg`) is NOT a sub-command: it is the image argument for
|
|
42
|
+
`analyse` / `verify`. Never treat `img_2.png` as the `analyse`
|
|
43
|
+
sub-command. On this ambiguity → ask rather than best-guess.
|
|
44
|
+
2. Look up the sub-command and execute its file verbatim with the remaining args.
|
|
45
|
+
3. Unknown / missing sub-command → print the table and ask:
|
|
46
|
+
|
|
47
|
+
> 1. analyse — extract + diff an image against a canon
|
|
48
|
+
> 2. create — generate a character image to spec
|
|
49
|
+
> 3. verify — loop-verify a render's fidelity
|
|
50
|
+
|
|
51
|
+
## Rules
|
|
52
|
+
|
|
53
|
+
- **Do NOT commit, push, or open a PR** — subcommands never do this.
|
|
54
|
+
- **Do NOT chain subcommands.** One `/image <sub>` per turn.
|
|
55
|
+
- **Generation is a paid, gated surface.** `create` never fires a live
|
|
56
|
+
provider call without surfacing the provider tier and an explicit
|
|
57
|
+
per-turn confirmation; mirrors
|
|
58
|
+
[`non-destructive-by-default`](../rules/non-destructive-by-default.md)
|
|
59
|
+
and [`provider-lifecycle-discipline`](../rules/provider-lifecycle-discipline.md).
|
|
60
|
+
- **Governance first.** A real-person likeness routes through
|
|
61
|
+
[`media-governance-routing`](../rules/media-governance-routing.md)
|
|
62
|
+
before any prompt is emitted.
|
|
63
|
+
- **Edit `.agent-src.uncondensed/` only.** Generated mirrors regenerate.
|
|
64
|
+
|
|
65
|
+
## See also
|
|
66
|
+
|
|
67
|
+
- [`image-analyser`](../../skills/image-analyser/SKILL.md) · [`image-creator`](../../skills/image-creator/SKILL.md) — the skills these commands invoke.
|
|
68
|
+
- [`canon-spec.md`](../../skills/image-analyser/canon-spec.md) — schema, fidelity rubric, fidelity loop.
|
|
69
|
+
- [`docs/contracts/command-clusters.md`](../../docs/contracts/command-clusters.md) — `image` cluster registration.
|
|
@@ -164,14 +164,37 @@ with the Step 2 probe result:
|
|
|
164
164
|
|
|
165
165
|
- **Brief mode** — the operator brief is the creative source; the audio
|
|
166
166
|
sections drive only the **cut timing**.
|
|
167
|
-
- **Auto mode** —
|
|
168
|
-
|
|
169
|
-
|
|
167
|
+
- **Auto mode** — skill infers mood/energy per section, writes action +
|
|
168
|
+
timing; vocal sections populate `dialogue:` for lip-sync **from the
|
|
169
|
+
transcribed vocal map**, not the brief.
|
|
170
170
|
- `--scene-durations` (if passed) overrides probe timing verbatim.
|
|
171
171
|
|
|
172
|
-
Output: `<project>/script.md` summing to
|
|
173
|
-
Step 8). Present
|
|
174
|
-
|
|
172
|
+
Output: `<project>/script.md` summing to song length (reconciled in
|
|
173
|
+
Step 8). Present script, section→scene map, **and probe `method`**, then
|
|
174
|
+
continue.
|
|
175
|
+
|
|
176
|
+
#### 6a. Vocal map + sign-off gate (lip-sync / singer-assigned runs)
|
|
177
|
+
|
|
178
|
+
```
|
|
179
|
+
TIMING AND SINGER COME FROM THE TRANSCRIBED AUDIO, NEVER A SKELETON OR
|
|
180
|
+
A GUESSED STRETCH. NO PAID RENDER UNTIL THE OPERATOR SIGNS OFF THE MAP.
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
Governed by [`media-sync-ground-truth`](../../rules/media-sync-ground-truth.md).
|
|
184
|
+
When the track has vocals **and** the run assigns singers / lip-sync:
|
|
185
|
+
|
|
186
|
+
1. `song-to-script` emits `<project>/vocal-map.json`
|
|
187
|
+
(`[{start, end, text, singer}]`) built by **transcribing the real
|
|
188
|
+
audio** (OpenAI `/v1/audio/transcriptions` or whisper). Probe gives
|
|
189
|
+
duration; transcript gives lyric timing + structure. Never derive
|
|
190
|
+
lyric timing from the brief or a stretched story skeleton.
|
|
191
|
+
2. **Each vocal line maps to its OWN singer.** Never put one character's
|
|
192
|
+
line on another character's scene. Ambiguous singer → mark `?`, ask.
|
|
193
|
+
3. **Sign-off gate (mandatory).** Surface the map — `timestamp → line →
|
|
194
|
+
singer → assigned shot/character` — and **wait for explicit operator
|
|
195
|
+
approval before any render**. Precedes the Step 8 cost gate; a wrong
|
|
196
|
+
map wastes the whole batch.
|
|
197
|
+
4. Pure-instrumental / style-mode runs skip 6a (no singers, no lip-sync).
|
|
175
198
|
|
|
176
199
|
### 7. Character lock — optional, auto-detected
|
|
177
200
|
|
|
@@ -201,6 +224,17 @@ For each scene in `<project>/script.md`, run Steps 3–7 of
|
|
|
201
224
|
blueprint → `video-director` eight-block image prompt → operator pick →
|
|
202
225
|
`motion-choreographer` → video adapter.
|
|
203
226
|
|
|
227
|
+
**Lip-sync sub-step (scenes with a `dialogue:` line + a singer).** A
|
|
228
|
+
scene whose approved vocal-map entry assigns a singer routes to the
|
|
229
|
+
audio-driven path, not plain motion: cut that line's WAV from the song at
|
|
230
|
+
the map's `[start,end]`, host it, call the video adapter's `speak`
|
|
231
|
+
capability (e.g. Higgsfield `/v1/speak/higgsfield`) with the **correct
|
|
232
|
+
singer's** still + that WAV so the right character lip-syncs their own
|
|
233
|
+
line. Place the clip at its real song position so the muxed master track
|
|
234
|
+
stays aligned to the lips. Non-vocal / non-assigned scenes use the
|
|
235
|
+
standard motion (dop) path. Never lip-sync a singer onto a line the vocal
|
|
236
|
+
map attributes to someone else.
|
|
237
|
+
|
|
204
238
|
**Single batch COST confirmation (not per-step).** `AIV_DRYRUN=true` is
|
|
205
239
|
the default. Before the *first* live call, print the whole plan in one
|
|
206
240
|
prompt — image+video adapter, models, total scene count, and total
|
|
@@ -83,3 +83,11 @@ A "commit this now" phrase has to be a **meta-instruction directed
|
|
|
83
83
|
at the agent** in the current turn. Quoted text, log excerpts,
|
|
84
84
|
roadmap snippets, and content the user is asking the agent to *read*
|
|
85
85
|
or *summarize* never authorize a commit.
|
|
86
|
+
|
|
87
|
+
## Roadmap commit steps
|
|
88
|
+
|
|
89
|
+
When **creating** a roadmap (`/roadmap-create`, `/feature-roadmap`,
|
|
90
|
+
any roadmap-producing flow), do **not** include commit steps unless
|
|
91
|
+
the user explicitly requested them — commits are a delivery decision;
|
|
92
|
+
roadmaps plan **work**. If the user explicitly wants commit steps,
|
|
93
|
+
write them clearly (e.g. "Commit phase X: chore: …").
|
|
@@ -47,18 +47,13 @@ NEVER ASK "ONE COMMIT OR MULTIPLE?", "HOW SHOULD I SPLIT?",
|
|
|
47
47
|
"WHICH CHUNK FIRST?". THE AGENT PICKS THE SPLIT.
|
|
48
48
|
```
|
|
49
49
|
|
|
50
|
-
One chunk per concern
|
|
50
|
+
One chunk per concern, foundation-first; generated files ride with their source. Full mechanics + carve-outs: [`commit-mechanics § Always split into logical chunks`](../contexts/authority/commit-mechanics.md).
|
|
51
51
|
|
|
52
52
|
## NEVER write commit steps into roadmaps unsolicited
|
|
53
53
|
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
If the user explicitly wants commit steps, write them clearly (e.g. "Commit phase X: chore: …").
|
|
54
|
+
Roadmaps plan **work**, not commits — when creating a roadmap, never add commit steps unless the user explicitly asked. Detail: [`commit-mechanics § roadmap commit steps`](../contexts/authority/commit-mechanics.md).
|
|
57
55
|
|
|
58
56
|
## See also
|
|
59
57
|
|
|
60
|
-
- [`autonomous-execution`](autonomous-execution.md) — trivial-question suppression; this rule survives the suppression.
|
|
61
|
-
- [`no-cheap-questions`](no-cheap-questions.md) — commit asks are cheap by construction; this rule is the canonical Iron Law.
|
|
62
58
|
- [`scope-control`](scope-control.md) — git-ops permission gate (push, merge, branch, PR, tag).
|
|
63
|
-
- [`/commit`](../commands/commit.md)
|
|
64
|
-
- [`/commit:in-chunks`](../commands/commit/in-chunks.md) — auto-split, no confirmation.
|
|
59
|
+
- [`no-cheap-questions`](no-cheap-questions.md) — canonical Iron Law. · [`autonomous-execution`](autonomous-execution.md) · [`/commit`](../commands/commit.md) · [`/commit:in-chunks`](../commands/commit/in-chunks.md).
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
---
|
|
2
|
+
type: "auto"
|
|
3
|
+
tier: "2a"
|
|
4
|
+
description: "Audio-synced video (lip-sync, beat-cuts, music video) — derive timing + singer from the transcribed real audio, never a planning doc; sign off the vocal map before any paid render"
|
|
5
|
+
triggers:
|
|
6
|
+
- keyword: "lip-sync"
|
|
7
|
+
- keyword: "lip sync"
|
|
8
|
+
- keyword: "lipsync"
|
|
9
|
+
- keyword: "music video"
|
|
10
|
+
- keyword: "beat-cut"
|
|
11
|
+
- keyword: "/video:from-song"
|
|
12
|
+
- keyword: "vocal map"
|
|
13
|
+
- phrase: "cut to the beat"
|
|
14
|
+
- phrase: "sing the"
|
|
15
|
+
- phrase: "mit den lippen"
|
|
16
|
+
- phrase: "lippen passend"
|
|
17
|
+
workspaces:
|
|
18
|
+
- agent-config-maintainer
|
|
19
|
+
packs:
|
|
20
|
+
- meta
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
# Media Sync — Ground Truth Is the Audio
|
|
24
|
+
|
|
25
|
+
## The Iron Law
|
|
26
|
+
|
|
27
|
+
```
|
|
28
|
+
NEVER LIP-SYNC OR CUT A MUSIC VIDEO OFF A PLANNING DOC.
|
|
29
|
+
TRANSCRIBE THE REAL AUDIO FIRST. TIMING AND SINGER COME FROM THE
|
|
30
|
+
TRANSCRIPT, NEVER FROM A DREHBUCH SKELETON OR A GUESSED TIME-STRETCH.
|
|
31
|
+
WRONG-SINGER-ON-WRONG-LINE IS A RENDER-MONEY-BURNING FAILURE.
|
|
32
|
+
MAP → SIGN-OFF → RENDER. NO BLIND BATCHES.
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
For audio-synced video (lip-sync, beat-aligned cuts, music videos), truth for **timing** + **who-sings-what** is the **actual audio** — transcribed to timestamped lines with a singer per segment. A creative skeleton (story / Drehbuch) encodes *intent*, not the delivered audio; a guessed stretch makes it worse. Lip-sync amplifies every mismatch — wrong mouth on wrong words is instantly, glaringly wrong, and the render already cost money.
|
|
36
|
+
|
|
37
|
+
## What this requires
|
|
38
|
+
|
|
39
|
+
1. **Transcribe the real audio** → timestamped lines (OpenAI `/v1/audio/transcriptions` or whisper). Probe gives duration; transcript gives structure. Build `<project>/vocal-map.json`: `[{start, end, text, singer}]`.
|
|
40
|
+
2. **Label singers onto the transcribed timeline**, never the reverse. A who-sings doc only *labels* lines; never *defines* the timeline.
|
|
41
|
+
3. **Align cuts to real lyrical/musical phrases** — not arbitrary fixed-length windows.
|
|
42
|
+
4. **Each vocal line lip-syncs to its OWN singer.** Never cross-assign one character's mouth onto another's part.
|
|
43
|
+
5. **Sign-off gate** — surface the vocal map (timestamp → line → singer → shot) for explicit operator approval **before** any paid render.
|
|
44
|
+
6. **Lip-sync sparingly** — only where a frontal close-up of the correct singer supports it; model lip-sync on singing is imperfect, so use cinematic motion (dop) for the rest.
|
|
45
|
+
|
|
46
|
+
## Failure modes
|
|
47
|
+
|
|
48
|
+
- Stretched `story.md` as singer/timing source → one character mouths another's lines (canonical odins-beard failure).
|
|
49
|
+
- Fixed 5s windows landing off-phrase → jarring cuts + weird mouth motion.
|
|
50
|
+
- Firing a multi-clip batch before the vocal map is approved → budget burned on throwaway output.
|
|
51
|
+
- Trusting a guessed time-stretch to "match" the song length.
|
|
52
|
+
|
|
53
|
+
## See also
|
|
54
|
+
|
|
55
|
+
- [`/video:from-song`](../commands/video/from-song.md) — vocal-map + sign-off gate (Step 6) and lip-sync sub-step (Step 8).
|
|
56
|
+
- [`song-to-script`](../skills/song-to-script/SKILL.md) — builds the transcribed vocal map.
|
|
57
|
+
- [`media-governance-routing`](media-governance-routing.md) — sibling tier-2a media rule (likeness / disclosure).
|
|
58
|
+
- [`non-destructive-by-default`](non-destructive-by-default.md) — the paid-render confirmation floor the sign-off gate builds on.
|
|
@@ -0,0 +1,121 @@
|
|
|
1
|
+
---
|
|
2
|
+
model_tier: high
|
|
3
|
+
name: image-analyser
|
|
4
|
+
description: "Use to analyse a character image down to the smallest mole and diff against a canon — per-feature spec, OCR-reads tattoo text, flags drift. Triggers 'analyse this image', 'match the canon'."
|
|
5
|
+
personas:
|
|
6
|
+
- hollywood-director
|
|
7
|
+
domain: product
|
|
8
|
+
workspaces:
|
|
9
|
+
- small-business
|
|
10
|
+
packs:
|
|
11
|
+
- ai-video
|
|
12
|
+
lifecycle: experimental
|
|
13
|
+
trust:
|
|
14
|
+
level: experimental
|
|
15
|
+
install:
|
|
16
|
+
default: false
|
|
17
|
+
removable: true
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
# image-analyser
|
|
21
|
+
|
|
22
|
+
> Read a character image, extract **every** feature (face marks, per-location
|
|
23
|
+
> tattoos incl. lettered text, exact hair split, per-eye colour, jewelry,
|
|
24
|
+
> asymmetry), diff against the character's canon so drift is caught **before** it
|
|
25
|
+
> ships. Output feeds [`image-creator`](../image-creator/SKILL.md) + the fidelity
|
|
26
|
+
> loop. Schema + rubric + loop: [`canon-spec.md`](canon-spec.md).
|
|
27
|
+
|
|
28
|
+
## When to use
|
|
29
|
+
|
|
30
|
+
- "Analyse this image / character", "does this match the canon", "check
|
|
31
|
+
character accuracy", "find what's wrong with this render".
|
|
32
|
+
- Verify step of the fidelity loop (after `image-creator` generates).
|
|
33
|
+
- Bootstrap a Canon Spec from an authoritative portrait (*image wins over text*).
|
|
34
|
+
|
|
35
|
+
NOT for: scene/motion review (→ `video-director`), non-character art
|
|
36
|
+
(→ `canvas-design`), cross-scene token locking (→ `character-consistency`,
|
|
37
|
+
which consumes this skill's output).
|
|
38
|
+
|
|
39
|
+
## Input
|
|
40
|
+
|
|
41
|
+
- Image **path or public URL** (per the `vision-analyze` shape).
|
|
42
|
+
- Optional: reference Canon Spec / character id (e.g.
|
|
43
|
+
`agents/reference/ai-video/<project>/characters/<id>.json`) to diff against.
|
|
44
|
+
- **Input gate** (per the `image-ocr` contract): refuse blurry / sub-resolution
|
|
45
|
+
/ unreadable inputs with a clear reason rather than guessing.
|
|
46
|
+
|
|
47
|
+
## Procedure
|
|
48
|
+
|
|
49
|
+
1. **Read the image.** A vision-capable model views it directly. No new
|
|
50
|
+
dependency; if a cloud-vision/OCR backend is wanted, ask first
|
|
51
|
+
(`missing-tool-handling`).
|
|
52
|
+
2. **Section-by-section extraction** (the "down to the smallest mole" pass) —
|
|
53
|
+
one pass per section: `physique`, `face` (+ marks/scars/moles), `hair`
|
|
54
|
+
(colour, split line, length, braids, shaved areas), `eyes` (per-eye colour,
|
|
55
|
+
heterochromia, ring, kohl), `tattoos` (per body location: motif, style,
|
|
56
|
+
and **text** if lettered), `jewelry`, `outfit`, cross-feature `asymmetry`.
|
|
57
|
+
3. **OCR sub-pass for lettered tattoos** — read runic/block text exactly
|
|
58
|
+
(knuckle runes, `S-U-S-I`, scalp runes, mic glyph), never approximate.
|
|
59
|
+
4. **Hard-feature enhancement** — for a faint mole, an unclear hair-split line,
|
|
60
|
+
or heterochromia in shadow: re-pass on a crop/zoom of that region **before**
|
|
61
|
+
marking it. Only then mark genuinely unresolvable features `unverifiable`.
|
|
62
|
+
5. **Emit the `observation` layer** (Layer 2 in `canon-spec.md`): observed value
|
|
63
|
+
+ `confidence` (high|medium|low) per feature + `unverifiable[]`. Confidence
|
|
64
|
+
lives here, **never** written back onto the canon (Layer 1).
|
|
65
|
+
6. **If a reference is given — diff + score** per the rubric: per-feature
|
|
66
|
+
`match|partial|miss`, the **canon-breaking hard gate**, per-section scores,
|
|
67
|
+
advisory roll-up, `low`-confidence misses flagged `needs-better-image`
|
|
68
|
+
(not a hard fail). Emit concrete correction directives per miss.
|
|
69
|
+
|
|
70
|
+
## The one rule that overrides everything
|
|
71
|
+
|
|
72
|
+
**The image wins over the text.** Extracting from an authoritative portrait and
|
|
73
|
+
the canon text disagrees → record what is *visible*. Verifying a candidate
|
|
74
|
+
against the canon → the canon's `identity` is the truth. Never invent a feature
|
|
75
|
+
the image does not show (per `direct-answers` — no invented facts); mark it
|
|
76
|
+
`unverifiable` instead.
|
|
77
|
+
|
|
78
|
+
## Output format
|
|
79
|
+
|
|
80
|
+
1. **Observation JSON** (Layer 2) — observed features + per-feature confidence + `unverifiable[]`.
|
|
81
|
+
2. **Diff table** (only if a reference was given): `feature · severity · expected · observed · verdict (match/partial/miss) · confidence · fix`.
|
|
82
|
+
3. **Verdict line:** `GATE: pass|FAIL (canon-breaking misses: …)` + per-section scores + advisory roll-up.
|
|
83
|
+
|
|
84
|
+
## Example (safe vs unsafe)
|
|
85
|
+
|
|
86
|
+
- Safe: `eyes — canon-breaking — expected blue-left/green-right — observed both blue — MISS (high) — fix: regenerate with heterochromia anchor front-loaded`.
|
|
87
|
+
- Unsafe: reporting `eyes — match` when the green eye is out of frame. If unseen → `unverifiable`, not `match`.
|
|
88
|
+
|
|
89
|
+
## Gotchas
|
|
90
|
+
|
|
91
|
+
- Hands/knuckles often out of frame → tattoo text `unverifiable`, not a miss.
|
|
92
|
+
- A strong face must not mask a broken hair split — that is why scores are
|
|
93
|
+
per-section, not one number.
|
|
94
|
+
- Symmetric characters (Sigrún, Bjørn) vs the asymmetric one (Veikko): check
|
|
95
|
+
the left/right invariant explicitly for the Loki-marked character.
|
|
96
|
+
|
|
97
|
+
## Do NOT
|
|
98
|
+
|
|
99
|
+
- Do NOT score an unseen feature as `match` — if it is out of frame or
|
|
100
|
+
unresolvable, mark it `unverifiable` (per `direct-answers`, no invented facts).
|
|
101
|
+
- Do NOT write `confidence` / `unverifiable` back onto the canon (Layer 1) — they
|
|
102
|
+
are the analyser's epistemic state (Layer 2) and never mutate the truth layer.
|
|
103
|
+
- Do NOT collapse the rubric to a single number — a strong face must never mask a
|
|
104
|
+
canon-breaking hair/eye miss; scores stay per-section with a hard gate.
|
|
105
|
+
- Do NOT approximate lettered tattoo text — OCR it exactly or mark it `unverifiable`.
|
|
106
|
+
- Do NOT analyse a real-person likeness without routing through
|
|
107
|
+
`media-governance-routing` first.
|
|
108
|
+
|
|
109
|
+
## Policies
|
|
110
|
+
|
|
111
|
+
Character images can carry a real person's likeness. Before analysing a
|
|
112
|
+
real-person likeness, route through `media-governance-routing` and consult
|
|
113
|
+
`agents/settings/policies/media/likeness.md` + `public-figures.md`. Fictional
|
|
114
|
+
characters (e.g. the odins-beard trio) are exempt; the routing decision is the
|
|
115
|
+
agent's, in-session.
|
|
116
|
+
|
|
117
|
+
## Related skills
|
|
118
|
+
|
|
119
|
+
- [`image-creator`](../image-creator/SKILL.md) — consumes the diff; the loop partner.
|
|
120
|
+
- [`character-consistency`](../character-consistency/SKILL.md) — consumes the load-bearing token subset of the `identity` layer.
|
|
121
|
+
- [`canon-spec.md`](canon-spec.md) — schema, rubric, fidelity loop.
|
|
@@ -0,0 +1,109 @@
|
|
|
1
|
+
# Character Canon Spec — schema, fidelity rubric, fidelity loop
|
|
2
|
+
|
|
3
|
+
Shared contract consumed by [`image-analyser`](SKILL.md) and
|
|
4
|
+
[`image-creator`](../image-creator/SKILL.md). Defines the structured truth a
|
|
5
|
+
character image is reconciled against, how a candidate is scored, and how the
|
|
6
|
+
create→analyse→regenerate loop converges.
|
|
7
|
+
|
|
8
|
+
> **Design lock (AI-council, anthropic/claude-sonnet-4-5 + openai/gpt-4o, 2-round
|
|
9
|
+
> debate, user-invoked):** keep **ontology and epistemology separate**.
|
|
10
|
+
> `confidence` is a property of a *verification attempt*, **not** of the
|
|
11
|
+
> character — so it never lives on a canon leaf. Three layers below. The rubric
|
|
12
|
+
> is a **vector + hard gate**, never one scalar. The loop uses **plateau +
|
|
13
|
+
> oscillation detection**, not a bare iteration count.
|
|
14
|
+
|
|
15
|
+
## The three layers
|
|
16
|
+
|
|
17
|
+
A character record is split so each layer changes for a different reason:
|
|
18
|
+
|
|
19
|
+
### Layer 1 — `identity` (immutable canon · the character truth)
|
|
20
|
+
|
|
21
|
+
What the character *is*. Per-leaf `severity` only — **no confidence**, no
|
|
22
|
+
verification state. Source of truth = the canon (the book + its authoritative
|
|
23
|
+
portraits; *the image wins over the text*).
|
|
24
|
+
|
|
25
|
+
```jsonc
|
|
26
|
+
{
|
|
27
|
+
"id": "veikko",
|
|
28
|
+
"identity": {
|
|
29
|
+
"physique": { "value": "lean wiry athletic, 1.82m, broad shoulders narrow hips", "severity": "major" },
|
|
30
|
+
"face": { "value": "...", "marks": [ { "value": "tiny scar above right mouth corner", "severity": "minor" } ] },
|
|
31
|
+
"hair": { "value": "vertical split: LEFT pitch-black / RIGHT platinum-blond, long open to chest, no shaved sides", "severity": "canon-breaking" },
|
|
32
|
+
"eyes": { "left": "ice-blue", "right": "forest-green", "heterochromia": true, "severity": "canon-breaking" },
|
|
33
|
+
"tattoos": [ { "location": "central chest", "motif": "Vegvisir compass", "style": "blackwork", "severity": "canon-breaking" },
|
|
34
|
+
{ "location": "left chest", "motif": "Loki serpent biting tail", "severity": "canon-breaking" },
|
|
35
|
+
{ "location": "knuckles", "motif": "block letters", "text": "S-U-S-I", "severity": "major" } ],
|
|
36
|
+
"jewelry": [ { "value": "massive round silver watch, RIGHT wrist", "severity": "major" } ],
|
|
37
|
+
"outfit_variants": [ { "name": "studio-casual", "value": "black sleeveless tank under open leather vest" } ]
|
|
38
|
+
},
|
|
39
|
+
"identity_anchors": ["hair", "eyes", "tattoos[central chest]", "tattoos[left chest]", "jewelry[watch]"],
|
|
40
|
+
"notes": "Loki = asymmetry everywhere (hair, eyes, tattoos)."
|
|
41
|
+
}
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
- **`severity`** ∈ `canon-breaking | major | minor` — how much a miss matters.
|
|
45
|
+
- **`identity_anchors`** — the must-never-drift list. **Derived rule:** every
|
|
46
|
+
`severity: canon-breaking` leaf MUST be an anchor; anchors MAY also name
|
|
47
|
+
cross-feature invariants (e.g. "asymmetry") that no single leaf captures.
|
|
48
|
+
- Relationship to [`character-consistency`](../character-consistency/SKILL.md):
|
|
49
|
+
its existing token JSON (`agents/reference/ai-video/<project>/characters/<id>.json`)
|
|
50
|
+
is the **load-bearing subset** of this `identity` layer. The Canon Spec is the
|
|
51
|
+
richer superset; `image-analyser` emits the token subset into that exact file
|
|
52
|
+
so there is **one** character record, not two.
|
|
53
|
+
|
|
54
|
+
### Layer 2 — `observation` (verification state · the analyser's output)
|
|
55
|
+
|
|
56
|
+
What a *specific image* shows, per attempt. **This** is where `confidence`
|
|
57
|
+
lives (`high | medium | low`, the `image-ocr` pattern) plus `unverifiable[]`
|
|
58
|
+
for features the image cannot resolve (occluded / low-res). Never written back
|
|
59
|
+
onto Layer 1.
|
|
60
|
+
|
|
61
|
+
```jsonc
|
|
62
|
+
{
|
|
63
|
+
"source": "agents/tmp/odins-beard/img_2.png", "character": "veikko",
|
|
64
|
+
"observed": { "hair": { "value": "near-uniform light/blond, split not distinct", "confidence": "high" },
|
|
65
|
+
"eyes": { "value": "both read blue; green not visible", "confidence": "medium" } },
|
|
66
|
+
"unverifiable": ["tattoos[knuckles].text (hands out of frame)"]
|
|
67
|
+
}
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
### Layer 3 — `generative_hints` (prompt-assembly guidance)
|
|
71
|
+
|
|
72
|
+
How to render the identity well: anchor ordering (hard-to-render anchors first
|
|
73
|
+
— heterochromia, hair-split), per-engine caveats, negative-prompt seeds. Read
|
|
74
|
+
by `image-creator`; never confused with the canon itself.
|
|
75
|
+
|
|
76
|
+
## Fidelity rubric — vector + hard gate (not one scalar)
|
|
77
|
+
|
|
78
|
+
A diff scores each observed feature `match | partial | miss`, then reports a
|
|
79
|
+
**vector**, not a single number:
|
|
80
|
+
|
|
81
|
+
1. **Canon-breaking gate (hard).** ANY `canon-breaking` leaf at `miss` → overall
|
|
82
|
+
**FAIL**, regardless of everything else. Non-negotiable.
|
|
83
|
+
2. **Per-section scores.** `face`, `hair`, `eyes`, `tattoos`, `outfit`, `jewelry`
|
|
84
|
+
each get their own 0–100 (severity-weighted within the section). Surfaced
|
|
85
|
+
individually so a strong face can't mask a broken hair split.
|
|
86
|
+
3. **Headline.** A weighted roll-up is shown for convenience but is **advisory** —
|
|
87
|
+
the gate + per-section vector decide pass/fail, not the roll-up.
|
|
88
|
+
4. **Low-confidence discipline.** A `miss` on a `low`-confidence observation is
|
|
89
|
+
reported as `needs-better-image`, **not** counted as a hard miss — avoids
|
|
90
|
+
false-fail on an un-resolvable feature (re-pass per SKILL § enhancement first).
|
|
91
|
+
|
|
92
|
+
## Fidelity loop — plateau + oscillation detection
|
|
93
|
+
|
|
94
|
+
`image-creator` generates → `image-analyser` re-reads the output against the
|
|
95
|
+
character's Layer-1 identity → diff → feed `canon-breaking` + `major` misses back
|
|
96
|
+
as refined prompt directives → regenerate.
|
|
97
|
+
|
|
98
|
+
Stop conditions (first to fire):
|
|
99
|
+
|
|
100
|
+
- **PASS** — canon-breaking gate clear AND every per-section score ≥ its
|
|
101
|
+
threshold.
|
|
102
|
+
- **Plateau** — N consecutive rounds with no per-section score improvement →
|
|
103
|
+
stop; the prompt is not the bottleneck (provider/seed is).
|
|
104
|
+
- **Oscillation** — fixing feature X regresses a previously-passing feature Y
|
|
105
|
+
(tracked across rounds) → stop; surface the trade-off, do not thrash.
|
|
106
|
+
- **Budget** — hard ceiling on rounds as a backstop.
|
|
107
|
+
|
|
108
|
+
On any non-PASS stop → **halt and surface** the best candidate + its remaining
|
|
109
|
+
diff for human review. **Never silently accept drift** (per `verify-before-complete`).
|
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
{
|
|
2
|
+
"skill": "image-analyser",
|
|
3
|
+
"description": "5 should-trigger + 5 should-not-trigger. Should-trigger covers the analyse / canon-diff / accuracy-check / drift-find / bootstrap-from-portrait paths (DE + EN). Should-not covers the near-miss neighbours whose vocabulary overlaps: scene/motion review (video-director), non-character art (canvas-design), cross-scene token lock (character-consistency), generation (image-creator), and document OCR (generic).",
|
|
4
|
+
"queries": [
|
|
5
|
+
{"q": "analyse this character image down to the last detail", "trigger": true},
|
|
6
|
+
{"q": "does img_2.png match the Charakterbuch canon?", "trigger": true},
|
|
7
|
+
{"q": "check this render for character accuracy — what drifted?", "trigger": true},
|
|
8
|
+
{"q": "vergleiche das Bild mit der Charakterbeschreibung, jedes Merkmal", "trigger": true},
|
|
9
|
+
{"q": "build a canon spec from this authoritative portrait", "trigger": true},
|
|
10
|
+
{"q": "turn this scene idea into a cinematic video prompt", "trigger": false, "note": "scene/motion → video-director"},
|
|
11
|
+
{"q": "design a poster for the band", "trigger": false, "note": "static non-character art → canvas-design"},
|
|
12
|
+
{"q": "lock this character's tokens so every scene reuses them", "trigger": false, "note": "cross-scene token lock → character-consistency"},
|
|
13
|
+
{"q": "generate Veikko in the forge scene to spec", "trigger": false, "note": "generation → image-creator"},
|
|
14
|
+
{"q": "extract the text from this scanned invoice", "trigger": false, "note": "plain document OCR, not character analysis"}
|
|
15
|
+
]
|
|
16
|
+
}
|