@milenyumai/film-kit 1.4.1 → 1.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -48,13 +48,14 @@ All rules, skills, and workflows are located under \`.agent/\`.
48
48
  - **Model Profile:** \`.agent/model-profile.md\` — Active model rules and constraints
49
49
  - **Agent:** \`.agent/agents/prompt-engineer.md\` — Senior prompt engineer agent
50
50
 
51
- ### Skills (8 modules)
51
+ ### Skills (9 modules)
52
52
  | Skill | Path | Priority |
53
53
  |-------|------|----------|
54
54
  | Safety Compliance | \`.agent/skills/safety-compliance/SKILL.md\` | P0 — ALWAYS |
55
55
  | Reference Locking | \`.agent/skills/reference-locking/SKILL.md\` | P1 — When refs provided |
56
56
  | Frame Chaining | \`.agent/skills/frame-chaining/SKILL.md\` | P2 — ALWAYS |
57
57
  | Spatial Blocking | \`.agent/skills/spatial-blocking/SKILL.md\` | P2 — Relational realism / gaze / depth |
58
+ | Semantic Consistency | \`.agent/skills/semantic-consistency/SKILL.md\` | P2 — ALWAYS, visual_world + physics gate |
58
59
  | Coverage System | \`.agent/skills/coverage-system/SKILL.md\` | P2 — ALWAYS (mandatory) |
59
60
  | Visual Modes | \`.agent/skills/visual-modes/SKILL.md\` | P4 — ALWAYS |
60
61
  | Audio Design | \`.agent/skills/audio-design/SKILL.md\` | P4 — When dialogue/SFX |
@@ -75,6 +76,7 @@ When the user asks \`/generate\`, convert the scenario into:
75
76
  - \`${config.outputDir}/shot-plan.json\` — Single-agent plan + policy + \`voiceCast\` contract
76
77
  - \`${config.outputDir}/shots/SHOT01.md, SHOT02.md, ...\` — Production shot files (with coverage included)
77
78
  - \`${config.outputDir}/reports/SAFETY-REPORT.md\` — Safety gate result
79
+ - \`${config.outputDir}/reports/SEMANTIC-REPORT.md\` — Semantic consistency gate result
78
80
  - \`${config.outputDir}/reports/DELIVERY-REPORT.md\` — Delivery gate result
79
81
  - \`${config.outputDir}/_index.md\` — Shot list with chain & status tracking
80
82
 
@@ -100,6 +102,7 @@ Each \`SHOTNN.md\` is a **single file** containing ALL shot details:
100
102
  - **Name Policy:** Visual prompts must stay anonymous. Dialogue naming follows \`shot-plan.json\` policy.
101
103
  - **AUTO-SAFETY:** Proactively reframe content that may trigger safety filters
102
104
  - **Frame Chaining:** Last frame of SHOT[N] = First frame of SHOT[N+1]
105
+ - **Semantic Consistency:** \`shot-plan.json.visual_world\` is canonical for perspective, named camera movement strategy, shadow vector, scale, reflection, physics, and seed strategy
103
106
  - **Coverage Mandatory:** Every main shot includes 2-3 coverage sub-shots in same file
104
107
  - **Voice Design:** \`shot-plan.json\` keeps top-level \`voiceCast\`; every speaking VIDEO section keeps \`Audio Plan\`
105
108
  - **Music: NONE** by default (user must explicitly request)
@@ -122,6 +125,7 @@ Read .agent/VOICE-DESIGN.md when dialogue, narrator VO, or reusable speaker iden
122
125
  | Reference Locking | .agent/skills/reference-locking/SKILL.md | When refs provided |
123
126
  | Frame Chaining | .agent/skills/frame-chaining/SKILL.md | Multi-shot projects |
124
127
  | Spatial Blocking | .agent/skills/spatial-blocking/SKILL.md | Multi-subject / gaze / scale-critical shots |
128
+ | Semantic Consistency | .agent/skills/semantic-consistency/SKILL.md | ALWAYS |
125
129
  | Coverage System | .agent/skills/coverage-system/SKILL.md | ALWAYS (mandatory) |
126
130
  | Visual Modes | .agent/skills/visual-modes/SKILL.md | All visual work |
127
131
  | Audio Design | .agent/skills/audio-design/SKILL.md | Dialogue/SFX needed |
@@ -137,9 +141,11 @@ Read .agent/VOICE-DESIGN.md when dialogue, narrator VO, or reusable speaker iden
137
141
  7. EVERY prompt must have an Avoid line. No exceptions.
138
142
  8. Coverage shots mandatory (2-3 per main shot, min 60 words each, included in same file).
139
143
  9. Frame chaining: Last frame of SHOT[N] = First frame of SHOT[N+1].
140
- 10. ILK/İLK FRAME section must contain a code block even for chained shots.
141
- 11. ONE FILE PER SHOT: Each SHOTNN.md contains main shot + all coverage shots.
142
- 12. Keep top-level \`voiceCast\` in ${config.outputDir}/shot-plan.json and \`Audio Plan\` in every speaking VIDEO section.
144
+ 10. Semantic consistency: \`${config.outputDir}/shot-plan.json\` must include \`visual_world\`; prompts must align camera, named movement strategy, light/shadow vector, scale, reflections, physics, anatomy risk, and contextual logic.
145
+ 11. ILK/İLK FRAME section must contain a code block even for chained shots.
146
+ 12. Chained ILK/İLK FRAME code blocks must contain only: \`Use SHOT[prev]_END as exact first frame\`; any new visual prompt is a CHAIN BREAK.
147
+ 13. ONE FILE PER SHOT: Each SHOTNN.md contains main shot + all coverage shots.
148
+ 14. Keep top-level \`voiceCast\` in ${config.outputDir}/shot-plan.json and \`Audio Plan\` in every speaking VIDEO section.
143
149
 
144
150
  ## WORKFLOWS
145
151
  - /generate → Read .agent/workflows/generate.md
@@ -174,7 +180,7 @@ Read \`.agent/model-profile.md\` for active model constraints.
174
180
 
175
181
  ## SKILL LOADING (MANDATORY)
176
182
  Before generating ANY prompts:
177
- 1. ALWAYS load: safety-compliance, frame-chaining, coverage-system, prompt-structure, visual-modes
183
+ 1. ALWAYS load: safety-compliance, frame-chaining, semantic-consistency, coverage-system, prompt-structure, visual-modes
178
184
  2. Load for relational realism: spatial-blocking
179
185
  3. Load if refs provided: reference-locking
180
186
  4. Load if dialogue/SFX: audio-design
@@ -194,9 +200,11 @@ All skills at: \`.agent/skills/[name]/SKILL.md\`
194
200
  - AUTO-ANONYMOUS: Replace ALL real names with physical descriptions
195
201
  - Dialogue naming follows \`${config.outputDir}/shot-plan.json\` policy
196
202
  - \`shot-plan.json\` stores top-level \`voiceCast\`
203
+ - \`shot-plan.json\` stores top-level \`visual_world\` for camera/lens/camera-movement/light/shadow/scale/reflection/physics/seed strategy
197
204
  - Every speaking VIDEO section includes \`Audio Plan\`
198
205
  - AUTO-SAFETY: Proactively reframe sensitive content
199
206
  - Frame chaining: Last frame SHOT[N] = First frame SHOT[N+1]
207
+ - Chained ILK/İLK FRAME code block contains only \`Use SHOT[prev]_END as exact first frame\`; any new visual prompt requires CHAIN BREAK
200
208
  - Coverage: 2-3 sub-shots per main shot (min 60 words each, in same file)
201
209
  - Avoid line: MANDATORY on every prompt
202
210
  - Music: NONE by default
@@ -241,6 +249,7 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
241
249
  - \`reference-locking/SKILL.md\` — When refs provided (P1)
242
250
  - \`frame-chaining/SKILL.md\` — ALWAYS for multi-shot (P2)
243
251
  - \`spatial-blocking/SKILL.md\` — when gaze / scale / compositing realism matters (P2)
252
+ - \`semantic-consistency/SKILL.md\` — ALWAYS, canonical \`visual_world\` + physics gate (P2)
244
253
  - \`coverage-system/SKILL.md\` — ALWAYS, mandatory (P2)
245
254
  - \`visual-modes/SKILL.md\` — ALWAYS (P4)
246
255
  - \`audio-design/SKILL.md\` — When dialogue/SFX (P4)
@@ -252,8 +261,9 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
252
261
  - Kling preset: \`${config.klingPreset}\`
253
262
  - Create \`${config.outputDir}/project-info.md\`, \`${config.outputDir}/shot-plan.json\`, and \`${config.outputDir}/_index.md\`
254
263
  - Keep top-level \`voiceCast\` in \`${config.outputDir}/shot-plan.json\`
264
+ - Keep top-level \`visual_world\` in \`${config.outputDir}/shot-plan.json\`
255
265
  - Write \`${config.outputDir}/shots/SHOTNN.md\` per shot; coverage stays in the same file
256
- - Refresh \`${config.outputDir}/reports/SAFETY-REPORT.md\` and \`${config.outputDir}/reports/DELIVERY-REPORT.md\` before \`/finish\`
266
+ - Refresh \`${config.outputDir}/reports/SAFETY-REPORT.md\`, \`${config.outputDir}/reports/SEMANTIC-REPORT.md\`, and \`${config.outputDir}/reports/DELIVERY-REPORT.md\` before \`/finish\`
257
267
 
258
268
  ## Non-Negotiables
259
269
  1. **AUTO-ANONYMOUS:** Replace ALL real person names in visual prompts with physical descriptions.
@@ -266,10 +276,12 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
266
276
  8. **Coverage:** 2-3 sub-shots within same SHOTNN.md file, min 70 words each.
267
277
  9. **Voice Design:** keep project-level \`voiceCast\` in \`${config.outputDir}/shot-plan.json\` and per-shot \`Audio Plan\` in each VIDEO section.
268
278
  10. **ILK/İLK FRAME:** Always include a fenced code block, even when chained.
269
- 11. **Quality Floor:** ILK >= 80, SON >= 80, VIDEO >= 120, coverage >= 70 words.
270
- 12. **Specificity Floor:** lens/framing, lighting, and foreground/midground/background action are mandatory.
271
- 13. **Spatial Realism Floor:** eyeline target, plane map, shared light source, and contact/depth cues are mandatory when relational staging matters.
272
- 14. **ONE FILE PER SHOT:** No separate coverage files.
279
+ 11. **Chained ILK/İLK FRAME:** code block contains only \`Use SHOT[prev]_END as exact first frame\`; any new visual prompt requires CHAIN BREAK.
280
+ 12. **Quality Floor:** ILK >= 80, SON >= 80, VIDEO >= 120, coverage >= 70 words.
281
+ 13. **Specificity Floor:** lens/framing, lighting, and foreground/midground/background action are mandatory.
282
+ 14. **Spatial Realism Floor:** eyeline target, plane map, shared light source, and contact/depth cues are mandatory when relational staging matters.
283
+ 15. **Semantic Consistency Floor:** \`visual_world\`, perspective/geometry, shadow vector, scale map, reflections, gravity/contact physics, anatomy risk, foreground/background coherence, contextual contradictions, and targeted semantic avoid terms are mandatory.
284
+ 16. **ONE FILE PER SHOT:** No separate coverage files.
273
285
 
274
286
  ## Workflows
275
287
  | Command | Workflow |
@@ -306,9 +318,11 @@ This workspace keeps high-level policy in \`CLAUDE.md\` and operational detail i
306
318
  - Keep one file per shot: \`${config.outputDir}/shots/SHOTNN.md\`
307
319
  - Maintain \`${config.outputDir}/shot-plan.json\` dialogue naming policy
308
320
  - Maintain \`${config.outputDir}/shot-plan.json\` top-level \`voiceCast\`
321
+ - Maintain \`${config.outputDir}/shot-plan.json\` top-level \`visual_world\`
309
322
  - Keep \`Audio Plan\` blocks aligned to \`voiceCast\`
310
323
  - Keep \`ILK/İLK FRAME\` in a fenced code block even when chained
311
324
  - Quality floor and specificity floor are hard gates, not suggestions
325
+ - Semantic consistency floor is a hard gate: camera/lens/camera-movement/light/shadow/scale/reflection/physics/anatomy/context must align to \`visual_world\`
312
326
  - Apply \`.agent/skills/spatial-blocking/SKILL.md\` whenever eyeline, compositing, or depth realism is critical
313
327
 
314
328
  ## Debugging
@@ -336,14 +350,17 @@ Use the Film-Kit core runtime.
336
350
  - draft and repair shot files under \`${config.outputDir}/shots/\`
337
351
  - apply \`${config.outputDir}/shot-plan.json\` dialogue naming policy
338
352
  - maintain top-level \`voiceCast\` inside \`${config.outputDir}/shot-plan.json\`
353
+ - maintain top-level \`visual_world\` inside \`${config.outputDir}/shot-plan.json\`
339
354
  - keep \`Audio Plan\` blocks valid against \`voiceCast\`
340
355
  - enforce AUTO-ANONYMOUS, AUTO-SAFETY, chaining, and coverage contracts
341
356
  - enforce quality floor: ILK >= 80, SON >= 80, VIDEO >= 120, coverage >= 70
342
357
  - enforce specificity floor: lens/framing, lighting, and foreground/midground/background action
343
358
  - enforce spatial realism: explicit eyeline target, plane map, shared light source, and contact/depth cues when needed
359
+ - enforce semantic consistency: \`visual_world\`, perspective/geometry, shadow vector, scale map, reflection handling, physics/anatomy risk, foreground/background coherence, contextual contradictions, and scene-specific avoid terms
344
360
 
345
361
  ## Boundaries
346
362
  - do not skip safety or delivery reports
363
+ - do not pass chained ILK/İLK FRAME blocks that contain anything besides exact reuse text
347
364
  - do not split coverage into separate files
348
365
  - if asked to review only, report issues instead of regenerating shots by default
349
366
  `;
@@ -367,6 +384,7 @@ If using the native Claude subagent, read \`.claude/agents/prompt-engineer.md\`
367
384
  - Create \`${config.outputDir}/project-info.md\`
368
385
  - Create \`${config.outputDir}/shot-plan.json\`
369
386
  - Add top-level \`voiceCast\` before writing speaking shots
387
+ - Add top-level \`visual_world\` before writing visual prompts
370
388
 
371
389
  2. **Batch Strategy:**
372
390
  - 1-10 shots → Generate all at once
@@ -378,9 +396,11 @@ If using the native Claude subagent, read \`.claude/agents/prompt-engineer.md\`
378
396
  - Generate main shot (İLK FRAME + SON FRAME + VİDEO)
379
397
  - Add machine-readable \`Audio Plan\` before every VIDEO section
380
398
  - Keep İLK FRAME as fenced code block even when chained
399
+ - If chained, keep İLK FRAME code block to exact reuse text only; new visual prompt means CHAIN BREAK
381
400
  - Enforce hard quality floor: ILK >= 80, SON >= 80, VIDEO >= 120, coverage >= 70
382
401
  - Enforce specificity floor: lens/framing + lighting + foreground/midground/background action
383
402
  - Enforce spatial realism floor: eyeline target + plane map + shared light source + contact/depth cues when applicable
403
+ - Enforce semantic consistency floor: perspective/geometry + shadow vector + scale map + reflections + gravity/contact physics + anatomy risk + contextual contradiction check
384
404
  - Generate 2-3 coverage shots (in same file)
385
405
  - Write to \`${config.outputDir}/shots/SHOT[NN].md\`
386
406
  - Update \`${config.outputDir}/_index.md\`
@@ -388,6 +408,7 @@ If using the native Claude subagent, read \`.claude/agents/prompt-engineer.md\`
388
408
  4. **Validation Gates:**
389
409
  - Run /safety-check
390
410
  - Write \`${config.outputDir}/reports/SAFETY-REPORT.md\`
411
+ - Write \`${config.outputDir}/reports/SEMANTIC-REPORT.md\`
391
412
  - Write \`${config.outputDir}/reports/DELIVERY-REPORT.md\`
392
413
  - If any gate fails, run \`.agent/workflows/recover.md\`
393
414
 
@@ -405,12 +426,13 @@ function buildClaudeRuleOutputContract(config) {
405
426
 
406
427
  ## Required Files
407
428
  - \`${config.outputDir}/project-info.md\` — Characters, settings, emotional arc mapping, tension levels
408
- - \`${config.outputDir}/shot-plan.json\` — Name policy, shot plan, validation contract, and top-level \`voiceCast\`
429
+ - \`${config.outputDir}/shot-plan.json\` — Name policy, shot plan, validation contract, top-level \`voiceCast\`, and top-level \`visual_world\`
409
430
  - \`.agent/model-profile.md\` — Active model constraints and presets
410
431
  - \`.agent/VOICE-DESIGN.md\` — Voice identity and shot audio contract
411
432
  - \`${config.outputDir}/_index.md\` — Shot tracking with chain & status
412
433
  - \`${config.outputDir}/shots/SHOT01.md ... SHOTNN.md\` — Individual shot files (one file per shot)
413
434
  - \`${config.outputDir}/reports/SAFETY-REPORT.md\` — Safety gate report
435
+ - \`${config.outputDir}/reports/SEMANTIC-REPORT.md\` — Semantic consistency gate report
414
436
  - \`${config.outputDir}/reports/DELIVERY-REPORT.md\` — Delivery gate report
415
437
 
416
438
  ## Prompt Flow Order (MANDATORY)
@@ -449,10 +471,11 @@ FIRST SHOT / CHAINED from SHOT[prev]_END / CHAIN BREAK - Reason
449
471
  ## Main Shot
450
472
 
451
473
  ### İLK FRAME (SHOTNN_START)
452
- [If chained: "Use SHOT[prev]_END as first frame"]
474
+ [If chained: the code block below must contain only "Use SHOT[prev]_END as exact first frame"]
453
475
 
454
476
  > NOTE: Even when chained, this section MUST contain a fenced code block.
455
- > If chained, include: "Use SHOT[prev]_END as exact first frame."
477
+ > If chained, the fenced code block must contain only: "Use SHOT[prev]_END as exact first frame."
478
+ > Any new visual prompt in a chained ILK FRAME section requires CHAIN BREAK.
456
479
 
457
480
  \\\`\\\`\\\`
458
481
  [Image prompt — min 60 words, following prompt flow order]
@@ -596,6 +619,11 @@ Character gaze directions must be spatially consistent between cuts.
596
619
  - Keep one motivated light source across subjects.
597
620
  - Add contact / weight / support cues to avoid pasted composite look.
598
621
 
622
+ ### Semantic Consistency
623
+ - \`shot-plan.json.visual_world\` is the canonical scene contract.
624
+ - Prompts must agree with its aspect ratio, camera height, lens family, horizon line, vanishing strategy, camera movement strategy, light source, shadow direction, color temperature, scale map, reflection risk, physics constraints, and seed strategy.
625
+ - Avoid contextual contradictions unless the prompt explicitly explains the unusual physics or style.
626
+
599
627
  ### Dramaturgy (for dialogue scenes)
600
628
  Analyze per character: Objective → Obstacle → Stakes → Subtext → Beat turns.
601
629
  Embed as physical behavior in prompts, NOT as metadata.
@@ -632,10 +660,11 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
632
660
  2. \`reference-locking/SKILL.md\` — When reference images provided
633
661
  3. \`frame-chaining/SKILL.md\` — ALWAYS for multi-shot continuity
634
662
  4. \`spatial-blocking/SKILL.md\` — When gaze / depth / scale realism is critical
635
- 5. \`coverage-system/SKILL.md\` — ALWAYS (mandatory coverage shots)
636
- 6. \`visual-modes/SKILL.md\` — ALWAYS (Ultra Realism default)
637
- 7. \`audio-design/SKILL.md\` — When dialogue or SFX needed
638
- 8. \`prompt-structure/SKILL.md\` — ALWAYS (prompt templates)
663
+ 5. \`semantic-consistency/SKILL.md\` — ALWAYS (visual_world + semantic QA)
664
+ 6. \`coverage-system/SKILL.md\` — ALWAYS (mandatory coverage shots)
665
+ 7. \`visual-modes/SKILL.md\` — ALWAYS (Ultra Realism default)
666
+ 8. \`audio-design/SKILL.md\` — When dialogue or SFX needed
667
+ 9. \`prompt-structure/SKILL.md\` — ALWAYS (prompt templates)
639
668
 
640
669
  ### When User Asks /generate
641
670
  1. Read \`.agent/workflows/generate.md\` for the full procedure
@@ -645,15 +674,18 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
645
674
  5. Create project info: \`${config.outputDir}/project-info.md\`
646
675
  6. Create plan: \`${config.outputDir}/shot-plan.json\`
647
676
  7. Keep top-level \`voiceCast\` in the plan and \`Audio Plan\` in speaking VIDEO sections
648
- 8. Write reports: \`${config.outputDir}/reports/SAFETY-REPORT.md\`, \`${config.outputDir}/reports/DELIVERY-REPORT.md\`
677
+ 8. Keep top-level \`visual_world\` in the plan for camera/lens/camera-movement/light/shadow/scale/reflection/physics/seed rules
678
+ 9. Write reports: \`${config.outputDir}/reports/SAFETY-REPORT.md\`, \`${config.outputDir}/reports/SEMANTIC-REPORT.md\`, \`${config.outputDir}/reports/DELIVERY-REPORT.md\`
649
679
 
650
680
  ### Critical Rules
651
681
  - **AUTO-ANONYMOUS:** Replace ALL real names with physical descriptions
652
682
  - **Name Policy:** Dialogue naming follows \`${config.outputDir}/shot-plan.json\` policy
653
683
  - **AUTO-SAFETY:** Proactively reframe sensitive content
654
684
  - **Frame Chaining:** Last frame of SHOT[N] = First frame of SHOT[N+1]
685
+ - **Chain Hardening:** chained ILK/İLK FRAME code block contains only \`Use SHOT[prev]_END as exact first frame\`
655
686
  - **Coverage:** 2-3 sub-shots per main shot (in same file, min 60 words each)
656
687
  - **Spatial Realism:** eyeline targets, shared light, depth scale, and anti-cutout staging must agree when subjects share frame
688
+ - **Semantic Consistency:** \`visual_world\` controls perspective/geometry, shadow vector, scale map, reflections, physics, anatomy risk, background coherence, and contextual contradictions
657
689
  - **Avoid Line:** MANDATORY on every prompt
658
690
  - **Music:** NONE by default
659
691
  - **Voice Design:** keep \`voiceCast\` in \`${config.outputDir}/shot-plan.json\` and \`Audio Plan\` in speaking VIDEO sections
@@ -686,9 +718,9 @@ When request is /generate, follow the Film-Kit Hollywood production system:
686
718
  3. Load required skills from \`.agent/skills/\`
687
719
  4. Transform scenario into production shot package at \`${config.outputDir}\`
688
720
  5. Generate: project-info.md, shot-plan.json, _index.md, shots/SHOT01.md..SHOTNN.md
689
- 6. Keep top-level \`voiceCast\` in shot-plan.json
721
+ 6. Keep top-level \`voiceCast\` and \`visual_world\` in shot-plan.json
690
722
  7. Each SHOTNN.md: İLK FRAME + SON FRAME + AUDIO PLAN + VİDEO + 2-3 Coverage (ALL IN ONE FILE)
691
- 8. Enforce: auto-anonymous, dialogue name policy, auto-safety, frame chaining, avoid lines
723
+ 8. Enforce: auto-anonymous, dialogue name policy, auto-safety, frame chaining, semantic consistency, avoid lines
692
724
  9. Write reports to \`${config.outputDir}/reports/\` before /finish
693
725
  `;
694
726
  }
@@ -713,10 +745,11 @@ Before generating ANY prompts, read these skills:
713
745
  2. \`.agent/skills/reference-locking/SKILL.md\` — When refs provided
714
746
  3. \`.agent/skills/frame-chaining/SKILL.md\` — ALWAYS
715
747
  4. \`.agent/skills/spatial-blocking/SKILL.md\` — When gaze / depth / scale realism is critical
716
- 5. \`.agent/skills/coverage-system/SKILL.md\` — ALWAYS (mandatory)
717
- 6. \`.agent/skills/visual-modes/SKILL.md\` — ALWAYS
718
- 7. \`.agent/skills/audio-design/SKILL.md\` — When dialogue/SFX
719
- 8. \`.agent/skills/prompt-structure/SKILL.md\` — ALWAYS
748
+ 5. \`.agent/skills/semantic-consistency/SKILL.md\` — ALWAYS (visual_world + semantic QA)
749
+ 6. \`.agent/skills/coverage-system/SKILL.md\` — ALWAYS (mandatory)
750
+ 7. \`.agent/skills/visual-modes/SKILL.md\` — ALWAYS
751
+ 8. \`.agent/skills/audio-design/SKILL.md\` — When dialogue/SFX
752
+ 9. \`.agent/skills/prompt-structure/SKILL.md\` — ALWAYS
720
753
 
721
754
  ## Workflows
722
755
  | Command | Workflow |
@@ -739,7 +772,8 @@ Before generating ANY prompts, read these skills:
739
772
  - Project info: \`${config.outputDir}/project-info.md\`
740
773
  - Plan: \`${config.outputDir}/shot-plan.json\`
741
774
  - Voice contract: top-level \`voiceCast\` in \`${config.outputDir}/shot-plan.json\`
742
- - Reports: \`${config.outputDir}/reports/SAFETY-REPORT.md\`, \`${config.outputDir}/reports/DELIVERY-REPORT.md\`
775
+ - Semantic contract: top-level \`visual_world\` in \`${config.outputDir}/shot-plan.json\`
776
+ - Reports: \`${config.outputDir}/reports/SAFETY-REPORT.md\`, \`${config.outputDir}/reports/SEMANTIC-REPORT.md\`, \`${config.outputDir}/reports/DELIVERY-REPORT.md\`
743
777
 
744
778
  ## Critical Rules
745
779
  1. **AUTO-ANONYMOUS:** Replace ALL real person names with physical descriptions
@@ -753,8 +787,10 @@ Before generating ANY prompts, read these skills:
753
787
  9. **Ultra Realism** default visual mode
754
788
  10. **8s duration** default, slow burn pacing
755
789
  11. **ILK/İLK FRAME:** always keep fenced code block
756
- 12. **ONE FILE PER SHOT:** SHOTNN.md contains main shot + all coverage
757
- 13. **Relational Realism:** preserve eyeline targets, shared light, depth scale, and anti-cutout staging when multiple subjects share frame
790
+ 12. **Chained ILK/İLK FRAME:** code block contains only \`Use SHOT[prev]_END as exact first frame\`; any new visual prompt is CHAIN BREAK
791
+ 13. **ONE FILE PER SHOT:** SHOTNN.md contains main shot + all coverage
792
+ 14. **Relational Realism:** preserve eyeline targets, shared light, depth scale, and anti-cutout staging when multiple subjects share frame
793
+ 15. **Semantic Consistency:** preserve \`visual_world\` perspective, shadow vector, scale map, reflections, gravity/contact physics, anatomy risk, foreground/background coherence, and contextual logic
758
794
 
759
795
  ## Quality Floor (Hard Gate)
760
796
  Reject and regenerate any shot that fails:
@@ -767,6 +803,7 @@ Reject and regenerate any shot that fails:
767
803
  - missing explicit foreground/midground/background action details
768
804
  - missing explicit eyeline target or \`not camera\` instruction when gaze matters
769
805
  - missing explicit shared light source / depth / contact cues in multi-subject shots
806
+ - missing semantic consistency anchors: perspective/geometry, shadow vector, scale map, reflection handling, gravity/contact physics, anatomy risk, foreground/background coherence, contextual contradiction check
770
807
 
771
808
  ## Reject Weak Prompt Style
772
809
  Do not accept generic filler language:
@@ -869,7 +906,7 @@ The more aligned these are, the cleaner the transition:
869
906
  - Hand pose and finger count should be similar in both frames
870
907
  - Avoid end frames with extreme mouth positions if speech is not intended
871
908
 
872
- **Loop shortcut:** Set Start = End (same image). Prompt: "seamless loop" + simple camera movement (e.g., roll 360, slow push-in).
909
+ **Loop shortcut:** Set Start = End (same image). Prompt: "seamless loop" + simple camera movement (e.g., roll 360, Dolly In).
873
910
 
874
911
  ### Transformation Budget
875
912
 
@@ -924,10 +961,11 @@ These prevent the model from taking shortcuts.
924
961
  More complex camera = more warp risk.
925
962
 
926
963
  **Safest commands (highest success rate):**
927
- - slow push-in / pull-back
928
- - pan left/right
929
- - tilt up/down
930
- - gentle handheld micro-sway
964
+ - Dolly In / Dolly Out
965
+ - Pan Left / Pan Right
966
+ - Tilt Up / Tilt Down
967
+ - Tracking Shot or Steadicam Movement for smooth follow
968
+ - Handheld Movement with gentle micro-sway
931
969
  - roll 360 (especially for loops)
932
970
 
933
971
  **Stabilization trick:** Writing "tripod-locked" reduces background jitter.
@@ -1065,17 +1103,23 @@ Element Binding is Kling 3.0's built-in technology for maintaining character and
1065
1103
  - For **multi-shot** sequences: Prefer Element Binding when available
1066
1104
  - **Fallback:** If Element Binding is not available in your interface, manually repeat: character age, distinctive features, costume, and key proportions at each shot's start
1067
1105
 
1068
- ### Advanced Camera Vocabulary (Kling vCoT Triggers)
1106
+ ### Advanced Camera Vocabulary (24-Move Cinematic Lexicon)
1107
+
1108
+ These professional terms activate Kling's "Visual Chain-of-Thought" (vCoT) for more precise results. Use one named movement per shot unless a motivated compound move is required:
1069
1109
 
1070
- These professional terms activate Kling's "Visual Chain-of-Thought" (vCoT) for more precise results:
1110
+ | Group | Movements |
1111
+ |-------|-----------|
1112
+ | **Physical push/pull** | Dolly In, Dolly Out |
1113
+ | **Locked head rotation** | Pan Left, Pan Right, Tilt Up, Tilt Down |
1114
+ | **Physical lateral/vertical travel** | Truck Left, Truck Right, Pedestal Up, Pedestal Down |
1115
+ | **Arc/parallax** | Arc Left, Arc Right, Tracking Shot, Leading Shot, Following Shot |
1116
+ | **Dynamic stabilization** | Whip Pan, Handheld Movement, Steadicam Movement |
1117
+ | **Angle/subjective** | Canted Angle (Dutch Angle), Point of View (POV) |
1118
+ | **Optical/composite** | Zoom In, Zoom Out, Dolly Zoom (Vertigo Effect), Crane/Jib Shot |
1071
1119
 
1072
- | Category | Terms |
1073
- |----------|-------|
1074
- | **Angles** | Low-angle hero shot, Dutch angle (tilted horizon), POV (subjective), Bird's-eye view (top-down) |
1075
- | **Movements** | Dolly push-in, Orbit (360° rotation), Lateral pan, Tracking, Spiral up |
1076
- | **Hybrid** | Dolly Zoom (Vertigo effect — zoom in while pulling back), Move Left and Zoom In (simultaneous) |
1120
+ Aliases: Dolly push-in = Dolly In, Dolly pull-out = Dolly Out, Orbit = Arc Left/Right, Lateral slide = Truck Left/Right, Crane rise/descend = Crane/Jib Shot. Rack focus is a focus move, not camera travel.
1077
1121
 
1078
- > **Tip:** Hybrid movements like Dolly Zoom trigger stronger vCoT processing and produce more cinematic results, but increase warp risk. Use with wider CFG (0.50-0.60).
1122
+ > **Tip:** Hybrid movements like Dolly Zoom, Truck plus Pan, or Crane/Jib plus Arc trigger stronger vCoT processing and produce more cinematic results, but increase warp risk. Use with wider CFG (0.50-0.60).
1079
1123
 
1080
1124
  ### Native Audio & Dialogue (Kling-Specific)
1081
1125
 
@@ -9,7 +9,7 @@
9
9
  Modular system consisting of:
10
10
 
11
11
  - **1 Specialist Agent** - Technical Prompt Engineer
12
- - **8 Skills** - Domain-specific knowledge modules
12
+ - **9 Skills** - Domain-specific knowledge modules
13
13
  - **4 Workflows** - Slash command procedures
14
14
 
15
15
  ---
@@ -30,6 +30,7 @@ Modular system consisting of:
30
30
  │ ├── frame-chaining/ # Shot continuity protocol
31
31
  │ ├── coverage-system/ # Mandatory coverage shots (NEW)
32
32
  │ ├── spatial-blocking/ # Eyeline, depth, scale, compositing realism
33
+ │ ├── semantic-consistency/ # Perspective, shadows, scale, physics, render QA
33
34
  │ ├── visual-modes/ # Ultra Realism & style modes
34
35
  │ ├── audio-design/ # Sound design rules
35
36
  │ └── prompt-structure/ # Prompt engineering patterns
@@ -47,11 +48,11 @@ Modular system consisting of:
47
48
 
48
49
  | Agent | Focus | Skills Used |
49
50
  |-------|-------|-------------|
50
- | `prompt-engineer` | Cinematic prompt generation for Veo 3.1 / Kling 3.0 | All 8 skills |
51
+ | `prompt-engineer` | Cinematic prompt generation for Veo 3.1 / Kling 3.0 | All 9 skills |
51
52
 
52
53
  ---
53
54
 
54
- ## 🧩 Skills (8)
55
+ ## 🧩 Skills (9)
55
56
 
56
57
  | Skill | Description |
57
58
  |-------|-------------|
@@ -60,6 +61,7 @@ Modular system consisting of:
60
61
  | `frame-chaining` | **Shot continuity**, last frame → first frame chaining, scene transition protocol (fade, dissolve, match cut) |
61
62
  | `coverage-system` | **Mandatory coverage shots** (Reaction, OTS, Insert, Cutaway, ECU, Wide) + L-cut/J-cut + 30° kuralı + **180° kuralı** + eyeline match + matching action + multi-character blocking |
62
63
  | `spatial-blocking` | **Relational realism**: eyeline targeting, plane mapping, body orientation, shared lighting, depth/scale integration, anti-cutout / anti-miniature cues |
64
+ | `semantic-consistency` | **Scene-level realism gate**: canonical `visual_world`, perspective/geometry, shadow vectors, scale map, reflections, gravity/contact physics, contextual contradiction checks, render QA |
63
65
  | `visual-modes` | **Ultra Realism** default, stylization triggers, anti-AI artifact rules + **renk sürekliliği** + magic hour + flashback/rüya görsel ayrımı |
64
66
  | `audio-design` | **Sound design** rules, voice realism, project-level `voiceCast`, shot-level `audioPlan`, audio direction block + diegetic/non-diegetic ses ayrımı |
65
67
  | `prompt-structure` | Image/video prompt templates, camera vocabulary, seed parameter, prompt rewriter, **re-take strategy**, coverage prompt yazım standartları (≥60 kelime) |
@@ -91,6 +93,7 @@ User Scenario → Agent Activated → Read model-profile → Load Required Skill
91
93
  reference-locking (if refs provided)
92
94
  frame-chaining (ALWAYS)
93
95
  spatial-blocking (when gaze/depth/scale realism matters)
96
+ semantic-consistency (ALWAYS - visual_world + physics gate)
94
97
  coverage-system (ALWAYS - mandatory)
95
98
  visual-modes (check for style triggers)
96
99
  audio-design (if dialogue/SFX needed)
package/content/MASTER.md CHANGED
@@ -26,6 +26,7 @@ Scenario Received → Check for elements:
26
26
  ├── Reference images provided? → READ reference-locking/SKILL.md
27
27
  ├── Multiple shots? → READ frame-chaining/SKILL.md (ALWAYS)
28
28
  ├── Multi-subject / gaze / depth realism? → READ spatial-blocking/SKILL.md
29
+ ├── ALWAYS READ → semantic-consistency/SKILL.md (visual_world + semantic QA)
29
30
  ├── Style keywords (anime, noir, etc.)? → READ visual-modes/SKILL.md
30
31
  ├── Dialogue/SFX needed? → READ audio-design/SKILL.md
31
32
  ├── ALWAYS READ → coverage-system/SKILL.md (MANDATORY)
@@ -531,8 +532,9 @@ For full coverage protocols → READ `skills/coverage-system/SKILL.md`
531
532
  > 🇹🇷 [Türkçe kısa özet: Bu shot'ta ne oluyor, 1 cümle]
532
533
 
533
534
  **İLK FRAME (SHOTNN_START):**
534
- [If CHAINED: "→ Use SHOT[prev]_END as first frame"]
535
- [If FIRST/BREAK: Generate code block]
535
+ [If CHAINED: "→ Use SHOT[prev]_END as first frame"]
536
+ [If CHAINED: the fenced code block must contain only "Use SHOT[prev]_END as exact first frame"; any new visual prompt requires CHAIN BREAK]
537
+ [If FIRST/BREAK: Generate code block]
536
538
 
537
539
  ```
538
540
  [Complete image prompt — MIN 60 words, MAX 100 words]
@@ -675,8 +677,9 @@ Before outputting, validate EVERY shot. **Bu kontrol otomatiktir, kullanıcı ha
675
677
  |-------|--------------|
676
678
  | [safety-compliance](skills/safety-compliance/SKILL.md) | ALWAYS before generating |
677
679
  | [reference-locking](skills/reference-locking/SKILL.md) | When reference images provided |
678
- | [frame-chaining](skills/frame-chaining/SKILL.md) | ALWAYS for multi-shot |
679
- | [coverage-system](skills/coverage-system/SKILL.md) | ALWAYS (mandatory for every shot) |
680
+ | [frame-chaining](skills/frame-chaining/SKILL.md) | ALWAYS for multi-shot |
681
+ | [semantic-consistency](skills/semantic-consistency/SKILL.md) | ALWAYS for `visual_world`, perspective, shadow, scale, physics, and render QA gates |
682
+ | [coverage-system](skills/coverage-system/SKILL.md) | ALWAYS (mandatory for every shot) |
680
683
  | [visual-modes](skills/visual-modes/SKILL.md) | Check for style triggers |
681
684
  | [audio-design](skills/audio-design/SKILL.md) | When dialogue/SFX needed |
682
685
  | [prompt-structure](skills/prompt-structure/SKILL.md) | ALWAYS |
@@ -710,18 +713,18 @@ Before outputting, validate EVERY shot. **Bu kontrol otomatiktir, kullanıcı ha
710
713
  |--------|-------------------|
711
714
  | **Composition** | Rule of thirds, golden ratio, leading lines, depth layers |
712
715
  | **Lighting** | Three-point setups, motivated sources, proper color temperature |
713
- | **Camera** | Professional movements (dolly, crane, Steadicam), intentional lens choices |
716
+ | **Camera** | 24-move cinematic lexicon from `prompt-structure`, intentional lens choices |
714
717
  | **Color** | Graded for mood, consistent palette throughout |
715
718
  | **Sound** | Layered design: dialogue, SFX, ambience, Foley |
716
719
  | **Continuity** | 180° rule, eyeline match, seamless cuts |
717
720
 
718
721
  ### Professional Cinematography Terms
719
722
 
720
- - **Camera:** dolly, crane, Steadicam, rack focus, deep focus, tracking
723
+ - **Camera:** Dolly In/Out, Pan Left/Right, Tilt Up/Down, Truck Left/Right, Pedestal Up/Down, Arc Left/Right, Whip Pan, Tracking Shot, Leading Shot, Following Shot, Canted Angle/Dutch Angle, Handheld Movement, Steadicam Movement, Zoom In/Out, Dolly Zoom, Crane/Jib Shot, Point of View (POV); rack focus and deep focus are focus tools, not travel moves
721
724
  - **Lighting:** key light, fill, rim/hair, practical, motivated, diffused, bounce
722
725
  - **Composition:** negative space, leading lines, frame within frame, Dutch angle
723
726
  - **Color:** LUT, color grade, desaturated, warm/cool palette, high/low key
724
- - **Movement:** push-in, pull-out, orbit, reveal, whip pan, crash zoom
727
+ - **Movement:** use canonical names from `prompt-structure` instead of vague aliases; push-in = Dolly In, pull-out = Dolly Out, orbit = Arc Left/Right, crash zoom = fast Zoom In/Out
725
728
 
726
729
  ---
727
730
 
package/content/RULES.md CHANGED
@@ -23,8 +23,9 @@ Film/Video request detected → Activate prompt-engineer agent
23
23
  ├── model-profile (ALWAYS FIRST)
24
24
  ├── safety-compliance (ALWAYS)
25
25
  ├── reference-locking (if refs provided)
26
- ├── frame-chaining (ALWAYS for multi-shot)
27
- ├── coverage-system (ALWAYS - mandatory)
26
+ ├── frame-chaining (ALWAYS for multi-shot)
27
+ ├── semantic-consistency (ALWAYS for visual_world + physics)
28
+ ├── coverage-system (ALWAYS - mandatory)
28
29
  ├── visual-modes (check for style triggers)
29
30
  ├── audio-design (if dialogue/SFX)
30
31
  └── prompt-structure (ALWAYS)
@@ -36,8 +37,9 @@ Film/Video request detected → Activate prompt-engineer agent
36
37
  |-------|------|--------------|
37
38
  | Safety & Celebrity Ban | `.agent/skills/safety-compliance/SKILL.md` | ALWAYS |
38
39
  | Reference Locking | `.agent/skills/reference-locking/SKILL.md` | When refs provided |
39
- | Frame Chaining | `.agent/skills/frame-chaining/SKILL.md` | Multi-shot projects |
40
- | Coverage System | `.agent/skills/coverage-system/SKILL.md` | ALWAYS (mandatory) |
40
+ | Frame Chaining | `.agent/skills/frame-chaining/SKILL.md` | Multi-shot projects |
41
+ | Semantic Consistency | `.agent/skills/semantic-consistency/SKILL.md` | ALWAYS |
42
+ | Coverage System | `.agent/skills/coverage-system/SKILL.md` | ALWAYS (mandatory) |
41
43
  | Visual Modes | `.agent/skills/visual-modes/SKILL.md` | All visual work |
42
44
  | Audio Design | `.agent/skills/audio-design/SKILL.md` | Dialogue/SFX needed |
43
45
  | Prompt Structure | `.agent/skills/prompt-structure/SKILL.md` | ALWAYS |
@@ -181,7 +183,7 @@ Shot'ları zenginleştiren 12 teknik: mini hedef, fiziksel aksiyon, katmanlı ı
181
183
  |--------|-------------------|
182
184
  | **Composition** | Rule of thirds, golden ratio, leading lines, depth layers |
183
185
  | **Lighting** | Three-point, motivated sources, contrast ratios, color temperature |
184
- | **Camera** | Professional movements, lens selection, aperture control |
186
+ | **Camera** | 24-move cinematic lexicon from `prompt-structure`, lens selection, aperture control |
185
187
  | **Color** | Graded for mood, consistent palette, period-appropriate |
186
188
  | **Sound** | Layered design, spatial audio, natural dynamics |
187
189
  | **Editing** | Motivated cuts, rhythm, pacing, continuity |
@@ -189,7 +191,7 @@ Shot'ları zenginleştiren 12 teknik: mini hedef, fiziksel aksiyon, katmanlı ı
189
191
  ### Professional Terms to Use
190
192
 
191
193
  ```
192
- Cinematography: dolly, crane, Steadicam, rack focus, deep focus
194
+ Cinematography: Dolly In/Out, Pan Left/Right, Tilt Up/Down, Truck Left/Right, Pedestal Up/Down, Arc Left/Right, Whip Pan, Tracking Shot, Leading Shot, Following Shot, Canted Angle/Dutch Angle, Handheld Movement, Steadicam Movement, Zoom In/Out, Dolly Zoom, Crane/Jib Shot, Point of View (POV); rack focus/deep focus are focus tools
193
195
  Lighting: key light, fill, rim, practical, motivated, diffused
194
196
  Composition: negative space, leading lines, frame within frame
195
197
  Color: LUT, grade, desaturated, warm/cool, high/low key
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: prompt-engineer
3
3
  description: Senior Technical Prompt Engineer for model-aware runtime profiles (Veo 3.1 / Kling 3.0). Converts shot lists and scenarios into production-grade cinematic prompts optimized for continuity and platform compliance.
4
- skills: safety-compliance, reference-locking, frame-chaining, coverage-system, spatial-blocking, visual-modes, audio-design, prompt-structure
4
+ skills: safety-compliance, reference-locking, frame-chaining, coverage-system, spatial-blocking, semantic-consistency, visual-modes, audio-design, prompt-structure
5
5
  ---
6
6
 
7
7
  # Prompt Engineer - Hollywood Standard Cinematic Video Generation
@@ -44,6 +44,7 @@ You are a senior Technical Prompt Engineer specialized in model-aware cinematic
44
44
  15. **Hard quality floor:** ILK >=80, SON >=80, VIDEO >=120, coverage >=70 kelime
45
45
  16. **Hard specificity floor:** Her promptta lens/framing, lighting ve FG/MG/BG action detayları zorunlu
46
46
  17. **Spatial realism floor:** eyeline target, plane map, shared light source, contact/weight cues ve tam ölçek derinlik mantığı gerektiğinde zorunlu
47
+ 18. **Semantic consistency floor:** `shot-plan.json.visual_world` kanonik olmalı; perspective/geometry, shadow vector, scale map, reflection handling, gravity/contact physics, anatomy risk, foreground/background coherence ve contextual contradiction her shot'ta geçmeli
47
48
 
48
49
  ---
49
50
 
@@ -217,12 +218,13 @@ Before outputting ANY shot:
217
218
  - [ ] Quality floor passes? (ILK>=80, SON>=80, VIDEO>=120, coverage>=70)
218
219
  - [ ] Specificity floor passes? (lens + lighting + FG/MG/BG action)
219
220
  - [ ] Spatial realism passes? (eyeline target + plane map + shared light + contact/depth cues)
221
+ - [ ] Semantic consistency passes? (`visual_world` fields + perspective/geometry + shadow vector + scale map + reflection/physics/anatomy/context checks)
220
222
  - [ ] Model Control block exists? (`Model`, `Preset`, `CFG`, `Transition Mode`)
221
223
 
222
224
  ### 3. Kling-Specific Gates (when model is kling-3.0)
223
225
  - [ ] Motion timeline uses `first → then → finally` structure?
224
226
  - [ ] "What stays the same" explicitly stated? (identity, background, costume)
225
- - [ ] Camera movement is simple/safe? (no complex hybrid movements)
227
+ - [ ] Camera movement is named from the 24-move cinematic lexicon and simple/safe unless an advanced hybrid is motivated?
226
228
  - [ ] Negative prompt includes Kling cleanup set? (warping, rubbery, melted)
227
229
  - [ ] Duration matches transformation budget? (5s=1 change, 10s=2-3, 15s=complex)
228
230
  - [ ] Start/end frames are in same visual universe? (angle, scale, light, lens)
@@ -254,11 +256,11 @@ For EACH shot, output exactly one file (`SHOTNN.md`) containing Main Shot + Cove
254
256
 
255
257
  ## Main Shot
256
258
 
257
- ### İLK FRAME (SHOTNN_START)
258
- [Always provide a fenced code block]
259
- [If CHAINED: include "Use SHOT[prev]_END as exact first frame" inside code block]
260
- [If FIRST/BREAK: image prompt in same code block]
261
-
259
+ ### İLK FRAME (SHOTNN_START)
260
+ [Always provide a fenced code block]
261
+ [If CHAINED: code block must contain only "Use SHOT[prev]_END as exact first frame"; any new visual prompt requires CHAIN BREAK]
262
+ [If FIRST/BREAK: image prompt in same code block]
263
+
262
264
  ### SON FRAME (SHOTNN_END)
263
265
  ```
264
266
  [Image Prompt (Flow Order + Avoid Line)]
@@ -109,7 +109,7 @@ Each coverage shot receives:
109
109
  Camera: Close-up, 85mm lens, f/2.0, shallow DOF
110
110
  Subject: Character's face, eyes, micro-expressions
111
111
  Lighting: Soft key, subtle fill, rim for separation
112
- Movement: Static or very slow push-in
112
+ Movement: Static or very slow Dolly In
113
113
  Duration: 4-6 seconds
114
114
  Audio: Natural ambience only, no dialogue (listening)
115
115
  ```
@@ -386,5 +386,5 @@ When designing END frames, consider the transformation budget:
386
386
  For seamless loops with Kling:
387
387
  1. **Start = End** (use identical image for both)
388
388
  2. Prompt: `"seamless loop"` + simple camera movement
389
- 3. Best movements for loops: roll 360°, slow push-in/pull-back
389
+ 3. Best movements for loops: roll 360°, Dolly In/Dolly Out
390
390
  4. Avoid subject transformations — loops work best with camera-only motion
@@ -23,7 +23,8 @@ description: Image and video prompt templates, camera movement vocabulary, stabi
23
23
  | **Reference Commands First** | Always start with reference image instructions |
24
24
  | **Safety First** | Always consider filter implications |
25
25
  | **Short Sentence Rule** | Split long sentences across shots |
26
- | **Model Profile First** | Read `.agent/model-profile.md` before generating any prompts |
26
+ | **Model Profile First** | Read `.agent/model-profile.md` before generating any prompts |
27
+ | **Semantic World First** | Read `.agent/skills/semantic-consistency/SKILL.md` and align prompts to `shot-plan.json.visual_world` |
27
28
 
28
29
  ---
29
30
 
@@ -45,9 +46,9 @@ description: Image and video prompt templates, camera movement vocabulary, stabi
45
46
 
46
47
  > **İlk cümle = "Bu shot ne?" sorusunun tek cümlelik cevabı olmalı.**
47
48
 
48
- ### Standard Template
49
-
50
- ```
49
+ ### Standard Template
50
+
51
+ ```
51
52
  [REFERENCE LOCK section if applicable]
52
53
 
53
54
  Cinematic still frame of [subject with reference adherence] in [frozen pose].
@@ -56,8 +57,27 @@ Lighting: [specific setup].
56
57
  Camera: [framing], [lens mm], [aperture], photorealistic, crisp focus, no motion blur, no text.
57
58
  [Safety injection if needed]
58
59
 
59
- [Full avoid line]
60
- ```
60
+ [Full avoid line]
61
+ ```
62
+
63
+ ### Semantic Consistency Layer
64
+
65
+ Before writing any image prompt, lock the scene's `visual_world` values:
66
+
67
+ - camera height, lens family, horizon line, and vanishing-point strategy
68
+ - single motivated light source, color temperature, and shadow direction
69
+ - foreground/midground/background scale map
70
+ - reflection risk (`none`, `matte/non-reflective`, or accurate mirror/water/glass behavior)
71
+ - gravity/contact physics and anatomy risk
72
+ - contextual contradiction check
73
+
74
+ For chained shots, the `ILK/İLK FRAME` fenced code block must contain only:
75
+
76
+ ```text
77
+ Use SHOT[prev]_END as exact first frame.
78
+ ```
79
+
80
+ Any new start-frame visual prompt must be declared as `CHAIN BREAK - [reason]`.
61
81
 
62
82
  ### Example: Character in Environment
63
83
 
@@ -146,43 +166,70 @@ Avoid: distorted faces, morphing, bad anatomy, extra limbs/fingers, blurry, flic
146
166
 
147
167
  ---
148
168
 
149
- ## Camera Movement Vocabulary
150
-
151
- ### Primary Movements
152
-
153
- | Movement | Description | Use When |
154
- |----------|-------------|----------|
155
- | **Dolly push-in** | Camera moves toward subject | Building intensity, focus |
156
- | **Dolly pull-out** | Camera moves away from subject | Reveal, context, ending |
157
- | **Pan left/right** | Camera rotates horizontally | Following action, scanning |
158
- | **Tilt up/down** | Camera rotates vertically | Reveal height, power dynamics |
159
- | **Crane rise** | Camera elevates vertically | Grand reveal, establishing |
160
- | **Crane descend** | Camera lowers vertically | Intimate approach |
161
- | **Orbit** | Camera circles subject | Dramatic emphasis, 360 view |
162
- | **Tracking** | Camera follows alongside | Following movement |
163
- | **Rack focus** | Focus shifts between planes | Attention shift, reveal |
164
- | **Whip pan** | Camera rotates rapidly (snap) | Transition between subjects, hide cuts, energy burst |
165
- | **Lateral pan** | Camera slides horizontally | Reveal adjacent elements |
166
- | **Spiral up** | Camera rises while orbiting | Grand cinematic reveal |
167
-
168
- ### Advanced Angles (Kling vCoT Triggers)
169
-
170
- | Angle | Effect | Use When |
171
- |-------|--------|----------|
172
- | **Low-angle hero shot** | Subject towers over camera | Power, dominance, heroism |
173
- | **Dutch angle** | Tilted horizon | Unease, disorientation, tension |
174
- | **POV (subjective)** | Camera = character's eyes | Immersion, fear, discovery |
175
- | **Bird's-eye view** | Top-down perspective | Scale, geography, isolation |
176
-
177
- ### Hybrid Movements (Advanced Higher Risk, Higher Reward)
178
-
179
- | Movement | Effect | Risk Level |
180
- |----------|--------|------------|
181
- | **Dolly Zoom** (Vertigo) | Zoom in while pulling back | ⚠️ Medium — stunning but warp-prone |
182
- | **Move Left + Zoom In** | Simultaneous lateral + zoom | ⚠️ Medium — requires stable subject |
183
- | **Crane Rise + Pan** | Elevate while rotating | ⚠️ Low-Medium — smooth cinematic |
184
-
185
- > **Kling Note:** Professional terms activate Kling's "Visual Chain-of-Thought" (vCoT). Hybrid movements trigger stronger vCoT processing for more cinematic results, but increase warp risk. Use with CFG 0.50-0.60.
169
+ ## Camera Movement Vocabulary
170
+
171
+ Use one precise movement name per shot unless the beat explicitly needs a motivated compound move. Compound moves must preserve the same `visual_world` geometry, shadow vector, screen direction, and subject scale.
172
+
173
+ ### Canonical 24 Cinematic Camera Movements
174
+
175
+ | # | Movement | Description | Use When |
176
+ |---|----------|-------------|----------|
177
+ | 1 | **Dolly In** | Camera physically moves toward the subject | Build intensity, reveal emotion, enter a space |
178
+ | 2 | **Dolly Out** | Camera physically moves away from the subject | Reveal context, isolation, ending beat |
179
+ | 3 | **Pan Left** | Locked camera rotates left | Follow action, reveal off-screen space |
180
+ | 4 | **Pan Right** | Locked camera rotates right | Follow action, reveal off-screen space |
181
+ | 5 | **Tilt Up** | Locked camera rotates upward | Reveal height, power, scale |
182
+ | 6 | **Tilt Down** | Locked camera rotates downward | Reveal ground detail, vulnerability, aftermath |
183
+ | 7 | **Truck Left** | Whole camera slides left | Parallax, lateral reveal, follow blocking |
184
+ | 8 | **Truck Right** | Whole camera slides right | Parallax, lateral reveal, follow blocking |
185
+ | 9 | **Pedestal Up** | Camera physically rises vertically | Elevation reveal, character empowerment |
186
+ | 10 | **Pedestal Down** | Camera physically lowers vertically | Intimacy, compression, descent into detail |
187
+ | 11 | **Arc Left** | Camera moves around the subject toward screen-left | Relationship shift, dramatic parallax |
188
+ | 12 | **Arc Right** | Camera moves around the subject toward screen-right | Relationship shift, dramatic parallax |
189
+ | 13 | **Whip Pan** | Very fast pan with directional motion blur | Energetic transition, hidden cut, sudden discovery |
190
+ | 14 | **Tracking Shot** | Camera follows a moving subject at matched speed | Travel, pursuit, continuous blocking |
191
+ | 15 | **Leading Shot** | Camera moves ahead of a character, facing them | Emotional walk-and-talk, confrontation, dread |
192
+ | 16 | **Following Shot** | Camera follows behind a character | Discovery, pursuit, subjective tension |
193
+ | 17 | **Canted Angle (Dutch Angle)** | Camera rolls sideways, tilting the horizon | Unease, disorientation, psychological tension |
194
+ | 18 | **Handheld Movement** | Realistic handheld shake and micro-instability | Documentary realism, urgency, chaos |
195
+ | 19 | **Steadicam Movement** | Smooth stabilized walking movement | Long takes, premium follow shots, immersive movement |
196
+ | 20 | **Zoom In** | Lens optically tightens without camera travel | Attention shift, surveillance, emphasis |
197
+ | 21 | **Zoom Out** | Lens optically widens without camera travel | Reveal context, comedic or existential pullback |
198
+ | 22 | **Dolly Zoom (Vertigo Effect)** | Camera moves while zooming the opposite way | Spatial distortion, shock, realization |
199
+ | 23 | **Crane/Jib Shot** | Camera rises or drops on a crane/jib, often diagonally | Grand reveal, scale, transition from ground to overview |
200
+ | 24 | **Point of View (POV)** | Camera behaves as a character's eyes | Immersion, fear, discovery, subjective action |
201
+
202
+ ### Legacy Aliases And Focus Moves
203
+
204
+ | Existing Term | Canonical Mapping |
205
+ |---------------|-------------------|
206
+ | **Dolly push-in** | Dolly In |
207
+ | **Dolly pull-out** | Dolly Out |
208
+ | **Orbit / 360 rotation** | Arc Left / Arc Right; specify direction and avoid full 360 unless required |
209
+ | **Lateral pan / slide** | Truck Left / Truck Right when the camera moves physically; Pan Left / Pan Right when it only rotates |
210
+ | **Crane rise / Crane descend** | Crane/Jib Shot; use Pedestal Up/Down for a purely vertical camera lift |
211
+ | **Spiral up** | Crane/Jib Shot + Arc Left/Right; higher warp risk |
212
+ | **Rack focus** | Focus shift, not a camera movement; use only when foreground/background attention must change |
213
+
214
+ ### Advanced Angles (Kling vCoT Triggers)
215
+
216
+ | Angle | Effect | Use When |
217
+ |-------|--------|----------|
218
+ | **Low-angle hero shot** | Subject towers over camera | Power, dominance, heroism |
219
+ | **Dutch angle / Canted Angle** | Tilted horizon | Unease, disorientation, tension |
220
+ | **POV / Point of View** | Camera = character's eyes | Immersion, fear, discovery |
221
+ | **Bird's-eye view** | Top-down perspective | Scale, geography, isolation |
222
+
223
+ ### Hybrid Movements (Advanced - Higher Risk, Higher Reward)
224
+
225
+ | Movement | Effect | Risk Level |
226
+ |----------|--------|------------|
227
+ | **Dolly Zoom (Vertigo Effect)** | Camera moves while the lens zooms the opposite way | Medium - stunning but warp-prone |
228
+ | **Truck Left/Right + Pan Left/Right** | Lateral camera move plus rotation for parallax | Medium - must preserve screen direction |
229
+ | **Crane/Jib Shot + Arc Left/Right** | Elevated diagonal move around subject | Low-Medium - smooth cinematic reveal |
230
+ | **Whip Pan + Match Cut** | Fast blurred rotation hides transition | Medium - use only when transition is intentional |
231
+
232
+ > **Kling Note:** Professional terms activate Kling's "Visual Chain-of-Thought" (vCoT). Hybrid movements trigger stronger vCoT processing for more cinematic results, but increase warp risk. Use with CFG 0.50-0.60.
186
233
 
187
234
  ---
188
235
 
@@ -0,0 +1,94 @@
1
+ ---
2
+ name: semantic-consistency
3
+ description: Enforces scene-level semantic consistency for generated image/video prompts and rendered media: perspective, shadows, scale, reflections, physics, anatomy, contextual contradictions, chain reuse, and render QA.
4
+ ---
5
+
6
+ # Semantic Consistency Skill
7
+
8
+ Load this skill for every Film-Kit generation, repair, safety check, and render QA pass.
9
+ It turns semantic realism into a hard production gate instead of a best-effort prompt style.
10
+
11
+ ## Required `visual_world` Contract
12
+
13
+ Every `shot-plan.json` or `team-plan.json` must carry a scene-level `visual_world` block.
14
+ Use it as the canonical source for all shots in the same scene:
15
+
16
+ ```json
17
+ {
18
+ "visual_world": {
19
+ "aspect_ratio": "16:9",
20
+ "camera_height": "eye-level / low-angle / high-angle / top-down",
21
+ "lens_family": "wide 24-35mm / normal 50mm / portrait 85mm / telephoto 135mm+ / macro",
22
+ "horizon_line": "low / center / high / not visible",
23
+ "vanishing_point_strategy": "single-point / two-point / flat telephoto / top-down no horizon",
24
+ "camera_movement_strategy": "static / one named 24-move cinematic movement / motivated compound move with parallax notes",
25
+ "light_source": "single motivated source, position, angle, softness",
26
+ "shadow_direction": "all shadows fall screen-left / screen-right / toward camera / away from camera / directly below",
27
+ "color_temperature": "warm 3500K / neutral daylight 5600K / cool 6500K",
28
+ "scale_map": "foreground, midground, background object sizes and distance logic",
29
+ "reflection_risk": "none / glass / mirror / water / metal, plus expected reflection behavior",
30
+ "physics_constraints": "gravity, contact points, support surfaces, cloth/hair/liquid behavior",
31
+ "seed_strategy": "locked seed for variants / new seed for structural repair / unknown platform seed"
32
+ }
33
+ }
34
+ ```
35
+
36
+ If any field is unknown, write an explicit value such as `not applicable: no reflective surfaces`.
37
+ Do not leave silent blanks.
38
+
39
+ ## Hard Semantic Gates
40
+
41
+ Reject or repair a prompt package if any item fails:
42
+
43
+ - Perspective/geometry: camera angle, lens family, camera height, horizon line, vanishing logic, and camera movement strategy do not agree.
44
+ - Camera movement: a moving shot does not use one of the 24 canonical cinematic movement names from `prompt-structure`, or combines movements without motivated parallax/scale notes.
45
+ - Shadow vector: a single motivated light source and one consistent shadow direction are not stated.
46
+ - Scale map: foreground/midground/background object sizes and distances are ambiguous or physically impossible.
47
+ - Reflection handling: mirror, glass, water, or metal surfaces lack accurate reflection instructions or a `matte/non-reflective` reduction.
48
+ - Gravity/contact physics: objects, feet, furniture, cloth, hair, liquid, smoke, and debris do not have plausible support or behavior cues.
49
+ - Anatomy risk: visible humans lack pose simplicity, hand/face risk handling, or scene-specific anatomy avoid terms.
50
+ - Foreground/background coherence: background does not match foreground perspective, lighting, color temperature, and style.
51
+ - Contextual contradiction: prompt contains incompatible scene logic unless explicitly justified.
52
+ - Scene-specific avoid line: avoid terms are generic only, missing the scene's actual failure modes.
53
+
54
+ ## Chained `ILK FRAME` Rule
55
+
56
+ For a chained shot, the `ILK FRAME` fenced code block may contain only this reuse instruction:
57
+
58
+ ```text
59
+ Use SHOT[prev]_END as exact first frame.
60
+ ```
61
+
62
+ No new visual prompt, new lens, new camera angle, new lighting, or new composition is allowed inside a chained `ILK FRAME` block.
63
+ If a new `ILK FRAME` prompt is required, mark the shot as `CHAIN BREAK - [reason]` and generate a fresh start frame.
64
+
65
+ ## Prompt Injection Template
66
+
67
+ Append the relevant semantic anchors to every image prompt:
68
+
69
+ ```text
70
+ Semantic consistency: [camera height], [lens family], [horizon/vanishing logic], camera movement [static or named 24-move lexicon term], single [light source] from [direction] at [angle], [color temperature], all shadows falling [direction], scale map [FG/MG/BG distances], contact points and gravity physically plausible, background perspective and lighting match the foreground, reflection handling [none/matte/accurate].
71
+ Avoid: improper perspective, wrong scale, inconsistent shadows, impossible geometry, conflicting light sources, unrealistic reflections, floating objects, disconnected elements, broken gravity, bad anatomy, extra fingers, deformed hands, foreground-background mismatch, contextual contradiction, [scene-specific terms].
72
+ ```
73
+
74
+ Keep avoid lines targeted. Prefer 15-25 concrete scene-relevant terms over long universal lists.
75
+ Do not put the positive goal in the avoid line.
76
+
77
+ ## Render QA Gate
78
+
79
+ After render, inspect the actual image/video outputs, not only the prompt text.
80
+ Write `SEMANTIC-RENDER-REPORT.md` or a `Semantic Render QA` section in `RENDER-REPORT.md` with these fields:
81
+
82
+ ```markdown
83
+ - semantic_render_status: pass/fail
84
+ - perspective_geometry_status: pass/fail
85
+ - shadow_vector_status: pass/fail
86
+ - scale_depth_status: pass/fail
87
+ - reflection_status: pass/fail/not_applicable
88
+ - anatomy_physics_status: pass/fail
89
+ - foreground_background_status: pass/fail
90
+ - chain_alignment_status: pass/fail
91
+ - rerender_or_recover_actions: [none or exact SHOTNN actions]
92
+ ```
93
+
94
+ Fail the render if the actual media contradicts the canonical `visual_world` or if a chained first frame is not an exact copy of the previous rendered end frame.
@@ -138,6 +138,7 @@ These cues reduce the pasted-PNG feeling.
138
138
  ## 4. Shared Light Map
139
139
 
140
140
  The fastest way to create a composite artifact is to light each subject as if they live in different scenes.
141
+ For the broader semantic gate, load `semantic-consistency/SKILL.md` and copy the canonical `visual_world.camera_movement_strategy`, `light_source`, `shadow_direction`, `color_temperature`, `scale_map`, `reflection_risk`, and `physics_constraints` into the shot language.
141
142
 
142
143
  ### Required Language
143
144
 
@@ -94,21 +94,25 @@ These artifacts identify AI-generated content. AVOID them absolutely:
94
94
 
95
95
  ## IMAGE AVOID LINE (Ultra Realism)
96
96
 
97
- Include at end of EVERY image prompt:
98
-
99
- ```
97
+ Include at end of EVERY image prompt:
98
+
99
+ ```
100
100
  Avoid: blurry, low-res, noise, jpeg artifacts, motion blur, out of focus, distorted faces, bad anatomy, extra limbs/fingers, deformed hands, mismatched eyes, warped perspective, inconsistent lighting, banding, over-sharpening, plastic skin, waxy skin, airbrushed skin, beauty filter, porcelain doll look, symmetrical face, duplicate people, floating artifacts, cutout edges, pasted composite look, toy-like scale, miniature effect, disconnected eyelines, on-screen text, captions/subtitles, watermark, logo, UI elements, cartoon/anime style, illustration style, CGI look, video game graphics, 3D render look, artificial lighting, synthetic appearance.
101
- ```
101
+ ```
102
+
103
+ Add scene-specific semantic terms from `semantic-consistency/SKILL.md` rather than blindly expanding the list. Prioritize perspective, shadow vector, scale, reflection, gravity/contact, anatomy, foreground/background, and contextual contradiction risks that actually exist in the shot.
102
104
 
103
105
  ---
104
106
 
105
107
  ## VIDEO AVOID LINE (Ultra Realism)
106
108
 
107
- Include at end of EVERY video prompt:
108
-
109
- ```
109
+ Include at end of EVERY video prompt:
110
+
111
+ ```
110
112
  Avoid: distorted faces, morphing, bad anatomy, extra limbs/fingers, blurry, flickering, frame drops, inconsistent lighting, unnatural motion, warping, rolling shutter artifacts, camera jitter, cutout edges, pasted composite look, toy-like scale, miniature effect, disconnected eyelines, on-screen text, captions/subtitles, watermark, logo, cartoon/anime style, CGI motion, synthetic appearance, robotic movement, puppet-like animation, uncanny valley expressions.
111
- ```
113
+ ```
114
+
115
+ For Start+End or image-to-video workflows, keep the semantic avoid line aligned with the same `visual_world` used by the still prompts.
112
116
 
113
117
  ---
114
118
 
@@ -50,7 +50,9 @@ If model is kling-3.0: keep Start+End transition mode and first/then/finally mot
50
50
  - Write each shot to `$OUTPUT_DIR/shots/SHOT[NN].md`
51
51
  - Keep one-file-per-shot contract
52
52
  - Ensure `ILK/İLK FRAME` code block exists even when chained
53
+ - For chained shots, the `ILK/İLK FRAME` code block must contain only `Use SHOT[prev]_END as exact first frame`; write `CHAIN BREAK` before any new start-frame prompt
53
54
  - Keep `Audio Plan` blocks aligned to the existing `voiceCast`
55
+ - Keep `shot-plan.json.visual_world` consistent or document a `CHAIN BREAK` scene-world reset
54
56
  - Update `$OUTPUT_DIR/_index.md`
55
57
 
56
58
  ### 5. Refresh Reports
@@ -59,6 +61,7 @@ After continuation batch:
59
61
 
60
62
  - run `/safety-check`
61
63
  - refresh `$OUTPUT_DIR/reports/SAFETY-REPORT.md`
64
+ - refresh `$OUTPUT_DIR/reports/SEMANTIC-REPORT.md`
62
65
  - refresh `$OUTPUT_DIR/reports/DELIVERY-REPORT.md`
63
66
 
64
67
  ---
@@ -34,6 +34,8 @@ All shot files follow `SHOTNN.md` naming.
34
34
  7. Verify `Model Control` block exists in every shot file.
35
35
  8. Verify `shot-plan.json` has `voiceCast` coverage for every speaker or narrator.
36
36
  9. Verify every speaking VIDEO section has an `Audio Plan` block with valid `activeSpeakerKey`.
37
+ 10. Verify `shot-plan.json.visual_world` exists and every shot follows its camera, lens, camera movement strategy, shadow vector, scale map, reflection, physics, and seed strategy.
38
+ 11. Verify chained `ILK/İLK FRAME` code blocks contain only `Use SHOT[prev]_END as exact first frame`; any competing prompt must be marked `CHAIN BREAK`.
37
39
  8. For `kling-3.0`, verify:
38
40
  - `Transition Mode: Start+End`
39
41
  - CFG value is documented
@@ -44,13 +46,14 @@ All shot files follow `SHOTNN.md` naming.
44
46
  Required reports:
45
47
 
46
48
  - `$OUTPUT_DIR/reports/SAFETY-REPORT.md`
49
+ - `$OUTPUT_DIR/reports/SEMANTIC-REPORT.md`
47
50
  - `$OUTPUT_DIR/reports/DELIVERY-REPORT.md`
48
51
 
49
52
  Gate rules:
50
53
 
51
54
  - reports must exist
52
55
  - reports must have non-contradictory status fields
53
- - both reports must be pass
56
+ - all reports must be pass
54
57
 
55
58
  If any rule fails, run `/recover` and do not finish.
56
59
 
@@ -82,6 +85,7 @@ Do not declare completion unless:
82
85
 
83
86
  - all shot files pass structure and continuity
84
87
  - safety report is pass
88
+ - semantic report is pass
85
89
  - delivery report is pass
86
90
  - final summary says pass
87
91
  - model-specific checks pass (Kling or Veo profile)
@@ -38,6 +38,7 @@ $OUTPUT_DIR/
38
38
  │ └── ...
39
39
  ├── reports/
40
40
  │ ├── SAFETY-REPORT.md # Safety and policy validation
41
+ │ ├── SEMANTIC-REPORT.md # Perspective, shadow, scale, physics validation
41
42
  │ └── DELIVERY-REPORT.md # Final packaging validation
42
43
  └── _index.md # Shot list with status and report gate table
43
44
  ```
@@ -64,6 +65,7 @@ $OUTPUT_DIR/
64
65
  - `dialogue_name_policy` (`preserve-original-dialogue` or `anonymize-dialogue`)
65
66
  - top-level `voiceCast`
66
67
  - voice defaults (`single_active_speaker`, `music_default`, `subtitles_default`)
68
+ - top-level `visual_world` with `aspect_ratio`, `camera_height`, `lens_family`, `horizon_line`, `vanishing_point_strategy`, `camera_movement_strategy`, `light_source`, `shadow_direction`, `color_temperature`, `scale_map`, `reflection_risk`, `physics_constraints`, and `seed_strategy`
67
69
  10. Ensure every speaker or narrator has a stable `speakerKey` before shot writing.
68
70
  11. Ensure directories exist: `$OUTPUT_DIR/`, `$OUTPUT_DIR/shots/`, `$OUTPUT_DIR/reports/`.
69
71
 
@@ -92,7 +94,7 @@ For EACH shot:
92
94
  4. Reuse or create the correct `voiceCast` entry for any speaking character or narrator.
93
95
  5. Generate main shot prompts (`ILK/İLK FRAME`, `SON FRAME`, `VIDEO`).
94
96
  5. `ILK/İLK FRAME` section MUST always include a fenced code block.
95
- 6. If chained, first-frame code block must explicitly state: `Use SHOT[prev]_END as exact first frame`.
97
+ 6. If chained, first-frame code block must contain only: `Use SHOT[prev]_END as exact first frame`. Any new visual prompt, camera, lens, lighting, or composition requires `CHAIN BREAK - [reason]`.
96
98
  7. Write a machine-readable `Audio Plan` JSON block for every VIDEO section.
97
99
  8. If dialogue or voiceover exists, require `activeSpeakerKey`, `dialogueLines`, and `performanceNote`.
98
100
  9. Keep one active speaker per shot. Split reply dialogue across multiple shots.
@@ -103,11 +105,19 @@ For EACH shot:
103
105
  - each coverage prompt: minimum 70 words
104
106
  8. Enforce specificity floor on every prompt:
105
107
  - explicit lens/framing/camera movement
108
+ - camera movement selected from the 24-move cinematic lexicon in `prompt-structure` when movement is present
106
109
  - explicit lighting direction/intensity/atmosphere
107
110
  - explicit foreground/midground/background action details
108
111
  - explicit eyeline target / body orientation when a subject looks at someone or something
109
112
  - explicit shared light source / bounce logic when multiple subjects share frame
110
113
  - explicit depth/scale integration when more than one plane is visible
114
+ 9. Enforce semantic consistency floor on every prompt:
115
+ - perspective/geometry matches `shot-plan.json -> visual_world`
116
+ - shadow vector follows the canonical light source and `shadow_direction`
117
+ - scale map and foreground/midground/background distances are physically plausible
118
+ - reflection handling is accurate or reflective surfaces are intentionally avoided
119
+ - gravity/contact physics, anatomy risk, foreground/background coherence, and contextual contradictions are resolved
120
+ - avoid line contains targeted scene-specific semantic failure terms
111
121
  10. Generate coverage prompts (2-3 per main shot, min 70 words each).
112
122
  11. Add Turkish summary for shot and each coverage section.
113
123
  12. Apply model-specific generation gates (see below).
@@ -126,7 +136,7 @@ Before writing prompts, design the Start→End transition:
126
136
  3. **Execution mode:** Default to `single-transition`; use `custom-storyboard` only when the shot truly has 2-3 meaningful internal phases.
127
137
  4. **Motion timeline:** Write 2-4 steps: `first → then → finally`.
128
138
  5. **Face/hands stability:** Match orientations between start and end — avoid >45° face rotation.
129
- 6. **Camera safety:** Use only safe movements (slow push-in, pan, tilt, micro-sway, tripod-locked).
139
+ 6. **Camera safety:** Use the 24-move cinematic lexicon; prefer safe simple moves (Dolly In/Out, Pan Left/Right, Tilt Up/Down, Truck Left/Right, Pedestal Up/Down, Tracking Shot, Steadicam Movement) before advanced hybrids.
130
140
  7. **Anti-fragmentation:** Do not turn one glance, gesture, or prop touch into separate micro-shots. If custom storyboard is used, cap it at 3 stages and make each stage editorially distinct.
131
141
 
132
142
  #### Veo Gate (when model is veo31)
@@ -139,7 +149,7 @@ Before writing prompts, design the Start→End transition:
139
149
 
140
150
  - [ ] `first → then → finally` motion timeline in VIDEO prompts
141
151
  - [ ] "What stays the same" explicitly stated (identity, background, costume)
142
- - [ ] Camera movement is simple and safe (no complex hybrid)
152
+ - [ ] Camera movement is named from the 24-move cinematic lexicon and kept simple/safe unless the beat requires an advanced hybrid
143
153
  - [ ] `stable background`, `no warping`, `physically plausible` constraints
144
154
  - [ ] Kling negative prompt set active (warping, rubbery, melted, deformed)
145
155
  - [ ] Duration matches transformation budget
@@ -162,8 +172,18 @@ Before writing prompts, design the Start→End transition:
162
172
  - `file_completeness_status`
163
173
  - `packaging_status`
164
174
  - `blockers`
165
- 4. Reject any shot output that passes while violating quality floor or specificity floor.
166
- 5. Do not finalize if any report is fail or missing.
175
+ 4. Write `$OUTPUT_DIR/reports/SEMANTIC-REPORT.md` with strict fields:
176
+ - `overall_status`
177
+ - `visual_world_status`
178
+ - `perspective_geometry_status`
179
+ - `shadow_vector_status`
180
+ - `scale_depth_status`
181
+ - `reflection_physics_anatomy_status`
182
+ - `contextual_contradiction_status`
183
+ - `chained_ilk_frame_status`
184
+ - `blockers`
185
+ 5. Reject any shot output that passes while violating quality floor, specificity floor, or semantic consistency floor.
186
+ 6. Do not finalize if any report is fail or missing.
167
187
 
168
188
  ---
169
189
 
@@ -192,8 +212,8 @@ FIRST SHOT / CHAINED from SHOT[prev]_END / CHAIN BREAK - Reason
192
212
  ### ILK FRAME (SHOTNN_START)
193
213
 
194
214
  ```text
195
- [Image prompt - min 80 words, Flow Order]
196
- [If chained: include "Use SHOT[prev]_END as exact first frame"]
215
+ [Image prompt - min 80 words, Flow Order for FIRST SHOT or CHAIN BREAK only]
216
+ [If chained: the entire code block must be only "Use SHOT[prev]_END as exact first frame"]
197
217
  Avoid: blurry, low-res, text, watermark, bad anatomy, distorted face ...
198
218
  ```
199
219
 
@@ -305,6 +325,7 @@ Report:
305
325
  [N] shot olusturuldu ve $OUTPUT_DIR/shots/ klasorune kaydedildi.
306
326
  Toplam: [N] ana shot + [M] coverage = [N+M] production shots.
307
327
  Safety report: $OUTPUT_DIR/reports/SAFETY-REPORT.md
328
+ Semantic report: $OUTPUT_DIR/reports/SEMANTIC-REPORT.md
308
329
  Delivery report: $OUTPUT_DIR/reports/DELIVERY-REPORT.md
309
330
 
310
331
  Devam etmek icin: 'devam et' veya '/chain'
@@ -12,9 +12,12 @@ $ARGUMENTS
12
12
 
13
13
  - safety report is fail
14
14
  - delivery report is fail
15
+ - semantic report is fail
15
16
  - missing required shot sections
16
17
  - continuity mismatch between neighboring shots
17
18
  - missing `voiceCast` entry or broken `activeSpeakerKey` binding
19
+ - missing or contradictory `shot-plan.json.visual_world`
20
+ - chained `ILK/İLK FRAME` block contains a new visual prompt instead of exact reuse
18
21
 
19
22
  ## Recovery Steps
20
23
 
@@ -24,11 +27,13 @@ $ARGUMENTS
24
27
  4. Repair `voiceCast` or `Audio Plan` bindings before rerunning reports.
25
28
  5. Re-run `/safety-check`.
26
29
  6. Regenerate `$OUTPUT_DIR/reports/DELIVERY-REPORT.md`.
27
- 7. Update `$OUTPUT_DIR/_index.md` with recovered status.
30
+ 7. Regenerate `$OUTPUT_DIR/reports/SEMANTIC-REPORT.md`.
31
+ 8. Update `$OUTPUT_DIR/_index.md` with recovered status.
28
32
 
29
33
  ## Exit Criteria
30
34
 
31
35
  Recovery is complete only when both are pass:
32
36
 
33
37
  - `$OUTPUT_DIR/reports/SAFETY-REPORT.md`
38
+ - `$OUTPUT_DIR/reports/SEMANTIC-REPORT.md`
34
39
  - `$OUTPUT_DIR/reports/DELIVERY-REPORT.md`
@@ -46,8 +46,18 @@ Validate all prompts before delivery to ensure platform compliance.
46
46
  3. Continuity checks
47
47
  - `SHOT[N]_END` aligns with `SHOT[N+1]_START`
48
48
  - chain breaks are explicitly declared when needed
49
-
50
- 4. Quality checks
49
+ - chained `ILK/İLK FRAME` code blocks contain only `Use SHOT[prev]_END as exact first frame`; competing new prompts are automatic fail
50
+
51
+ 4. Semantic consistency checks
52
+ - `shot-plan.json` contains top-level `visual_world`
53
+ - `visual_world` includes `aspect_ratio`, `camera_height`, `lens_family`, `horizon_line`, `vanishing_point_strategy`, `camera_movement_strategy`, `light_source`, `shadow_direction`, `color_temperature`, `scale_map`, `reflection_risk`, `physics_constraints`, and `seed_strategy`
54
+ - every prompt aligns perspective/geometry with the canonical lens/camera/horizon/vanishing logic
55
+ - every moving prompt uses one of the 24 canonical cinematic camera movement names from `prompt-structure`
56
+ - every prompt states one shadow vector from the canonical light source
57
+ - scale/depth, foreground/background coherence, reflection handling, gravity/contact physics, anatomy risk, and contextual contradictions are resolved
58
+ - avoid lines include targeted scene-specific semantic failure terms
59
+
60
+ 5. Quality checks
51
61
  - `ILK/İLK FRAME` prompt >= 80 words
52
62
  - `SON FRAME` prompt >= 80 words
53
63
  - `VIDEO/VİDEO` prompt >= 120 words
@@ -60,7 +70,7 @@ Validate all prompts before delivery to ensure platform compliance.
60
70
  - contact / weight / support cues exist when compositing realism is critical
61
71
  - depth / scale integration is explicit when multiple planes are visible
62
72
 
63
- 5. Kling-only checks (when model is `kling-3.0`)
73
+ 6. Kling-only checks (when model is `kling-3.0`)
64
74
  - `Model Control` block exists in each SHOT file
65
75
  - `Transition Mode: Start+End` is declared
66
76
  - VIDEO prompt contains explicit `first -> then -> finally` progression
@@ -90,6 +100,30 @@ Write `$OUTPUT_DIR/reports/SAFETY-REPORT.md` using strict fields:
90
100
  - [concise bullet list or none]
91
101
  ```
92
102
 
103
+ Write `$OUTPUT_DIR/reports/SEMANTIC-REPORT.md` using strict fields:
104
+
105
+ ```markdown
106
+ # SEMANTIC REPORT
107
+
108
+ - overall_status: pass/fail
109
+ - visual_world_status: pass/fail
110
+ - perspective_geometry_status: pass/fail
111
+ - shadow_vector_status: pass/fail
112
+ - scale_depth_status: pass/fail
113
+ - reflection_physics_anatomy_status: pass/fail
114
+ - foreground_background_status: pass/fail
115
+ - contextual_contradiction_status: pass/fail
116
+ - chained_ilk_frame_status: pass/fail
117
+ - blockers:
118
+ - none
119
+
120
+ ## Findings
121
+ - [concise bullet list]
122
+
123
+ ## Fixes Applied
124
+ - [concise bullet list or none]
125
+ ```
126
+
93
127
  Rules:
94
128
  - Status fields must not contradict each other.
95
129
  - If any status is fail, `overall_status` must be fail.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@milenyumai/film-kit",
3
- "version": "1.4.1",
3
+ "version": "1.4.2",
4
4
  "description": "Hollywood-standard cinematic prompt engineering toolkit with model profiles (Veo 3.1 / Kling 3.0). Auto-configures AI agents (Cursor, Claude Code, VS Code Copilot, Antigravity) with production-grade shot generation system.",
5
5
  "type": "module",
6
6
  "main": "./build/index.js",