@milenyumai/film-kit 1.4.1 → 1.4.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/build/lib/templates.js +84 -40
- package/content/ARCHITECTURE.md +6 -3
- package/content/MASTER.md +10 -7
- package/content/RULES.md +8 -6
- package/content/agents/prompt-engineer.md +9 -7
- package/content/skills/coverage-system/SKILL.md +1 -1
- package/content/skills/frame-chaining/SKILL.md +1 -1
- package/content/skills/prompt-structure/SKILL.md +90 -43
- package/content/skills/semantic-consistency/SKILL.md +94 -0
- package/content/skills/spatial-blocking/SKILL.md +1 -0
- package/content/skills/visual-modes/SKILL.md +12 -8
- package/content/workflows/chain.md +3 -0
- package/content/workflows/finish.md +5 -1
- package/content/workflows/generate.md +28 -7
- package/content/workflows/recover.md +6 -1
- package/content/workflows/safety-check.md +37 -3
- package/package.json +1 -1
package/build/lib/templates.js
CHANGED
|
@@ -48,13 +48,14 @@ All rules, skills, and workflows are located under \`.agent/\`.
|
|
|
48
48
|
- **Model Profile:** \`.agent/model-profile.md\` — Active model rules and constraints
|
|
49
49
|
- **Agent:** \`.agent/agents/prompt-engineer.md\` — Senior prompt engineer agent
|
|
50
50
|
|
|
51
|
-
### Skills (
|
|
51
|
+
### Skills (9 modules)
|
|
52
52
|
| Skill | Path | Priority |
|
|
53
53
|
|-------|------|----------|
|
|
54
54
|
| Safety Compliance | \`.agent/skills/safety-compliance/SKILL.md\` | P0 — ALWAYS |
|
|
55
55
|
| Reference Locking | \`.agent/skills/reference-locking/SKILL.md\` | P1 — When refs provided |
|
|
56
56
|
| Frame Chaining | \`.agent/skills/frame-chaining/SKILL.md\` | P2 — ALWAYS |
|
|
57
57
|
| Spatial Blocking | \`.agent/skills/spatial-blocking/SKILL.md\` | P2 — Relational realism / gaze / depth |
|
|
58
|
+
| Semantic Consistency | \`.agent/skills/semantic-consistency/SKILL.md\` | P2 — ALWAYS, visual_world + physics gate |
|
|
58
59
|
| Coverage System | \`.agent/skills/coverage-system/SKILL.md\` | P2 — ALWAYS (mandatory) |
|
|
59
60
|
| Visual Modes | \`.agent/skills/visual-modes/SKILL.md\` | P4 — ALWAYS |
|
|
60
61
|
| Audio Design | \`.agent/skills/audio-design/SKILL.md\` | P4 — When dialogue/SFX |
|
|
@@ -75,6 +76,7 @@ When the user asks \`/generate\`, convert the scenario into:
|
|
|
75
76
|
- \`${config.outputDir}/shot-plan.json\` — Single-agent plan + policy + \`voiceCast\` contract
|
|
76
77
|
- \`${config.outputDir}/shots/SHOT01.md, SHOT02.md, ...\` — Production shot files (with coverage included)
|
|
77
78
|
- \`${config.outputDir}/reports/SAFETY-REPORT.md\` — Safety gate result
|
|
79
|
+
- \`${config.outputDir}/reports/SEMANTIC-REPORT.md\` — Semantic consistency gate result
|
|
78
80
|
- \`${config.outputDir}/reports/DELIVERY-REPORT.md\` — Delivery gate result
|
|
79
81
|
- \`${config.outputDir}/_index.md\` — Shot list with chain & status tracking
|
|
80
82
|
|
|
@@ -100,6 +102,7 @@ Each \`SHOTNN.md\` is a **single file** containing ALL shot details:
|
|
|
100
102
|
- **Name Policy:** Visual prompts must stay anonymous. Dialogue naming follows \`shot-plan.json\` policy.
|
|
101
103
|
- **AUTO-SAFETY:** Proactively reframe content that may trigger safety filters
|
|
102
104
|
- **Frame Chaining:** Last frame of SHOT[N] = First frame of SHOT[N+1]
|
|
105
|
+
- **Semantic Consistency:** \`shot-plan.json.visual_world\` is canonical for perspective, named camera movement strategy, shadow vector, scale, reflection, physics, and seed strategy
|
|
103
106
|
- **Coverage Mandatory:** Every main shot includes 2-3 coverage sub-shots in same file
|
|
104
107
|
- **Voice Design:** \`shot-plan.json\` keeps top-level \`voiceCast\`; every speaking VIDEO section keeps \`Audio Plan\`
|
|
105
108
|
- **Music: NONE** by default (user must explicitly request)
|
|
@@ -122,6 +125,7 @@ Read .agent/VOICE-DESIGN.md when dialogue, narrator VO, or reusable speaker iden
|
|
|
122
125
|
| Reference Locking | .agent/skills/reference-locking/SKILL.md | When refs provided |
|
|
123
126
|
| Frame Chaining | .agent/skills/frame-chaining/SKILL.md | Multi-shot projects |
|
|
124
127
|
| Spatial Blocking | .agent/skills/spatial-blocking/SKILL.md | Multi-subject / gaze / scale-critical shots |
|
|
128
|
+
| Semantic Consistency | .agent/skills/semantic-consistency/SKILL.md | ALWAYS |
|
|
125
129
|
| Coverage System | .agent/skills/coverage-system/SKILL.md | ALWAYS (mandatory) |
|
|
126
130
|
| Visual Modes | .agent/skills/visual-modes/SKILL.md | All visual work |
|
|
127
131
|
| Audio Design | .agent/skills/audio-design/SKILL.md | Dialogue/SFX needed |
|
|
@@ -137,9 +141,11 @@ Read .agent/VOICE-DESIGN.md when dialogue, narrator VO, or reusable speaker iden
|
|
|
137
141
|
7. EVERY prompt must have an Avoid line. No exceptions.
|
|
138
142
|
8. Coverage shots mandatory (2-3 per main shot, min 60 words each, included in same file).
|
|
139
143
|
9. Frame chaining: Last frame of SHOT[N] = First frame of SHOT[N+1].
|
|
140
|
-
10.
|
|
141
|
-
11.
|
|
142
|
-
12.
|
|
144
|
+
10. Semantic consistency: \`${config.outputDir}/shot-plan.json\` must include \`visual_world\`; prompts must align camera, named movement strategy, light/shadow vector, scale, reflections, physics, anatomy risk, and contextual logic.
|
|
145
|
+
11. ILK/İLK FRAME section must contain a code block even for chained shots.
|
|
146
|
+
12. Chained ILK/İLK FRAME code blocks must contain only: \`Use SHOT[prev]_END as exact first frame\`; any new visual prompt is a CHAIN BREAK.
|
|
147
|
+
13. ONE FILE PER SHOT: Each SHOTNN.md contains main shot + all coverage shots.
|
|
148
|
+
14. Keep top-level \`voiceCast\` in ${config.outputDir}/shot-plan.json and \`Audio Plan\` in every speaking VIDEO section.
|
|
143
149
|
|
|
144
150
|
## WORKFLOWS
|
|
145
151
|
- /generate → Read .agent/workflows/generate.md
|
|
@@ -174,7 +180,7 @@ Read \`.agent/model-profile.md\` for active model constraints.
|
|
|
174
180
|
|
|
175
181
|
## SKILL LOADING (MANDATORY)
|
|
176
182
|
Before generating ANY prompts:
|
|
177
|
-
1. ALWAYS load: safety-compliance, frame-chaining, coverage-system, prompt-structure, visual-modes
|
|
183
|
+
1. ALWAYS load: safety-compliance, frame-chaining, semantic-consistency, coverage-system, prompt-structure, visual-modes
|
|
178
184
|
2. Load for relational realism: spatial-blocking
|
|
179
185
|
3. Load if refs provided: reference-locking
|
|
180
186
|
4. Load if dialogue/SFX: audio-design
|
|
@@ -194,9 +200,11 @@ All skills at: \`.agent/skills/[name]/SKILL.md\`
|
|
|
194
200
|
- AUTO-ANONYMOUS: Replace ALL real names with physical descriptions
|
|
195
201
|
- Dialogue naming follows \`${config.outputDir}/shot-plan.json\` policy
|
|
196
202
|
- \`shot-plan.json\` stores top-level \`voiceCast\`
|
|
203
|
+
- \`shot-plan.json\` stores top-level \`visual_world\` for camera/lens/camera-movement/light/shadow/scale/reflection/physics/seed strategy
|
|
197
204
|
- Every speaking VIDEO section includes \`Audio Plan\`
|
|
198
205
|
- AUTO-SAFETY: Proactively reframe sensitive content
|
|
199
206
|
- Frame chaining: Last frame SHOT[N] = First frame SHOT[N+1]
|
|
207
|
+
- Chained ILK/İLK FRAME code block contains only \`Use SHOT[prev]_END as exact first frame\`; any new visual prompt requires CHAIN BREAK
|
|
200
208
|
- Coverage: 2-3 sub-shots per main shot (min 60 words each, in same file)
|
|
201
209
|
- Avoid line: MANDATORY on every prompt
|
|
202
210
|
- Music: NONE by default
|
|
@@ -241,6 +249,7 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
|
|
|
241
249
|
- \`reference-locking/SKILL.md\` — When refs provided (P1)
|
|
242
250
|
- \`frame-chaining/SKILL.md\` — ALWAYS for multi-shot (P2)
|
|
243
251
|
- \`spatial-blocking/SKILL.md\` — when gaze / scale / compositing realism matters (P2)
|
|
252
|
+
- \`semantic-consistency/SKILL.md\` — ALWAYS, canonical \`visual_world\` + physics gate (P2)
|
|
244
253
|
- \`coverage-system/SKILL.md\` — ALWAYS, mandatory (P2)
|
|
245
254
|
- \`visual-modes/SKILL.md\` — ALWAYS (P4)
|
|
246
255
|
- \`audio-design/SKILL.md\` — When dialogue/SFX (P4)
|
|
@@ -252,8 +261,9 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
|
|
|
252
261
|
- Kling preset: \`${config.klingPreset}\`
|
|
253
262
|
- Create \`${config.outputDir}/project-info.md\`, \`${config.outputDir}/shot-plan.json\`, and \`${config.outputDir}/_index.md\`
|
|
254
263
|
- Keep top-level \`voiceCast\` in \`${config.outputDir}/shot-plan.json\`
|
|
264
|
+
- Keep top-level \`visual_world\` in \`${config.outputDir}/shot-plan.json\`
|
|
255
265
|
- Write \`${config.outputDir}/shots/SHOTNN.md\` per shot; coverage stays in the same file
|
|
256
|
-
- Refresh \`${config.outputDir}/reports/SAFETY-REPORT.md
|
|
266
|
+
- Refresh \`${config.outputDir}/reports/SAFETY-REPORT.md\`, \`${config.outputDir}/reports/SEMANTIC-REPORT.md\`, and \`${config.outputDir}/reports/DELIVERY-REPORT.md\` before \`/finish\`
|
|
257
267
|
|
|
258
268
|
## Non-Negotiables
|
|
259
269
|
1. **AUTO-ANONYMOUS:** Replace ALL real person names in visual prompts with physical descriptions.
|
|
@@ -266,10 +276,12 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
|
|
|
266
276
|
8. **Coverage:** 2-3 sub-shots within same SHOTNN.md file, min 70 words each.
|
|
267
277
|
9. **Voice Design:** keep project-level \`voiceCast\` in \`${config.outputDir}/shot-plan.json\` and per-shot \`Audio Plan\` in each VIDEO section.
|
|
268
278
|
10. **ILK/İLK FRAME:** Always include a fenced code block, even when chained.
|
|
269
|
-
11. **
|
|
270
|
-
12. **
|
|
271
|
-
13. **
|
|
272
|
-
14. **
|
|
279
|
+
11. **Chained ILK/İLK FRAME:** code block contains only \`Use SHOT[prev]_END as exact first frame\`; any new visual prompt requires CHAIN BREAK.
|
|
280
|
+
12. **Quality Floor:** ILK >= 80, SON >= 80, VIDEO >= 120, coverage >= 70 words.
|
|
281
|
+
13. **Specificity Floor:** lens/framing, lighting, and foreground/midground/background action are mandatory.
|
|
282
|
+
14. **Spatial Realism Floor:** eyeline target, plane map, shared light source, and contact/depth cues are mandatory when relational staging matters.
|
|
283
|
+
15. **Semantic Consistency Floor:** \`visual_world\`, perspective/geometry, shadow vector, scale map, reflections, gravity/contact physics, anatomy risk, foreground/background coherence, contextual contradictions, and targeted semantic avoid terms are mandatory.
|
|
284
|
+
16. **ONE FILE PER SHOT:** No separate coverage files.
|
|
273
285
|
|
|
274
286
|
## Workflows
|
|
275
287
|
| Command | Workflow |
|
|
@@ -306,9 +318,11 @@ This workspace keeps high-level policy in \`CLAUDE.md\` and operational detail i
|
|
|
306
318
|
- Keep one file per shot: \`${config.outputDir}/shots/SHOTNN.md\`
|
|
307
319
|
- Maintain \`${config.outputDir}/shot-plan.json\` dialogue naming policy
|
|
308
320
|
- Maintain \`${config.outputDir}/shot-plan.json\` top-level \`voiceCast\`
|
|
321
|
+
- Maintain \`${config.outputDir}/shot-plan.json\` top-level \`visual_world\`
|
|
309
322
|
- Keep \`Audio Plan\` blocks aligned to \`voiceCast\`
|
|
310
323
|
- Keep \`ILK/İLK FRAME\` in a fenced code block even when chained
|
|
311
324
|
- Quality floor and specificity floor are hard gates, not suggestions
|
|
325
|
+
- Semantic consistency floor is a hard gate: camera/lens/camera-movement/light/shadow/scale/reflection/physics/anatomy/context must align to \`visual_world\`
|
|
312
326
|
- Apply \`.agent/skills/spatial-blocking/SKILL.md\` whenever eyeline, compositing, or depth realism is critical
|
|
313
327
|
|
|
314
328
|
## Debugging
|
|
@@ -336,14 +350,17 @@ Use the Film-Kit core runtime.
|
|
|
336
350
|
- draft and repair shot files under \`${config.outputDir}/shots/\`
|
|
337
351
|
- apply \`${config.outputDir}/shot-plan.json\` dialogue naming policy
|
|
338
352
|
- maintain top-level \`voiceCast\` inside \`${config.outputDir}/shot-plan.json\`
|
|
353
|
+
- maintain top-level \`visual_world\` inside \`${config.outputDir}/shot-plan.json\`
|
|
339
354
|
- keep \`Audio Plan\` blocks valid against \`voiceCast\`
|
|
340
355
|
- enforce AUTO-ANONYMOUS, AUTO-SAFETY, chaining, and coverage contracts
|
|
341
356
|
- enforce quality floor: ILK >= 80, SON >= 80, VIDEO >= 120, coverage >= 70
|
|
342
357
|
- enforce specificity floor: lens/framing, lighting, and foreground/midground/background action
|
|
343
358
|
- enforce spatial realism: explicit eyeline target, plane map, shared light source, and contact/depth cues when needed
|
|
359
|
+
- enforce semantic consistency: \`visual_world\`, perspective/geometry, shadow vector, scale map, reflection handling, physics/anatomy risk, foreground/background coherence, contextual contradictions, and scene-specific avoid terms
|
|
344
360
|
|
|
345
361
|
## Boundaries
|
|
346
362
|
- do not skip safety or delivery reports
|
|
363
|
+
- do not pass chained ILK/İLK FRAME blocks that contain anything besides exact reuse text
|
|
347
364
|
- do not split coverage into separate files
|
|
348
365
|
- if asked to review only, report issues instead of regenerating shots by default
|
|
349
366
|
`;
|
|
@@ -367,6 +384,7 @@ If using the native Claude subagent, read \`.claude/agents/prompt-engineer.md\`
|
|
|
367
384
|
- Create \`${config.outputDir}/project-info.md\`
|
|
368
385
|
- Create \`${config.outputDir}/shot-plan.json\`
|
|
369
386
|
- Add top-level \`voiceCast\` before writing speaking shots
|
|
387
|
+
- Add top-level \`visual_world\` before writing visual prompts
|
|
370
388
|
|
|
371
389
|
2. **Batch Strategy:**
|
|
372
390
|
- 1-10 shots → Generate all at once
|
|
@@ -378,9 +396,11 @@ If using the native Claude subagent, read \`.claude/agents/prompt-engineer.md\`
|
|
|
378
396
|
- Generate main shot (İLK FRAME + SON FRAME + VİDEO)
|
|
379
397
|
- Add machine-readable \`Audio Plan\` before every VIDEO section
|
|
380
398
|
- Keep İLK FRAME as fenced code block even when chained
|
|
399
|
+
- If chained, keep İLK FRAME code block to exact reuse text only; new visual prompt means CHAIN BREAK
|
|
381
400
|
- Enforce hard quality floor: ILK >= 80, SON >= 80, VIDEO >= 120, coverage >= 70
|
|
382
401
|
- Enforce specificity floor: lens/framing + lighting + foreground/midground/background action
|
|
383
402
|
- Enforce spatial realism floor: eyeline target + plane map + shared light source + contact/depth cues when applicable
|
|
403
|
+
- Enforce semantic consistency floor: perspective/geometry + shadow vector + scale map + reflections + gravity/contact physics + anatomy risk + contextual contradiction check
|
|
384
404
|
- Generate 2-3 coverage shots (in same file)
|
|
385
405
|
- Write to \`${config.outputDir}/shots/SHOT[NN].md\`
|
|
386
406
|
- Update \`${config.outputDir}/_index.md\`
|
|
@@ -388,6 +408,7 @@ If using the native Claude subagent, read \`.claude/agents/prompt-engineer.md\`
|
|
|
388
408
|
4. **Validation Gates:**
|
|
389
409
|
- Run /safety-check
|
|
390
410
|
- Write \`${config.outputDir}/reports/SAFETY-REPORT.md\`
|
|
411
|
+
- Write \`${config.outputDir}/reports/SEMANTIC-REPORT.md\`
|
|
391
412
|
- Write \`${config.outputDir}/reports/DELIVERY-REPORT.md\`
|
|
392
413
|
- If any gate fails, run \`.agent/workflows/recover.md\`
|
|
393
414
|
|
|
@@ -405,12 +426,13 @@ function buildClaudeRuleOutputContract(config) {
|
|
|
405
426
|
|
|
406
427
|
## Required Files
|
|
407
428
|
- \`${config.outputDir}/project-info.md\` — Characters, settings, emotional arc mapping, tension levels
|
|
408
|
-
- \`${config.outputDir}/shot-plan.json\` — Name policy, shot plan, validation contract, and top-level \`
|
|
429
|
+
- \`${config.outputDir}/shot-plan.json\` — Name policy, shot plan, validation contract, top-level \`voiceCast\`, and top-level \`visual_world\`
|
|
409
430
|
- \`.agent/model-profile.md\` — Active model constraints and presets
|
|
410
431
|
- \`.agent/VOICE-DESIGN.md\` — Voice identity and shot audio contract
|
|
411
432
|
- \`${config.outputDir}/_index.md\` — Shot tracking with chain & status
|
|
412
433
|
- \`${config.outputDir}/shots/SHOT01.md ... SHOTNN.md\` — Individual shot files (one file per shot)
|
|
413
434
|
- \`${config.outputDir}/reports/SAFETY-REPORT.md\` — Safety gate report
|
|
435
|
+
- \`${config.outputDir}/reports/SEMANTIC-REPORT.md\` — Semantic consistency gate report
|
|
414
436
|
- \`${config.outputDir}/reports/DELIVERY-REPORT.md\` — Delivery gate report
|
|
415
437
|
|
|
416
438
|
## Prompt Flow Order (MANDATORY)
|
|
@@ -449,10 +471,11 @@ FIRST SHOT / CHAINED from SHOT[prev]_END / CHAIN BREAK - Reason
|
|
|
449
471
|
## Main Shot
|
|
450
472
|
|
|
451
473
|
### İLK FRAME (SHOTNN_START)
|
|
452
|
-
[If chained: "
|
|
474
|
+
[If chained: the code block below must contain only "Use SHOT[prev]_END as exact first frame"]
|
|
453
475
|
|
|
454
476
|
> NOTE: Even when chained, this section MUST contain a fenced code block.
|
|
455
|
-
> If chained,
|
|
477
|
+
> If chained, the fenced code block must contain only: "Use SHOT[prev]_END as exact first frame."
|
|
478
|
+
> Any new visual prompt in a chained ILK FRAME section requires CHAIN BREAK.
|
|
456
479
|
|
|
457
480
|
\\\`\\\`\\\`
|
|
458
481
|
[Image prompt — min 60 words, following prompt flow order]
|
|
@@ -596,6 +619,11 @@ Character gaze directions must be spatially consistent between cuts.
|
|
|
596
619
|
- Keep one motivated light source across subjects.
|
|
597
620
|
- Add contact / weight / support cues to avoid pasted composite look.
|
|
598
621
|
|
|
622
|
+
### Semantic Consistency
|
|
623
|
+
- \`shot-plan.json.visual_world\` is the canonical scene contract.
|
|
624
|
+
- Prompts must agree with its aspect ratio, camera height, lens family, horizon line, vanishing strategy, camera movement strategy, light source, shadow direction, color temperature, scale map, reflection risk, physics constraints, and seed strategy.
|
|
625
|
+
- Avoid contextual contradictions unless the prompt explicitly explains the unusual physics or style.
|
|
626
|
+
|
|
599
627
|
### Dramaturgy (for dialogue scenes)
|
|
600
628
|
Analyze per character: Objective → Obstacle → Stakes → Subtext → Beat turns.
|
|
601
629
|
Embed as physical behavior in prompts, NOT as metadata.
|
|
@@ -632,10 +660,11 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
|
|
|
632
660
|
2. \`reference-locking/SKILL.md\` — When reference images provided
|
|
633
661
|
3. \`frame-chaining/SKILL.md\` — ALWAYS for multi-shot continuity
|
|
634
662
|
4. \`spatial-blocking/SKILL.md\` — When gaze / depth / scale realism is critical
|
|
635
|
-
5. \`
|
|
636
|
-
6. \`
|
|
637
|
-
7. \`
|
|
638
|
-
8. \`
|
|
663
|
+
5. \`semantic-consistency/SKILL.md\` — ALWAYS (visual_world + semantic QA)
|
|
664
|
+
6. \`coverage-system/SKILL.md\` — ALWAYS (mandatory coverage shots)
|
|
665
|
+
7. \`visual-modes/SKILL.md\` — ALWAYS (Ultra Realism default)
|
|
666
|
+
8. \`audio-design/SKILL.md\` — When dialogue or SFX needed
|
|
667
|
+
9. \`prompt-structure/SKILL.md\` — ALWAYS (prompt templates)
|
|
639
668
|
|
|
640
669
|
### When User Asks /generate
|
|
641
670
|
1. Read \`.agent/workflows/generate.md\` for the full procedure
|
|
@@ -645,15 +674,18 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
|
|
|
645
674
|
5. Create project info: \`${config.outputDir}/project-info.md\`
|
|
646
675
|
6. Create plan: \`${config.outputDir}/shot-plan.json\`
|
|
647
676
|
7. Keep top-level \`voiceCast\` in the plan and \`Audio Plan\` in speaking VIDEO sections
|
|
648
|
-
8.
|
|
677
|
+
8. Keep top-level \`visual_world\` in the plan for camera/lens/camera-movement/light/shadow/scale/reflection/physics/seed rules
|
|
678
|
+
9. Write reports: \`${config.outputDir}/reports/SAFETY-REPORT.md\`, \`${config.outputDir}/reports/SEMANTIC-REPORT.md\`, \`${config.outputDir}/reports/DELIVERY-REPORT.md\`
|
|
649
679
|
|
|
650
680
|
### Critical Rules
|
|
651
681
|
- **AUTO-ANONYMOUS:** Replace ALL real names with physical descriptions
|
|
652
682
|
- **Name Policy:** Dialogue naming follows \`${config.outputDir}/shot-plan.json\` policy
|
|
653
683
|
- **AUTO-SAFETY:** Proactively reframe sensitive content
|
|
654
684
|
- **Frame Chaining:** Last frame of SHOT[N] = First frame of SHOT[N+1]
|
|
685
|
+
- **Chain Hardening:** chained ILK/İLK FRAME code block contains only \`Use SHOT[prev]_END as exact first frame\`
|
|
655
686
|
- **Coverage:** 2-3 sub-shots per main shot (in same file, min 60 words each)
|
|
656
687
|
- **Spatial Realism:** eyeline targets, shared light, depth scale, and anti-cutout staging must agree when subjects share frame
|
|
688
|
+
- **Semantic Consistency:** \`visual_world\` controls perspective/geometry, shadow vector, scale map, reflections, physics, anatomy risk, background coherence, and contextual contradictions
|
|
657
689
|
- **Avoid Line:** MANDATORY on every prompt
|
|
658
690
|
- **Music:** NONE by default
|
|
659
691
|
- **Voice Design:** keep \`voiceCast\` in \`${config.outputDir}/shot-plan.json\` and \`Audio Plan\` in speaking VIDEO sections
|
|
@@ -686,9 +718,9 @@ When request is /generate, follow the Film-Kit Hollywood production system:
|
|
|
686
718
|
3. Load required skills from \`.agent/skills/\`
|
|
687
719
|
4. Transform scenario into production shot package at \`${config.outputDir}\`
|
|
688
720
|
5. Generate: project-info.md, shot-plan.json, _index.md, shots/SHOT01.md..SHOTNN.md
|
|
689
|
-
6. Keep top-level \`voiceCast\` in shot-plan.json
|
|
721
|
+
6. Keep top-level \`voiceCast\` and \`visual_world\` in shot-plan.json
|
|
690
722
|
7. Each SHOTNN.md: İLK FRAME + SON FRAME + AUDIO PLAN + VİDEO + 2-3 Coverage (ALL IN ONE FILE)
|
|
691
|
-
8. Enforce: auto-anonymous, dialogue name policy, auto-safety, frame chaining, avoid lines
|
|
723
|
+
8. Enforce: auto-anonymous, dialogue name policy, auto-safety, frame chaining, semantic consistency, avoid lines
|
|
692
724
|
9. Write reports to \`${config.outputDir}/reports/\` before /finish
|
|
693
725
|
`;
|
|
694
726
|
}
|
|
@@ -713,10 +745,11 @@ Before generating ANY prompts, read these skills:
|
|
|
713
745
|
2. \`.agent/skills/reference-locking/SKILL.md\` — When refs provided
|
|
714
746
|
3. \`.agent/skills/frame-chaining/SKILL.md\` — ALWAYS
|
|
715
747
|
4. \`.agent/skills/spatial-blocking/SKILL.md\` — When gaze / depth / scale realism is critical
|
|
716
|
-
5. \`.agent/skills/
|
|
717
|
-
6. \`.agent/skills/
|
|
718
|
-
7. \`.agent/skills/
|
|
719
|
-
8. \`.agent/skills/
|
|
748
|
+
5. \`.agent/skills/semantic-consistency/SKILL.md\` — ALWAYS (visual_world + semantic QA)
|
|
749
|
+
6. \`.agent/skills/coverage-system/SKILL.md\` — ALWAYS (mandatory)
|
|
750
|
+
7. \`.agent/skills/visual-modes/SKILL.md\` — ALWAYS
|
|
751
|
+
8. \`.agent/skills/audio-design/SKILL.md\` — When dialogue/SFX
|
|
752
|
+
9. \`.agent/skills/prompt-structure/SKILL.md\` — ALWAYS
|
|
720
753
|
|
|
721
754
|
## Workflows
|
|
722
755
|
| Command | Workflow |
|
|
@@ -739,7 +772,8 @@ Before generating ANY prompts, read these skills:
|
|
|
739
772
|
- Project info: \`${config.outputDir}/project-info.md\`
|
|
740
773
|
- Plan: \`${config.outputDir}/shot-plan.json\`
|
|
741
774
|
- Voice contract: top-level \`voiceCast\` in \`${config.outputDir}/shot-plan.json\`
|
|
742
|
-
-
|
|
775
|
+
- Semantic contract: top-level \`visual_world\` in \`${config.outputDir}/shot-plan.json\`
|
|
776
|
+
- Reports: \`${config.outputDir}/reports/SAFETY-REPORT.md\`, \`${config.outputDir}/reports/SEMANTIC-REPORT.md\`, \`${config.outputDir}/reports/DELIVERY-REPORT.md\`
|
|
743
777
|
|
|
744
778
|
## Critical Rules
|
|
745
779
|
1. **AUTO-ANONYMOUS:** Replace ALL real person names with physical descriptions
|
|
@@ -753,8 +787,10 @@ Before generating ANY prompts, read these skills:
|
|
|
753
787
|
9. **Ultra Realism** default visual mode
|
|
754
788
|
10. **8s duration** default, slow burn pacing
|
|
755
789
|
11. **ILK/İLK FRAME:** always keep fenced code block
|
|
756
|
-
12. **
|
|
757
|
-
13. **
|
|
790
|
+
12. **Chained ILK/İLK FRAME:** code block contains only \`Use SHOT[prev]_END as exact first frame\`; any new visual prompt is CHAIN BREAK
|
|
791
|
+
13. **ONE FILE PER SHOT:** SHOTNN.md contains main shot + all coverage
|
|
792
|
+
14. **Relational Realism:** preserve eyeline targets, shared light, depth scale, and anti-cutout staging when multiple subjects share frame
|
|
793
|
+
15. **Semantic Consistency:** preserve \`visual_world\` perspective, shadow vector, scale map, reflections, gravity/contact physics, anatomy risk, foreground/background coherence, and contextual logic
|
|
758
794
|
|
|
759
795
|
## Quality Floor (Hard Gate)
|
|
760
796
|
Reject and regenerate any shot that fails:
|
|
@@ -767,6 +803,7 @@ Reject and regenerate any shot that fails:
|
|
|
767
803
|
- missing explicit foreground/midground/background action details
|
|
768
804
|
- missing explicit eyeline target or \`not camera\` instruction when gaze matters
|
|
769
805
|
- missing explicit shared light source / depth / contact cues in multi-subject shots
|
|
806
|
+
- missing semantic consistency anchors: perspective/geometry, shadow vector, scale map, reflection handling, gravity/contact physics, anatomy risk, foreground/background coherence, contextual contradiction check
|
|
770
807
|
|
|
771
808
|
## Reject Weak Prompt Style
|
|
772
809
|
Do not accept generic filler language:
|
|
@@ -869,7 +906,7 @@ The more aligned these are, the cleaner the transition:
|
|
|
869
906
|
- Hand pose and finger count should be similar in both frames
|
|
870
907
|
- Avoid end frames with extreme mouth positions if speech is not intended
|
|
871
908
|
|
|
872
|
-
**Loop shortcut:** Set Start = End (same image). Prompt: "seamless loop" + simple camera movement (e.g., roll 360,
|
|
909
|
+
**Loop shortcut:** Set Start = End (same image). Prompt: "seamless loop" + simple camera movement (e.g., roll 360, Dolly In).
|
|
873
910
|
|
|
874
911
|
### Transformation Budget
|
|
875
912
|
|
|
@@ -924,10 +961,11 @@ These prevent the model from taking shortcuts.
|
|
|
924
961
|
More complex camera = more warp risk.
|
|
925
962
|
|
|
926
963
|
**Safest commands (highest success rate):**
|
|
927
|
-
-
|
|
928
|
-
-
|
|
929
|
-
-
|
|
930
|
-
-
|
|
964
|
+
- Dolly In / Dolly Out
|
|
965
|
+
- Pan Left / Pan Right
|
|
966
|
+
- Tilt Up / Tilt Down
|
|
967
|
+
- Tracking Shot or Steadicam Movement for smooth follow
|
|
968
|
+
- Handheld Movement with gentle micro-sway
|
|
931
969
|
- roll 360 (especially for loops)
|
|
932
970
|
|
|
933
971
|
**Stabilization trick:** Writing "tripod-locked" reduces background jitter.
|
|
@@ -1065,17 +1103,23 @@ Element Binding is Kling 3.0's built-in technology for maintaining character and
|
|
|
1065
1103
|
- For **multi-shot** sequences: Prefer Element Binding when available
|
|
1066
1104
|
- **Fallback:** If Element Binding is not available in your interface, manually repeat: character age, distinctive features, costume, and key proportions at each shot's start
|
|
1067
1105
|
|
|
1068
|
-
### Advanced Camera Vocabulary (
|
|
1106
|
+
### Advanced Camera Vocabulary (24-Move Cinematic Lexicon)
|
|
1107
|
+
|
|
1108
|
+
These professional terms activate Kling's "Visual Chain-of-Thought" (vCoT) for more precise results. Use one named movement per shot unless a motivated compound move is required:
|
|
1069
1109
|
|
|
1070
|
-
|
|
1110
|
+
| Group | Movements |
|
|
1111
|
+
|-------|-----------|
|
|
1112
|
+
| **Physical push/pull** | Dolly In, Dolly Out |
|
|
1113
|
+
| **Locked head rotation** | Pan Left, Pan Right, Tilt Up, Tilt Down |
|
|
1114
|
+
| **Physical lateral/vertical travel** | Truck Left, Truck Right, Pedestal Up, Pedestal Down |
|
|
1115
|
+
| **Arc/parallax** | Arc Left, Arc Right, Tracking Shot, Leading Shot, Following Shot |
|
|
1116
|
+
| **Dynamic stabilization** | Whip Pan, Handheld Movement, Steadicam Movement |
|
|
1117
|
+
| **Angle/subjective** | Canted Angle (Dutch Angle), Point of View (POV) |
|
|
1118
|
+
| **Optical/composite** | Zoom In, Zoom Out, Dolly Zoom (Vertigo Effect), Crane/Jib Shot |
|
|
1071
1119
|
|
|
1072
|
-
|
|
1073
|
-
|----------|-------|
|
|
1074
|
-
| **Angles** | Low-angle hero shot, Dutch angle (tilted horizon), POV (subjective), Bird's-eye view (top-down) |
|
|
1075
|
-
| **Movements** | Dolly push-in, Orbit (360° rotation), Lateral pan, Tracking, Spiral up |
|
|
1076
|
-
| **Hybrid** | Dolly Zoom (Vertigo effect — zoom in while pulling back), Move Left and Zoom In (simultaneous) |
|
|
1120
|
+
Aliases: Dolly push-in = Dolly In, Dolly pull-out = Dolly Out, Orbit = Arc Left/Right, Lateral slide = Truck Left/Right, Crane rise/descend = Crane/Jib Shot. Rack focus is a focus move, not camera travel.
|
|
1077
1121
|
|
|
1078
|
-
> **Tip:** Hybrid movements like Dolly Zoom trigger stronger vCoT processing and produce more cinematic results, but increase warp risk. Use with wider CFG (0.50-0.60).
|
|
1122
|
+
> **Tip:** Hybrid movements like Dolly Zoom, Truck plus Pan, or Crane/Jib plus Arc trigger stronger vCoT processing and produce more cinematic results, but increase warp risk. Use with wider CFG (0.50-0.60).
|
|
1079
1123
|
|
|
1080
1124
|
### Native Audio & Dialogue (Kling-Specific)
|
|
1081
1125
|
|
package/content/ARCHITECTURE.md
CHANGED
|
@@ -9,7 +9,7 @@
|
|
|
9
9
|
Modular system consisting of:
|
|
10
10
|
|
|
11
11
|
- **1 Specialist Agent** - Technical Prompt Engineer
|
|
12
|
-
- **
|
|
12
|
+
- **9 Skills** - Domain-specific knowledge modules
|
|
13
13
|
- **4 Workflows** - Slash command procedures
|
|
14
14
|
|
|
15
15
|
---
|
|
@@ -30,6 +30,7 @@ Modular system consisting of:
|
|
|
30
30
|
│ ├── frame-chaining/ # Shot continuity protocol
|
|
31
31
|
│ ├── coverage-system/ # Mandatory coverage shots (NEW)
|
|
32
32
|
│ ├── spatial-blocking/ # Eyeline, depth, scale, compositing realism
|
|
33
|
+
│ ├── semantic-consistency/ # Perspective, shadows, scale, physics, render QA
|
|
33
34
|
│ ├── visual-modes/ # Ultra Realism & style modes
|
|
34
35
|
│ ├── audio-design/ # Sound design rules
|
|
35
36
|
│ └── prompt-structure/ # Prompt engineering patterns
|
|
@@ -47,11 +48,11 @@ Modular system consisting of:
|
|
|
47
48
|
|
|
48
49
|
| Agent | Focus | Skills Used |
|
|
49
50
|
|-------|-------|-------------|
|
|
50
|
-
| `prompt-engineer` | Cinematic prompt generation for Veo 3.1 / Kling 3.0 | All
|
|
51
|
+
| `prompt-engineer` | Cinematic prompt generation for Veo 3.1 / Kling 3.0 | All 9 skills |
|
|
51
52
|
|
|
52
53
|
---
|
|
53
54
|
|
|
54
|
-
## 🧩 Skills (
|
|
55
|
+
## 🧩 Skills (9)
|
|
55
56
|
|
|
56
57
|
| Skill | Description |
|
|
57
58
|
|-------|-------------|
|
|
@@ -60,6 +61,7 @@ Modular system consisting of:
|
|
|
60
61
|
| `frame-chaining` | **Shot continuity**, last frame → first frame chaining, scene transition protocol (fade, dissolve, match cut) |
|
|
61
62
|
| `coverage-system` | **Mandatory coverage shots** (Reaction, OTS, Insert, Cutaway, ECU, Wide) + L-cut/J-cut + 30° kuralı + **180° kuralı** + eyeline match + matching action + multi-character blocking |
|
|
62
63
|
| `spatial-blocking` | **Relational realism**: eyeline targeting, plane mapping, body orientation, shared lighting, depth/scale integration, anti-cutout / anti-miniature cues |
|
|
64
|
+
| `semantic-consistency` | **Scene-level realism gate**: canonical `visual_world`, perspective/geometry, shadow vectors, scale map, reflections, gravity/contact physics, contextual contradiction checks, render QA |
|
|
63
65
|
| `visual-modes` | **Ultra Realism** default, stylization triggers, anti-AI artifact rules + **renk sürekliliği** + magic hour + flashback/rüya görsel ayrımı |
|
|
64
66
|
| `audio-design` | **Sound design** rules, voice realism, project-level `voiceCast`, shot-level `audioPlan`, audio direction block + diegetic/non-diegetic ses ayrımı |
|
|
65
67
|
| `prompt-structure` | Image/video prompt templates, camera vocabulary, seed parameter, prompt rewriter, **re-take strategy**, coverage prompt yazım standartları (≥60 kelime) |
|
|
@@ -91,6 +93,7 @@ User Scenario → Agent Activated → Read model-profile → Load Required Skill
|
|
|
91
93
|
reference-locking (if refs provided)
|
|
92
94
|
frame-chaining (ALWAYS)
|
|
93
95
|
spatial-blocking (when gaze/depth/scale realism matters)
|
|
96
|
+
semantic-consistency (ALWAYS - visual_world + physics gate)
|
|
94
97
|
coverage-system (ALWAYS - mandatory)
|
|
95
98
|
visual-modes (check for style triggers)
|
|
96
99
|
audio-design (if dialogue/SFX needed)
|
package/content/MASTER.md
CHANGED
|
@@ -26,6 +26,7 @@ Scenario Received → Check for elements:
|
|
|
26
26
|
├── Reference images provided? → READ reference-locking/SKILL.md
|
|
27
27
|
├── Multiple shots? → READ frame-chaining/SKILL.md (ALWAYS)
|
|
28
28
|
├── Multi-subject / gaze / depth realism? → READ spatial-blocking/SKILL.md
|
|
29
|
+
├── ALWAYS READ → semantic-consistency/SKILL.md (visual_world + semantic QA)
|
|
29
30
|
├── Style keywords (anime, noir, etc.)? → READ visual-modes/SKILL.md
|
|
30
31
|
├── Dialogue/SFX needed? → READ audio-design/SKILL.md
|
|
31
32
|
├── ALWAYS READ → coverage-system/SKILL.md (MANDATORY)
|
|
@@ -531,8 +532,9 @@ For full coverage protocols → READ `skills/coverage-system/SKILL.md`
|
|
|
531
532
|
> 🇹🇷 [Türkçe kısa özet: Bu shot'ta ne oluyor, 1 cümle]
|
|
532
533
|
|
|
533
534
|
**İLK FRAME (SHOTNN_START):**
|
|
534
|
-
[If CHAINED: "→ Use SHOT[prev]_END as first frame"]
|
|
535
|
-
[If
|
|
535
|
+
[If CHAINED: "→ Use SHOT[prev]_END as first frame"]
|
|
536
|
+
[If CHAINED: the fenced code block must contain only "Use SHOT[prev]_END as exact first frame"; any new visual prompt requires CHAIN BREAK]
|
|
537
|
+
[If FIRST/BREAK: Generate code block]
|
|
536
538
|
|
|
537
539
|
```
|
|
538
540
|
[Complete image prompt — MIN 60 words, MAX 100 words]
|
|
@@ -675,8 +677,9 @@ Before outputting, validate EVERY shot. **Bu kontrol otomatiktir, kullanıcı ha
|
|
|
675
677
|
|-------|--------------|
|
|
676
678
|
| [safety-compliance](skills/safety-compliance/SKILL.md) | ALWAYS before generating |
|
|
677
679
|
| [reference-locking](skills/reference-locking/SKILL.md) | When reference images provided |
|
|
678
|
-
| [frame-chaining](skills/frame-chaining/SKILL.md) | ALWAYS for multi-shot |
|
|
679
|
-
| [
|
|
680
|
+
| [frame-chaining](skills/frame-chaining/SKILL.md) | ALWAYS for multi-shot |
|
|
681
|
+
| [semantic-consistency](skills/semantic-consistency/SKILL.md) | ALWAYS for `visual_world`, perspective, shadow, scale, physics, and render QA gates |
|
|
682
|
+
| [coverage-system](skills/coverage-system/SKILL.md) | ALWAYS (mandatory for every shot) |
|
|
680
683
|
| [visual-modes](skills/visual-modes/SKILL.md) | Check for style triggers |
|
|
681
684
|
| [audio-design](skills/audio-design/SKILL.md) | When dialogue/SFX needed |
|
|
682
685
|
| [prompt-structure](skills/prompt-structure/SKILL.md) | ALWAYS |
|
|
@@ -710,18 +713,18 @@ Before outputting, validate EVERY shot. **Bu kontrol otomatiktir, kullanıcı ha
|
|
|
710
713
|
|--------|-------------------|
|
|
711
714
|
| **Composition** | Rule of thirds, golden ratio, leading lines, depth layers |
|
|
712
715
|
| **Lighting** | Three-point setups, motivated sources, proper color temperature |
|
|
713
|
-
| **Camera** |
|
|
716
|
+
| **Camera** | 24-move cinematic lexicon from `prompt-structure`, intentional lens choices |
|
|
714
717
|
| **Color** | Graded for mood, consistent palette throughout |
|
|
715
718
|
| **Sound** | Layered design: dialogue, SFX, ambience, Foley |
|
|
716
719
|
| **Continuity** | 180° rule, eyeline match, seamless cuts |
|
|
717
720
|
|
|
718
721
|
### Professional Cinematography Terms
|
|
719
722
|
|
|
720
|
-
- **Camera:**
|
|
723
|
+
- **Camera:** Dolly In/Out, Pan Left/Right, Tilt Up/Down, Truck Left/Right, Pedestal Up/Down, Arc Left/Right, Whip Pan, Tracking Shot, Leading Shot, Following Shot, Canted Angle/Dutch Angle, Handheld Movement, Steadicam Movement, Zoom In/Out, Dolly Zoom, Crane/Jib Shot, Point of View (POV); rack focus and deep focus are focus tools, not travel moves
|
|
721
724
|
- **Lighting:** key light, fill, rim/hair, practical, motivated, diffused, bounce
|
|
722
725
|
- **Composition:** negative space, leading lines, frame within frame, Dutch angle
|
|
723
726
|
- **Color:** LUT, color grade, desaturated, warm/cool palette, high/low key
|
|
724
|
-
- **Movement:** push-in, pull-out, orbit
|
|
727
|
+
- **Movement:** use canonical names from `prompt-structure` instead of vague aliases; push-in = Dolly In, pull-out = Dolly Out, orbit = Arc Left/Right, crash zoom = fast Zoom In/Out
|
|
725
728
|
|
|
726
729
|
---
|
|
727
730
|
|
package/content/RULES.md
CHANGED
|
@@ -23,8 +23,9 @@ Film/Video request detected → Activate prompt-engineer agent
|
|
|
23
23
|
├── model-profile (ALWAYS FIRST)
|
|
24
24
|
├── safety-compliance (ALWAYS)
|
|
25
25
|
├── reference-locking (if refs provided)
|
|
26
|
-
├── frame-chaining (ALWAYS for multi-shot)
|
|
27
|
-
├──
|
|
26
|
+
├── frame-chaining (ALWAYS for multi-shot)
|
|
27
|
+
├── semantic-consistency (ALWAYS for visual_world + physics)
|
|
28
|
+
├── coverage-system (ALWAYS - mandatory)
|
|
28
29
|
├── visual-modes (check for style triggers)
|
|
29
30
|
├── audio-design (if dialogue/SFX)
|
|
30
31
|
└── prompt-structure (ALWAYS)
|
|
@@ -36,8 +37,9 @@ Film/Video request detected → Activate prompt-engineer agent
|
|
|
36
37
|
|-------|------|--------------|
|
|
37
38
|
| Safety & Celebrity Ban | `.agent/skills/safety-compliance/SKILL.md` | ALWAYS |
|
|
38
39
|
| Reference Locking | `.agent/skills/reference-locking/SKILL.md` | When refs provided |
|
|
39
|
-
| Frame Chaining | `.agent/skills/frame-chaining/SKILL.md` | Multi-shot projects |
|
|
40
|
-
|
|
|
40
|
+
| Frame Chaining | `.agent/skills/frame-chaining/SKILL.md` | Multi-shot projects |
|
|
41
|
+
| Semantic Consistency | `.agent/skills/semantic-consistency/SKILL.md` | ALWAYS |
|
|
42
|
+
| Coverage System | `.agent/skills/coverage-system/SKILL.md` | ALWAYS (mandatory) |
|
|
41
43
|
| Visual Modes | `.agent/skills/visual-modes/SKILL.md` | All visual work |
|
|
42
44
|
| Audio Design | `.agent/skills/audio-design/SKILL.md` | Dialogue/SFX needed |
|
|
43
45
|
| Prompt Structure | `.agent/skills/prompt-structure/SKILL.md` | ALWAYS |
|
|
@@ -181,7 +183,7 @@ Shot'ları zenginleştiren 12 teknik: mini hedef, fiziksel aksiyon, katmanlı ı
|
|
|
181
183
|
|--------|-------------------|
|
|
182
184
|
| **Composition** | Rule of thirds, golden ratio, leading lines, depth layers |
|
|
183
185
|
| **Lighting** | Three-point, motivated sources, contrast ratios, color temperature |
|
|
184
|
-
| **Camera** |
|
|
186
|
+
| **Camera** | 24-move cinematic lexicon from `prompt-structure`, lens selection, aperture control |
|
|
185
187
|
| **Color** | Graded for mood, consistent palette, period-appropriate |
|
|
186
188
|
| **Sound** | Layered design, spatial audio, natural dynamics |
|
|
187
189
|
| **Editing** | Motivated cuts, rhythm, pacing, continuity |
|
|
@@ -189,7 +191,7 @@ Shot'ları zenginleştiren 12 teknik: mini hedef, fiziksel aksiyon, katmanlı ı
|
|
|
189
191
|
### Professional Terms to Use
|
|
190
192
|
|
|
191
193
|
```
|
|
192
|
-
Cinematography:
|
|
194
|
+
Cinematography: Dolly In/Out, Pan Left/Right, Tilt Up/Down, Truck Left/Right, Pedestal Up/Down, Arc Left/Right, Whip Pan, Tracking Shot, Leading Shot, Following Shot, Canted Angle/Dutch Angle, Handheld Movement, Steadicam Movement, Zoom In/Out, Dolly Zoom, Crane/Jib Shot, Point of View (POV); rack focus/deep focus are focus tools
|
|
193
195
|
Lighting: key light, fill, rim, practical, motivated, diffused
|
|
194
196
|
Composition: negative space, leading lines, frame within frame
|
|
195
197
|
Color: LUT, grade, desaturated, warm/cool, high/low key
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: prompt-engineer
|
|
3
3
|
description: Senior Technical Prompt Engineer for model-aware runtime profiles (Veo 3.1 / Kling 3.0). Converts shot lists and scenarios into production-grade cinematic prompts optimized for continuity and platform compliance.
|
|
4
|
-
skills: safety-compliance, reference-locking, frame-chaining, coverage-system, spatial-blocking, visual-modes, audio-design, prompt-structure
|
|
4
|
+
skills: safety-compliance, reference-locking, frame-chaining, coverage-system, spatial-blocking, semantic-consistency, visual-modes, audio-design, prompt-structure
|
|
5
5
|
---
|
|
6
6
|
|
|
7
7
|
# Prompt Engineer - Hollywood Standard Cinematic Video Generation
|
|
@@ -44,6 +44,7 @@ You are a senior Technical Prompt Engineer specialized in model-aware cinematic
|
|
|
44
44
|
15. **Hard quality floor:** ILK >=80, SON >=80, VIDEO >=120, coverage >=70 kelime
|
|
45
45
|
16. **Hard specificity floor:** Her promptta lens/framing, lighting ve FG/MG/BG action detayları zorunlu
|
|
46
46
|
17. **Spatial realism floor:** eyeline target, plane map, shared light source, contact/weight cues ve tam ölçek derinlik mantığı gerektiğinde zorunlu
|
|
47
|
+
18. **Semantic consistency floor:** `shot-plan.json.visual_world` kanonik olmalı; perspective/geometry, shadow vector, scale map, reflection handling, gravity/contact physics, anatomy risk, foreground/background coherence ve contextual contradiction her shot'ta geçmeli
|
|
47
48
|
|
|
48
49
|
---
|
|
49
50
|
|
|
@@ -217,12 +218,13 @@ Before outputting ANY shot:
|
|
|
217
218
|
- [ ] Quality floor passes? (ILK>=80, SON>=80, VIDEO>=120, coverage>=70)
|
|
218
219
|
- [ ] Specificity floor passes? (lens + lighting + FG/MG/BG action)
|
|
219
220
|
- [ ] Spatial realism passes? (eyeline target + plane map + shared light + contact/depth cues)
|
|
221
|
+
- [ ] Semantic consistency passes? (`visual_world` fields + perspective/geometry + shadow vector + scale map + reflection/physics/anatomy/context checks)
|
|
220
222
|
- [ ] Model Control block exists? (`Model`, `Preset`, `CFG`, `Transition Mode`)
|
|
221
223
|
|
|
222
224
|
### 3. Kling-Specific Gates (when model is kling-3.0)
|
|
223
225
|
- [ ] Motion timeline uses `first → then → finally` structure?
|
|
224
226
|
- [ ] "What stays the same" explicitly stated? (identity, background, costume)
|
|
225
|
-
- [ ] Camera movement is simple/safe
|
|
227
|
+
- [ ] Camera movement is named from the 24-move cinematic lexicon and simple/safe unless an advanced hybrid is motivated?
|
|
226
228
|
- [ ] Negative prompt includes Kling cleanup set? (warping, rubbery, melted)
|
|
227
229
|
- [ ] Duration matches transformation budget? (5s=1 change, 10s=2-3, 15s=complex)
|
|
228
230
|
- [ ] Start/end frames are in same visual universe? (angle, scale, light, lens)
|
|
@@ -254,11 +256,11 @@ For EACH shot, output exactly one file (`SHOTNN.md`) containing Main Shot + Cove
|
|
|
254
256
|
|
|
255
257
|
## Main Shot
|
|
256
258
|
|
|
257
|
-
### İLK FRAME (SHOTNN_START)
|
|
258
|
-
[Always provide a fenced code block]
|
|
259
|
-
[If CHAINED:
|
|
260
|
-
[If FIRST/BREAK: image prompt in same code block]
|
|
261
|
-
|
|
259
|
+
### İLK FRAME (SHOTNN_START)
|
|
260
|
+
[Always provide a fenced code block]
|
|
261
|
+
[If CHAINED: code block must contain only "Use SHOT[prev]_END as exact first frame"; any new visual prompt requires CHAIN BREAK]
|
|
262
|
+
[If FIRST/BREAK: image prompt in same code block]
|
|
263
|
+
|
|
262
264
|
### SON FRAME (SHOTNN_END)
|
|
263
265
|
```
|
|
264
266
|
[Image Prompt (Flow Order + Avoid Line)]
|
|
@@ -109,7 +109,7 @@ Each coverage shot receives:
|
|
|
109
109
|
Camera: Close-up, 85mm lens, f/2.0, shallow DOF
|
|
110
110
|
Subject: Character's face, eyes, micro-expressions
|
|
111
111
|
Lighting: Soft key, subtle fill, rim for separation
|
|
112
|
-
Movement: Static or very slow
|
|
112
|
+
Movement: Static or very slow Dolly In
|
|
113
113
|
Duration: 4-6 seconds
|
|
114
114
|
Audio: Natural ambience only, no dialogue (listening)
|
|
115
115
|
```
|
|
@@ -386,5 +386,5 @@ When designing END frames, consider the transformation budget:
|
|
|
386
386
|
For seamless loops with Kling:
|
|
387
387
|
1. **Start = End** (use identical image for both)
|
|
388
388
|
2. Prompt: `"seamless loop"` + simple camera movement
|
|
389
|
-
3. Best movements for loops: roll 360°,
|
|
389
|
+
3. Best movements for loops: roll 360°, Dolly In/Dolly Out
|
|
390
390
|
4. Avoid subject transformations — loops work best with camera-only motion
|
|
@@ -23,7 +23,8 @@ description: Image and video prompt templates, camera movement vocabulary, stabi
|
|
|
23
23
|
| **Reference Commands First** | Always start with reference image instructions |
|
|
24
24
|
| **Safety First** | Always consider filter implications |
|
|
25
25
|
| **Short Sentence Rule** | Split long sentences across shots |
|
|
26
|
-
| **Model Profile First** | Read `.agent/model-profile.md` before generating any prompts |
|
|
26
|
+
| **Model Profile First** | Read `.agent/model-profile.md` before generating any prompts |
|
|
27
|
+
| **Semantic World First** | Read `.agent/skills/semantic-consistency/SKILL.md` and align prompts to `shot-plan.json.visual_world` |
|
|
27
28
|
|
|
28
29
|
---
|
|
29
30
|
|
|
@@ -45,9 +46,9 @@ description: Image and video prompt templates, camera movement vocabulary, stabi
|
|
|
45
46
|
|
|
46
47
|
> **İlk cümle = "Bu shot ne?" sorusunun tek cümlelik cevabı olmalı.**
|
|
47
48
|
|
|
48
|
-
### Standard Template
|
|
49
|
-
|
|
50
|
-
```
|
|
49
|
+
### Standard Template
|
|
50
|
+
|
|
51
|
+
```
|
|
51
52
|
[REFERENCE LOCK section if applicable]
|
|
52
53
|
|
|
53
54
|
Cinematic still frame of [subject with reference adherence] in [frozen pose].
|
|
@@ -56,8 +57,27 @@ Lighting: [specific setup].
|
|
|
56
57
|
Camera: [framing], [lens mm], [aperture], photorealistic, crisp focus, no motion blur, no text.
|
|
57
58
|
[Safety injection if needed]
|
|
58
59
|
|
|
59
|
-
[Full avoid line]
|
|
60
|
-
```
|
|
60
|
+
[Full avoid line]
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
### Semantic Consistency Layer
|
|
64
|
+
|
|
65
|
+
Before writing any image prompt, lock the scene's `visual_world` values:
|
|
66
|
+
|
|
67
|
+
- camera height, lens family, horizon line, and vanishing-point strategy
|
|
68
|
+
- single motivated light source, color temperature, and shadow direction
|
|
69
|
+
- foreground/midground/background scale map
|
|
70
|
+
- reflection risk (`none`, `matte/non-reflective`, or accurate mirror/water/glass behavior)
|
|
71
|
+
- gravity/contact physics and anatomy risk
|
|
72
|
+
- contextual contradiction check
|
|
73
|
+
|
|
74
|
+
For chained shots, the `ILK/İLK FRAME` fenced code block must contain only:
|
|
75
|
+
|
|
76
|
+
```text
|
|
77
|
+
Use SHOT[prev]_END as exact first frame.
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
Any new start-frame visual prompt must be declared as `CHAIN BREAK - [reason]`.
|
|
61
81
|
|
|
62
82
|
### Example: Character in Environment
|
|
63
83
|
|
|
@@ -146,43 +166,70 @@ Avoid: distorted faces, morphing, bad anatomy, extra limbs/fingers, blurry, flic
|
|
|
146
166
|
|
|
147
167
|
---
|
|
148
168
|
|
|
149
|
-
## Camera Movement Vocabulary
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
|
156
|
-
|
|
157
|
-
| **
|
|
158
|
-
| **
|
|
159
|
-
| **
|
|
160
|
-
| **
|
|
161
|
-
| **
|
|
162
|
-
| **
|
|
163
|
-
| **
|
|
164
|
-
| **
|
|
165
|
-
| **
|
|
166
|
-
| **
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
|
171
|
-
|
|
172
|
-
|
|
|
173
|
-
| **Dutch
|
|
174
|
-
| **
|
|
175
|
-
| **
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
169
|
+
## Camera Movement Vocabulary
|
|
170
|
+
|
|
171
|
+
Use one precise movement name per shot unless the beat explicitly needs a motivated compound move. Compound moves must preserve the same `visual_world` geometry, shadow vector, screen direction, and subject scale.
|
|
172
|
+
|
|
173
|
+
### Canonical 24 Cinematic Camera Movements
|
|
174
|
+
|
|
175
|
+
| # | Movement | Description | Use When |
|
|
176
|
+
|---|----------|-------------|----------|
|
|
177
|
+
| 1 | **Dolly In** | Camera physically moves toward the subject | Build intensity, reveal emotion, enter a space |
|
|
178
|
+
| 2 | **Dolly Out** | Camera physically moves away from the subject | Reveal context, isolation, ending beat |
|
|
179
|
+
| 3 | **Pan Left** | Locked camera rotates left | Follow action, reveal off-screen space |
|
|
180
|
+
| 4 | **Pan Right** | Locked camera rotates right | Follow action, reveal off-screen space |
|
|
181
|
+
| 5 | **Tilt Up** | Locked camera rotates upward | Reveal height, power, scale |
|
|
182
|
+
| 6 | **Tilt Down** | Locked camera rotates downward | Reveal ground detail, vulnerability, aftermath |
|
|
183
|
+
| 7 | **Truck Left** | Whole camera slides left | Parallax, lateral reveal, follow blocking |
|
|
184
|
+
| 8 | **Truck Right** | Whole camera slides right | Parallax, lateral reveal, follow blocking |
|
|
185
|
+
| 9 | **Pedestal Up** | Camera physically rises vertically | Elevation reveal, character empowerment |
|
|
186
|
+
| 10 | **Pedestal Down** | Camera physically lowers vertically | Intimacy, compression, descent into detail |
|
|
187
|
+
| 11 | **Arc Left** | Camera moves around the subject toward screen-left | Relationship shift, dramatic parallax |
|
|
188
|
+
| 12 | **Arc Right** | Camera moves around the subject toward screen-right | Relationship shift, dramatic parallax |
|
|
189
|
+
| 13 | **Whip Pan** | Very fast pan with directional motion blur | Energetic transition, hidden cut, sudden discovery |
|
|
190
|
+
| 14 | **Tracking Shot** | Camera follows a moving subject at matched speed | Travel, pursuit, continuous blocking |
|
|
191
|
+
| 15 | **Leading Shot** | Camera moves ahead of a character, facing them | Emotional walk-and-talk, confrontation, dread |
|
|
192
|
+
| 16 | **Following Shot** | Camera follows behind a character | Discovery, pursuit, subjective tension |
|
|
193
|
+
| 17 | **Canted Angle (Dutch Angle)** | Camera rolls sideways, tilting the horizon | Unease, disorientation, psychological tension |
|
|
194
|
+
| 18 | **Handheld Movement** | Realistic handheld shake and micro-instability | Documentary realism, urgency, chaos |
|
|
195
|
+
| 19 | **Steadicam Movement** | Smooth stabilized walking movement | Long takes, premium follow shots, immersive movement |
|
|
196
|
+
| 20 | **Zoom In** | Lens optically tightens without camera travel | Attention shift, surveillance, emphasis |
|
|
197
|
+
| 21 | **Zoom Out** | Lens optically widens without camera travel | Reveal context, comedic or existential pullback |
|
|
198
|
+
| 22 | **Dolly Zoom (Vertigo Effect)** | Camera moves while zooming the opposite way | Spatial distortion, shock, realization |
|
|
199
|
+
| 23 | **Crane/Jib Shot** | Camera rises or drops on a crane/jib, often diagonally | Grand reveal, scale, transition from ground to overview |
|
|
200
|
+
| 24 | **Point of View (POV)** | Camera behaves as a character's eyes | Immersion, fear, discovery, subjective action |
|
|
201
|
+
|
|
202
|
+
### Legacy Aliases And Focus Moves
|
|
203
|
+
|
|
204
|
+
| Existing Term | Canonical Mapping |
|
|
205
|
+
|---------------|-------------------|
|
|
206
|
+
| **Dolly push-in** | Dolly In |
|
|
207
|
+
| **Dolly pull-out** | Dolly Out |
|
|
208
|
+
| **Orbit / 360 rotation** | Arc Left / Arc Right; specify direction and avoid full 360 unless required |
|
|
209
|
+
| **Lateral pan / slide** | Truck Left / Truck Right when the camera moves physically; Pan Left / Pan Right when it only rotates |
|
|
210
|
+
| **Crane rise / Crane descend** | Crane/Jib Shot; use Pedestal Up/Down for a purely vertical camera lift |
|
|
211
|
+
| **Spiral up** | Crane/Jib Shot + Arc Left/Right; higher warp risk |
|
|
212
|
+
| **Rack focus** | Focus shift, not a camera movement; use only when foreground/background attention must change |
|
|
213
|
+
|
|
214
|
+
### Advanced Angles (Kling vCoT Triggers)
|
|
215
|
+
|
|
216
|
+
| Angle | Effect | Use When |
|
|
217
|
+
|-------|--------|----------|
|
|
218
|
+
| **Low-angle hero shot** | Subject towers over camera | Power, dominance, heroism |
|
|
219
|
+
| **Dutch angle / Canted Angle** | Tilted horizon | Unease, disorientation, tension |
|
|
220
|
+
| **POV / Point of View** | Camera = character's eyes | Immersion, fear, discovery |
|
|
221
|
+
| **Bird's-eye view** | Top-down perspective | Scale, geography, isolation |
|
|
222
|
+
|
|
223
|
+
### Hybrid Movements (Advanced - Higher Risk, Higher Reward)
|
|
224
|
+
|
|
225
|
+
| Movement | Effect | Risk Level |
|
|
226
|
+
|----------|--------|------------|
|
|
227
|
+
| **Dolly Zoom (Vertigo Effect)** | Camera moves while the lens zooms the opposite way | Medium - stunning but warp-prone |
|
|
228
|
+
| **Truck Left/Right + Pan Left/Right** | Lateral camera move plus rotation for parallax | Medium - must preserve screen direction |
|
|
229
|
+
| **Crane/Jib Shot + Arc Left/Right** | Elevated diagonal move around subject | Low-Medium - smooth cinematic reveal |
|
|
230
|
+
| **Whip Pan + Match Cut** | Fast blurred rotation hides transition | Medium - use only when transition is intentional |
|
|
231
|
+
|
|
232
|
+
> **Kling Note:** Professional terms activate Kling's "Visual Chain-of-Thought" (vCoT). Hybrid movements trigger stronger vCoT processing for more cinematic results, but increase warp risk. Use with CFG 0.50-0.60.
|
|
186
233
|
|
|
187
234
|
---
|
|
188
235
|
|
|
@@ -0,0 +1,94 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: semantic-consistency
|
|
3
|
+
description: Enforces scene-level semantic consistency for generated image/video prompts and rendered media: perspective, shadows, scale, reflections, physics, anatomy, contextual contradictions, chain reuse, and render QA.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Semantic Consistency Skill
|
|
7
|
+
|
|
8
|
+
Load this skill for every Film-Kit generation, repair, safety check, and render QA pass.
|
|
9
|
+
It turns semantic realism into a hard production gate instead of a best-effort prompt style.
|
|
10
|
+
|
|
11
|
+
## Required `visual_world` Contract
|
|
12
|
+
|
|
13
|
+
Every `shot-plan.json` or `team-plan.json` must carry a scene-level `visual_world` block.
|
|
14
|
+
Use it as the canonical source for all shots in the same scene:
|
|
15
|
+
|
|
16
|
+
```json
|
|
17
|
+
{
|
|
18
|
+
"visual_world": {
|
|
19
|
+
"aspect_ratio": "16:9",
|
|
20
|
+
"camera_height": "eye-level / low-angle / high-angle / top-down",
|
|
21
|
+
"lens_family": "wide 24-35mm / normal 50mm / portrait 85mm / telephoto 135mm+ / macro",
|
|
22
|
+
"horizon_line": "low / center / high / not visible",
|
|
23
|
+
"vanishing_point_strategy": "single-point / two-point / flat telephoto / top-down no horizon",
|
|
24
|
+
"camera_movement_strategy": "static / one named 24-move cinematic movement / motivated compound move with parallax notes",
|
|
25
|
+
"light_source": "single motivated source, position, angle, softness",
|
|
26
|
+
"shadow_direction": "all shadows fall screen-left / screen-right / toward camera / away from camera / directly below",
|
|
27
|
+
"color_temperature": "warm 3500K / neutral daylight 5600K / cool 6500K",
|
|
28
|
+
"scale_map": "foreground, midground, background object sizes and distance logic",
|
|
29
|
+
"reflection_risk": "none / glass / mirror / water / metal, plus expected reflection behavior",
|
|
30
|
+
"physics_constraints": "gravity, contact points, support surfaces, cloth/hair/liquid behavior",
|
|
31
|
+
"seed_strategy": "locked seed for variants / new seed for structural repair / unknown platform seed"
|
|
32
|
+
}
|
|
33
|
+
}
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
If any field is unknown, write an explicit value such as `not applicable: no reflective surfaces`.
|
|
37
|
+
Do not leave silent blanks.
|
|
38
|
+
|
|
39
|
+
## Hard Semantic Gates
|
|
40
|
+
|
|
41
|
+
Reject or repair a prompt package if any item fails:
|
|
42
|
+
|
|
43
|
+
- Perspective/geometry: camera angle, lens family, camera height, horizon line, vanishing logic, and camera movement strategy do not agree.
|
|
44
|
+
- Camera movement: a moving shot does not use one of the 24 canonical cinematic movement names from `prompt-structure`, or combines movements without motivated parallax/scale notes.
|
|
45
|
+
- Shadow vector: a single motivated light source and one consistent shadow direction are not stated.
|
|
46
|
+
- Scale map: foreground/midground/background object sizes and distances are ambiguous or physically impossible.
|
|
47
|
+
- Reflection handling: mirror, glass, water, or metal surfaces lack accurate reflection instructions or a `matte/non-reflective` reduction.
|
|
48
|
+
- Gravity/contact physics: objects, feet, furniture, cloth, hair, liquid, smoke, and debris do not have plausible support or behavior cues.
|
|
49
|
+
- Anatomy risk: visible humans lack pose simplicity, hand/face risk handling, or scene-specific anatomy avoid terms.
|
|
50
|
+
- Foreground/background coherence: background does not match foreground perspective, lighting, color temperature, and style.
|
|
51
|
+
- Contextual contradiction: prompt contains incompatible scene logic unless explicitly justified.
|
|
52
|
+
- Scene-specific avoid line: avoid terms are generic only, missing the scene's actual failure modes.
|
|
53
|
+
|
|
54
|
+
## Chained `ILK FRAME` Rule
|
|
55
|
+
|
|
56
|
+
For a chained shot, the `ILK FRAME` fenced code block may contain only this reuse instruction:
|
|
57
|
+
|
|
58
|
+
```text
|
|
59
|
+
Use SHOT[prev]_END as exact first frame.
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
No new visual prompt, new lens, new camera angle, new lighting, or new composition is allowed inside a chained `ILK FRAME` block.
|
|
63
|
+
If a new `ILK FRAME` prompt is required, mark the shot as `CHAIN BREAK - [reason]` and generate a fresh start frame.
|
|
64
|
+
|
|
65
|
+
## Prompt Injection Template
|
|
66
|
+
|
|
67
|
+
Append the relevant semantic anchors to every image prompt:
|
|
68
|
+
|
|
69
|
+
```text
|
|
70
|
+
Semantic consistency: [camera height], [lens family], [horizon/vanishing logic], camera movement [static or named 24-move lexicon term], single [light source] from [direction] at [angle], [color temperature], all shadows falling [direction], scale map [FG/MG/BG distances], contact points and gravity physically plausible, background perspective and lighting match the foreground, reflection handling [none/matte/accurate].
|
|
71
|
+
Avoid: improper perspective, wrong scale, inconsistent shadows, impossible geometry, conflicting light sources, unrealistic reflections, floating objects, disconnected elements, broken gravity, bad anatomy, extra fingers, deformed hands, foreground-background mismatch, contextual contradiction, [scene-specific terms].
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
Keep avoid lines targeted. Prefer 15-25 concrete scene-relevant terms over long universal lists.
|
|
75
|
+
Do not put the positive goal in the avoid line.
|
|
76
|
+
|
|
77
|
+
## Render QA Gate
|
|
78
|
+
|
|
79
|
+
After render, inspect the actual image/video outputs, not only the prompt text.
|
|
80
|
+
Write `SEMANTIC-RENDER-REPORT.md` or a `Semantic Render QA` section in `RENDER-REPORT.md` with these fields:
|
|
81
|
+
|
|
82
|
+
```markdown
|
|
83
|
+
- semantic_render_status: pass/fail
|
|
84
|
+
- perspective_geometry_status: pass/fail
|
|
85
|
+
- shadow_vector_status: pass/fail
|
|
86
|
+
- scale_depth_status: pass/fail
|
|
87
|
+
- reflection_status: pass/fail/not_applicable
|
|
88
|
+
- anatomy_physics_status: pass/fail
|
|
89
|
+
- foreground_background_status: pass/fail
|
|
90
|
+
- chain_alignment_status: pass/fail
|
|
91
|
+
- rerender_or_recover_actions: [none or exact SHOTNN actions]
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
Fail the render if the actual media contradicts the canonical `visual_world` or if a chained first frame is not an exact copy of the previous rendered end frame.
|
|
@@ -138,6 +138,7 @@ These cues reduce the pasted-PNG feeling.
|
|
|
138
138
|
## 4. Shared Light Map
|
|
139
139
|
|
|
140
140
|
The fastest way to create a composite artifact is to light each subject as if they live in different scenes.
|
|
141
|
+
For the broader semantic gate, load `semantic-consistency/SKILL.md` and copy the canonical `visual_world.camera_movement_strategy`, `light_source`, `shadow_direction`, `color_temperature`, `scale_map`, `reflection_risk`, and `physics_constraints` into the shot language.
|
|
141
142
|
|
|
142
143
|
### Required Language
|
|
143
144
|
|
|
@@ -94,21 +94,25 @@ These artifacts identify AI-generated content. AVOID them absolutely:
|
|
|
94
94
|
|
|
95
95
|
## IMAGE AVOID LINE (Ultra Realism)
|
|
96
96
|
|
|
97
|
-
Include at end of EVERY image prompt:
|
|
98
|
-
|
|
99
|
-
```
|
|
97
|
+
Include at end of EVERY image prompt:
|
|
98
|
+
|
|
99
|
+
```
|
|
100
100
|
Avoid: blurry, low-res, noise, jpeg artifacts, motion blur, out of focus, distorted faces, bad anatomy, extra limbs/fingers, deformed hands, mismatched eyes, warped perspective, inconsistent lighting, banding, over-sharpening, plastic skin, waxy skin, airbrushed skin, beauty filter, porcelain doll look, symmetrical face, duplicate people, floating artifacts, cutout edges, pasted composite look, toy-like scale, miniature effect, disconnected eyelines, on-screen text, captions/subtitles, watermark, logo, UI elements, cartoon/anime style, illustration style, CGI look, video game graphics, 3D render look, artificial lighting, synthetic appearance.
|
|
101
|
-
```
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
Add scene-specific semantic terms from `semantic-consistency/SKILL.md` rather than blindly expanding the list. Prioritize perspective, shadow vector, scale, reflection, gravity/contact, anatomy, foreground/background, and contextual contradiction risks that actually exist in the shot.
|
|
102
104
|
|
|
103
105
|
---
|
|
104
106
|
|
|
105
107
|
## VIDEO AVOID LINE (Ultra Realism)
|
|
106
108
|
|
|
107
|
-
Include at end of EVERY video prompt:
|
|
108
|
-
|
|
109
|
-
```
|
|
109
|
+
Include at end of EVERY video prompt:
|
|
110
|
+
|
|
111
|
+
```
|
|
110
112
|
Avoid: distorted faces, morphing, bad anatomy, extra limbs/fingers, blurry, flickering, frame drops, inconsistent lighting, unnatural motion, warping, rolling shutter artifacts, camera jitter, cutout edges, pasted composite look, toy-like scale, miniature effect, disconnected eyelines, on-screen text, captions/subtitles, watermark, logo, cartoon/anime style, CGI motion, synthetic appearance, robotic movement, puppet-like animation, uncanny valley expressions.
|
|
111
|
-
```
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
For Start+End or image-to-video workflows, keep the semantic avoid line aligned with the same `visual_world` used by the still prompts.
|
|
112
116
|
|
|
113
117
|
---
|
|
114
118
|
|
|
@@ -50,7 +50,9 @@ If model is kling-3.0: keep Start+End transition mode and first/then/finally mot
|
|
|
50
50
|
- Write each shot to `$OUTPUT_DIR/shots/SHOT[NN].md`
|
|
51
51
|
- Keep one-file-per-shot contract
|
|
52
52
|
- Ensure `ILK/İLK FRAME` code block exists even when chained
|
|
53
|
+
- For chained shots, the `ILK/İLK FRAME` code block must contain only `Use SHOT[prev]_END as exact first frame`; write `CHAIN BREAK` before any new start-frame prompt
|
|
53
54
|
- Keep `Audio Plan` blocks aligned to the existing `voiceCast`
|
|
55
|
+
- Keep `shot-plan.json.visual_world` consistent or document a `CHAIN BREAK` scene-world reset
|
|
54
56
|
- Update `$OUTPUT_DIR/_index.md`
|
|
55
57
|
|
|
56
58
|
### 5. Refresh Reports
|
|
@@ -59,6 +61,7 @@ After continuation batch:
|
|
|
59
61
|
|
|
60
62
|
- run `/safety-check`
|
|
61
63
|
- refresh `$OUTPUT_DIR/reports/SAFETY-REPORT.md`
|
|
64
|
+
- refresh `$OUTPUT_DIR/reports/SEMANTIC-REPORT.md`
|
|
62
65
|
- refresh `$OUTPUT_DIR/reports/DELIVERY-REPORT.md`
|
|
63
66
|
|
|
64
67
|
---
|
|
@@ -34,6 +34,8 @@ All shot files follow `SHOTNN.md` naming.
|
|
|
34
34
|
7. Verify `Model Control` block exists in every shot file.
|
|
35
35
|
8. Verify `shot-plan.json` has `voiceCast` coverage for every speaker or narrator.
|
|
36
36
|
9. Verify every speaking VIDEO section has an `Audio Plan` block with valid `activeSpeakerKey`.
|
|
37
|
+
10. Verify `shot-plan.json.visual_world` exists and every shot follows its camera, lens, camera movement strategy, shadow vector, scale map, reflection, physics, and seed strategy.
|
|
38
|
+
11. Verify chained `ILK/İLK FRAME` code blocks contain only `Use SHOT[prev]_END as exact first frame`; any competing prompt must be marked `CHAIN BREAK`.
|
|
37
39
|
8. For `kling-3.0`, verify:
|
|
38
40
|
- `Transition Mode: Start+End`
|
|
39
41
|
- CFG value is documented
|
|
@@ -44,13 +46,14 @@ All shot files follow `SHOTNN.md` naming.
|
|
|
44
46
|
Required reports:
|
|
45
47
|
|
|
46
48
|
- `$OUTPUT_DIR/reports/SAFETY-REPORT.md`
|
|
49
|
+
- `$OUTPUT_DIR/reports/SEMANTIC-REPORT.md`
|
|
47
50
|
- `$OUTPUT_DIR/reports/DELIVERY-REPORT.md`
|
|
48
51
|
|
|
49
52
|
Gate rules:
|
|
50
53
|
|
|
51
54
|
- reports must exist
|
|
52
55
|
- reports must have non-contradictory status fields
|
|
53
|
-
-
|
|
56
|
+
- all reports must be pass
|
|
54
57
|
|
|
55
58
|
If any rule fails, run `/recover` and do not finish.
|
|
56
59
|
|
|
@@ -82,6 +85,7 @@ Do not declare completion unless:
|
|
|
82
85
|
|
|
83
86
|
- all shot files pass structure and continuity
|
|
84
87
|
- safety report is pass
|
|
88
|
+
- semantic report is pass
|
|
85
89
|
- delivery report is pass
|
|
86
90
|
- final summary says pass
|
|
87
91
|
- model-specific checks pass (Kling or Veo profile)
|
|
@@ -38,6 +38,7 @@ $OUTPUT_DIR/
|
|
|
38
38
|
│ └── ...
|
|
39
39
|
├── reports/
|
|
40
40
|
│ ├── SAFETY-REPORT.md # Safety and policy validation
|
|
41
|
+
│ ├── SEMANTIC-REPORT.md # Perspective, shadow, scale, physics validation
|
|
41
42
|
│ └── DELIVERY-REPORT.md # Final packaging validation
|
|
42
43
|
└── _index.md # Shot list with status and report gate table
|
|
43
44
|
```
|
|
@@ -64,6 +65,7 @@ $OUTPUT_DIR/
|
|
|
64
65
|
- `dialogue_name_policy` (`preserve-original-dialogue` or `anonymize-dialogue`)
|
|
65
66
|
- top-level `voiceCast`
|
|
66
67
|
- voice defaults (`single_active_speaker`, `music_default`, `subtitles_default`)
|
|
68
|
+
- top-level `visual_world` with `aspect_ratio`, `camera_height`, `lens_family`, `horizon_line`, `vanishing_point_strategy`, `camera_movement_strategy`, `light_source`, `shadow_direction`, `color_temperature`, `scale_map`, `reflection_risk`, `physics_constraints`, and `seed_strategy`
|
|
67
69
|
10. Ensure every speaker or narrator has a stable `speakerKey` before shot writing.
|
|
68
70
|
11. Ensure directories exist: `$OUTPUT_DIR/`, `$OUTPUT_DIR/shots/`, `$OUTPUT_DIR/reports/`.
|
|
69
71
|
|
|
@@ -92,7 +94,7 @@ For EACH shot:
|
|
|
92
94
|
4. Reuse or create the correct `voiceCast` entry for any speaking character or narrator.
|
|
93
95
|
5. Generate main shot prompts (`ILK/İLK FRAME`, `SON FRAME`, `VIDEO`).
|
|
94
96
|
5. `ILK/İLK FRAME` section MUST always include a fenced code block.
|
|
95
|
-
6. If chained, first-frame code block must
|
|
97
|
+
6. If chained, first-frame code block must contain only: `Use SHOT[prev]_END as exact first frame`. Any new visual prompt, camera, lens, lighting, or composition requires `CHAIN BREAK - [reason]`.
|
|
96
98
|
7. Write a machine-readable `Audio Plan` JSON block for every VIDEO section.
|
|
97
99
|
8. If dialogue or voiceover exists, require `activeSpeakerKey`, `dialogueLines`, and `performanceNote`.
|
|
98
100
|
9. Keep one active speaker per shot. Split reply dialogue across multiple shots.
|
|
@@ -103,11 +105,19 @@ For EACH shot:
|
|
|
103
105
|
- each coverage prompt: minimum 70 words
|
|
104
106
|
8. Enforce specificity floor on every prompt:
|
|
105
107
|
- explicit lens/framing/camera movement
|
|
108
|
+
- camera movement selected from the 24-move cinematic lexicon in `prompt-structure` when movement is present
|
|
106
109
|
- explicit lighting direction/intensity/atmosphere
|
|
107
110
|
- explicit foreground/midground/background action details
|
|
108
111
|
- explicit eyeline target / body orientation when a subject looks at someone or something
|
|
109
112
|
- explicit shared light source / bounce logic when multiple subjects share frame
|
|
110
113
|
- explicit depth/scale integration when more than one plane is visible
|
|
114
|
+
9. Enforce semantic consistency floor on every prompt:
|
|
115
|
+
- perspective/geometry matches `shot-plan.json -> visual_world`
|
|
116
|
+
- shadow vector follows the canonical light source and `shadow_direction`
|
|
117
|
+
- scale map and foreground/midground/background distances are physically plausible
|
|
118
|
+
- reflection handling is accurate or reflective surfaces are intentionally avoided
|
|
119
|
+
- gravity/contact physics, anatomy risk, foreground/background coherence, and contextual contradictions are resolved
|
|
120
|
+
- avoid line contains targeted scene-specific semantic failure terms
|
|
111
121
|
10. Generate coverage prompts (2-3 per main shot, min 70 words each).
|
|
112
122
|
11. Add Turkish summary for shot and each coverage section.
|
|
113
123
|
12. Apply model-specific generation gates (see below).
|
|
@@ -126,7 +136,7 @@ Before writing prompts, design the Start→End transition:
|
|
|
126
136
|
3. **Execution mode:** Default to `single-transition`; use `custom-storyboard` only when the shot truly has 2-3 meaningful internal phases.
|
|
127
137
|
4. **Motion timeline:** Write 2-4 steps: `first → then → finally`.
|
|
128
138
|
5. **Face/hands stability:** Match orientations between start and end — avoid >45° face rotation.
|
|
129
|
-
6. **Camera safety:** Use
|
|
139
|
+
6. **Camera safety:** Use the 24-move cinematic lexicon; prefer safe simple moves (Dolly In/Out, Pan Left/Right, Tilt Up/Down, Truck Left/Right, Pedestal Up/Down, Tracking Shot, Steadicam Movement) before advanced hybrids.
|
|
130
140
|
7. **Anti-fragmentation:** Do not turn one glance, gesture, or prop touch into separate micro-shots. If custom storyboard is used, cap it at 3 stages and make each stage editorially distinct.
|
|
131
141
|
|
|
132
142
|
#### Veo Gate (when model is veo31)
|
|
@@ -139,7 +149,7 @@ Before writing prompts, design the Start→End transition:
|
|
|
139
149
|
|
|
140
150
|
- [ ] `first → then → finally` motion timeline in VIDEO prompts
|
|
141
151
|
- [ ] "What stays the same" explicitly stated (identity, background, costume)
|
|
142
|
-
- [ ] Camera movement is
|
|
152
|
+
- [ ] Camera movement is named from the 24-move cinematic lexicon and kept simple/safe unless the beat requires an advanced hybrid
|
|
143
153
|
- [ ] `stable background`, `no warping`, `physically plausible` constraints
|
|
144
154
|
- [ ] Kling negative prompt set active (warping, rubbery, melted, deformed)
|
|
145
155
|
- [ ] Duration matches transformation budget
|
|
@@ -162,8 +172,18 @@ Before writing prompts, design the Start→End transition:
|
|
|
162
172
|
- `file_completeness_status`
|
|
163
173
|
- `packaging_status`
|
|
164
174
|
- `blockers`
|
|
165
|
-
4.
|
|
166
|
-
|
|
175
|
+
4. Write `$OUTPUT_DIR/reports/SEMANTIC-REPORT.md` with strict fields:
|
|
176
|
+
- `overall_status`
|
|
177
|
+
- `visual_world_status`
|
|
178
|
+
- `perspective_geometry_status`
|
|
179
|
+
- `shadow_vector_status`
|
|
180
|
+
- `scale_depth_status`
|
|
181
|
+
- `reflection_physics_anatomy_status`
|
|
182
|
+
- `contextual_contradiction_status`
|
|
183
|
+
- `chained_ilk_frame_status`
|
|
184
|
+
- `blockers`
|
|
185
|
+
5. Reject any shot output that passes while violating quality floor, specificity floor, or semantic consistency floor.
|
|
186
|
+
6. Do not finalize if any report is fail or missing.
|
|
167
187
|
|
|
168
188
|
---
|
|
169
189
|
|
|
@@ -192,8 +212,8 @@ FIRST SHOT / CHAINED from SHOT[prev]_END / CHAIN BREAK - Reason
|
|
|
192
212
|
### ILK FRAME (SHOTNN_START)
|
|
193
213
|
|
|
194
214
|
```text
|
|
195
|
-
[Image prompt - min 80 words, Flow Order]
|
|
196
|
-
[If chained:
|
|
215
|
+
[Image prompt - min 80 words, Flow Order for FIRST SHOT or CHAIN BREAK only]
|
|
216
|
+
[If chained: the entire code block must be only "Use SHOT[prev]_END as exact first frame"]
|
|
197
217
|
Avoid: blurry, low-res, text, watermark, bad anatomy, distorted face ...
|
|
198
218
|
```
|
|
199
219
|
|
|
@@ -305,6 +325,7 @@ Report:
|
|
|
305
325
|
[N] shot olusturuldu ve $OUTPUT_DIR/shots/ klasorune kaydedildi.
|
|
306
326
|
Toplam: [N] ana shot + [M] coverage = [N+M] production shots.
|
|
307
327
|
Safety report: $OUTPUT_DIR/reports/SAFETY-REPORT.md
|
|
328
|
+
Semantic report: $OUTPUT_DIR/reports/SEMANTIC-REPORT.md
|
|
308
329
|
Delivery report: $OUTPUT_DIR/reports/DELIVERY-REPORT.md
|
|
309
330
|
|
|
310
331
|
Devam etmek icin: 'devam et' veya '/chain'
|
|
@@ -12,9 +12,12 @@ $ARGUMENTS
|
|
|
12
12
|
|
|
13
13
|
- safety report is fail
|
|
14
14
|
- delivery report is fail
|
|
15
|
+
- semantic report is fail
|
|
15
16
|
- missing required shot sections
|
|
16
17
|
- continuity mismatch between neighboring shots
|
|
17
18
|
- missing `voiceCast` entry or broken `activeSpeakerKey` binding
|
|
19
|
+
- missing or contradictory `shot-plan.json.visual_world`
|
|
20
|
+
- chained `ILK/İLK FRAME` block contains a new visual prompt instead of exact reuse
|
|
18
21
|
|
|
19
22
|
## Recovery Steps
|
|
20
23
|
|
|
@@ -24,11 +27,13 @@ $ARGUMENTS
|
|
|
24
27
|
4. Repair `voiceCast` or `Audio Plan` bindings before rerunning reports.
|
|
25
28
|
5. Re-run `/safety-check`.
|
|
26
29
|
6. Regenerate `$OUTPUT_DIR/reports/DELIVERY-REPORT.md`.
|
|
27
|
-
7.
|
|
30
|
+
7. Regenerate `$OUTPUT_DIR/reports/SEMANTIC-REPORT.md`.
|
|
31
|
+
8. Update `$OUTPUT_DIR/_index.md` with recovered status.
|
|
28
32
|
|
|
29
33
|
## Exit Criteria
|
|
30
34
|
|
|
31
35
|
Recovery is complete only when both are pass:
|
|
32
36
|
|
|
33
37
|
- `$OUTPUT_DIR/reports/SAFETY-REPORT.md`
|
|
38
|
+
- `$OUTPUT_DIR/reports/SEMANTIC-REPORT.md`
|
|
34
39
|
- `$OUTPUT_DIR/reports/DELIVERY-REPORT.md`
|
|
@@ -46,8 +46,18 @@ Validate all prompts before delivery to ensure platform compliance.
|
|
|
46
46
|
3. Continuity checks
|
|
47
47
|
- `SHOT[N]_END` aligns with `SHOT[N+1]_START`
|
|
48
48
|
- chain breaks are explicitly declared when needed
|
|
49
|
-
|
|
50
|
-
|
|
49
|
+
- chained `ILK/İLK FRAME` code blocks contain only `Use SHOT[prev]_END as exact first frame`; competing new prompts are automatic fail
|
|
50
|
+
|
|
51
|
+
4. Semantic consistency checks
|
|
52
|
+
- `shot-plan.json` contains top-level `visual_world`
|
|
53
|
+
- `visual_world` includes `aspect_ratio`, `camera_height`, `lens_family`, `horizon_line`, `vanishing_point_strategy`, `camera_movement_strategy`, `light_source`, `shadow_direction`, `color_temperature`, `scale_map`, `reflection_risk`, `physics_constraints`, and `seed_strategy`
|
|
54
|
+
- every prompt aligns perspective/geometry with the canonical lens/camera/horizon/vanishing logic
|
|
55
|
+
- every moving prompt uses one of the 24 canonical cinematic camera movement names from `prompt-structure`
|
|
56
|
+
- every prompt states one shadow vector from the canonical light source
|
|
57
|
+
- scale/depth, foreground/background coherence, reflection handling, gravity/contact physics, anatomy risk, and contextual contradictions are resolved
|
|
58
|
+
- avoid lines include targeted scene-specific semantic failure terms
|
|
59
|
+
|
|
60
|
+
5. Quality checks
|
|
51
61
|
- `ILK/İLK FRAME` prompt >= 80 words
|
|
52
62
|
- `SON FRAME` prompt >= 80 words
|
|
53
63
|
- `VIDEO/VİDEO` prompt >= 120 words
|
|
@@ -60,7 +70,7 @@ Validate all prompts before delivery to ensure platform compliance.
|
|
|
60
70
|
- contact / weight / support cues exist when compositing realism is critical
|
|
61
71
|
- depth / scale integration is explicit when multiple planes are visible
|
|
62
72
|
|
|
63
|
-
|
|
73
|
+
6. Kling-only checks (when model is `kling-3.0`)
|
|
64
74
|
- `Model Control` block exists in each SHOT file
|
|
65
75
|
- `Transition Mode: Start+End` is declared
|
|
66
76
|
- VIDEO prompt contains explicit `first -> then -> finally` progression
|
|
@@ -90,6 +100,30 @@ Write `$OUTPUT_DIR/reports/SAFETY-REPORT.md` using strict fields:
|
|
|
90
100
|
- [concise bullet list or none]
|
|
91
101
|
```
|
|
92
102
|
|
|
103
|
+
Write `$OUTPUT_DIR/reports/SEMANTIC-REPORT.md` using strict fields:
|
|
104
|
+
|
|
105
|
+
```markdown
|
|
106
|
+
# SEMANTIC REPORT
|
|
107
|
+
|
|
108
|
+
- overall_status: pass/fail
|
|
109
|
+
- visual_world_status: pass/fail
|
|
110
|
+
- perspective_geometry_status: pass/fail
|
|
111
|
+
- shadow_vector_status: pass/fail
|
|
112
|
+
- scale_depth_status: pass/fail
|
|
113
|
+
- reflection_physics_anatomy_status: pass/fail
|
|
114
|
+
- foreground_background_status: pass/fail
|
|
115
|
+
- contextual_contradiction_status: pass/fail
|
|
116
|
+
- chained_ilk_frame_status: pass/fail
|
|
117
|
+
- blockers:
|
|
118
|
+
- none
|
|
119
|
+
|
|
120
|
+
## Findings
|
|
121
|
+
- [concise bullet list]
|
|
122
|
+
|
|
123
|
+
## Fixes Applied
|
|
124
|
+
- [concise bullet list or none]
|
|
125
|
+
```
|
|
126
|
+
|
|
93
127
|
Rules:
|
|
94
128
|
- Status fields must not contradict each other.
|
|
95
129
|
- If any status is fail, `overall_status` must be fail.
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@milenyumai/film-kit",
|
|
3
|
-
"version": "1.4.
|
|
3
|
+
"version": "1.4.2",
|
|
4
4
|
"description": "Hollywood-standard cinematic prompt engineering toolkit with model profiles (Veo 3.1 / Kling 3.0). Auto-configures AI agents (Cursor, Claude Code, VS Code Copilot, Antigravity) with production-grade shot generation system.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "./build/index.js",
|