@milenyumai/film-kit 2.0.0 → 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +66 -1
- package/build/lib/templates.js +121 -22
- package/content/agents/prompt-engineer.md +4 -3
- package/content/skills/frame-chaining/SKILL.md +10 -1
- package/content/skills/prompt-structure/SKILL.md +33 -27
- package/content/skills/reference-locking/SKILL.md +70 -57
- package/content/skills/semantic-consistency/SKILL.md +26 -3
- package/content/workflows/finish.md +1 -1
- package/content/workflows/generate.md +28 -15
- package/content/workflows/safety-check.md +5 -4
- package/package.json +2 -2
- package/packages/hybrid/README.md +13 -17
- package/packages/hybrid/build/cli.js +7 -8
- package/packages/hybrid/build/lib/templates.js +117 -21
- package/packages/hybrid/content/HYBRID-OVERRIDES.md +6 -4
- package/packages/hybrid/content/skills/nano-banana-pro-image/SKILL.md +13 -8
- package/packages/hybrid/content/skills/prompt-structure/SKILL.md +15 -11
- package/packages/hybrid/content/workflows/generate.md +7 -4
- package/packages/hybrid-smart/README.md +14 -18
- package/packages/hybrid-smart/build/cli.js +7 -8
- package/packages/hybrid-smart/build/lib/templates.js +118 -22
- package/packages/hybrid-smart/content/SMART-HYBRID-OVERRIDES.md +4 -4
- package/packages/hybrid-smart/content/skills/nano-banana-pro-image/SKILL.md +13 -8
- package/packages/hybrid-smart/content/skills/prompt-structure/SKILL.md +11 -9
- package/packages/hybrid-smart/content/workflows/generate.md +7 -4
- package/packages/multi/README.md +15 -17
- package/packages/multi/build/cli.js +7 -8
- package/packages/multi/build/lib/configure.js +4 -2
- package/packages/multi/build/lib/templates.js +121 -13
- package/packages/multi/content/agents/character-consistency-auditor.md +9 -2
- package/packages/multi/content/agents/lead-director.md +1 -0
- package/packages/multi/content/agents/semantic-auditor.md +5 -1
- package/packages/multi/content/agents/shot-generator.md +9 -5
- package/packages/multi/content/workflows/generate-multi.md +1 -0
- package/packages/multi/content/workflows/generate-teammate.md +7 -4
- package/packages/multi/content/workflows/safety-check-multi.md +5 -4
- package/packages/studio/README.md +14 -17
- package/packages/studio/build/cli.js +10 -11
- package/packages/studio/build/lib/configure.js +4 -2
- package/packages/studio/build/lib/templates.js +137 -13
- package/packages/studio/content/agents/render-supervisor.md +7 -0
- package/packages/studio/content/skills/fal-render/SKILL.md +6 -0
- package/packages/studio/content/workflows/render.md +2 -1
package/README.md
CHANGED
|
@@ -51,6 +51,8 @@ npx @milenyumai/film-kit init --preset studio --model seedance-2.0 --refs-dir ./
|
|
|
51
51
|
|
|
52
52
|
After the first run, Film-Kit writes `film-kit.config.json` and uses that file as the canonical project configuration.
|
|
53
53
|
|
|
54
|
+
In Codex-enabled projects, all presets now include an optional `/codex-images` still phase after a successful `/generate`. It is always opt-in and never auto-starts.
|
|
55
|
+
|
|
54
56
|
## Presets
|
|
55
57
|
|
|
56
58
|
| Preset | Purpose | Supported video model surface | Default notes |
|
|
@@ -238,18 +240,81 @@ Film-Kit writes repo-scoped runtime files for supported editors and agents. Depe
|
|
|
238
240
|
- `.cursor/`
|
|
239
241
|
- `.github/copilot-instructions.md`
|
|
240
242
|
- `AGENTS.md`
|
|
241
|
-
- runtime workflows such as `generate`, `chain`, `safety-check`, `recover`, `finish`
|
|
243
|
+
- runtime workflows such as `generate`, `codex-images`, `chain`, `safety-check`, `recover`, `finish`
|
|
242
244
|
|
|
243
245
|
Editorial/runtime contract shared across the package:
|
|
244
246
|
|
|
245
247
|
- one file per shot: `SHOTNN.md`
|
|
246
248
|
- coverage lives inside the same shot file
|
|
247
249
|
- report files live under `reports/`
|
|
250
|
+
- optional Codex still outputs live under `codex-images/` with `manifest.json` and `reports/CODEX-IMAGE-REPORT.md`
|
|
248
251
|
- continuity and first-frame chaining are explicit
|
|
249
252
|
- voice design uses top-level `voiceCast`
|
|
250
253
|
- per-shot sound control uses `Audio Plan`
|
|
251
254
|
- semantic consistency uses canonical `visual_world`
|
|
252
255
|
|
|
256
|
+
## Reference Locking And Start/End Consistency
|
|
257
|
+
|
|
258
|
+
When a reference image exists, Film-Kit now treats it as an immutable source, not a loose style hint.
|
|
259
|
+
|
|
260
|
+
Default behavior across `single`, `multi`, `hybrid`, `hybrid-smart`, and `studio`:
|
|
261
|
+
|
|
262
|
+
- face, hair, skin tone, body proportions, wardrobe, accessories, props, materials, patterns, and logos are treated as locked invariants
|
|
263
|
+
- generated prompts should not re-describe identity details already visible in the reference
|
|
264
|
+
- generated prompts should explicitly preserve the same person or object exactly as the reference
|
|
265
|
+
- silent redesign, restyling, redressing, age shifting, or prop redesign is considered a failure
|
|
266
|
+
|
|
267
|
+
Allowed movement inside a reference-locked shot is intentionally narrow:
|
|
268
|
+
|
|
269
|
+
- pose can change only if it is explicitly declared
|
|
270
|
+
- expression can change only if it is explicitly declared
|
|
271
|
+
- camera or environment state can change only if it is explicitly declared
|
|
272
|
+
- each start/end pair should carry only one major semantic delta unless the workflow marks a deliberate chain break or transition
|
|
273
|
+
|
|
274
|
+
## Prompt Contract
|
|
275
|
+
|
|
276
|
+
Start/end still prompts are now optimized for structure, not raw length.
|
|
277
|
+
|
|
278
|
+
Image prompt contract:
|
|
279
|
+
|
|
280
|
+
- `REFERENCE LOCK`
|
|
281
|
+
- `Keep same`
|
|
282
|
+
- `Change only`
|
|
283
|
+
- short semantic anchor
|
|
284
|
+
- targeted avoid line
|
|
285
|
+
|
|
286
|
+
This means:
|
|
287
|
+
|
|
288
|
+
- reference-anchored image prompts are intentionally shorter and more surgical
|
|
289
|
+
- old image-quality heuristics based on minimum word counts are no longer the primary control surface
|
|
290
|
+
- detailed prose is still expected for video prompts, where motion, timing, audio, and camera language need richer direction
|
|
291
|
+
|
|
292
|
+
`hybrid` and `hybrid-smart` keep their compact still-prompt behavior. `single`, `multi`, and `studio` keep richer video prompt behavior for Veo, Kling, and Seedance.
|
|
293
|
+
|
|
294
|
+
## Semantic QA And Render QA
|
|
295
|
+
|
|
296
|
+
`visual_world` is the canonical continuity contract for semantic consistency and render review.
|
|
297
|
+
|
|
298
|
+
Reference-aware runtime and reports now carry these concepts:
|
|
299
|
+
|
|
300
|
+
- `identity_lock: "exact-reference"`
|
|
301
|
+
- `shared_frame_invariants`
|
|
302
|
+
- `allowed_change`
|
|
303
|
+
- `reference_drift_forbidden: true`
|
|
304
|
+
|
|
305
|
+
Generated specialist and render QA flows now explicitly fail on:
|
|
306
|
+
|
|
307
|
+
- face, hair, skin, body, wardrobe, accessory, or prop drift
|
|
308
|
+
- lens, angle, subject scale, light direction, or color-temperature drift
|
|
309
|
+
- unexplained semantic changes between start and end frames
|
|
310
|
+
- violations of declared `Keep same` invariants
|
|
311
|
+
- changes outside the declared `Change only` budget
|
|
312
|
+
|
|
313
|
+
Studio render QA also records render-level verdicts such as:
|
|
314
|
+
|
|
315
|
+
- `reference_drift_status`
|
|
316
|
+
- `start_end_contract_status`
|
|
317
|
+
|
|
253
318
|
## Seedance Runtime Coverage
|
|
254
319
|
|
|
255
320
|
The package now carries Seedance-aware instructions in the generated runtime, not only in type definitions or CLI parsing.
|
package/build/lib/templates.js
CHANGED
|
@@ -112,6 +112,91 @@ model_reasoning_effort = "high"
|
|
|
112
112
|
developer_instructions = ${tomlString(developerInstructions)}
|
|
113
113
|
`;
|
|
114
114
|
}
|
|
115
|
+
function buildCodexImageSkill(workflowFile, outputDir) {
|
|
116
|
+
return `---
|
|
117
|
+
name: film-kit-codex-images
|
|
118
|
+
description: Optional Codex-native still image phase for Film-Kit. Use after successful prompt generation when the user wants to create or iteratively edit accepted ILK FRAME and SON FRAME images inside Codex. Never auto-start; offer it after /generate and wait for user confirmation.
|
|
119
|
+
---
|
|
120
|
+
|
|
121
|
+
# Film-Kit Codex Images
|
|
122
|
+
|
|
123
|
+
This skill is for **Codex only**. It adds a still-image phase after Film-Kit prompt generation.
|
|
124
|
+
|
|
125
|
+
## Read First
|
|
126
|
+
1. \`${workflowFile}\`
|
|
127
|
+
2. \`.agent/model-profile.md\`
|
|
128
|
+
3. the relevant shot files under \`${outputDir}/shots/\`
|
|
129
|
+
4. \`${outputDir}/shot-plan.json\`
|
|
130
|
+
5. report files under \`${outputDir}/reports/\` when the workflow requires them
|
|
131
|
+
|
|
132
|
+
## Trigger Rules
|
|
133
|
+
- Start only when the user explicitly asks for image generation/editing or accepts the \`/codex-images\` suggestion after \`/generate\`.
|
|
134
|
+
- Do not auto-start this phase.
|
|
135
|
+
- Do not use it for coverage stills or video generation in v1.
|
|
136
|
+
|
|
137
|
+
## Core Behavior
|
|
138
|
+
- Use Codex's built-in image generation/editing flow.
|
|
139
|
+
- If a local image file must be edited, first bring it into the conversation context, then edit it.
|
|
140
|
+
- Generate \`ILK FRAME\` from the accepted shot prompt plus references.
|
|
141
|
+
- Generate \`SON FRAME\` by iterating from the accepted \`ILK FRAME\` while preserving \`REFERENCE LOCK\`, \`Keep same\`, and \`Change only\`.
|
|
142
|
+
- For chained shots, default the next shot's \`ILK FRAME\` to the previously accepted \`SON FRAME\` unless the user explicitly requests a fresh start.
|
|
143
|
+
- After every image, ask the user to accept it or request a targeted edit.
|
|
144
|
+
|
|
145
|
+
## Output Contract
|
|
146
|
+
- Save accepted stills under \`${outputDir}/codex-images/SHOTNN/\`
|
|
147
|
+
- Canonical files: \`ilk-frame.png\`, \`son-frame.png\`
|
|
148
|
+
- Iterations use sibling versioned filenames such as \`ilk-frame-v2.png\` and \`son-frame-v3.png\`
|
|
149
|
+
- Keep \`${outputDir}/codex-images/manifest.json\` updated
|
|
150
|
+
- Keep \`${outputDir}/reports/CODEX-IMAGE-REPORT.md\` updated
|
|
151
|
+
`;
|
|
152
|
+
}
|
|
153
|
+
function buildCodexImagesWorkflow(outputDir) {
|
|
154
|
+
return `---
|
|
155
|
+
description: Optional Codex-native still image phase for Film-Kit single-agent output
|
|
156
|
+
---
|
|
157
|
+
|
|
158
|
+
# /codex-images - Film-Kit Codex Still Phase
|
|
159
|
+
|
|
160
|
+
## Preconditions
|
|
161
|
+
- \`/generate\` completed successfully
|
|
162
|
+
- shot files already exist under \`${outputDir}/shots/\`
|
|
163
|
+
- this phase is opt-in; start only after the user confirms
|
|
164
|
+
|
|
165
|
+
## Reference Priority
|
|
166
|
+
1. images already present in the active Codex conversation
|
|
167
|
+
2. user-supplied local image paths
|
|
168
|
+
3. repo-local \`refs/\` folders when present
|
|
169
|
+
|
|
170
|
+
## Output Contract
|
|
171
|
+
\`\`\`text
|
|
172
|
+
${outputDir}/codex-images/
|
|
173
|
+
├── SHOTNN/
|
|
174
|
+
│ ├── ilk-frame.png
|
|
175
|
+
│ ├── ilk-frame-v2.png
|
|
176
|
+
│ ├── son-frame.png
|
|
177
|
+
│ └── son-frame-v3.png
|
|
178
|
+
├── manifest.json
|
|
179
|
+
└── ../reports/CODEX-IMAGE-REPORT.md
|
|
180
|
+
\`\`\`
|
|
181
|
+
|
|
182
|
+
## Flow
|
|
183
|
+
1. Ask which shot range to process. Default to sequential order.
|
|
184
|
+
2. For a fresh shot, generate \`ILK FRAME\` from the accepted prompt plus references.
|
|
185
|
+
3. Show the result and ask the user to accept it or request a targeted edit.
|
|
186
|
+
4. Save the accepted start image as \`ilk-frame.png\`; keep previous attempts as versioned siblings.
|
|
187
|
+
5. Generate \`SON FRAME\` by editing from the accepted \`ILK FRAME\` while preserving \`REFERENCE LOCK\`, \`Keep same\`, and \`Change only\`.
|
|
188
|
+
6. Show the result and ask the user to accept it or request a targeted edit.
|
|
189
|
+
7. Save the accepted end image as \`son-frame.png\`; keep previous attempts as versioned siblings.
|
|
190
|
+
8. If the next shot is chained, default its \`ILK FRAME\` to the accepted previous \`SON FRAME\` unless the user explicitly asks for a fresh start.
|
|
191
|
+
9. Update \`${outputDir}/codex-images/manifest.json\` and \`${outputDir}/reports/CODEX-IMAGE-REPORT.md\` after each accepted image.
|
|
192
|
+
|
|
193
|
+
## Hard Rules
|
|
194
|
+
- Never auto-start this workflow.
|
|
195
|
+
- V1 covers only main-shot \`ILK FRAME\` and \`SON FRAME\`.
|
|
196
|
+
- Do not generate coverage stills or video here.
|
|
197
|
+
- Preserve chain continuity unless the user explicitly requests a reset.
|
|
198
|
+
`;
|
|
199
|
+
}
|
|
115
200
|
export function buildProjectFiles(config) {
|
|
116
201
|
const files = {};
|
|
117
202
|
// Runtime model profile file is generated from selected model options.
|
|
@@ -143,8 +228,11 @@ export function buildProjectFiles(config) {
|
|
|
143
228
|
files[".codex/config.toml"] = buildCodexConfig();
|
|
144
229
|
files[".codex/agents/prompt-engineer.toml"] = buildCodexAgentToml("prompt_engineer", "Film-Kit prompt engineer for single-agent cinematic shot generation, repair, and delivery QA.", `Read AGENTS.md first, then .agent/model-profile.md, .agent/MASTER.md, .agent/VOICE-DESIGN.md, .agent/agents/prompt-engineer.md, and the requested .agent/workflows/*.md file before acting.
|
|
145
230
|
Use .agents/skills for Codex App skill discovery; .agent remains the Film-Kit legacy runtime source.
|
|
231
|
+
When /generate finishes successfully inside Codex, always offer /codex-images as an optional still-image phase. Never auto-start it.
|
|
146
232
|
Keep outputs under ${config.outputDir} unless the user explicitly changes the output contract.
|
|
147
233
|
For worktree runs, if node_modules or build outputs are missing and package.json exists, bootstrap with npm install and npm run build before running package commands.`);
|
|
234
|
+
files[".agents/skills/film-kit-codex-images/SKILL.md"] = buildCodexImageSkill(".agent/workflows/codex-images.md", config.outputDir);
|
|
235
|
+
files[".agent/workflows/codex-images.md"] = buildCodexImagesWorkflow(config.outputDir);
|
|
148
236
|
if (config.platforms.includes("antigravity")) {
|
|
149
237
|
files[".agents/skills/shotforge-generate/SKILL.md"] = buildAntigravitySkill(config);
|
|
150
238
|
}
|
|
@@ -190,11 +278,13 @@ All rules, skills, and workflows are located under \`.agent/\`.
|
|
|
190
278
|
| \`/safety-check\` | \`.agent/workflows/safety-check.md\` |
|
|
191
279
|
| \`/recover\` | \`.agent/workflows/recover.md\` |
|
|
192
280
|
| \`/finish\` | \`.agent/workflows/finish.md\` |
|
|
281
|
+
| \`/codex-images\` | \`.agent/workflows/codex-images.md\` |
|
|
193
282
|
|
|
194
283
|
## OpenAI Codex App
|
|
195
284
|
- Open the Git root as the Codex project root; Codex discovers \`AGENTS.md\`, \`.codex/config.toml\`, and \`.agents/skills/\` from that root.
|
|
196
285
|
- Repo-scoped Codex config lives in \`.codex/config.toml\`; Codex may ignore it until the project is trusted in the app.
|
|
197
286
|
- Codex skills live under \`.agents/skills/*/SKILL.md\`; \`.agent/skills\` remains the legacy Film-Kit runtime mirror for Claude/Cursor/Copilot/Antigravity.
|
|
287
|
+
- Codex-native still phase lives in \`.agents/skills/film-kit-codex-images/SKILL.md\` and starts with \`/codex-images\` only after user confirmation.
|
|
198
288
|
- Codex custom agents live in \`.codex/agents/*.toml\` and point back to the canonical \`.agent/agents/*.md\` role files.
|
|
199
289
|
- Full-auto Codex profile: \`approval_policy = "never"\`, \`sandbox_mode = "danger-full-access"\`, \`[agents] max_threads = 6\`, \`max_depth = 1\`.
|
|
200
290
|
- Worktree bootstrap: if \`node_modules\` or build output is missing, run \`npm install\` and \`npm run build\` before package commands.
|
|
@@ -208,6 +298,7 @@ When the user asks \`/generate\`, convert the scenario into:
|
|
|
208
298
|
- \`${config.outputDir}/reports/SEMANTIC-REPORT.md\` — Semantic consistency gate result
|
|
209
299
|
- \`${config.outputDir}/reports/DELIVERY-REPORT.md\` — Delivery gate result
|
|
210
300
|
- \`${config.outputDir}/_index.md\` — Shot list with chain & status tracking
|
|
301
|
+
- Optional Codex still phase: \`${config.outputDir}/codex-images/manifest.json\` + \`${config.outputDir}/reports/CODEX-IMAGE-REPORT.md\`
|
|
211
302
|
|
|
212
303
|
## Input
|
|
213
304
|
- Preferred scenario file: \`${config.scenarioHint}\`
|
|
@@ -218,11 +309,11 @@ When the user asks \`/generate\`, convert the scenario into:
|
|
|
218
309
|
## Output Contract
|
|
219
310
|
Each \`SHOTNN.md\` is a **single file** containing ALL shot details:
|
|
220
311
|
- 🔗 Chain status (FIRST / CHAINED / CHAIN BREAK)
|
|
221
|
-
- İLK FRAME (start frame image prompt —
|
|
222
|
-
- SON FRAME (end frame image prompt —
|
|
312
|
+
- İLK FRAME (start frame image prompt — compact reference-first still contract)
|
|
313
|
+
- SON FRAME (end frame image prompt — same-visual-universe + one-delta still contract)
|
|
223
314
|
- AUDIO PLAN (machine-readable shot audio contract)
|
|
224
|
-
- VİDEO (video prompt with audio direction — min
|
|
225
|
-
- COVERAGE SHOTS (2-3 coverage shots within same file —
|
|
315
|
+
- VİDEO (video prompt with audio direction — min 120 words)
|
|
316
|
+
- COVERAGE SHOTS (2-3 coverage shots within same file — compact stills + detailed coverage video when needed)
|
|
226
317
|
- 🇹🇷 Turkish summary for each section
|
|
227
318
|
- Avoid line on EVERY prompt
|
|
228
319
|
|
|
@@ -268,7 +359,7 @@ Read .agent/VOICE-DESIGN.md when dialogue, narrator VO, or reusable speaker iden
|
|
|
268
359
|
5. SLOW BURN: 8s default duration. Split actions into multiple shots. "Ilmek ilmek islemek."
|
|
269
360
|
6. Music: NONE by default. User must explicitly request.
|
|
270
361
|
7. EVERY prompt must have an Avoid line. No exceptions.
|
|
271
|
-
8. Coverage shots mandatory (2-3 per main shot,
|
|
362
|
+
8. Coverage shots mandatory (2-3 per main shot, included in same file; stills stay compact, coverage videos stay detailed).
|
|
272
363
|
9. Frame chaining: Last frame of SHOT[N] = First frame of SHOT[N+1].
|
|
273
364
|
10. Semantic consistency: \`${config.outputDir}/shot-plan.json\` must include \`visual_world\`; prompts must align camera, named movement strategy, light/shadow vector, scale, reflections, physics, anatomy risk, and contextual logic.
|
|
274
365
|
11. ILK/İLK FRAME section must contain a code block even for chained shots.
|
|
@@ -334,7 +425,7 @@ All skills at: \`.agent/skills/[name]/SKILL.md\`
|
|
|
334
425
|
- AUTO-SAFETY: Proactively reframe sensitive content
|
|
335
426
|
- Frame chaining: Last frame SHOT[N] = First frame SHOT[N+1]
|
|
336
427
|
- Chained ILK/İLK FRAME code block contains only \`Use SHOT[prev]_END as exact first frame\`; any new visual prompt requires CHAIN BREAK
|
|
337
|
-
- Coverage: 2-3 sub-shots per main shot (
|
|
428
|
+
- Coverage: 2-3 sub-shots per main shot (same file; compact stills, detailed coverage videos)
|
|
338
429
|
- Avoid line: MANDATORY on every prompt
|
|
339
430
|
- Music: NONE by default
|
|
340
431
|
- Duration: 8s default, slow burn pacing
|
|
@@ -406,7 +497,7 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
|
|
|
406
497
|
9. **Voice Design:** keep project-level \`voiceCast\` in \`${config.outputDir}/shot-plan.json\` and per-shot \`Audio Plan\` in each VIDEO section.
|
|
407
498
|
10. **ILK/İLK FRAME:** Always include a fenced code block, even when chained.
|
|
408
499
|
11. **Chained ILK/İLK FRAME:** code block contains only \`Use SHOT[prev]_END as exact first frame\`; any new visual prompt requires CHAIN BREAK.
|
|
409
|
-
12. **
|
|
500
|
+
12. **Image Prompt Contract:** reference-anchored ILK/SON use \`REFERENCE LOCK\` + \`Keep same\` + \`Change only\`; VIDEO >= 120; coverage video >= 70.
|
|
410
501
|
13. **Specificity Floor:** lens/framing, lighting, and foreground/midground/background action are mandatory.
|
|
411
502
|
14. **Spatial Realism Floor:** eyeline target, plane map, shared light source, and contact/depth cues are mandatory when relational staging matters.
|
|
412
503
|
15. **Semantic Consistency Floor:** \`visual_world\`, perspective/geometry, shadow vector, scale map, reflections, gravity/contact physics, anatomy risk, foreground/background coherence, contextual contradictions, and targeted semantic avoid terms are mandatory.
|
|
@@ -420,6 +511,7 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
|
|
|
420
511
|
| \`/safety-check\` | \`.agent/workflows/safety-check.md\` |
|
|
421
512
|
| \`/recover\` | \`.agent/workflows/recover.md\` |
|
|
422
513
|
| \`/finish\` | \`.agent/workflows/finish.md\` |
|
|
514
|
+
| \`/codex-images\` | \`.agent/workflows/codex-images.md\` |
|
|
423
515
|
|
|
424
516
|
## Claude Code Ops
|
|
425
517
|
- Use \`/memory\` to verify which CLAUDE and rule files are loaded.
|
|
@@ -450,7 +542,7 @@ This workspace keeps high-level policy in \`CLAUDE.md\` and operational detail i
|
|
|
450
542
|
- Maintain \`${config.outputDir}/shot-plan.json\` top-level \`visual_world\`
|
|
451
543
|
- Keep \`Audio Plan\` blocks aligned to \`voiceCast\`
|
|
452
544
|
- Keep \`ILK/İLK FRAME\` in a fenced code block even when chained
|
|
453
|
-
-
|
|
545
|
+
- Image prompt contract, specificity floor, and semantic floor are hard gates, not suggestions
|
|
454
546
|
- Semantic consistency floor is a hard gate: camera/lens/camera-movement/light/shadow/scale/reflection/physics/anatomy/context must align to \`visual_world\`
|
|
455
547
|
- Apply \`.agent/skills/spatial-blocking/SKILL.md\` whenever eyeline, compositing, or depth realism is critical
|
|
456
548
|
|
|
@@ -482,7 +574,7 @@ Use the Film-Kit core runtime.
|
|
|
482
574
|
- maintain top-level \`visual_world\` inside \`${config.outputDir}/shot-plan.json\`
|
|
483
575
|
- keep \`Audio Plan\` blocks valid against \`voiceCast\`
|
|
484
576
|
- enforce AUTO-ANONYMOUS, AUTO-SAFETY, chaining, and coverage contracts
|
|
485
|
-
- enforce
|
|
577
|
+
- enforce image prompt contract: compact reference-first ILK/SON, VIDEO >= 120, coverage video >= 70
|
|
486
578
|
- enforce specificity floor: lens/framing, lighting, and foreground/midground/background action
|
|
487
579
|
- enforce spatial realism: explicit eyeline target, plane map, shared light source, and contact/depth cues when needed
|
|
488
580
|
- enforce semantic consistency: \`visual_world\`, perspective/geometry, shadow vector, scale map, reflection handling, physics/anatomy risk, foreground/background coherence, contextual contradictions, and scene-specific avoid terms
|
|
@@ -526,7 +618,7 @@ If using the native Claude subagent, read \`.claude/agents/prompt-engineer.md\`
|
|
|
526
618
|
- Add machine-readable \`Audio Plan\` before every VIDEO section
|
|
527
619
|
- Keep İLK FRAME as fenced code block even when chained
|
|
528
620
|
- If chained, keep İLK FRAME code block to exact reuse text only; new visual prompt means CHAIN BREAK
|
|
529
|
-
- Enforce
|
|
621
|
+
- Enforce image prompt contract: compact reference-first ILK/SON, VIDEO >= 120, coverage video >= 70
|
|
530
622
|
- Enforce specificity floor: lens/framing + lighting + foreground/midground/background action
|
|
531
623
|
- Enforce spatial realism floor: eyeline target + plane map + shared light source + contact/depth cues when applicable
|
|
532
624
|
- Enforce semantic consistency floor: perspective/geometry + shadow vector + scale map + reflections + gravity/contact physics + anatomy risk + contextual contradiction check
|
|
@@ -607,7 +699,7 @@ FIRST SHOT / CHAINED from SHOT[prev]_END / CHAIN BREAK - Reason
|
|
|
607
699
|
> Any new visual prompt in a chained ILK FRAME section requires CHAIN BREAK.
|
|
608
700
|
|
|
609
701
|
\\\`\\\`\\\`
|
|
610
|
-
[Image prompt —
|
|
702
|
+
[Image prompt — compact still prompt. If refs exist: REFERENCE LOCK + Keep same + Change only + semantic anchor + Avoid]
|
|
611
703
|
Avoid: blurry, low-res, noise, distorted faces, bad anatomy, extra limbs/fingers,
|
|
612
704
|
plastic skin, waxy skin, on-screen text, watermark, logo, cartoon style, CGI look.
|
|
613
705
|
\\\`\\\`\\\`
|
|
@@ -615,7 +707,7 @@ plastic skin, waxy skin, on-screen text, watermark, logo, cartoon style, CGI loo
|
|
|
615
707
|
### SON FRAME (SHOTNN_END)
|
|
616
708
|
|
|
617
709
|
\\\`\\\`\\\`
|
|
618
|
-
[Image prompt —
|
|
710
|
+
[Image prompt — compact still prompt. Preserve same visual universe and declare one allowed change]
|
|
619
711
|
Avoid: blurry, low-res, noise, distorted faces, bad anatomy, extra limbs/fingers,
|
|
620
712
|
plastic skin, waxy skin, on-screen text, watermark, logo, cartoon style, CGI look.
|
|
621
713
|
\\\`\\\`\\\`
|
|
@@ -629,7 +721,7 @@ plastic skin, waxy skin, on-screen text, watermark, logo, cartoon style, CGI loo
|
|
|
629
721
|
### VİDEO
|
|
630
722
|
|
|
631
723
|
\\\`\\\`\\\`
|
|
632
|
-
[Video prompt — min
|
|
724
|
+
[Video prompt — min 120 words, following prompt flow order]
|
|
633
725
|
|
|
634
726
|
Audio direction:
|
|
635
727
|
- Language: [TURKISH/ENGLISH/etc.]
|
|
@@ -654,11 +746,11 @@ inconsistent lighting, unnatural motion, on-screen text, watermark.
|
|
|
654
746
|
🇹🇷 [Türkçe coverage özeti]
|
|
655
747
|
|
|
656
748
|
\\\`\\\`\\\`
|
|
657
|
-
[Coverage IMAGE prompt —
|
|
749
|
+
[Coverage IMAGE prompt — compact standalone still contract (not chained)]
|
|
658
750
|
Avoid: blurry, low-res, noise, distorted faces, bad anatomy, on-screen text, watermark.
|
|
659
751
|
\\\`\\\`\\\`
|
|
660
752
|
\\\`\\\`\\\`
|
|
661
|
-
[Coverage VIDEO prompt — min
|
|
753
|
+
[Coverage VIDEO prompt — min 70 words, with full audio direction block]
|
|
662
754
|
|
|
663
755
|
Audio direction:
|
|
664
756
|
- Language: [LANGUAGE]
|
|
@@ -812,7 +904,7 @@ Before generating ANY prompts, read skills from \`.agent/skills/\`:
|
|
|
812
904
|
- **AUTO-SAFETY:** Proactively reframe sensitive content
|
|
813
905
|
- **Frame Chaining:** Last frame of SHOT[N] = First frame of SHOT[N+1]
|
|
814
906
|
- **Chain Hardening:** chained ILK/İLK FRAME code block contains only \`Use SHOT[prev]_END as exact first frame\`
|
|
815
|
-
- **Coverage:** 2-3 sub-shots per main shot (in same file,
|
|
907
|
+
- **Coverage:** 2-3 sub-shots per main shot (in same file; stills compact, coverage videos detailed)
|
|
816
908
|
- **Spatial Realism:** eyeline targets, shared light, depth scale, and anti-cutout staging must agree when subjects share frame
|
|
817
909
|
- **Semantic Consistency:** \`visual_world\` controls perspective/geometry, shadow vector, scale map, reflections, physics, anatomy risk, background coherence, and contextual contradictions
|
|
818
910
|
- **Avoid Line:** MANDATORY on every prompt
|
|
@@ -888,6 +980,7 @@ Before generating ANY prompts, read these skills:
|
|
|
888
980
|
| /safety-check | \`.agent/workflows/safety-check.md\` — Validate before delivery |
|
|
889
981
|
| /recover | \`.agent/workflows/recover.md\` — Recover failed gates |
|
|
890
982
|
| /finish | \`.agent/workflows/finish.md\` — Complete project, create summary |
|
|
983
|
+
| /codex-images | \`.agent/workflows/codex-images.md\` — Optional Codex-native still phase |
|
|
891
984
|
|
|
892
985
|
## Input
|
|
893
986
|
- Scenario source: \`${config.scenarioHint}\` (or selected file in editor)
|
|
@@ -909,7 +1002,7 @@ Before generating ANY prompts, read these skills:
|
|
|
909
1002
|
2. **Name Policy:** Dialogue naming follows \`${config.outputDir}/shot-plan.json\` policy
|
|
910
1003
|
3. **AUTO-SAFETY:** Proactively reframe sensitive content
|
|
911
1004
|
4. **Frame Chaining:** Last frame of SHOT[N] becomes first frame of SHOT[N+1]
|
|
912
|
-
5. **Coverage Mandatory:** 2-3 sub-shots per main shot (in same file,
|
|
1005
|
+
5. **Coverage Mandatory:** 2-3 sub-shots per main shot (in same file; stills compact, coverage videos detailed)
|
|
913
1006
|
6. **Avoid Line:** MANDATORY on every prompt (image + video + coverage)
|
|
914
1007
|
7. **Voice Design:** keep \`voiceCast\` in \`${config.outputDir}/shot-plan.json\` and \`Audio Plan\` in every speaking VIDEO section
|
|
915
1008
|
8. **Music: NONE** by default
|
|
@@ -923,10 +1016,11 @@ Before generating ANY prompts, read these skills:
|
|
|
923
1016
|
|
|
924
1017
|
## Quality Floor (Hard Gate)
|
|
925
1018
|
Reject and regenerate any shot that fails:
|
|
926
|
-
- ILK FRAME
|
|
927
|
-
- SON FRAME
|
|
1019
|
+
- reference-anchored ILK FRAME missing \`REFERENCE LOCK\`, \`Keep same\`, or \`Change only\`
|
|
1020
|
+
- SON FRAME drifts from the same visual universe or changes more than one major beat
|
|
928
1021
|
- VIDEO < 120 words
|
|
929
|
-
- any coverage prompt < 70 words
|
|
1022
|
+
- any coverage video prompt < 70 words
|
|
1023
|
+
- reference-visible face/hair/skin/body/wardrobe/accessories/prop design drift
|
|
930
1024
|
- missing explicit lens/framing/camera movement details
|
|
931
1025
|
- missing explicit lighting direction/intensity/atmosphere details
|
|
932
1026
|
- missing explicit foreground/midground/background action details
|
|
@@ -977,13 +1071,15 @@ ${isKling
|
|
|
977
1071
|
- Visual prompts in English; dialogue kept in source language.
|
|
978
1072
|
- Main chain continuity: SHOT[N]_END -> SHOT[N+1]_START.
|
|
979
1073
|
- Default duration: 8s slow burn.
|
|
980
|
-
- Prompt rewriter active — longer prompts (80-120 words) give more control.
|
|
1074
|
+
- Prompt rewriter active — longer prompts (80-120 words) give more control for VIDEO prompts; reference-anchored image prompts should stay compact and delta-only.
|
|
981
1075
|
|
|
982
1076
|
### Veo Image-to-Video
|
|
983
1077
|
When generating video from a reference image:
|
|
984
1078
|
- The model extracts lighting and style directly from the image
|
|
985
1079
|
- Do NOT repeat visual info already in the image (e.g., "warm light", "35mm film")
|
|
1080
|
+
- Keep the same identity, wardrobe, accessories, prop design, lens feel, light direction, and subject scale
|
|
986
1081
|
- Focus only on the **change**: motion, action, and audio
|
|
1082
|
+
- Use the same seed for same-scene reference variants and start/end pairs; change seed only for structural repair or \`CHAIN BREAK\`
|
|
987
1083
|
|
|
988
1084
|
### Veo Re-Take Strategy
|
|
989
1085
|
| Error Type | Fix | Seed |
|
|
@@ -992,7 +1088,7 @@ When generating video from a reference image:
|
|
|
992
1088
|
| Extra limbs | Add "extra limbs, extra fingers" to Avoid | New seed |
|
|
993
1089
|
| Motion inconsistency | Simplify movement, fewer actions | Same seed |
|
|
994
1090
|
| Light jumps | Specify light source more precisely | Same seed |
|
|
995
|
-
| Character inconsistent | Strengthen reference lock,
|
|
1091
|
+
| Character inconsistent | Strengthen immutable reference lock, remove face/body/wardrobe redesign text | Same locked seed |
|
|
996
1092
|
| Background drift | Add background details | Same seed |
|
|
997
1093
|
| Robotic motion | Add "organic natural movement" + micro-behavior | New seed |
|
|
998
1094
|
|
|
@@ -1038,6 +1134,9 @@ The more aligned these are, the cleaner the transition:
|
|
|
1038
1134
|
- Light direction & white balance
|
|
1039
1135
|
- Overall perspective
|
|
1040
1136
|
|
|
1137
|
+
**Keep same:** identity, face, hair, skin tone, body proportions, wardrobe, accessories, prop design, lens family, camera height, subject scale, light direction, color temperature, and location shell.
|
|
1138
|
+
**Change only:** one major semantic beat such as expression, hand position, body lean, prop position, or environmental state.
|
|
1139
|
+
|
|
1041
1140
|
> Large angle differences or mismatched lens feel forces the model to "guess" perspective → warp increases.
|
|
1042
1141
|
|
|
1043
1142
|
**Human face/hands rules:**
|
|
@@ -42,10 +42,11 @@ You are a senior Technical Prompt Engineer specialized in model-aware cinematic
|
|
|
42
42
|
12. **HER prompt'ta Avoid satırı ZORUNLU (❗ coverage dahil)**
|
|
43
43
|
13. **🇹🇷 Türkçe özet: Her shot ve coverage için 1 cümlelik Türkçe özet ekle**
|
|
44
44
|
14. Emotional arc: Film genelinde gerilim eğrisi (1-5) uygula
|
|
45
|
-
15. **Hard
|
|
45
|
+
15. **Hard prompt floor:** reference-anchored ILK/SON use `REFERENCE LOCK` + `Keep same` + `Change only`; VIDEO >=120; coverage video >=70
|
|
46
46
|
16. **Hard specificity floor:** Her promptta lens/framing, lighting ve FG/MG/BG action detayları zorunlu
|
|
47
47
|
17. **Spatial realism floor:** eyeline target, plane map, shared light source, contact/weight cues ve tam ölçek derinlik mantığı gerektiğinde zorunlu
|
|
48
|
-
18. **Semantic consistency floor:** `shot-plan.json.visual_world` kanonik olmalı; perspective/geometry, shadow vector, scale map, reflection handling, gravity/contact physics, anatomy risk, foreground/background coherence
|
|
48
|
+
18. **Semantic consistency floor:** `shot-plan.json.visual_world` kanonik olmalı; perspective/geometry, shadow vector, scale map, reflection handling, gravity/contact physics, anatomy risk, foreground/background coherence, contextual contradiction, immutable reference lock ve allowed-change budget her shot'ta geçmeli
|
|
49
|
+
19. **Codex still phase rule:** `/generate` başarıyla bittiyse ve çalışma Codex içindeyse `/codex-images` opsiyonunu öner; asla otomatik başlatma
|
|
49
50
|
|
|
50
51
|
---
|
|
51
52
|
|
|
@@ -271,7 +272,7 @@ Before outputting ANY shot:
|
|
|
271
272
|
- [ ] Quality floor passes? (ILK>=80, SON>=80, VIDEO>=120, coverage>=70)
|
|
272
273
|
- [ ] Specificity floor passes? (lens + lighting + FG/MG/BG action)
|
|
273
274
|
- [ ] Spatial realism passes? (eyeline target + plane map + shared light + contact/depth cues)
|
|
274
|
-
- [ ] Semantic consistency passes? (`visual_world` fields + perspective/geometry + shadow vector + scale map + reflection/physics/anatomy/context checks)
|
|
275
|
+
- [ ] Semantic consistency passes? (`visual_world` fields + perspective/geometry + shadow vector + scale map + reflection/physics/anatomy/context + reference drift + allowed-change checks)
|
|
275
276
|
- [ ] Model Control block exists? (`Model`, `Preset`, `CFG`, `Transition Mode`)
|
|
276
277
|
|
|
277
278
|
### 3. Kling-Specific Gates (when model is kling-3.0)
|
|
@@ -360,7 +360,16 @@ When preparing Start and End frames for Kling 3.0, both frames MUST share:
|
|
|
360
360
|
| **Light direction & WB** | Same source, same color temperature |
|
|
361
361
|
| **Overall perspective** | Same spatial depth and vanishing points |
|
|
362
362
|
|
|
363
|
-
> Mismatched elements force Kling to "guess" → warp, melting, and artifacts increase.
|
|
363
|
+
> Mismatched elements force Kling to "guess" → warp, melting, and artifacts increase.
|
|
364
|
+
|
|
365
|
+
### Keep Same / Change Only Contract
|
|
366
|
+
|
|
367
|
+
For a reference-locked start/end pair:
|
|
368
|
+
|
|
369
|
+
- `Keep same:` identity, face, hair, skin tone, body proportions, wardrobe, accessories, prop design, lens family, camera height, subject scale, light direction, color temperature, and location shell
|
|
370
|
+
- `Change only:` one major semantic beat such as expression, hand position, body lean, prop position, or environmental state
|
|
371
|
+
- if two or more major semantic changes are needed, declare `CHAIN BREAK` or split into additional shots
|
|
372
|
+
- do not redesign anything already visible in the reference image
|
|
364
373
|
|
|
365
374
|
### Face & Hands Stability for Start+End
|
|
366
375
|
|
|
@@ -19,7 +19,7 @@ description: Image and video prompt templates, camera movement vocabulary, stabi
|
|
|
19
19
|
| **Positive Framing** | Describe what SHOULD happen (negative prompts less effective) |
|
|
20
20
|
| **Specific Action Verbs** | "strides" > "walks", "glides" > "moves" |
|
|
21
21
|
| **Professional Terms** | Use industry-standard cinematography language |
|
|
22
|
-
| **Ideal Length** | 80-120 words
|
|
22
|
+
| **Ideal Length** | Video prompts: 80-120 words. Reference-anchored image prompts: compact 35-70 words; hybrid still pairs can go 20-50 when anchors are strong |
|
|
23
23
|
| **Reference Commands First** | Always start with reference image instructions |
|
|
24
24
|
| **Safety First** | Always consider filter implications |
|
|
25
25
|
| **Short Sentence Rule** | Split long sentences across shots |
|
|
@@ -36,28 +36,26 @@ description: Image and video prompt templates, camera movement vocabulary, stabi
|
|
|
36
36
|
|
|
37
37
|
| Sıra | İçerik | Örnek |
|
|
38
38
|
|------|--------|-------|
|
|
39
|
-
| **1** |
|
|
40
|
-
| **2** |
|
|
41
|
-
| **3** |
|
|
42
|
-
| **4** | Kamera + Lens | "
|
|
43
|
-
| **5** |
|
|
44
|
-
| **6** | Audio direction | Sadece video prompt'larında |
|
|
45
|
-
| **7** | Avoid line | HER prompt'ta zorunlu |
|
|
39
|
+
| **1** | Reference lock / frame intent | "Same person exactly as reference. Do not modify." |
|
|
40
|
+
| **2** | Keep same | Kimlik, wardrobe, prop design, lens feel, ışık yönü, subject scale |
|
|
41
|
+
| **3** | Change only | Tek anlamlı değişim: ifade, el pozisyonu, aksiyon beat'i |
|
|
42
|
+
| **4** | Kamera + Lens | "35mm lens feel, eye-level medium shot" |
|
|
43
|
+
| **5** | Semantic anchor | Işık yönü, gölge vektörü, scale/depth, location shell |
|
|
44
|
+
| **6** | Audio direction | Sadece video prompt'larında |
|
|
45
|
+
| **7** | Avoid line | HER prompt'ta zorunlu |
|
|
46
46
|
|
|
47
47
|
> **İlk cümle = "Bu shot ne?" sorusunun tek cümlelik cevabı olmalı.**
|
|
48
48
|
|
|
49
49
|
### Standard Template
|
|
50
50
|
|
|
51
51
|
```
|
|
52
|
-
[REFERENCE LOCK
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
[
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
[
|
|
59
|
-
|
|
60
|
-
[Full avoid line]
|
|
52
|
+
[REFERENCE LOCK] Same person/object exactly as reference. Do not modify, enhance, reinterpret, restyle, redress, or redesign anything visible in the reference.
|
|
53
|
+
Keep same: identity, wardrobe/accessories, prop design, lens feel, camera height, subject scale, light direction, color temperature, location shell.
|
|
54
|
+
Change only: [one intentional beat].
|
|
55
|
+
[Frame intent + composition + brief action].
|
|
56
|
+
Semantic consistency: [shadow direction], [scale map], [reflection/physics note].
|
|
57
|
+
[Safety injection if needed]
|
|
58
|
+
Avoid: [targeted failures including reference drift / outfit drift / prop drift / lens drift / light-direction drift]
|
|
61
59
|
```
|
|
62
60
|
|
|
63
61
|
### Semantic Consistency Layer
|
|
@@ -78,19 +76,27 @@ Use SHOT[prev]_END as exact first frame.
|
|
|
78
76
|
```
|
|
79
77
|
|
|
80
78
|
Any new start-frame visual prompt must be declared as `CHAIN BREAK - [reason]`.
|
|
79
|
+
|
|
80
|
+
### Reference-Anchored Image Prompt Contract
|
|
81
|
+
|
|
82
|
+
When a reference exists:
|
|
83
|
+
|
|
84
|
+
- do not restate face, hair, skin tone, body build, wardrobe, accessories, or prop design as freeform descriptive prose
|
|
85
|
+
- use `Keep same:` for invariants and `Change only:` for one meaningful delta
|
|
86
|
+
- do not re-explain style or lighting already visible in the reference image
|
|
87
|
+
- use compact prompts; density matters more than length
|
|
88
|
+
- if more than one major semantic change is needed, split the beat or declare `CHAIN BREAK`
|
|
81
89
|
|
|
82
90
|
### Example: Character in Environment
|
|
83
91
|
|
|
84
|
-
```
|
|
85
|
-
[REFERENCE LOCK]
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
Avoid: blurry, low-res, noise, distorted faces, bad anatomy, extra limbs/fingers, plastic skin, waxy skin, airbrushed skin, on-screen text, watermark, logo, cartoon style, CGI look, different face than reference.
|
|
93
|
-
```
|
|
92
|
+
```
|
|
93
|
+
[REFERENCE LOCK] Same soldier exactly as the uploaded reference. Do not modify face, hair, skin, body, uniform, fez, or cannon design.
|
|
94
|
+
Keep same: identity, uniform, fez, cannon design, medium-shot 35mm lens feel, eye-level camera, late-afternoon side light, smoke-filled fortification layout.
|
|
95
|
+
Change only: the soldier's expression shifts to determined while one hand rests on the cannon barrel.
|
|
96
|
+
Cinematic still frame for a reaction-hold handoff.
|
|
97
|
+
Semantic consistency: long shadows fall screen-right; scale map keeps soldier foreground, cannon midground, smoke depth behind.
|
|
98
|
+
Avoid: distorted face, outfit drift, prop drift, lens drift, light-direction drift, bad anatomy, extra fingers, on-screen text, watermark.
|
|
99
|
+
```
|
|
94
100
|
|
|
95
101
|
---
|
|
96
102
|
|