@milenyumai/film-kit 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,307 @@
1
+ ---
2
+ name: audio-design
3
+ description: Sound design rules for Veo 3.1. Voice realism, environmental sounds, SFX, ambience, and audio direction block formatting. Includes anti-synthetic audio guidelines.
4
+ ---
5
+
6
+ # Audio Design System
7
+
8
+ > **Philosophy:** Sound is half the experience. Audio must be as real as visuals.
9
+ > **Core Principle:** Every sound recorded-quality. No synthetic giveaways.
10
+
11
+ ---
12
+
13
+ ## 🎯 Audio Realism Baseline
14
+
15
+ ### Human Voices (Ultra Realistic)
16
+
17
+ | Requirement | Description |
18
+ |-------------|-------------|
19
+ | **Natural delivery** | Sound like real human actors |
20
+ | **Breathing** | Natural breaths between phrases |
21
+ | **Micro-pauses** | Authentic hesitation, thinking |
22
+ | **Vocal imperfections** | Slight variations, not robotic |
23
+ | **Emotional authenticity** | Genuine feeling, not performed reading |
24
+
25
+ ### ❌ Voice Artifacts to Avoid
26
+
27
+ - Robotic TTS quality
28
+ - Monotone delivery
29
+ - Machine-like precision
30
+ - Flat affect
31
+ - Unnatural pacing
32
+ - Over-articulation
33
+
34
+ ---
35
+
36
+ ## Environmental Sounds
37
+
38
+ ### Acoustic Space Matching
39
+
40
+ | Environment | Acoustic Properties |
41
+ |-------------|---------------------|
42
+ | **Small room** | Tight reverb, close sounds |
43
+ | **Large hall** | Natural echo, distant ambience |
44
+ | **Outdoor open** | Minimal reverb, wind, distant sounds |
45
+ | **Enclosed bunker** | Muffled, resonant, claustrophobic |
46
+ | **Ship deck** | Wind, waves, metallic resonance |
47
+ | **Ship interior** | Engine hum, creaking, contained |
48
+
49
+ ### Surface-Matched Footsteps
50
+
51
+ ```
52
+ Wood: Creaking, hollow resonance
53
+ Stone: Sharp, echoing impact
54
+ Metal: Clanging, ringing
55
+ Gravel: Crunching, shifting
56
+ Dirt: Muffled, soft thuds
57
+ Wet ground: Splashing, squelching
58
+ ```
59
+
60
+ ---
61
+
62
+ ## Military/Historical Sound Design
63
+
64
+ ### Artillery Sounds
65
+
66
+ ```
67
+ Cannon fire: Deep boom, mechanical operation, recoil sounds
68
+ Shell loading: Heavy metallic clunk, scraping
69
+ Ammunition handling: Weight sounds, brass on metal
70
+ Distant bombardment: Rumbling, delayed impact
71
+ Near misses: Whistling, earth shaking, debris
72
+ ```
73
+
74
+ ### Naval Sounds
75
+
76
+ ```
77
+ Ship engines: Deep thrumming, mechanical rhythm
78
+ Hull creaking: Metal stress, wave impact
79
+ Bridge ambience: Telegraph, commands, equipment
80
+ Distant ships: Muffled horns, gunfire echoes
81
+ Water: Waves, spray, impacts
82
+ ```
83
+
84
+ ### Bunker/Trench Sounds
85
+
86
+ ```
87
+ Explosions above: Muffled booms, earth falling
88
+ Structural stress: Creaking, dust falling
89
+ Radio/telephone: Static, crackling, distant voices
90
+ Soldier activity: Equipment rattling, footsteps, breathing
91
+ Silence tension: Heartbeats, shallow breathing
92
+ ```
93
+
94
+ ---
95
+
96
+ ## Audio Direction Block Template
97
+
98
+ Include at end of EVERY video prompt:
99
+
100
+ ```
101
+ Audio direction:
102
+ - Language: [TURKISH/ENGLISH/NONE]
103
+ - Type: [Dialogue/SFX/Mixed/SFX Only]
104
+ - Dialogue transcript: [Exact lines OR "NONE"]
105
+ - SFX: [Specific sound effects list]
106
+ - Ambience: [Environmental background]
107
+ - Music: NONE <-- (STRICT DEFAULT: Only change if user explicitly requests music)
108
+ - Mix target: [Percentages, e.g., "Dialogue 60%, Ambience 30%, SFX 10%"]
109
+ - No on-screen subtitles/captions.
110
+ ```
111
+
112
+ ### Example: Battle Scene
113
+
114
+ ```
115
+ Audio direction:
116
+ - Language: TURKISH
117
+ - Type: Mixed
118
+ - Dialogue transcript: "Şimdi tam zamanı. Bismillah, Ya Allah! Ateş serbest!"
119
+ - SFX: Cannon fire, shell loading, mechanical recoil, distant explosions
120
+ - Ambience: Wind, smoke whooshing, distant ship engines
121
+ - Music: NONE
122
+ - Mix target: Dialogue 50%, SFX 35%, Ambience 15%
123
+ - No on-screen subtitles/captions.
124
+ ```
125
+
126
+ ### Example: Tense Waiting
127
+
128
+ ```
129
+ Audio direction:
130
+ - Language: NONE
131
+ - Type: SFX Only
132
+ - Dialogue transcript: NONE
133
+ - SFX: Distant rumbling, earth shaking, debris falling, heartbeats
134
+ - Ambience: Muffled explosions, soldiers breathing, equipment rattling
135
+ - Music: NONE
136
+ - Mix target: Ambience 70%, SFX 30%
137
+ - No on-screen subtitles/captions.
138
+ ```
139
+
140
+ ### Example: Dialogue Scene
141
+
142
+ ```
143
+ Audio direction:
144
+ - Language: TURKISH
145
+ - Type: Dialogue
146
+ - Dialogue transcript: "Anladım paşam. Hemen harekete geçiyoruz."
147
+ - SFX: Telephone click, paper rustling
148
+ - Ambience: Bunker acoustics, distant explosions muffled
149
+ - Music: NONE
150
+ - Mix target: Dialogue 70%, Ambience 25%, SFX 5%
151
+ - No on-screen subtitles/captions.
152
+ ```
153
+
154
+ ---
155
+
156
+ ## Anti-Synthetic Audio Rules
157
+
158
+ ### ❌ AVOID
159
+
160
+ | Artifact | Description |
161
+ |----------|-------------|
162
+ | **Obviously synthesized** | AI-generated, robotic sounds |
163
+ | **Stock library generic** | Overused, recognizable loops |
164
+ | **Mismatched acoustics** | Sound doesn't match space |
165
+ | **Floating audio** | No spatial grounding |
166
+ | **Unnatural transitions** | Jarring sound cuts |
167
+ | **Uniform volume** | No natural dynamics |
168
+
169
+ ### ✅ REQUIRE
170
+
171
+ | Quality | Description |
172
+ |---------|-------------|
173
+ | **Recorded quality** | Sounds like on-set capture |
174
+ | **Spatial grounding** | Sound has location in scene |
175
+ | **Natural dynamics** | Volume variations realistic |
176
+ | **Acoustic matching** | Sound matches environment |
177
+ | **Organic imperfections** | Slight variations natural |
178
+
179
+ ---
180
+
181
+ ## Dialogue Handling
182
+
183
+ ### 🎯 NATIVE LANGUAGE RULE (CRITICAL!)
184
+
185
+ **NEVER translate user's dialogue!**
186
+
187
+ | User Writes In | Audio Transcript Uses |
188
+ |----------------|----------------------|
189
+ | Turkish | Turkish (as-is) |
190
+ | English | English (as-is) |
191
+ | Any language | Same language (verbatim) |
192
+
193
+ ```
194
+ User: "Kalkın aslanlar! Düşman zırhlıları top menziline giriyorlar!"
195
+
196
+ ❌ WRONG: Dialogue transcript: "Rise lions! Enemy battleships are entering range!"
197
+ ✅ RIGHT: Dialogue transcript: "Kalkın aslanlar! Düşman zırhlıları top menziline giriyorlar!"
198
+ ```
199
+
200
+ ### When User Provides Dialogue
201
+
202
+ - Include VERBATIM in audio transcript
203
+ - Preserve original language (Turkish stays Turkish, English stays English)
204
+ - Note emotional delivery required
205
+
206
+ ### 🗣️ SPEAKER ISOLATION RULE (Prevent Mixed Dialogue)
207
+
208
+ **Problem:** If two people are in the frame and both speak, AI often mixes lipsync or timing.
209
+ **Solution:** ONE active speaker per shot.
210
+
211
+ - **Shot A:** Character X speaks. Camera focuses on X (or over-the-shoulder of Y).
212
+ - **Shot B:** Character Y replies. Camera focuses on Y.
213
+
214
+ **EXCEPTION:** If both MUST be in frame (Two-Shot):
215
+ 1. Use "Reaction Shot" for the listener (listener nods while speaker talks off-screen sound).
216
+ 2. OR Ensure prompt explicitly states: "[Character A] talking, [Character B] listening silently".
217
+
218
+ ```
219
+ Dialogue transcript: "Kalkın aslanlar! Düşman zırhlıları top menziline giriyorlar!"
220
+ [Delivery: Excited, commanding]
221
+ ```
222
+
223
+ ### When No Dialogue
224
+
225
+ ```
226
+ Dialogue transcript: NONE
227
+ ```
228
+
229
+ ### Short Sentence Rule
230
+
231
+ **DO NOT** try to fit very long sentences into a single shot.
232
+
233
+ If dialogue is long:
234
+ - Split the sentence across shots
235
+ - OR split the shot in two
236
+ - This helps AI with audio splitting
237
+
238
+ ```
239
+ ❌ Wrong: One 4s shot with 20 seconds of dialogue
240
+ ✅ Right: Split into multiple shots with shorter lines
241
+ ```
242
+
243
+ ---
244
+
245
+ ## Audio Safety
246
+
247
+ | Rule | Requirement |
248
+ |------|-------------|
249
+ | **Minors on screen** | "Dialogue transcript: NONE" (platform requirement) |
250
+ | **Violence context** | Describe SFX as "mechanical operation" not "firing at targets" |
251
+ | **Threats** | Never script threatening dialogue |
252
+ | **Copyrighted music** | Never reference by name |
253
+
254
+ ---
255
+
256
+ ## 🔇 NO MUSIC POLICY
257
+
258
+ **Rule:** By default, generating music is BANNED.
259
+ **Why:** Cinematic realism relies on SFX and Ambience. Cheap stock music ruins immersion.
260
+
261
+ - **Default:** `Music: NONE`
262
+ - **Exception:** Only if User says "Add sad music" or "Background score".
263
+ - **Strictness:** Even if the scene is emotional, use *Silence* or *Wind*, NOT music.
264
+
265
+ ---
266
+
267
+ ## Mix Targets by Scene Type
268
+
269
+ | Scene Type | Dialogue | SFX | Ambience | Music |
270
+ |------------|----------|-----|----------|-------|
271
+ | **Intense dialogue** | 70% | 5% | 25% | 0% |
272
+ | **Action/battle** | 20% | 50% | 30% | 0% |
273
+ | **Tension/waiting** | 0% | 30% | 70% | 0% |
274
+ | **Emotional moment** | 60% | 10% | 30% | 0% |
275
+ | **Establishing shot** | 0% | 20% | 80% | 0% |
276
+
277
+ ---
278
+
279
+ ## Period-Specific Audio Notes
280
+
281
+ ### WWI/Çanakkale
282
+
283
+ ```
284
+ Artillery: Black powder era, deeper booms
285
+ Ships: Coal-powered, rhythmic engine sounds
286
+ Communication: Field telephone, no radio static
287
+ Commands: Shouted, no PA systems
288
+ Uniforms: Fabric rustling, leather creaking, metal equipment
289
+ ```
290
+
291
+ ---
292
+
293
+ ## 🔊 Diegetic vs Non-Diegetic Ses
294
+
295
+ Ses'in kaynağına göre prompt'ta farklı yönetilmelidir:
296
+
297
+ | Tür | Tanım | Örnekler | Prompt'ta Nasıl |
298
+ |-----|-------|----------|----------------|
299
+ | **Diegetic** | Sahne içinden gelen ses | Radyo, TV, konuşma, ayak sesi | SFX veya Dialogue olarak yaz |
300
+ | **Non-Diegetic** | Sahne dışından gelen ses | Film müziği, narrator, efekt | Music veya voiceover olarak yaz |
301
+ | **Meta-Diegetic** | Karakterin zihnindeki ses | İç ses, hatıra, hayal | "Internal monologue" + echo efekti |
302
+
303
+ **Kural:** Veo 3.1'de tüm sesler diegetic olarak davranır. Non-diegetic müzik eklenecekse kullanıcı açıkça istemelidir (Music: NONE default).
304
+
305
+ ---
306
+
307
+ > **Remember:** Audio sells the reality. A perfectly filmed scene fails if it sounds fake. Every sound must justify its existence and match its environment.