vargai 0.3.1 → 0.4.0-alpha.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +1 -38
- package/biome.json +6 -1
- package/docs/index.html +1130 -0
- package/docs/prompting.md +326 -0
- package/docs/react.md +834 -0
- package/package.json +11 -6
- package/src/ai-sdk/index.ts +2 -21
- package/src/cli/commands/index.ts +1 -4
- package/src/cli/commands/render.tsx +71 -0
- package/src/cli/index.ts +2 -0
- package/src/react/cli.ts +52 -0
- package/src/react/elements.ts +146 -0
- package/src/react/examples/branching.tsx +66 -0
- package/src/react/examples/captions-demo.tsx +37 -0
- package/src/react/examples/character-video.tsx +84 -0
- package/src/react/examples/grid.tsx +53 -0
- package/src/react/examples/layouts-demo.tsx +57 -0
- package/src/react/examples/madi.tsx +60 -0
- package/src/react/examples/music-test.tsx +35 -0
- package/src/react/examples/onlyfans-1m/workflow.tsx +88 -0
- package/src/react/examples/orange-portrait.tsx +41 -0
- package/src/react/examples/split-element-demo.tsx +60 -0
- package/src/react/examples/split-layout-demo.tsx +60 -0
- package/src/react/examples/split.tsx +41 -0
- package/src/react/examples/video-grid.tsx +46 -0
- package/src/react/index.ts +43 -0
- package/src/react/layouts/grid.tsx +28 -0
- package/src/react/layouts/index.ts +2 -0
- package/src/react/layouts/split.tsx +20 -0
- package/src/react/react.test.ts +309 -0
- package/src/react/render.ts +21 -0
- package/src/react/renderers/animate.ts +59 -0
- package/src/react/renderers/captions.ts +297 -0
- package/src/react/renderers/clip.ts +248 -0
- package/src/react/renderers/context.ts +17 -0
- package/src/react/renderers/image.ts +109 -0
- package/src/react/renderers/index.ts +22 -0
- package/src/react/renderers/music.ts +60 -0
- package/src/react/renderers/packshot.ts +84 -0
- package/src/react/renderers/progress.ts +173 -0
- package/src/react/renderers/render.ts +243 -0
- package/src/react/renderers/slider.ts +69 -0
- package/src/react/renderers/speech.ts +53 -0
- package/src/react/renderers/split.ts +91 -0
- package/src/react/renderers/subtitle.ts +16 -0
- package/src/react/renderers/swipe.ts +75 -0
- package/src/react/renderers/title.ts +17 -0
- package/src/react/renderers/utils.ts +124 -0
- package/src/react/renderers/video.ts +127 -0
- package/src/react/runtime/jsx-dev-runtime.ts +43 -0
- package/src/react/runtime/jsx-runtime.ts +35 -0
- package/src/react/types.ts +232 -0
- package/src/studio/index.ts +26 -0
- package/src/studio/scanner.ts +102 -0
- package/src/studio/server.ts +554 -0
- package/src/studio/stages.ts +251 -0
- package/src/studio/step-renderer.ts +279 -0
- package/src/studio/types.ts +60 -0
- package/src/studio/ui/cache.html +303 -0
- package/src/studio/ui/index.html +1820 -0
- package/tsconfig.cli.json +8 -0
- package/tsconfig.json +3 -1
- package/bun.lock +0 -1255
- package/docs/plan.md +0 -66
- package/docs/todo.md +0 -14
- package/src/ai-sdk/middleware/index.ts +0 -25
- package/src/ai-sdk/middleware/placeholder.ts +0 -111
- package/src/ai-sdk/middleware/wrap-image-model.ts +0 -86
- package/src/ai-sdk/middleware/wrap-music-model.ts +0 -108
- package/src/ai-sdk/middleware/wrap-video-model.ts +0 -115
- /package/docs/{varg-sdk.md → sdk.md} +0 -0
- /package/src/ai-sdk/providers/{elevenlabs.ts → elevenlabs-provider.ts} +0 -0
- /package/src/ai-sdk/providers/{fal.ts → fal-provider.ts} +0 -0
|
@@ -0,0 +1,326 @@
|
|
|
1
|
+
# Prompting Kling & Nano-Banana-Pro
|
|
2
|
+
|
|
3
|
+
_Shared by Alex_
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Kling AI (Video Generation)
|
|
8
|
+
|
|
9
|
+
### Core Prompt Structure
|
|
10
|
+
|
|
11
|
+
A strong Kling prompt relies on four key elements: **subject**, **context**, **action**, and **style**. If you omit any of these, the model is forced to guess.
|
|
12
|
+
|
|
13
|
+
**Formula:**
|
|
14
|
+
|
|
15
|
+
```
|
|
16
|
+
[Subject] + [Action/Motion] + [Context/Setting] + [Camera/Style]
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
### What Makes Kling Special
|
|
20
|
+
|
|
21
|
+
- Kling's greatest strength is **camera motion** and **character physics**
|
|
22
|
+
- Works best with prompts under **40–50 words**, using clear and structured formatting
|
|
23
|
+
- The real magic shows up when you guide it visually, not just verbally—**reference-first prompting** with images
|
|
24
|
+
|
|
25
|
+
### Pro Techniques
|
|
26
|
+
|
|
27
|
+
#### Camera Control
|
|
28
|
+
|
|
29
|
+
- Add detailed camera movements: `"slow zoom-in"`, `"quick pan"`, `"aerial shot"`
|
|
30
|
+
- Include technical specifications like `"Shot on virtual anamorphic lens, 24mm, f/2.8"`—these function as stylistic cues
|
|
31
|
+
|
|
32
|
+
#### Weight Elements
|
|
33
|
+
|
|
34
|
+
Use emphasis indicators `(++)` for critical elements:
|
|
35
|
+
|
|
36
|
+
```
|
|
37
|
+
++sleek red convertible++ driving along coastal highway
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
#### Negative Prompts
|
|
41
|
+
|
|
42
|
+
Specify what to avoid:
|
|
43
|
+
|
|
44
|
+
```
|
|
45
|
+
No people, no text overlays, no distortion in vehicle proportions
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
### Image-to-Video Formula
|
|
49
|
+
|
|
50
|
+
For Image-to-Video, the essential elements are **subject** and **movement**. Unlike Text-to-Video, which requires scene description, Image-to-Video already has a scene provided by the input image.
|
|
51
|
+
|
|
52
|
+
**Example:**
|
|
53
|
+
|
|
54
|
+
```
|
|
55
|
+
Cat walking forward on the alien landscape, his tail swaying gently.
|
|
56
|
+
Vibrant meteor shower fills the sky, with meteors streaking across.
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
### Common Pitfalls
|
|
60
|
+
|
|
61
|
+
- Requesting `"360-degree rotation around subject while zooming in"` often produces warped geometry due to multiple simultaneous camera transformations
|
|
62
|
+
- Avoid relying on specific numbers—the AI may struggle with consistency (e.g., "5 trees", "6 puppies")
|
|
63
|
+
- Mixing lighting terms like `"golden hour"` with `"studio lighting"` confuses the model's style interpretation
|
|
64
|
+
|
|
65
|
+
---
|
|
66
|
+
|
|
67
|
+
## Nano-Banana-Pro (Image Generation)
|
|
68
|
+
|
|
69
|
+
Google's latest model—internally "Gemini 3 Pro Image."
|
|
70
|
+
|
|
71
|
+
### Key Paradigm Shift
|
|
72
|
+
|
|
73
|
+
Nano-Banana Pro is a **"Thinking" model**. It doesn't just match keywords; it understands intent, physics, and composition. Stop using "tag soups" (e.g., `dog, park, 4k, realistic`) and start acting like a **Creative Director**.
|
|
74
|
+
|
|
75
|
+
### Core Prompt Formula
|
|
76
|
+
|
|
77
|
+
```
|
|
78
|
+
[Subject + Adjectives] doing [Action] in [Location/Context].
|
|
79
|
+
[Composition/Camera Angle]. [Lighting/Atmosphere]. [Style/Media].
|
|
80
|
+
[Specific Constraint/Text].
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
**ICS Method:** Always specify:
|
|
84
|
+
|
|
85
|
+
1. **Image type** — blueprint, infographic, diagram
|
|
86
|
+
2. **Content** — source data or information
|
|
87
|
+
3. **Visual style** — survival guide, McKinsey presentation, comic
|
|
88
|
+
|
|
89
|
+
### Killer Features
|
|
90
|
+
|
|
91
|
+
#### Text Rendering
|
|
92
|
+
|
|
93
|
+
Don't just say "add text." Be specific:
|
|
94
|
+
|
|
95
|
+
```
|
|
96
|
+
Write the text 'HELLO WORLD' in a bold, red, serif font on the sign.
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
Maximize text legibility by isolating string literals in double quotes. Explicitly define font family.
|
|
100
|
+
|
|
101
|
+
#### Conversational Editing
|
|
102
|
+
|
|
103
|
+
The model excels at understanding conversational edits. If an image is 80% correct, don't regenerate from scratch—simply ask for the specific change you need.
|
|
104
|
+
|
|
105
|
+
#### Search-Powered Generation
|
|
106
|
+
|
|
107
|
+
Nano-Banana Pro uses Google Search to generate imagery based on real-time data, current events, or factual verification.
|
|
108
|
+
|
|
109
|
+
#### Multi-Image Context
|
|
110
|
+
|
|
111
|
+
Supports a **14-image context window**. Upload images (style guides, logos, character sheets) and instruct:
|
|
112
|
+
|
|
113
|
+
```
|
|
114
|
+
Use the uploaded images as a strict style reference...
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
### Pro Tips
|
|
118
|
+
|
|
119
|
+
- Remove polite phrases like "please"—use command-line style syntax
|
|
120
|
+
- Trick the model into photorealism by specifying camera gear: `"Shot on Arri Alexa"` forces the AI to emulate specific film grain
|
|
121
|
+
- You don't need `"4k, trending on artstation, masterpiece"` spam anymore—Nano Banana Pro understands natural language
|
|
122
|
+
|
|
123
|
+
### Limitations
|
|
124
|
+
|
|
125
|
+
- Rendering small text, fine details, and producing accurate spellings may not work perfectly
|
|
126
|
+
- Always verify the factual accuracy of data-driven visuals like diagrams and infographics
|
|
127
|
+
|
|
128
|
+
---
|
|
129
|
+
|
|
130
|
+
## Quick Comparison
|
|
131
|
+
|
|
132
|
+
| Aspect | Kling | Nano-Banana-Pro |
|
|
133
|
+
|--------|-------|-----------------|
|
|
134
|
+
| **Type** | Video gen | Image gen |
|
|
135
|
+
| **Sweet Spot** | Camera motion, physics | Text rendering, editing |
|
|
136
|
+
| **Prompting Style** | Cinematic director | Creative director |
|
|
137
|
+
| **Best Input** | Image + motion prompt | Natural language + refs |
|
|
138
|
+
| **Iteration** | Re-generate | Conversational edits |
|
|
139
|
+
|
|
140
|
+
---
|
|
141
|
+
|
|
142
|
+
## TikTok-Style Prompts for Kling
|
|
143
|
+
|
|
144
|
+
### General Guidelines
|
|
145
|
+
|
|
146
|
+
- **Image-to-Video** gives more control (upload frame → describe movement)
|
|
147
|
+
- Specify **vertical 9:16** aspect ratio
|
|
148
|
+
- Keep prompts short (**40–50 words max**)
|
|
149
|
+
- Use **one camera movement type** at a time (don't mix pan + zoom + rotate)
|
|
150
|
+
|
|
151
|
+
### Timeline Breakdown
|
|
152
|
+
|
|
153
|
+
| Time | Action/Angle | Notes |
|
|
154
|
+
|------|--------------|-------|
|
|
155
|
+
| 0–1s | Hook — unusual expression or text question, frontal close-up | Capture attention in first 1–2 seconds |
|
|
156
|
+
| 1–4s | Switch to 45° medium shot; first grimace | Sharp angle change on music beat |
|
|
157
|
+
| 4–7s | Low angle or extreme close-up — funny angle; add text/emoji | Use jump cut to remove pause |
|
|
158
|
+
| 7–10s | High angle; new emotion, possible match cut or object cover | Insert reaction cutaway for smoothness |
|
|
159
|
+
| 10–13s | Show all four angles in 2×2 grid (optional) | Creates powerful final effect |
|
|
160
|
+
| 13–15s | Finale and CTA: return to frontal shot; deliver joke or call-to-action | End on strong beat; add text with CTA |
|
|
161
|
+
|
|
162
|
+
---
|
|
163
|
+
|
|
164
|
+
### Prompt Examples by Segment
|
|
165
|
+
|
|
166
|
+
#### 0–1 sec: Hook — Frontal Close-up
|
|
167
|
+
|
|
168
|
+
**Text-to-Video:**
|
|
169
|
+
|
|
170
|
+
```
|
|
171
|
+
Young woman, surprised expression with wide eyes, looking directly at camera.
|
|
172
|
+
Extreme close-up face shot, static camera, soft ring light, vertical 9:16 format.
|
|
173
|
+
Cinematic, social media aesthetic, sharp focus on eyes.
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
**Image-to-Video:**
|
|
177
|
+
|
|
178
|
+
```
|
|
179
|
+
Woman's eyes widen in surprise, eyebrows raise slightly, subtle head tilt forward.
|
|
180
|
+
Static shot, no camera movement.
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
**Negative prompt:** `blur, camera shake, side profile, horizontal format`
|
|
184
|
+
|
|
185
|
+
---
|
|
186
|
+
|
|
187
|
+
#### 1–4 sec: 45° Medium Shot + Grimace
|
|
188
|
+
|
|
189
|
+
**Text-to-Video:**
|
|
190
|
+
|
|
191
|
+
```
|
|
192
|
+
Woman at 45-degree angle, medium shot showing torso and hands.
|
|
193
|
+
She makes an exaggerated funny face, hands gesture expressively.
|
|
194
|
+
Camera slowly pushes in, soft natural lighting from window, vertical format.
|
|
195
|
+
TikTok influencer style, playful mood.
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
**Image-to-Video:**
|
|
199
|
+
|
|
200
|
+
```
|
|
201
|
+
Woman turns head 45 degrees, makes exaggerated grimace,
|
|
202
|
+
hands move up near face expressively. Slow push-in camera movement.
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
**Tip:** Use `++grimace++` or `++expressive hands++` for emphasis
|
|
206
|
+
|
|
207
|
+
---
|
|
208
|
+
|
|
209
|
+
#### 4–7 sec: Low Angle or Extreme Close-up
|
|
210
|
+
|
|
211
|
+
**Option A — Low Angle:**
|
|
212
|
+
|
|
213
|
+
```
|
|
214
|
+
Woman filmed from below, low angle shot, looking down at camera with
|
|
215
|
+
confident smirk. Dramatic perspective, she appears powerful and playful.
|
|
216
|
+
Static camera, vertical 9:16, slight lens distortion for comic effect.
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
**Option B — Extreme Close-up (eyes/lips):**
|
|
220
|
+
|
|
221
|
+
```
|
|
222
|
+
Extreme close-up of woman's eyes, pupils dilate slightly,
|
|
223
|
+
eyebrow raises in surprise. Macro shot, shallow depth of field,
|
|
224
|
+
ring light reflection in eyes. Static, vertical format.
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
**Negative prompt:** `full body, wide shot, horizontal, distortion in face proportions`
|
|
228
|
+
|
|
229
|
+
---
|
|
230
|
+
|
|
231
|
+
#### 7–10 sec: High Angle + New Emotion
|
|
232
|
+
|
|
233
|
+
**Text-to-Video:**
|
|
234
|
+
|
|
235
|
+
```
|
|
236
|
+
Woman filmed from above, high angle shot, she looks up at camera
|
|
237
|
+
with playful vulnerability. Arms spread wide filling the frame.
|
|
238
|
+
Slow gentle tilt down, soft overhead lighting, vertical 9:16.
|
|
239
|
+
Cute, endearing mood, TikTok aesthetic.
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
**Image-to-Video:**
|
|
243
|
+
|
|
244
|
+
```
|
|
245
|
+
Woman looks up at camera, expression shifts from neutral to excited smile,
|
|
246
|
+
hands move into frame from sides. Subtle camera tilt, gentle movement.
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
---
|
|
250
|
+
|
|
251
|
+
#### 10–13 sec: Split Screen 2×2 (All 4 Angles)
|
|
252
|
+
|
|
253
|
+
Kling doesn't natively create grids—this is post-production. Options:
|
|
254
|
+
|
|
255
|
+
1. **Generate each angle separately** → combine in CapCut/Premiere
|
|
256
|
+
2. **Experimental mosaic prompt:**
|
|
257
|
+
|
|
258
|
+
```
|
|
259
|
+
Split screen 2x2 grid showing same woman from four angles simultaneously:
|
|
260
|
+
top-left frontal close-up, top-right 45-degree medium shot,
|
|
261
|
+
bottom-left low angle, bottom-right high angle.
|
|
262
|
+
All expressions different, synchronized movement, vertical format.
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
> ⚠️ Kling may struggle with consistency. Better to generate separately and combine.
|
|
266
|
+
|
|
267
|
+
---
|
|
268
|
+
|
|
269
|
+
#### 13–15 sec: Finale + CTA — Return to Frontal
|
|
270
|
+
|
|
271
|
+
**Text-to-Video:**
|
|
272
|
+
|
|
273
|
+
```
|
|
274
|
+
Woman, frontal close-up, delivers punchline directly to camera
|
|
275
|
+
with confident smile. Slight head nod, wink at the end.
|
|
276
|
+
Static camera, punchy energy, vertical 9:16, TikTok creator vibe.
|
|
277
|
+
Sharp focus, professional ring lighting.
|
|
278
|
+
```
|
|
279
|
+
|
|
280
|
+
**Image-to-Video:**
|
|
281
|
+
|
|
282
|
+
```
|
|
283
|
+
Woman smiles confidently, delivers line to camera,
|
|
284
|
+
subtle wink, slight forward lean. Static shot, no camera movement.
|
|
285
|
+
```
|
|
286
|
+
|
|
287
|
+
---
|
|
288
|
+
|
|
289
|
+
### Universal TikTok-Style Modifiers
|
|
290
|
+
|
|
291
|
+
Add to the end of prompts:
|
|
292
|
+
|
|
293
|
+
| Goal | Modifier |
|
|
294
|
+
|------|----------|
|
|
295
|
+
| **Vertical** | `vertical 9:16 format, portrait orientation` |
|
|
296
|
+
| **TikTok vibe** | `TikTok creator aesthetic, social media style` |
|
|
297
|
+
| **Lighting** | `ring light, soft natural window light, even lighting` |
|
|
298
|
+
| **Sharpness** | `sharp focus, high clarity, 1080p quality` |
|
|
299
|
+
| **Energy** | `dynamic, punchy, energetic mood` |
|
|
300
|
+
|
|
301
|
+
---
|
|
302
|
+
|
|
303
|
+
### Prompt Template for varg SDK
|
|
304
|
+
|
|
305
|
+
```javascript
|
|
306
|
+
const klingPromptTemplate = {
|
|
307
|
+
subject: "Young woman, [EXPRESSION]",
|
|
308
|
+
action: "[MOVEMENT/GESTURE]",
|
|
309
|
+
camera: {
|
|
310
|
+
angle: "[frontal/45-degree/low-angle/high-angle]",
|
|
311
|
+
shot: "[extreme-close-up/close-up/medium-shot]",
|
|
312
|
+
movement: "[static/slow-push-in/gentle-tilt]"
|
|
313
|
+
},
|
|
314
|
+
style: "TikTok creator aesthetic, vertical 9:16",
|
|
315
|
+
lighting: "ring light, soft natural lighting",
|
|
316
|
+
negative: "blur, horizontal format, camera shake"
|
|
317
|
+
};
|
|
318
|
+
```
|
|
319
|
+
|
|
320
|
+
---
|
|
321
|
+
|
|
322
|
+
## Pro Tips for Scale
|
|
323
|
+
|
|
324
|
+
- **Consistency via Image-to-Video** — Generate base frame in Flux/SDXL, then animate in Kling. Prevents face "drifting" between angles.
|
|
325
|
+
- **Batch prompts** — Create 3–4 expression variations per angle, then pick the best during editing.
|
|
326
|
+
- **Reference-first** — Kling works better when you provide an image and describe only the movement.
|