@renoise/video-maker 0.1.3 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +15 -0
- package/.claude-plugin/plugin.json +17 -3
- package/README.md +20 -33
- package/hooks/check-api-key.sh +28 -0
- package/hooks/hooks.json +3 -3
- package/openclaw.plugin.json +5 -3
- package/package.json +4 -9
- package/skills/director/SKILL.md +4 -7
- package/skills/file-upload/SKILL.md +79 -0
- package/skills/file-upload/scripts/upload.mjs +103 -0
- package/skills/gemini-gen/SKILL.md +232 -0
- package/skills/gemini-gen/scripts/gemini.mjs +220 -0
- package/skills/renoise-gen/SKILL.md +3 -1
- package/skills/short-film-editor/SKILL.md +23 -24
- package/skills/short-film-editor/references/continuity-guide.md +2 -2
- package/skills/tiktok-content-maker/SKILL.md +78 -81
- package/skills/tiktok-content-maker/examples/dress-demo.md +42 -42
- package/skills/tiktok-content-maker/references/ecom-prompt-guide.md +157 -152
- package/skills/video-download/SKILL.md +1 -1
- package/hooks/session-start.sh +0 -17
- package/lib/gemini.ts +0 -49
- package/skills/short-film-editor/scripts/generate-storyboard-html.ts +0 -714
- package/skills/tiktok-content-maker/scripts/analyze-images.ts +0 -122
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "renoise-plugins-official",
|
|
3
|
+
"owner": {
|
|
4
|
+
"name": "Renoise"
|
|
5
|
+
},
|
|
6
|
+
"description": "Official Claude Code plugins by Renoise",
|
|
7
|
+
"plugins": [
|
|
8
|
+
{
|
|
9
|
+
"name": "video-maker",
|
|
10
|
+
"version": "0.2.0",
|
|
11
|
+
"source": "./",
|
|
12
|
+
"description": "AI video production skills — creative direction, generation, editing, and e-commerce content"
|
|
13
|
+
}
|
|
14
|
+
]
|
|
15
|
+
}
|
|
@@ -1,5 +1,19 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "video-maker",
|
|
3
|
-
"description": "
|
|
4
|
-
"version": "0.
|
|
5
|
-
|
|
3
|
+
"description": "AI video production skills — creative direction, generation, editing, and e-commerce content",
|
|
4
|
+
"version": "0.2.0",
|
|
5
|
+
"author": {
|
|
6
|
+
"name": "Renoise"
|
|
7
|
+
},
|
|
8
|
+
"homepage": "https://github.com/ArcoCodes/renoise-plugins-official",
|
|
9
|
+
"repository": "https://github.com/ArcoCodes/renoise-plugins-official",
|
|
10
|
+
"keywords": [
|
|
11
|
+
"skills",
|
|
12
|
+
"video",
|
|
13
|
+
"ai-video",
|
|
14
|
+
"creative-direction",
|
|
15
|
+
"ecommerce",
|
|
16
|
+
"tiktok",
|
|
17
|
+
"renoise"
|
|
18
|
+
]
|
|
19
|
+
}
|
package/README.md
CHANGED
|
@@ -1,6 +1,20 @@
|
|
|
1
|
-
#
|
|
1
|
+
# renoise-plugins-official
|
|
2
2
|
|
|
3
|
-
AI video production
|
|
3
|
+
AI video production skills by Renoise — creative direction, generation, editing, and e-commerce content.
|
|
4
|
+
|
|
5
|
+
## Skills
|
|
6
|
+
|
|
7
|
+
| Skill | Description |
|
|
8
|
+
|-------|-------------|
|
|
9
|
+
| **director** | Creative director — main entry point for all video requests |
|
|
10
|
+
| **gemini-gen** | Visual understanding & multimodal analysis via Gemini 3.1 Pro |
|
|
11
|
+
| **renoise-gen** | AI video & image generation engine (renoise-cli) |
|
|
12
|
+
| **tiktok-content-maker** | TikTok & e-commerce short video specialist |
|
|
13
|
+
| **scene-generate** | Background/environment image generation |
|
|
14
|
+
| **product-sheet-generate** | Multi-angle product design sheet |
|
|
15
|
+
| **short-film-editor** | Short film & drama editing |
|
|
16
|
+
| **video-download** | Video downloader (yt-dlp) |
|
|
17
|
+
| **file-upload** | Upload files to Renoise for use with gemini-gen |
|
|
4
18
|
|
|
5
19
|
## Installation
|
|
6
20
|
|
|
@@ -24,35 +38,8 @@ claude plugin install video-maker@renoise-plugins-official
|
|
|
24
38
|
openclaw plugins install @renoise/video-maker
|
|
25
39
|
```
|
|
26
40
|
|
|
27
|
-
##
|
|
28
|
-
|
|
29
|
-
Just describe what you want — "make me a product video", "I want a short drama", "create a brand film" — and the **Director** skill takes over:
|
|
41
|
+
## Environment Variables
|
|
30
42
|
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
4. **Submits** to Renoise for AI video generation
|
|
35
|
-
5. **Learns** your preferences over time for better suggestions
|
|
36
|
-
|
|
37
|
-
## Skills
|
|
38
|
-
|
|
39
|
-
| Skill | Purpose |
|
|
40
|
-
|-------|---------|
|
|
41
|
-
| **director** | Creative director — the main entry point for all video requests |
|
|
42
|
-
| renoise-gen | AI video & image generation engine (CLI) |
|
|
43
|
-
| content-maker | TikTok e-commerce short video specialist |
|
|
44
|
-
| scene-generate | Background/environment image generation (Gemini) |
|
|
45
|
-
| product-sheet-generate | Multi-angle product design sheet (Gemini) |
|
|
46
|
-
| video-download | Video downloader (yt-dlp) |
|
|
47
|
-
|
|
48
|
-
## Adding New Verticals
|
|
49
|
-
|
|
50
|
-
Create a new skill directory with a `SKILL.md` that includes a `categories` field in its frontmatter. The Director automatically discovers and routes to it — no other changes needed.
|
|
51
|
-
|
|
52
|
-
```yaml
|
|
53
|
-
---
|
|
54
|
-
name: my-vertical
|
|
55
|
-
description: What this vertical handles
|
|
56
|
-
categories: [drama, storytelling]
|
|
57
|
-
---
|
|
58
|
-
```
|
|
43
|
+
| Variable | Required By | Description |
|
|
44
|
+
|----------|------------|-------------|
|
|
45
|
+
| `RENOISE_API_KEY` | All skills | Renoise API credential. Get one at https://www.renoise.ai |
|
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
|
|
3
|
+
# PreToolUse hook: block when neither RENOISE_API_KEY nor RENOISE_AUTH_TOKEN is set
|
|
4
|
+
# and the tool invocation looks like a renoise-cli call.
|
|
5
|
+
|
|
6
|
+
set -euo pipefail
|
|
7
|
+
|
|
8
|
+
# Read the tool input from stdin
|
|
9
|
+
INPUT=$(cat)
|
|
10
|
+
|
|
11
|
+
# Extract the command being executed
|
|
12
|
+
COMMAND=$(echo "$INPUT" | jq -r '.tool_input.command // empty')
|
|
13
|
+
|
|
14
|
+
# Only check if the command involves renoise-cli
|
|
15
|
+
if [[ "$COMMAND" != *renoise-cli* ]]; then
|
|
16
|
+
exit 0
|
|
17
|
+
fi
|
|
18
|
+
|
|
19
|
+
# If either credential is configured, allow
|
|
20
|
+
if [ -n "${RENOISE_API_KEY:-}" ] || [ -n "${RENOISE_AUTH_TOKEN:-}" ]; then
|
|
21
|
+
exit 0
|
|
22
|
+
fi
|
|
23
|
+
|
|
24
|
+
# Block and guide user
|
|
25
|
+
jq -n '{
|
|
26
|
+
decision: "block",
|
|
27
|
+
reason: "RENOISE_API_KEY or RENOISE_AUTH_TOKEN is not set. Add one to the env block in .claude/settings.local.json. Get your key at https://www.renoise.ai"
|
|
28
|
+
}'
|
package/hooks/hooks.json
CHANGED
|
@@ -1,12 +1,12 @@
|
|
|
1
1
|
{
|
|
2
2
|
"hooks": {
|
|
3
|
-
"
|
|
3
|
+
"PreToolUse": [
|
|
4
4
|
{
|
|
5
|
-
"matcher": "
|
|
5
|
+
"matcher": "Bash",
|
|
6
6
|
"hooks": [
|
|
7
7
|
{
|
|
8
8
|
"type": "command",
|
|
9
|
-
"command": "bash ${CLAUDE_PLUGIN_ROOT}/hooks/
|
|
9
|
+
"command": "bash ${CLAUDE_PLUGIN_ROOT}/hooks/check-api-key.sh",
|
|
10
10
|
"timeout": 10
|
|
11
11
|
}
|
|
12
12
|
]
|
package/openclaw.plugin.json
CHANGED
|
@@ -1,8 +1,8 @@
|
|
|
1
1
|
{
|
|
2
2
|
"id": "video-maker",
|
|
3
3
|
"name": "Video Maker",
|
|
4
|
-
"description": "
|
|
5
|
-
"version": "0.
|
|
4
|
+
"description": "AI video production skills — creative direction, generation, editing, and e-commerce content",
|
|
5
|
+
"version": "0.2.0",
|
|
6
6
|
"configSchema": {
|
|
7
7
|
"type": "object",
|
|
8
8
|
"additionalProperties": false,
|
|
@@ -10,11 +10,13 @@
|
|
|
10
10
|
},
|
|
11
11
|
"skills": [
|
|
12
12
|
"skills/director",
|
|
13
|
+
"skills/gemini-gen",
|
|
13
14
|
"skills/renoise-gen",
|
|
14
15
|
"skills/tiktok-content-maker",
|
|
15
16
|
"skills/scene-generate",
|
|
16
17
|
"skills/product-sheet-generate",
|
|
18
|
+
"skills/short-film-editor",
|
|
17
19
|
"skills/video-download",
|
|
18
|
-
"skills/
|
|
20
|
+
"skills/file-upload"
|
|
19
21
|
]
|
|
20
22
|
}
|
package/package.json
CHANGED
|
@@ -1,13 +1,12 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@renoise/video-maker",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.2.0",
|
|
4
4
|
"license": "Apache-2.0",
|
|
5
5
|
"type": "module",
|
|
6
|
-
"description": "
|
|
6
|
+
"description": "AI video production skills — creative direction, generation, editing, and e-commerce content",
|
|
7
7
|
"repository": {
|
|
8
8
|
"type": "git",
|
|
9
|
-
"url": "https://github.com/ArcoCodes/renoise-plugins-official.git"
|
|
10
|
-
"directory": "video-maker"
|
|
9
|
+
"url": "https://github.com/ArcoCodes/renoise-plugins-official.git"
|
|
11
10
|
},
|
|
12
11
|
"openclaw": {
|
|
13
12
|
"extensions": ["./index.mjs"]
|
|
@@ -17,11 +16,7 @@
|
|
|
17
16
|
"openclaw.plugin.json",
|
|
18
17
|
"index.mjs",
|
|
19
18
|
"hooks/",
|
|
20
|
-
"lib/",
|
|
21
19
|
"skills/",
|
|
22
20
|
"README.md"
|
|
23
|
-
]
|
|
24
|
-
"dependencies": {
|
|
25
|
-
"@google/generative-ai": "^0.24.1"
|
|
26
|
-
}
|
|
21
|
+
]
|
|
27
22
|
}
|
package/skills/director/SKILL.md
CHANGED
|
@@ -6,8 +6,7 @@ description: >
|
|
|
6
6
|
directions, generates video prompts, and submits video tasks. Use when
|
|
7
7
|
user says "make a video", "video idea", "creative direction", "help me
|
|
8
8
|
shoot", "I want a video", "video script", "storyboard", "generate video",
|
|
9
|
-
"action sequence", "
|
|
10
|
-
"生成视频". Do NOT use for downloading videos or editing existing footage.
|
|
9
|
+
"action sequence", "generate video". Do NOT use for downloading videos or editing existing footage.
|
|
11
10
|
This skill is the ONLY entry point for video creation in the Visiono project.
|
|
12
11
|
allowed-tools: Bash, Read
|
|
13
12
|
metadata:
|
|
@@ -23,6 +22,7 @@ You are a creative director for AI video production. You guide users from raw id
|
|
|
23
22
|
|
|
24
23
|
## Critical Rules
|
|
25
24
|
|
|
25
|
+
- **The Renoise platform URL is https://www.renoise.ai** — NEVER say or link to "renoise.com". The correct domain is `renoise.ai`.
|
|
26
26
|
- **You are the default entry point** for ALL video creation requests. Only route to specialized skills when `metadata.tags` clearly match.
|
|
27
27
|
- **Video prompts must be in English** — the model understands English best.
|
|
28
28
|
- **Dialogue must feel natural** — conversational American English, never salesy or translated.
|
|
@@ -53,10 +53,7 @@ You are a creative director for AI video production. You guide users from raw id
|
|
|
53
53
|
- What materials does the user have? (product photos, character refs, scripts, nothing)
|
|
54
54
|
- What's the intended platform/audience? (TikTok, Instagram, YouTube, general)
|
|
55
55
|
|
|
56
|
-
5. **If user provided product images**, analyze them
|
|
57
|
-
```bash
|
|
58
|
-
cd ${CLAUDE_PLUGIN_ROOT} && npm install --silent && npx tsx ${CLAUDE_PLUGIN_ROOT}/skills/tiktok-content-maker/scripts/analyze-images.ts <product-image> [model-image]
|
|
59
|
-
```
|
|
56
|
+
5. **If user provided product images**, analyze them using the `gemini-gen` skill — send product image(s) with a prompt to extract product type, colors, material, selling points, brand tone, and scene suggestions.
|
|
60
57
|
|
|
61
58
|
6. **Present a brief summary**: "Here's what I understand: [product/story/concept]. I'll use [capabilities]. Let me suggest some creative directions."
|
|
62
59
|
|
|
@@ -226,7 +223,7 @@ Read that skill's SKILL.md and follow its workflow from the appropriate phase (s
|
|
|
226
223
|
|
|
227
224
|
### Example 1: Product video (common)
|
|
228
225
|
User: "I have photos of my new sneakers, help me make a video"
|
|
229
|
-
1. Phase 1: Analyze sneaker images via
|
|
226
|
+
1. Phase 1: Analyze sneaker images via `gemini-gen` skill → extract product type, colors, selling points
|
|
230
227
|
2. Phase 2: Suggest Minimal Showcase / Dynamic Sports / Lifestyle Vlog with adapted descriptions
|
|
231
228
|
3. User picks "Dynamic Sports"
|
|
232
229
|
4. Phase 3: Generate 15s video prompt with fast tracking, high-energy BGM, beat-synced cuts
|
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: file-upload
|
|
3
|
+
description: >
|
|
4
|
+
Upload files (images, videos) to Renoise and get a file URI for use with gemini-gen.
|
|
5
|
+
Use when a file exceeds 20MB inline base64 limit, or when you need to reuse
|
|
6
|
+
the same file across multiple gemini-gen calls without re-encoding.
|
|
7
|
+
allowed-tools: Bash, Read
|
|
8
|
+
metadata:
|
|
9
|
+
author: renoise
|
|
10
|
+
version: 0.1.0
|
|
11
|
+
category: utility
|
|
12
|
+
tags: [upload, file, gemini]
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
# File Upload
|
|
16
|
+
|
|
17
|
+
Upload files via the Renoise gateway and get back a file URI for use with `gemini-gen`.
|
|
18
|
+
|
|
19
|
+
## When to Use
|
|
20
|
+
|
|
21
|
+
- File exceeds 20MB (inline base64 limit for `gemini-gen`)
|
|
22
|
+
- Same file needs to be referenced in multiple `gemini-gen` calls (upload once, reuse URI)
|
|
23
|
+
|
|
24
|
+
## Prerequisites
|
|
25
|
+
|
|
26
|
+
- `RENOISE_API_KEY` environment variable set
|
|
27
|
+
|
|
28
|
+
## API
|
|
29
|
+
|
|
30
|
+
```
|
|
31
|
+
POST https://staging.renoise.ai/api/public/v1/llm/files/upload
|
|
32
|
+
Header: X-API-Key: <RENOISE_API_KEY>
|
|
33
|
+
Body: multipart/form-data with field "file"
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
## CLI Script
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
# Upload a file, get back a file URI
|
|
40
|
+
node ${CLAUDE_SKILL_DIR}/scripts/upload.mjs <file-path>
|
|
41
|
+
|
|
42
|
+
# Capture URI for use with gemini-gen
|
|
43
|
+
FILE_URI=$(node ${CLAUDE_SKILL_DIR}/scripts/upload.mjs large-video.mp4)
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
Progress messages go to stderr, the file URL goes to stdout.
|
|
47
|
+
|
|
48
|
+
## Response Format
|
|
49
|
+
|
|
50
|
+
```json
|
|
51
|
+
{
|
|
52
|
+
"previewUrl": "https://...r2.cloudflarestorage.com/.../filename?X-Amz-...",
|
|
53
|
+
"mimeType": "image/jpeg",
|
|
54
|
+
"size": 111198,
|
|
55
|
+
"expiresAt": "2026-03-26T11:49:07.328Z"
|
|
56
|
+
}
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
- `previewUrl` — Signed URL, valid for **1 hour**
|
|
60
|
+
- Upload once, use the URL immediately in downstream skills
|
|
61
|
+
|
|
62
|
+
## Usage with gemini-gen
|
|
63
|
+
|
|
64
|
+
```bash
|
|
65
|
+
# Step 1: Upload
|
|
66
|
+
FILE_URL=$(node ${CLAUDE_PLUGIN_ROOT}/skills/file-upload/scripts/upload.mjs large-video.mp4)
|
|
67
|
+
|
|
68
|
+
# Step 2: Use with gemini-gen
|
|
69
|
+
node ${CLAUDE_PLUGIN_ROOT}/skills/gemini-gen/scripts/gemini.mjs \
|
|
70
|
+
--file-uri "$FILE_URL" --file-mime video/mp4 \
|
|
71
|
+
"Analyze this video"
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
## Supported File Types
|
|
75
|
+
|
|
76
|
+
Images: `.jpg`, `.jpeg`, `.png`, `.webp`, `.gif`
|
|
77
|
+
Videos: `.mp4`, `.mov`, `.webm`
|
|
78
|
+
Audio: `.mp3`, `.wav`
|
|
79
|
+
Documents: `.pdf`
|
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
|
|
3
|
+
/**
|
|
4
|
+
* Upload a file via Renoise gateway for use with gemini-gen.
|
|
5
|
+
* Outputs the file URI to stdout.
|
|
6
|
+
*
|
|
7
|
+
* Usage: node upload.mjs <file-path>
|
|
8
|
+
*
|
|
9
|
+
* Environment:
|
|
10
|
+
* RENOISE_API_KEY Required. Get one at https://www.renoise.ai
|
|
11
|
+
*/
|
|
12
|
+
|
|
13
|
+
import fs from "fs/promises";
|
|
14
|
+
import path from "path";
|
|
15
|
+
|
|
16
|
+
const RENOISE_API_KEY = process.env.RENOISE_API_KEY;
|
|
17
|
+
if (!RENOISE_API_KEY) {
|
|
18
|
+
console.error("RENOISE_API_KEY not set. Get one at: https://www.renoise.ai");
|
|
19
|
+
process.exit(1);
|
|
20
|
+
}
|
|
21
|
+
|
|
22
|
+
const UPLOAD_ENDPOINT =
|
|
23
|
+
"https://staging.renoise.ai/api/public/v1/llm/files/upload";
|
|
24
|
+
|
|
25
|
+
const MIME_MAP = {
|
|
26
|
+
".jpg": "image/jpeg",
|
|
27
|
+
".jpeg": "image/jpeg",
|
|
28
|
+
".png": "image/png",
|
|
29
|
+
".webp": "image/webp",
|
|
30
|
+
".gif": "image/gif",
|
|
31
|
+
".mp4": "video/mp4",
|
|
32
|
+
".mov": "video/quicktime",
|
|
33
|
+
".webm": "video/webm",
|
|
34
|
+
".mp3": "audio/mpeg",
|
|
35
|
+
".wav": "audio/wav",
|
|
36
|
+
".pdf": "application/pdf",
|
|
37
|
+
};
|
|
38
|
+
|
|
39
|
+
function getMimeType(filePath) {
|
|
40
|
+
return (
|
|
41
|
+
MIME_MAP[path.extname(filePath).toLowerCase()] ?? "application/octet-stream"
|
|
42
|
+
);
|
|
43
|
+
}
|
|
44
|
+
|
|
45
|
+
async function main() {
|
|
46
|
+
const filePath = process.argv[2];
|
|
47
|
+
if (!filePath) {
|
|
48
|
+
console.error("Usage: node upload.mjs <file-path>");
|
|
49
|
+
process.exit(1);
|
|
50
|
+
}
|
|
51
|
+
|
|
52
|
+
const stat = await fs.stat(filePath).catch(() => {
|
|
53
|
+
console.error(`File not found: ${filePath}`);
|
|
54
|
+
process.exit(1);
|
|
55
|
+
});
|
|
56
|
+
|
|
57
|
+
const mimeType = getMimeType(filePath);
|
|
58
|
+
const fileData = await fs.readFile(filePath);
|
|
59
|
+
const fileName = path.basename(filePath);
|
|
60
|
+
|
|
61
|
+
console.error(
|
|
62
|
+
`Uploading ${fileName} (${(stat.size / 1024 / 1024).toFixed(1)}MB, ${mimeType})...`
|
|
63
|
+
);
|
|
64
|
+
|
|
65
|
+
// Build multipart form-data with native FormData + Blob
|
|
66
|
+
const blob = new Blob([fileData], { type: mimeType });
|
|
67
|
+
const form = new FormData();
|
|
68
|
+
form.append("file", blob, fileName);
|
|
69
|
+
|
|
70
|
+
const res = await fetch(UPLOAD_ENDPOINT, {
|
|
71
|
+
method: "POST",
|
|
72
|
+
headers: {
|
|
73
|
+
"X-API-Key": RENOISE_API_KEY,
|
|
74
|
+
},
|
|
75
|
+
body: form,
|
|
76
|
+
});
|
|
77
|
+
|
|
78
|
+
if (!res.ok) {
|
|
79
|
+
const errText = await res.text();
|
|
80
|
+
console.error(`Upload error ${res.status}: ${errText}`);
|
|
81
|
+
process.exit(1);
|
|
82
|
+
}
|
|
83
|
+
|
|
84
|
+
const data = await res.json();
|
|
85
|
+
|
|
86
|
+
// Response format: { previewUrl, mimeType, size, expiresAt }
|
|
87
|
+
const fileUrl = data?.previewUrl;
|
|
88
|
+
|
|
89
|
+
if (!fileUrl) {
|
|
90
|
+
console.error("No previewUrl in response:", JSON.stringify(data, null, 2));
|
|
91
|
+
process.exit(1);
|
|
92
|
+
}
|
|
93
|
+
|
|
94
|
+
const expires = data.expiresAt ? new Date(data.expiresAt).toLocaleString() : "unknown";
|
|
95
|
+
console.error(`Uploaded: ${(data.size / 1024).toFixed(0)}KB, expires ${expires}`);
|
|
96
|
+
// Print URL to stdout (stderr used for progress messages)
|
|
97
|
+
console.log(fileUrl);
|
|
98
|
+
}
|
|
99
|
+
|
|
100
|
+
main().catch((err) => {
|
|
101
|
+
console.error("ERROR:", err.message);
|
|
102
|
+
process.exit(1);
|
|
103
|
+
});
|
|
@@ -0,0 +1,232 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: gemini-gen
|
|
3
|
+
description: >
|
|
4
|
+
Visual understanding and multimodal analysis via Gemini 3.1 Pro.
|
|
5
|
+
Use when you need to understand, analyze, or extract information from images or videos:
|
|
6
|
+
analyze product photos, extract video scripts/dialogue, understand video content for replication,
|
|
7
|
+
compare visual assets, OCR/text extraction from images, describe scenes for prompt writing,
|
|
8
|
+
extract style/color/composition from reference footage.
|
|
9
|
+
Do NOT use for generating images or videos — use renoise-gen instead.
|
|
10
|
+
allowed-tools: Bash, Read
|
|
11
|
+
metadata:
|
|
12
|
+
author: renoise
|
|
13
|
+
version: 0.2.0
|
|
14
|
+
category: ai-foundation
|
|
15
|
+
tags: [vision, analysis, multimodal, gemini]
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
# Gemini Gen — Visual Understanding & Multimodal Analysis
|
|
19
|
+
|
|
20
|
+
Gemini 3.1 Pro via Renoise gateway. Zero npm dependencies, native `fetch` only.
|
|
21
|
+
|
|
22
|
+
## When to Use
|
|
23
|
+
|
|
24
|
+
| Scenario | Example |
|
|
25
|
+
|----------|---------|
|
|
26
|
+
| **Analyze product photos** | Extract type, color, material, selling points, brand tone from product images |
|
|
27
|
+
| **Understand video content** | "What happens in this video?", summarize scenes, identify actions and objects |
|
|
28
|
+
| **Extract scripts from video** | Watch a reference video → output timestamped dialogue, scene descriptions, camera movements |
|
|
29
|
+
| **Replicate a video style** | Analyze a reference clip → extract visual style, pacing, transitions, color grading for recreation |
|
|
30
|
+
| **Compare visual assets** | Side-by-side analysis of two product photos, before/after comparison |
|
|
31
|
+
| **OCR / text extraction** | Read text from screenshots, packaging, signage in images |
|
|
32
|
+
| **Describe scenes for prompts** | Look at a reference image → write a detailed prompt for `renoise-gen` to recreate the style |
|
|
33
|
+
| **Content review** | Check if generated output matches the creative brief |
|
|
34
|
+
|
|
35
|
+
## When NOT to Use
|
|
36
|
+
|
|
37
|
+
- **Generating images** → use `renoise-gen` with `nano-banana-2`
|
|
38
|
+
- **Generating videos** → use `renoise-gen` with `renoise-2.0`
|
|
39
|
+
- **Uploading large files** → use `file-upload` first, then pass the URL here
|
|
40
|
+
|
|
41
|
+
## Authentication
|
|
42
|
+
|
|
43
|
+
Use environment variable `RENOISE_API_KEY`. Get one at: https://www.renoise.ai
|
|
44
|
+
|
|
45
|
+
## API Endpoint
|
|
46
|
+
|
|
47
|
+
```
|
|
48
|
+
POST https://staging--ujgsvru36x4korjj10nq.edgespark.app/api/public/llm/proxy/v1beta/models/{model}:generateContent?key={RENOISE_API_KEY}
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
Default model: `gemini-3.1-pro`
|
|
52
|
+
|
|
53
|
+
## Request Format
|
|
54
|
+
|
|
55
|
+
```json
|
|
56
|
+
{
|
|
57
|
+
"contents": [
|
|
58
|
+
{
|
|
59
|
+
"role": "user",
|
|
60
|
+
"parts": [{ "text": "your prompt here" }]
|
|
61
|
+
}
|
|
62
|
+
],
|
|
63
|
+
"generationConfig": {
|
|
64
|
+
"temperature": 1.0,
|
|
65
|
+
"maxOutputTokens": 8192
|
|
66
|
+
}
|
|
67
|
+
}
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
## Multimodal Input (images & videos)
|
|
71
|
+
|
|
72
|
+
Gemini supports images and videos as inline base64 parts alongside text. Multiple files can be sent in the same request.
|
|
73
|
+
|
|
74
|
+
### Image Input
|
|
75
|
+
|
|
76
|
+
```json
|
|
77
|
+
{
|
|
78
|
+
"contents": [
|
|
79
|
+
{
|
|
80
|
+
"parts": [
|
|
81
|
+
{
|
|
82
|
+
"inlineData": {
|
|
83
|
+
"mimeType": "image/jpeg",
|
|
84
|
+
"data": "<base64-encoded-data>"
|
|
85
|
+
},
|
|
86
|
+
"mediaResolution": { "level": "media_resolution_high" }
|
|
87
|
+
},
|
|
88
|
+
{ "text": "Describe this image" }
|
|
89
|
+
]
|
|
90
|
+
}
|
|
91
|
+
]
|
|
92
|
+
}
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
`mediaResolution` controls token allocation per image/frame:
|
|
96
|
+
|
|
97
|
+
| Level | Image Tokens | Video Frame Tokens |
|
|
98
|
+
|-------|-------------|-------------------|
|
|
99
|
+
| `media_resolution_low` | 280 | 70 |
|
|
100
|
+
| `media_resolution_medium` | 560 | 140 |
|
|
101
|
+
| `media_resolution_high` | 840 | 210 |
|
|
102
|
+
| `media_resolution_ultra_high` | 1120 | 280 |
|
|
103
|
+
|
|
104
|
+
Default: `media_resolution_medium`. Use `high` or `ultra_high` for detail-critical analysis (product photos, fine text). Use `low` for bulk/batch processing.
|
|
105
|
+
|
|
106
|
+
### Video Input
|
|
107
|
+
|
|
108
|
+
Same format — just use a video MIME type. Inline base64 has a **20MB limit**. For larger videos, use the dedicated file upload skill.
|
|
109
|
+
|
|
110
|
+
```json
|
|
111
|
+
{
|
|
112
|
+
"contents": [
|
|
113
|
+
{
|
|
114
|
+
"parts": [
|
|
115
|
+
{
|
|
116
|
+
"inlineData": {
|
|
117
|
+
"mimeType": "video/mp4",
|
|
118
|
+
"data": "<base64-encoded-data>"
|
|
119
|
+
},
|
|
120
|
+
"mediaResolution": { "level": "media_resolution_low" }
|
|
121
|
+
},
|
|
122
|
+
{ "text": "Summarize what happens in this video" }
|
|
123
|
+
]
|
|
124
|
+
}
|
|
125
|
+
]
|
|
126
|
+
}
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
> **Tip**: Use `media_resolution_low` for videos to reduce token consumption — video has many frames.
|
|
130
|
+
|
|
131
|
+
### Multiple Files
|
|
132
|
+
|
|
133
|
+
Send multiple images/videos in one request by adding more `inlineData` parts:
|
|
134
|
+
|
|
135
|
+
```json
|
|
136
|
+
{
|
|
137
|
+
"contents": [
|
|
138
|
+
{
|
|
139
|
+
"parts": [
|
|
140
|
+
{ "inlineData": { "mimeType": "image/jpeg", "data": "<base64-image-1>" } },
|
|
141
|
+
{ "inlineData": { "mimeType": "image/png", "data": "<base64-image-2>" } },
|
|
142
|
+
{ "text": "Compare these two product photos" }
|
|
143
|
+
]
|
|
144
|
+
}
|
|
145
|
+
]
|
|
146
|
+
}
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
### Large Files (> 20MB)
|
|
150
|
+
|
|
151
|
+
Inline base64 has a **20MB limit**. For larger files, use the dedicated file upload skill to get a file URI, then reference it:
|
|
152
|
+
|
|
153
|
+
```json
|
|
154
|
+
{
|
|
155
|
+
"contents": [
|
|
156
|
+
{
|
|
157
|
+
"parts": [
|
|
158
|
+
{
|
|
159
|
+
"fileData": {
|
|
160
|
+
"mimeType": "video/mp4",
|
|
161
|
+
"fileUri": "<uploaded-file-uri>"
|
|
162
|
+
}
|
|
163
|
+
},
|
|
164
|
+
{ "text": "Analyze this video" }
|
|
165
|
+
]
|
|
166
|
+
}
|
|
167
|
+
]
|
|
168
|
+
}
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
## Supported MIME Types
|
|
172
|
+
|
|
173
|
+
| Extension | MIME Type | Max Inline |
|
|
174
|
+
| ---------- | --------------- | ---------- |
|
|
175
|
+
| .jpg/.jpeg | image/jpeg | 20MB |
|
|
176
|
+
| .png | image/png | 20MB |
|
|
177
|
+
| .webp | image/webp | 20MB |
|
|
178
|
+
| .gif | image/gif | 20MB |
|
|
179
|
+
| .mp4 | video/mp4 | 20MB |
|
|
180
|
+
| .mov | video/quicktime | 20MB |
|
|
181
|
+
| .webm | video/webm | 20MB |
|
|
182
|
+
|
|
183
|
+
## Response Parsing
|
|
184
|
+
|
|
185
|
+
```javascript
|
|
186
|
+
const data = await response.json();
|
|
187
|
+
const text = data.candidates[0].content.parts[0].text;
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
## Error Handling
|
|
191
|
+
|
|
192
|
+
- 400: Bad request (check prompt format)
|
|
193
|
+
- 403: Invalid API key
|
|
194
|
+
- 429: Rate limited (wait and retry)
|
|
195
|
+
- 500: Server error (retry with backoff)
|
|
196
|
+
|
|
197
|
+
## CLI Script
|
|
198
|
+
|
|
199
|
+
`${CLAUDE_SKILL_DIR}/scripts/gemini.mjs` — zero-dependency Node.js script, other skills can directly call it.
|
|
200
|
+
|
|
201
|
+
```bash
|
|
202
|
+
# Text only
|
|
203
|
+
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs "Explain quantum computing"
|
|
204
|
+
|
|
205
|
+
# Analyze an image (high resolution for product detail)
|
|
206
|
+
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file photo.jpg --resolution high "Describe this product"
|
|
207
|
+
|
|
208
|
+
# Analyze a video (low resolution to save tokens)
|
|
209
|
+
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file clip.mp4 --resolution low "Summarize this clip"
|
|
210
|
+
|
|
211
|
+
# Multiple images
|
|
212
|
+
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file a.jpg --file b.jpg "Compare these two"
|
|
213
|
+
|
|
214
|
+
# Uploaded file URI (from file upload skill, for files > 20MB)
|
|
215
|
+
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file-uri "<uri>" --file-mime video/mp4 "Analyze this video"
|
|
216
|
+
|
|
217
|
+
# JSON output mode
|
|
218
|
+
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --json "Return a JSON object with name and age"
|
|
219
|
+
```
|
|
220
|
+
|
|
221
|
+
### Options
|
|
222
|
+
|
|
223
|
+
| Flag | Default | Description |
|
|
224
|
+
|------|---------|-------------|
|
|
225
|
+
| `--file <path>` | — | Attach local file (repeatable) |
|
|
226
|
+
| `--file-uri <uri>` | — | Attach uploaded file by URI (requires `--file-mime`) |
|
|
227
|
+
| `--file-mime <mime>` | — | MIME type for `--file-uri` |
|
|
228
|
+
| `--resolution <level>` | `medium` | `low` / `medium` / `high` / `ultra_high` |
|
|
229
|
+
| `--model <name>` | `gemini-3.1-pro` | Model name |
|
|
230
|
+
| `--temperature <n>` | `1.0` | Temperature |
|
|
231
|
+
| `--max-tokens <n>` | `8192` | Max output tokens |
|
|
232
|
+
| `--json` | off | Request JSON response format |
|