@mevdragon/vidfarm-devcli 0.2.1 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/.env.example +6 -39
  2. package/GETTING_STARTED.developers.md +87 -0
  3. package/README.md +94 -238
  4. package/SKILL.developer.md +430 -104
  5. package/dist/src/account-pages.js +1 -1
  6. package/dist/src/app.js +93 -5
  7. package/dist/src/cli.js +456 -8
  8. package/dist/src/config.js +3 -2
  9. package/dist/src/context.js +30 -11
  10. package/dist/src/db.js +2 -57
  11. package/dist/src/dev-app.js +0 -1
  12. package/dist/src/index.js +4 -2
  13. package/dist/src/lib/template-paths.js +21 -0
  14. package/dist/src/runtime.js +3 -1
  15. package/dist/src/services/auth.js +4 -4
  16. package/dist/src/services/job-logs.js +186 -0
  17. package/dist/src/services/jobs.js +3 -2
  18. package/dist/src/services/providers.js +14 -6
  19. package/dist/src/services/storage.js +85 -2
  20. package/dist/src/services/template-sources.js +29 -3
  21. package/dist/templates/template_0000/src/lib/images.js +46 -86
  22. package/dist/templates/template_0000/src/template.js +277 -53
  23. package/package.json +5 -6
  24. package/templates/template_0000/README.md +8 -52
  25. package/templates/template_0000/SKILL.md +35 -3
  26. package/templates/template_0000/package.json +3 -6
  27. package/templates/template_0000/src/lib/images.js +46 -86
  28. package/templates/template_0000/src/lib/images.ts +55 -98
  29. package/templates/template_0000/src/template-dna.js +9 -0
  30. package/templates/template_0000/src/template.js +523 -199
  31. package/templates/template_0000/src/template.ts +356 -61
  32. package/templates/template_0000/template.config.json +7 -12
  33. package/AWS_REMOTION_HANDOFF.md +0 -311
  34. package/PLATFORM_SPEC.md +0 -1039
  35. package/SKILL.director.md +0 -599
  36. package/dist/infra/cdk/bin/vidfarm-prod.js +0 -59
  37. package/dist/infra/cdk/lib/vidfarm-prod-stack.js +0 -212
  38. package/templates/template_0000/package-lock.json +0 -5505
  39. package/templates/template_0000/scripts/create-site.mjs +0 -27
  40. package/templates/template_0000/scripts/render-cloud.mjs +0 -72
package/PLATFORM_SPEC.md DELETED
@@ -1,1039 +0,0 @@
1
- # Vidfarm Platform Spec
2
-
3
- This document defines the initial platform architecture for Vidfarm running on a single Dockerized EC2 host.
4
-
5
- The goal is to let in-house developers build new video production templates quickly while the platform centrally owns auth, billing, job orchestration, customer state, and deployment.
6
-
7
- For template code distribution in v1.1, the platform should support GitHub-backed in-house template sources, but production activation must still be manual, admin-approved, and pinned to a specific commit SHA.
8
-
9
- ## Goals
10
-
11
- 1. Support multiple video production patterns behind one consistent API.
12
- 2. Treat every operation as an async job that immediately returns `job_id`.
13
- 3. Let template developers write normal TypeScript/Node code with normal npm dependencies.
14
- 4. Keep the first production deployment simple enough to run on one EC2 Docker host.
15
- 5. Preserve a clean path to later split heavy workloads into isolated workers or separate services.
16
-
17
- ## Non-Goals
18
-
19
- 1. Public marketplace for third-party user-submitted templates.
20
- 2. Multi-region or scale-to-zero serverless deployment in v1.
21
- 3. Full microservice isolation for every template.
22
- 4. Perfect cost attribution in v1. The first target is safe, conservative billing that protects gross margin.
23
-
24
- ## Core Decision
25
-
26
- Vidfarm should run as one shared platform container on EC2.
27
-
28
- Templates should be packaged as internal code modules loaded by that platform, not as separately deployed HTTP services by default.
29
- Platform runtime code should live under `src/*`, while template implementation code should live outside the platform tree under `templates/<template-folder>/*`.
30
-
31
- This gives us:
32
-
33
- 1. One auth and billing boundary.
34
- 2. One job table and one queueing system.
35
- 3. One deployment artifact for the normal case.
36
- 4. Full npm freedom for template developers.
37
- 5. A cleaner developer experience than forcing every template to become its own service.
38
-
39
- Templates may still opt into isolated execution later when they have special requirements such as native binaries, unusually high memory use, or independent scaling needs.
40
-
41
- ## Template Source Of Truth
42
-
43
- Template code may live in GitHub, but the platform should not auto-pull floating branch heads into production.
44
-
45
- Approved v1.1 release model:
46
-
47
- 1. developer keeps template code in GitHub
48
- 2. the default import branch is `production`
49
- 3. admin manually reviews the repo and chooses when to import
50
- 4. admin publishes the approved Remotion site bundle if the template needs Remotion
51
- 5. platform resolves the chosen branch head to a commit SHA
52
- 6. platform builds and certifies that exact commit
53
- 7. platform activates a pinned release record for that commit
54
- 8. admin rebuilds and redeploys the production Docker image with the approved release set
55
-
56
- Live platform state should therefore point to:
57
-
58
- - repo URL
59
- - branch name
60
- - exact commit SHA
61
- - certification result
62
- - active/inactive release state
63
-
64
- It should not point only to a floating branch name.
65
-
66
- Release authority should be centralized:
67
-
68
- - developers can push source code to GitHub
69
- - developers cannot directly publish to shared Remotion AWS
70
- - developers cannot directly promote templates into production Docker
71
- - shared AWS publish, activation, and production deployment are admin-only steps
72
-
73
- ## Initial Runtime Choice
74
-
75
- ### Production Runtime
76
-
77
- - Node.js 22
78
- - TypeScript
79
- - Hono for the HTTP API
80
- - SQLite for v1 job state and queue state
81
- - S3 for customer files and generated artifacts
82
- - Remotion Lambda for final render workloads where appropriate
83
-
84
- ### Why Hono
85
-
86
- Hono is a good fit for the control plane:
87
-
88
- 1. Small and fast.
89
- 2. Strong middleware model.
90
- 3. Good TypeScript ergonomics.
91
- 4. Easy to keep the HTTP layer thin while most complexity lives in jobs and template execution.
92
-
93
- ### Why Not Bun for v1 Runtime
94
-
95
- Bun is not forbidden, but it should not be the production baseline for v1.
96
-
97
- Reasons:
98
-
99
- 1. The platform already assumes Node-oriented Docker execution.
100
- 2. AI SDK compatibility and native package behavior are more predictable on Node.
101
- 3. Remotion and adjacent tooling are safer on the Node compatibility path.
102
- 4. The hard part of this system is orchestration correctness, not JavaScript runtime speed.
103
-
104
- If desired, Bun can be evaluated later as a local development runner or for specific internal tools.
105
-
106
- ## Supported Production Patterns
107
-
108
- The platform must support all of the following under one model:
109
-
110
- 1. Pure AI multi-stage production.
111
- 2. Remotion render pipelines.
112
- 3. Hybrid research plus render pipelines.
113
- 4. Animated storytelling pipelines.
114
-
115
- The common abstraction is:
116
-
117
- 1. Customer hits a template endpoint.
118
- 2. Platform validates auth and input.
119
- 3. Platform creates an async job.
120
- 4. Worker executes the requested operation.
121
- 5. Customer polls job state or receives a webhook.
122
-
123
- ## Architectural Overview
124
-
125
- ```txt
126
- Client
127
- -> Hono API
128
- -> Auth / Billing / Template Registry / Job Creation
129
- -> SQLite (jobs, logs, queue, rate-limit state, customer metadata pointers)
130
- -> Worker Loop in same container
131
- -> External providers (OpenAI, Gemini, OpenRouter, Perplexity, etc.)
132
- -> S3 (workspace files, stage artifacts, final outputs)
133
- -> Remotion Lambda (when render pipeline requires it)
134
- ```
135
-
136
- Initial deployment is one process image with two logical responsibilities:
137
-
138
- 1. API server
139
- 2. Background worker / dispatcher
140
-
141
- These can run in the same container in v1. If needed later, they can be split into separate process types using the same codebase.
142
-
143
- ## API Principles
144
-
145
- All template operations are async-first.
146
-
147
- Even if a task could finish quickly, the platform should still prefer job creation so the customer sees one consistent model.
148
-
149
- ### Base Path
150
-
151
- ```txt
152
- /templates/:templateId/*
153
- ```
154
-
155
- ### Core Endpoints
156
-
157
- ```txt
158
- GET /templates/:templateId
159
- GET /templates/:templateId/about/*
160
- GET /templates/:templateId/skill
161
- POST /templates/:templateId/config
162
- POST /templates/:templateId/operations/:operationName
163
- GET /templates/:templateId/jobs
164
- GET /templates/:templateId/jobs/:jobId
165
- GET /templates/:templateId/jobs/:jobId/logs
166
- POST /templates/:templateId/jobs/:jobId/cancel
167
- ```
168
-
169
- ### Request Headers
170
-
171
- ```txt
172
- vidfarm-user-id: string
173
- vidfarm-api-key: string
174
- ```
175
-
176
- ### Job Creation Request Shape
177
-
178
- ```json
179
- {
180
- "tracer": "client-generated-string",
181
- "payload": {}
182
- }
183
- ```
184
-
185
- ### Job Creation Response Shape
186
-
187
- ```json
188
- {
189
- "job_id": "job_xxx",
190
- "tracer": "client-generated-string",
191
- "status": "queued"
192
- }
193
- ```
194
-
195
- ### Template Metadata Response Shape
196
-
197
- ```json
198
- {
199
- "id": "4c7a7e1a-7f35-4f30-9f86-9c8a63c7f2db",
200
- "slug_id": "template_0000",
201
- "version": "1.0.0",
202
- "title": "Template 0000",
203
- "description": "Short-form slideshow pipeline",
204
- "viral_dna": "Fast TikTok slideshow hooks with mobile-native pacing.",
205
- "preview_media": [
206
- "https://api.example.com/templates/4c7a7e1a-7f35-4f30-9f86-9c8a63c7f2db/about/preview-01.jpg"
207
- ],
208
- "link_to_original": "https://www.tiktok.com/@example/video/1234567890",
209
- "skill_url": "https://api.example.com/templates/4c7a7e1a-7f35-4f30-9f86-9c8a63c7f2db/skill",
210
- "operations": [
211
- {
212
- "name": "create_slideshow",
213
- "description": "Generate slideshow frames.",
214
- "providerHint": "openrouter"
215
- }
216
- ]
217
- }
218
- ```
219
-
220
- `GET /templates/:templateId/about/*` should expose template metadata assets such as preview images or videos. In storage, these assets should use the stable logical prefix `templates/:templateId/about/*`, whether the backing store is local disk or S3.
221
-
222
- Template definitions should expose `preview_media` as absolute HTTPS URLs. When the backing store is S3, those entries should point directly at the S3 object for assets stored under `templates/:templateId/about/*`.
223
-
224
- ### Job List Filtering
225
-
226
- `GET /templates/:templateId/jobs` should support:
227
-
228
- - `tracer`
229
- - `start_time`
230
- - `end_time`
231
- - `limit`
232
-
233
- `GET /templates/:templateId/jobs/:jobId/logs` should use the same time window language:
234
-
235
- - `start_time`
236
- - `end_time`
237
- - `limit`
238
-
239
- `logs_from` is deprecated and should not be used in the standard.
240
-
241
- `GET /templates/:templateId` is the template "about" response and must also return:
242
-
243
- - `slug_id: string`
244
- - `title: string`
245
- - `viral_dna: string`
246
- - `preview_media: string[]`
247
- - `link_to_original: string`
248
-
249
- In the template definition standard, this response `title` and `description` should be sourced from `template.about.title` and `template.about.description`, not top-level template fields.
250
-
251
- ## Template Model
252
-
253
- Templates are normal TypeScript packages with unrestricted internal structure and should live under a repo-level `templates/` directory, for example `templates/template_0000/*`.
254
-
255
- They may:
256
-
257
- 1. Import npm libraries.
258
- 2. Define helper modules.
259
- 3. Bundle prompt files.
260
- 4. Include Remotion compositions.
261
- 5. Call provider SDKs.
262
- 6. Run arbitrary internal orchestration logic.
263
-
264
- The framework should not force templates into a single-file callback model.
265
-
266
- ### Template Contract
267
-
268
- The external API surface should be defined as operations, not just raw stage names.
269
-
270
- Suggested shape:
271
-
272
- ```ts
273
- export const myTemplate = defineTemplate({
274
- id: "123e4567-e89b-42d3-a456-426614174000",
275
- slugId: "ugc_voiceover_v1",
276
- version: "1.0.0",
277
- about: {
278
- title: "UGC Voiceover V1",
279
- description: "Short-form UGC voiceover pipeline",
280
- viral_dna: "Fast hooks, native pacing, and repeatable creator-style framing.",
281
- preview_media: ["https://cdn.example.com/templates/ugc-voiceover-v1/about/preview-01.mp4"],
282
- link_to_original: "https://www.tiktok.com/@example/video/1234567890"
283
- },
284
- configSchema: z.object({
285
- defaultProvider: z.enum(["openai", "gemini", "openrouter", "perplexity"]).default("openai")
286
- }),
287
-
288
- operations: {
289
- scaffold: {
290
- description: "Generate a script scaffold.",
291
- inputSchema: z.object({
292
- topic: z.string()
293
- }),
294
- workflow: "scaffoldWorkflow",
295
- providerHint: "openai",
296
- webhookSupport: true
297
- },
298
- render: {
299
- description: "Submit final render work.",
300
- inputSchema: z.object({
301
- storyboardId: z.string()
302
- }),
303
- workflow: "renderWorkflow",
304
- webhookSupport: true
305
- }
306
- },
307
-
308
- jobs: {
309
- async scaffoldWorkflow(ctx, input) {
310
- return {
311
- progress: 1,
312
- output: {}
313
- };
314
- },
315
- async renderWorkflow(ctx, input) {
316
- return {
317
- progress: 1,
318
- output: {}
319
- };
320
- }
321
- }
322
- });
323
- ```
324
-
325
- ### Why This Contract
326
-
327
- This separates:
328
-
329
- 1. Public API entrypoints.
330
- 2. Internal workflow implementation.
331
- 3. Template metadata and validation.
332
-
333
- It gives template developers full control over their workflow logic while keeping the platform contract stable.
334
-
335
- ## Template Execution Context
336
-
337
- Each operation or job should receive a framework-owned context object.
338
-
339
- Suggested capabilities:
340
-
341
- ```ts
342
- interface TemplateJobContext {
343
- env: "development" | "production";
344
- customer: CustomerContext;
345
- templateConfig: Record<string, unknown>;
346
- logger: {
347
- debug(message: string, metadata?: Record<string, unknown>): void;
348
- info(message: string, metadata?: Record<string, unknown>): void;
349
- warn(message: string, metadata?: Record<string, unknown>): void;
350
- error(message: string, metadata?: Record<string, unknown>): void;
351
- progress(progress: number, message: string, metadata?: Record<string, unknown>): void;
352
- };
353
- jobs: {
354
- enqueueChild(input: {
355
- operationName: string;
356
- workflowName: string;
357
- payload: Record<string, unknown>;
358
- providerHint?: ProviderType;
359
- }): Promise<{ jobId: string }>;
360
- };
361
- storage: {
362
- putJson(key: string, value: unknown): Promise<{ key: string; url: string | null }>;
363
- putText(key: string, value: string, contentType?: string): Promise<{ key: string; url: string | null }>;
364
- putBuffer(
365
- key: string,
366
- value: Uint8Array,
367
- options?: { contentType?: string; kind?: string; metadata?: Record<string, unknown> }
368
- ): Promise<{ key: string; url: string | null }>;
369
- getPublicUrl(key: string): string | null;
370
- };
371
- billing: {
372
- record(input: {
373
- type: "ai_generation" | "render" | "storage_write" | "cpu_estimate";
374
- costUsd: number;
375
- chargeUsd?: number;
376
- metadata?: Record<string, unknown>;
377
- }): Promise<void>;
378
- };
379
- providers: {
380
- generateText(input: {
381
- provider: ProviderType;
382
- model: string;
383
- prompt: string;
384
- temperature?: number;
385
- }): Promise<{ text: string }>;
386
- generateImage(input: {
387
- provider: ProviderType;
388
- model: string;
389
- prompt: string;
390
- size?: string;
391
- }): Promise<{ bytes: Uint8Array; contentType: string; revisedPrompt: string | null }>;
392
- analyzeImageLayout(input: {
393
- provider: ProviderType;
394
- model: string;
395
- imageUrl: string;
396
- overlayText: string;
397
- }): Promise<{
398
- zone: "top" | "center" | "bottom";
399
- align: "left" | "center" | "right";
400
- maxWidthPercent: number;
401
- justification: string;
402
- }>;
403
- };
404
- remotion: {
405
- render(input: {
406
- compositionId: string;
407
- serveUrl?: string;
408
- entryPoint?: string;
409
- outputKey?: string;
410
- inputProps: Record<string, unknown>;
411
- }): Promise<{ renderId: string; outputUrl: string | null; metadata: Record<string, unknown> }>;
412
- };
413
- }
414
- ```
415
-
416
- Framework-owned context capabilities should include:
417
-
418
- 1. Resolving customer AI keys safely.
419
- 2. Writing artifacts through a stable storage prefix.
420
- 3. Enqueuing child jobs.
421
- 4. Emitting logs and progress.
422
- 5. Recording billable events.
423
- 6. Submitting downstream renders through a Remotion adapter.
424
- 7. Calling provider adapters through centralized rate-limit enforcement.
425
-
426
- ## Environment Behavior
427
-
428
- The platform must clearly distinguish development from production.
429
-
430
- ### Development
431
-
432
- Developer-owned API keys from local `.env` are allowed.
433
-
434
- This is for:
435
-
436
- 1. Local testing.
437
- 2. Template development.
438
- 3. Dry-running internal workflows before deployment.
439
-
440
- Template authors should also be able to run the same certification harness locally through the developer CLI before admin review.
441
-
442
- ### Production
443
-
444
- The platform must use customer-owned provider keys stored in the customer profile when the template requests external AI inference on behalf of that customer.
445
-
446
- Platform-controlled keys may still exist for:
447
-
448
- 1. Platform-level fallback behavior.
449
- 2. Internal moderation or diagnostics.
450
- 3. Emergency operations.
451
-
452
- But customer-billed workloads should default to customer-owned keys when available.
453
-
454
- Admin-only template source import and activation are allowed in production. Developer template changes should not go live until an admin explicitly imports and activates a pinned release.
455
-
456
- ## Developer And Admin Auth Model
457
-
458
- The platform’s API auth remains customer-style `vidfarm-user-id` plus `vidfarm-api-key`, but the platform must also support an admin allowlist for template-source management endpoints.
459
-
460
- Minimum v1.1 rule:
461
-
462
- 1. all normal template execution uses standard platform auth
463
- 2. template source registration, import, and activation endpoints are admin-only
464
- 3. admin authorization may begin as an allowlist of trusted emails
465
-
466
- This keeps v1 simple while still distinguishing runtime customers from internal release operators.
467
-
468
- ## Customer Profile Model
469
-
470
- Each customer profile should support:
471
-
472
- 1. Multiple provider API keys.
473
- 2. Multiple keys per provider.
474
- 3. Workspace file storage references.
475
- 4. Webhook destinations.
476
- 5. Billing preferences and limits.
477
-
478
- Suggested provider key record:
479
-
480
- ```ts
481
- interface CustomerProviderKey {
482
- id: string;
483
- provider: "openai" | "gemini" | "openrouter" | "perplexity";
484
- encryptedSecret: string;
485
- label?: string;
486
- status: "active" | "paused" | "rate_limited" | "invalid";
487
- lastUsedAt?: string;
488
- cooldownUntil?: string;
489
- }
490
- ```
491
-
492
- Customers may store multiple keys for the same provider. The platform should treat those keys as a small pooled resource that jobs must acquire before making outbound AI requests.
493
-
494
- ## Queueing and Async Jobs
495
-
496
- The platform is async-native.
497
-
498
- Every operation should create a job record and return immediately.
499
-
500
- ### Job State
501
-
502
- Suggested states:
503
-
504
- ```txt
505
- queued
506
- running
507
- waiting_for_child
508
- waiting_for_human
509
- succeeded
510
- failed
511
- cancelled
512
- ```
513
-
514
- ### Job Data
515
-
516
- Each job record should track:
517
-
518
- 1. `job_id`
519
- 2. `template_id`
520
- 3. `operation_name`
521
- 4. `tracer`
522
- 5. `status`
523
- 6. `payload`
524
- 7. `result`
525
- 8. `error`
526
- 9. `progress`
527
- 10. `webhook_url`
528
- 11. `parent_job_id`
529
- 12. `customer_id`
530
- 13. `reservation_id` or billing reference
531
- 14. timestamps
532
-
533
- ### Logs
534
-
535
- Logs must be stored as structured job events, not just raw text.
536
-
537
- Each event should support:
538
-
539
- 1. timestamp
540
- 2. level
541
- 3. message
542
- 4. machine-readable metadata
543
- 5. progress update
544
- 6. artifact references
545
-
546
- This lets the client render a live job timeline later.
547
-
548
- ## SQLite-Backed AI Key Queue
549
-
550
- SQLite is not only the v1 job store. It is also the coordination layer for customer AI API key usage.
551
-
552
- The intended model is:
553
-
554
- 1. A job becomes runnable.
555
- 2. The worker identifies which provider and model the next step requires.
556
- 3. The worker attempts to lease one eligible customer key from SQLite.
557
- 4. If a lease is granted, the worker performs the outbound API call.
558
- 5. The worker records usage, updates cooldown state if needed, and releases the lease.
559
-
560
- This gives the platform a lightweight queue for AI key access without needing Redis, SQS, or a separate lock service.
561
-
562
- ### Why SQLite Is Acceptable in v1
563
-
564
- This is a reasonable design if all of the following remain true:
565
-
566
- 1. One EC2 host is the active source of truth.
567
- 2. The platform runs a moderate number of worker loops.
568
- 3. SQLite is configured in WAL mode.
569
- 4. Lease acquisition is done transactionally.
570
- 5. Jobs are retry-safe and can be rescheduled when no key is available.
571
-
572
- ### Core Idea
573
-
574
- The AI key queue is represented by a combination of:
575
-
576
- 1. Customer provider key records.
577
- 2. Active key lease records.
578
- 3. Key usage and error events.
579
- 4. Cooldown timestamps after rate-limit responses.
580
-
581
- There is no separate message broker for API key access. Eligibility is derived from database state at lease time.
582
-
583
- ### Suggested Tables
584
-
585
- ```sql
586
- create table customer_provider_keys (
587
- id text primary key,
588
- customer_id text not null,
589
- provider text not null,
590
- label text,
591
- encrypted_secret text not null,
592
- status text not null,
593
- weight integer not null default 1,
594
- last_used_at text,
595
- cooldown_until text,
596
- disabled_reason text,
597
- created_at text not null,
598
- updated_at text not null
599
- );
600
-
601
- create table provider_key_leases (
602
- key_id text primary key,
603
- lease_token text not null,
604
- worker_id text not null,
605
- job_id text not null,
606
- leased_at text not null,
607
- expires_at text not null
608
- );
609
-
610
- create table provider_key_usage_events (
611
- id text primary key,
612
- key_id text not null,
613
- job_id text not null,
614
- provider text not null,
615
- model text,
616
- event_type text not null,
617
- input_tokens integer,
618
- output_tokens integer,
619
- cost_usd real,
620
- created_at text not null
621
- );
622
- ```
623
-
624
- Optional model capability table:
625
-
626
- ```sql
627
- create table provider_key_capabilities (
628
- key_id text not null,
629
- model text not null,
630
- primary key (key_id, model)
631
- );
632
- ```
633
-
634
- ### Lease Acquisition
635
-
636
- Workers must acquire a key lease before making any outbound provider request on behalf of a customer.
637
-
638
- Lease acquisition should happen inside a transaction using `BEGIN IMMEDIATE`.
639
-
640
- The query should exclude keys that are:
641
-
642
- 1. Not active.
643
- 2. In cooldown.
644
- 3. Already leased and whose lease has not expired.
645
- 4. Incompatible with the requested provider or model.
646
-
647
- Preferred selection order in v1:
648
-
649
- 1. Least recently used eligible key.
650
- 2. Higher weight first when weights differ.
651
-
652
- Illustrative flow:
653
-
654
- ```txt
655
- BEGIN IMMEDIATE
656
- 1. Select one eligible key
657
- 2. Insert active lease row
658
- 3. Commit
659
- ```
660
-
661
- Illustrative query shape:
662
-
663
- ```sql
664
- select k.id
665
- from customer_provider_keys k
666
- left join provider_key_leases l
667
- on l.key_id = k.id
668
- and l.expires_at > datetime('now')
669
- where k.customer_id = ?
670
- and k.provider = ?
671
- and k.status = 'active'
672
- and (k.cooldown_until is null or k.cooldown_until <= datetime('now'))
673
- and l.key_id is null
674
- order by k.last_used_at asc nulls first, k.weight desc
675
- limit 1;
676
- ```
677
-
678
- If a key is found, the worker inserts a lease row with a short expiry such as 30 to 90 seconds.
679
-
680
- If no key is found, the worker must not busy-loop. It should reschedule the job for a future run.
681
-
682
- ### Lease Semantics
683
-
684
- Lease rows should contain:
685
-
686
- 1. `key_id`
687
- 2. `lease_token`
688
- 3. `worker_id`
689
- 4. `job_id`
690
- 5. `leased_at`
691
- 6. `expires_at`
692
-
693
- The lease token should be required for release or extension so one worker cannot accidentally release another worker's lease.
694
-
695
- ### Lease Expiry and Recovery
696
-
697
- If a worker crashes, its lease should naturally expire and the key should become eligible again.
698
-
699
- For long-running requests, the platform may optionally support lease extension heartbeats. This is useful when the provider call or downstream processing can exceed the default lease duration.
700
-
701
- ### Success Path
702
-
703
- After a successful provider call, the worker should:
704
-
705
- 1. Record a usage event.
706
- 2. Update `last_used_at`.
707
- 3. Clear any temporary rate-limit status when appropriate.
708
- 4. Release the lease.
709
-
710
- ### Rate-Limit Path
711
-
712
- If the provider returns a rate-limit response, the worker should:
713
-
714
- 1. Record a `rate_limit` usage event.
715
- 2. Put the key into cooldown by setting `cooldown_until`.
716
- 3. Release the lease.
717
- 4. Reschedule the job.
718
-
719
- Cooldown duration may initially be determined by:
720
-
721
- 1. Provider response headers if available.
722
- 2. Provider-specific backoff policy.
723
- 3. Conservative defaults when headers are absent.
724
-
725
- ### Auth Failure Path
726
-
727
- If the provider reports invalid credentials, the worker should:
728
-
729
- 1. Record an `auth_error` usage event.
730
- 2. Mark the key `invalid`.
731
- 3. Release the lease.
732
- 4. Retry with another key if one exists.
733
- 5. Fail the job clearly if no valid key remains.
734
-
735
- ### Scheduler Behavior
736
-
737
- The scheduler should treat jobs as runnable only when both are true:
738
-
739
- 1. The job itself is ready to run.
740
- 2. A compatible provider key can likely be leased now or soon.
741
-
742
- Recommended loop:
743
-
744
- 1. Fetch queued jobs ordered by `run_after`.
745
- 2. Attempt key lease acquisition for the next provider-dependent step.
746
- 3. If lease succeeds, run the step and mark job `running`.
747
- 4. If lease fails, move `run_after` forward instead of spinning.
748
- 5. Retry later.
749
-
750
- This is what makes the AI key queue effectively a SQLite-backed coordination system rather than a separate infrastructure dependency.
751
-
752
- ### Observability
753
-
754
- The platform should emit logs and metrics for:
755
-
756
- 1. Lease acquisition success rate.
757
- 2. Lease wait time.
758
- 3. Key cooldown frequency by provider.
759
- 4. Key invalidation frequency.
760
- 5. Job deferrals caused by unavailable keys.
761
-
762
- These signals will tell us when SQLite remains sufficient and when key coordination needs a stronger backend.
763
-
764
- ## Rate Limiting and Provider Routing
765
-
766
- Customer keys are not interchangeable infinite resources.
767
-
768
- The platform must route AI calls through a provider layer that understands:
769
-
770
- 1. Provider type.
771
- 2. Model name.
772
- 3. Key-level rate limits.
773
- 4. Backoff behavior.
774
- 5. Retry policy.
775
- 6. Temporary key disablement after provider errors.
776
-
777
- ### Initial Approach
778
-
779
- Use SQLite-backed leasing for queue and key selection in v1.
780
-
781
- This is acceptable if:
782
-
783
- 1. One EC2 host is the source of truth.
784
- 2. SQLite is configured in WAL mode.
785
- 3. Concurrency expectations remain moderate.
786
- 4. Jobs are idempotent enough to survive retries.
787
-
788
- ### Future Upgrade Path
789
-
790
- If platform concurrency or reliability needs outgrow SQLite, move the job and rate-limit state to Postgres before adopting many-worker horizontal scale.
791
-
792
- ## Billing Model
793
-
794
- The platform owns billing enforcement, but templates should emit billing events through framework APIs.
795
-
796
- Templates should not hand-roll their own pricing logic in arbitrary ways.
797
-
798
- ### Billing Principle
799
-
800
- Bill conservatively enough to avoid cloud-cost loss.
801
-
802
- Current target:
803
-
804
- ```txt
805
- customer_charge_usd ~= platform_cost_usd * 2
806
- ```
807
-
808
- This is a margin buffer, not a final finance system.
809
-
810
- ### Billing Event Types
811
-
812
- The framework should support at least:
813
-
814
- 1. External AI token usage.
815
- 2. Remotion render usage.
816
- 3. EC2 / CPU / memory approximation.
817
- 4. Storage writes.
818
- 5. Data egress or expensive file processing when relevant.
819
-
820
- ### Billing API for Templates
821
-
822
- Templates should call framework helpers like:
823
-
824
- ```ts
825
- await ctx.billing.record({
826
- type: "ai_generation",
827
- provider: "openai",
828
- model: "gpt-4.1",
829
- estimatedCostUsd: 0.024,
830
- metadata: {},
831
- });
832
- ```
833
-
834
- The framework should translate these events into customer-facing charges.
835
-
836
- ## Webhooks
837
-
838
- Every job may include an optional webhook destination.
839
-
840
- The platform should emit webhook events for:
841
-
842
- 1. `job.queued`
843
- 2. `job.running`
844
- 3. `job.progress`
845
- 4. `job.succeeded`
846
- 5. `job.failed`
847
- 6. `job.cancelled`
848
-
849
- Webhook delivery should be:
850
-
851
- 1. Signed
852
- 2. Retried with backoff
853
- 3. Persisted as delivery attempts
854
-
855
- ## File and Artifact Storage
856
-
857
- S3 is the system of record for customer-uploaded files and large generated outputs.
858
-
859
- ### Customer Workspace Convention
860
-
861
- Suggested logical prefix:
862
-
863
- ```txt
864
- s3://bucket/customers/:customerId/workspace/...
865
- ```
866
-
867
- ### Job Artifact Convention
868
-
869
- Suggested logical prefix:
870
-
871
- ```txt
872
- s3://bucket/templates/:templateId/users/:userId/jobs/:jobId/...
873
- ```
874
-
875
- Artifacts may include:
876
-
877
- 1. Prompt snapshots
878
- 2. Storyboards
879
- 3. Preview images
880
- 4. Audio assets
881
- 5. Subtitle files
882
- 6. Render manifests
883
- 7. Final video outputs
884
-
885
- This prefix is for template-generated outputs and intermediate artifacts. Keep it stable across storage backends so local development mirrors production object layout.
886
-
887
- ### Template Metadata Convention
888
-
889
- Suggested logical prefix:
890
-
891
- ```txt
892
- s3://bucket/templates/:templateId/about/...
893
- ```
894
-
895
- This prefix is for framework-owned template metadata assets such as preview images, preview videos, and other public "about" media referenced by `GET /templates/:templateId`.
896
-
897
- ## Remotion Integration
898
-
899
- Remotion should be treated as a specialized downstream execution path, not the center of the platform.
900
-
901
- Templates can invoke Remotion through a framework adapter.
902
-
903
- Suggested flow:
904
-
905
- 1. Template job prepares structured render input.
906
- 2. Template writes required assets through framework storage.
907
- 3. Template calls `ctx.remotion.render(...)`.
908
- 4. The adapter renders locally or via Lambda depending on environment config.
909
- 5. Final artifact is attached back to the parent job result.
910
-
911
- This keeps Remotion as one implementation detail among several, rather than forcing the platform to be Remotion-first.
912
-
913
- ## Isolation Policy
914
-
915
- Default mode is shared in-process execution inside the main platform runtime.
916
-
917
- ### Shared Execution Is Correct By Default
918
-
919
- Use shared execution when the template:
920
-
921
- 1. Uses standard Node dependencies.
922
- 2. Fits within normal memory and CPU budgets.
923
- 3. Does not need a custom OS image.
924
- 4. Can safely coexist with other templates.
925
-
926
- ### Isolated Execution Is an Escape Hatch
927
-
928
- Allow a template to declare isolated execution later if it needs:
929
-
930
- 1. Heavy FFmpeg or native binary workloads.
931
- 2. Custom Chromium or system library requirements.
932
- 3. A stricter security or resource boundary.
933
- 4. Independent scaling or scheduling.
934
-
935
- Even then, the main platform should still own:
936
-
937
- 1. Auth
938
- 2. Billing
939
- 3. Job creation
940
- 4. Customer state
941
- 5. Webhook delivery
942
-
943
- ## Suggested Repository Shape
944
-
945
- ```txt
946
- /src
947
- /templates/template_0000
948
- /templates/template_0001
949
- /AWS_REMOTION_HANDOFF.md
950
- /PLATFORM_SPEC.md
951
- /SKILL.director.md
952
- /SKILL.developer.md
953
- ```
954
-
955
- ## Suggested Internal Components
956
-
957
- ### Platform API
958
-
959
- Responsibilities:
960
-
961
- 1. Request validation
962
- 2. Auth
963
- 3. Template lookup
964
- 4. Config updates
965
- 5. Job creation
966
- 6. Job status reads
967
- 7. Webhook registration
968
-
969
- ### Worker / Dispatcher
970
-
971
- Responsibilities:
972
-
973
- 1. Pull queued jobs
974
- 2. Acquire provider key leases
975
- 3. Execute template jobs
976
- 4. Persist logs and artifacts
977
- 5. Update billing
978
- 6. Deliver completion webhooks
979
-
980
- ### Template Registry
981
-
982
- Responsibilities:
983
-
984
- 1. Register approved in-house templates
985
- 2. Expose metadata
986
- 3. Resolve operations and jobs
987
- 4. Enforce version compatibility
988
-
989
- ## Security Notes
990
-
991
- 1. Customer provider keys must be encrypted at rest.
992
- 2. API keys must be hash-verified, not stored in plaintext.
993
- 3. Template code is trusted internal code in v1, not untrusted tenant code.
994
- 4. Webhook signatures must be mandatory.
995
- 5. Customer file access must always be scoped by customer identity.
996
- 6. Template source imports must pin an exact commit SHA before activation.
997
- 7. Templates must include a `SKILL.md` file so customer AI agents have a framework-owned usage contract.
998
-
999
- ## Operational Notes for v1
1000
-
1001
- 1. Run one Docker image on one EC2 host.
1002
- 2. Keep API and worker in the same deployable unit initially.
1003
- 3. Use SQLite in WAL mode.
1004
- 4. Back up SQLite and treat it as transitional infrastructure.
1005
- 5. Store all large artifacts in S3, not on container disk.
1006
- 6. Assume template code is centrally reviewed before deployment.
1007
- 7. Not every template requires Remotion. Remotion validation should only apply to templates that actually call the Remotion adapter.
1008
-
1009
- ## Template Certification Minimum
1010
-
1011
- A template must not be activatable unless all of the following pass:
1012
-
1013
- 1. template metadata contract is valid
1014
- 2. operation-to-workflow references are valid
1015
- 3. a template-local `SKILL.md` file exists
1016
- 4. every operation defines a smoke-test payload
1017
- 5. the smoke-test harness passes
1018
- 6. the release is associated with a pinned commit SHA
1019
-
1020
- ## Known Limits of v1
1021
-
1022
- 1. SQLite is a reasonable starting point but not the final queueing backend for large-scale concurrency.
1023
- 2. Shared in-process template execution is simpler operationally but weaker as a hard isolation boundary.
1024
- 3. EC2 cost attribution will begin as approximation plus provider-cost tracking, not perfect real-time infrastructure metering.
1025
-
1026
- ## Recommendation Summary
1027
-
1028
- The recommended initial Vidfarm platform is:
1029
-
1030
- 1. One Node.js 22 Docker container on EC2.
1031
- 2. Hono as the HTTP API layer.
1032
- 3. SQLite as the initial jobs and rate-limit store.
1033
- 4. S3 for workspace and artifact storage.
1034
- 5. Remotion Lambda as a downstream rendering path.
1035
- 6. Templates implemented as normal internal TypeScript packages with full npm access.
1036
- 7. Public API defined as async operations that enqueue jobs.
1037
- 8. Optional isolated template execution added later only when justified by concrete workload needs.
1038
-
1039
- This is the simplest architecture that matches the product goals without prematurely turning each template into a separate service.