npm - @jiggai/recipes - Versions diffs - 0.4.34 → 0.4.35 - Mend

@jiggai/recipes 0.4.34 → 0.4.35

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/docs/ARCHITECTURE.md +66 -1
package/docs/COMMANDS.md +12 -0
package/docs/MEDIA_DRIVERS.md +175 -0
package/docs/MEDIA_GENERATION.md +553 -0
package/docs/TEMPLATE_VARIABLES.md +196 -0
package/docs/WORKFLOW_APPROVALS.md +334 -0
package/docs/WORKFLOW_NODES.md +147 -0
package/docs/WORKFLOW_RUNS_FILE_FIRST.md +2 -0
package/index.ts +9 -0
package/openclaw.plugin.json +1 -1
package/package.json +1 -1
package/src/handlers/media-drivers.ts +49 -0
package/src/lib/workflows/media-drivers/generic.driver.ts +128 -0
package/src/lib/workflows/media-drivers/index.ts +22 -0
package/src/lib/workflows/media-drivers/kling-video.driver.ts +110 -0
package/src/lib/workflows/media-drivers/luma-video.driver.ts +59 -0
package/src/lib/workflows/media-drivers/nano-banana-pro.driver.ts +70 -0
package/src/lib/workflows/media-drivers/openai-image-gen.driver.ts +60 -0
package/src/lib/workflows/media-drivers/registry.ts +96 -0
package/src/lib/workflows/media-drivers/runway-video.driver.ts +59 -0
package/src/lib/workflows/media-drivers/types.ts +50 -0
package/src/lib/workflows/media-drivers/utils.ts +149 -0
package/src/lib/workflows/workflow-worker.ts +33 -137

package/docs/ARCHITECTURE.md CHANGED Viewed

@@ -89,10 +89,75 @@ flowchart LR
     HTickets --> LibTickets
 ```
-## Key decisions
+## Media Drivers Subsystem
+ClawRecipes implements AI media generation through a driver-based architecture:
+### Architecture
+```
+src/lib/workflows/media-drivers/
+├── types.ts              # MediaDriver interface and types
+├── registry.ts           # Driver registration and discovery
+├── utils.ts              # Skill discovery and script execution
+├── *.driver.ts           # Provider-specific drivers
+└── generic.driver.ts     # Auto-discovery for unlisted skills
+```
+### Components
+- **MediaDriver Interface**: Standardized interface for all providers (slug, mediaType, requiredEnvVars, invoke)
+- **Registry System**: Maps provider names to drivers, checks env var availability
+- **Skill Discovery**: Searches `~/.openclaw/skills/`, `~/.openclaw/workspace/skills/`, etc.
+- **Generic Driver**: Auto-creates drivers for any installed skill not explicitly registered
+### Integration Points
+- **Workflow Worker**: Executes `media-image`, `media-video`, `media-audio` nodes via drivers
+- **CLI Command**: `workflows media-drivers` lists available providers (used by ClawKitchen UI)
+- **Environment Loading**: Merges `process.env` + `~/.openclaw/openclaw.json` env vars
+### Supported Providers
+- **Image**: nano-banana-pro (Gemini), openai-image-gen (DALL-E)
+- **Video**: klingai (Kling AI), runway-video (Runway), luma-video (Luma AI)
+- **Audio**: Extensible via generic driver pattern
+## OutputFields and Schema Validation
+LLM nodes support structured output generation with runtime validation:
+### Configuration
+```json
+{
+  "config": {
+    "outputFields": [
+      {"name": "title", "type": "text"},
+      {"name": "tags", "type": "list"},
+      {"name": "metadata", "type": "json"}
+    ]
+  }
+}
+```
+### Runtime Behavior
+1. **Schema Generation**: outputFields converted to JSON Schema with required fields
+2. **LLM Invocation**: Schema passed to `llm-task` tool for structured generation
+3. **Validation**: Response validated against schema before saving
+4. **Template Variables**: Structured fields become available as `{{nodeId.fieldName}}`
+### Field Types
+- **text**: String values
+- **list**: Arrays of strings
+- **json**: Nested JSON objects
+### Implementation
+- **Code Location**: `src/lib/workflows/workflow-worker.ts` in LLM node execution
+- **Variable Extraction**: JSON fields automatically exposed as template variables
+- **Error Handling**: Schema validation failures logged with detailed error messages
+## Key Decisions
 - **Tool policy preservation**: When a recipe omits `tools`, the scaffold preserves the existing agent's tool policy (rather than resetting it). See scaffold logic and tests.
 - **`__internal` export**: Unit tests import handlers and lib helpers via `__internal`; these are not part of the public plugin API.
+- **Media driver registry**: Prefers explicitly registered drivers over auto-discovery for better performance and error messages.
+- **Schema validation**: OutputFields generate strict JSON schemas to ensure predictable LLM outputs for downstream template usage.
 ## Quality automation

package/docs/COMMANDS.md CHANGED Viewed

@@ -268,6 +268,18 @@ openclaw recipes workflows worker-tick \
   --limit 10
 ```
+### Media driver commands
+List registered media generation drivers (and whether required API keys are present):
+```bash
+openclaw recipes workflows media-drivers
+```
+This is what ClawKitchen uses to populate the media provider dropdown.
+More: [MEDIA_DRIVERS.md](MEDIA_DRIVERS.md)
 ### Approval commands
 ```bash

package/docs/MEDIA_DRIVERS.md ADDED Viewed

@@ -0,0 +1,175 @@
+# Media Drivers (ClawRecipes)
+ClawRecipes implements media generation via a **driver architecture**. Drivers are used by the workflow worker when executing `media-image`, `media-video`, and `media-audio` nodes.
+This doc is for developers adding new providers.
+## Key Concepts
+- A **skill** is a folder on disk (usually installed from ClawHub) that contains scripts and docs.
+- A **driver** is a small TypeScript adapter that knows how to invoke a given skill reliably.
+- Drivers provide:
+  - Stable display name and slug
+  - Required env-var list (availability checks)
+  - Invocation details (stdin vs CLI args, scripts/ subdir, venv detection)
+## CLI: List drivers
+ClawKitchen uses this command to populate its provider dropdown:
+```bash
+openclaw recipes workflows media-drivers
+```
+It returns JSON like:
+```json
+[
+  {
+    "slug": "nano-banana-pro",
+    "displayName": "Nano Banana Pro (Gemini Image Generation)",
+    "mediaType": "image",
+    "requiredEnvVars": ["GEMINI_API_KEY"],
+    "available": true,
+    "missingEnvVars": []
+  }
+]
+```
+Availability is computed from **merged env**:
+- `process.env` (ClawRecipes process)
+- `~/.openclaw/openclaw.json` → `env.vars`
+## Where the code lives
+```
+src/lib/workflows/media-drivers/
+  types.ts
+  registry.ts
+  utils.ts
+  nano-banana-pro.driver.ts
+  openai-image-gen.driver.ts
+  runway-video.driver.ts
+  kling-video.driver.ts
+  luma-video.driver.ts
+  generic.driver.ts
+```
+## MediaDriver interface
+```ts
+export interface MediaDriver {
+  slug: string;                       // Skill folder name
+  mediaType: 'image' | 'video' | 'audio';
+  displayName: string;                // UI dropdown label
+  requiredEnvVars: string[];          // Availability check
+  invoke(opts: MediaDriverInvokeOpts): Promise<MediaDriverResult>;
+}
+```
+The `invoke()` method should:
+- Write output into `opts.outputDir`
+- Return `{ filePath }` pointing at the generated file
+- Throw on failure with a useful error message (stderr/stdout included when possible)
+## Adding a new driver
+### 1) Add or install the underlying skill
+A skill should live in one of:
+- `~/.openclaw/skills/<slug>`
+- `~/.openclaw/workspace/skills/<slug>`
+- `~/.openclaw/workspace/<slug>` (ClawHub sometimes installs here)
+The worker and driver utils search these roots via `findSkillDir(slug)`.
+### 2) Create the driver file
+Create:
+```
+src/lib/workflows/media-drivers/my-provider.driver.ts
+```
+Example pattern (stdin → `MEDIA:` output):
+```ts
+import * as path from 'path';
+import { MediaDriver, MediaDriverInvokeOpts, MediaDriverResult } from './types';
+import { findSkillDir, findVenvPython, runScript, parseMediaOutput } from './utils';
+export class MyProvider implements MediaDriver {
+  slug = 'my-provider';
+  mediaType = 'image' as const;
+  displayName = 'My Provider';
+  requiredEnvVars = ['MY_PROVIDER_API_KEY'];
+  async invoke(opts: MediaDriverInvokeOpts): Promise<MediaDriverResult> {
+    const { prompt, outputDir, env, timeout } = opts;
+    const skillDir = await findSkillDir(this.slug);
+    if (!skillDir) throw new Error(`Skill dir not found for ${this.slug}`);
+    const scriptPath = path.join(skillDir, 'generate_image.py');
+    const runner = await findVenvPython(skillDir);
+    const stdout = runScript({
+      runner,
+      script: scriptPath,
+      stdin: prompt,
+      env: { ...env, HOME: process.env.HOME || '/home/control' },
+      cwd: outputDir,
+      timeout,
+    });
+    const filePath = parseMediaOutput(stdout);
+    if (!filePath) throw new Error(`No MEDIA: path in output: ${stdout}`);
+    return { filePath };
+  }
+}
+```
+Example pattern (CLI args → script prints direct path):
+- See `nano-banana-pro.driver.ts`.
+### 3) Register it
+Add the driver to `registry.ts` in `knownDrivers`:
+```ts
+import { MyProvider } from './my-provider.driver';
+const knownDrivers: MediaDriver[] = [
+  // ...
+  new MyProvider(),
+];
+```
+### 4) Done
+- The workflow worker can now invoke it by setting `provider` to `skill-my-provider`.
+- The CLI `workflows media-drivers` will list it.
+- ClawKitchen will show it automatically (Kitchen pulls the list from the CLI).
+## Script contract (skills)
+Drivers can invoke scripts in different ways, but the recommended contract is:
+- Prompt via stdin
+- Print `MEDIA:/absolute/or/relative/path` to stdout
+- Write the file into `MEDIA_OUTPUT_DIR` if provided
+Notes:
+- Some ClawHub skills place scripts under `scripts/`. Use `findScriptInSkill()` or direct paths accordingly.
+- Venv support: worker/driver utils will prefer `.venv/bin/python` when present.
+## Troubleshooting
+- If the dropdown says a driver is unavailable, run:
+  ```bash
+  openclaw recipes workflows media-drivers
+  ```
+  and check `missingEnvVars`.
+- If a skill isn't found, verify the folder name matches `slug` and is inside a scanned root.