npm - @sogni-ai/sogni-client - Versions diffs - 4.2.0-alpha.3 → 4.2.0-alpha.5 - Mend

@sogni-ai/sogni-client 4.2.0-alpha.3 → 4.2.0-alpha.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

package/CHANGELOG.md +14 -0
package/README.md +16 -9
package/dist/Chat/ChatTools.d.ts +5 -49
package/dist/Chat/ChatTools.js +357 -82
package/dist/Chat/ChatTools.js.map +1 -1
package/dist/Chat/index.js +36 -2
package/dist/Chat/index.js.map +1 -1
package/dist/Chat/tools.d.ts +8 -55
package/dist/Chat/tools.js +316 -98
package/dist/Chat/tools.js.map +1 -1
package/dist/Chat/types.d.ts +1 -1
package/dist/lib/mediaValidation.d.ts +16 -0
package/dist/lib/mediaValidation.js +283 -0
package/dist/lib/mediaValidation.js.map +1 -0
package/llms-full.txt +32 -10
package/llms.txt +25 -6
package/package.json +1 -1
package/src/Chat/ChatTools.ts +433 -85
package/src/Chat/index.ts +48 -2
package/src/Chat/tools.ts +373 -102
package/src/Chat/types.ts +1 -1
package/src/lib/mediaValidation.ts +355 -0

package/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,17 @@
+# [4.2.0-alpha.5](https://github.com/Sogni-AI/sogni-client/compare/v4.2.0-alpha.4...v4.2.0-alpha.5) (2026-04-22)
+### Bug Fixes
+* validate inline media inputs for tools and vision requests ([c4bd7c1](https://github.com/Sogni-AI/sogni-client/commit/c4bd7c15cd530896065cf5ed41cd280b5db63938))
+# [4.2.0-alpha.4](https://github.com/Sogni-AI/sogni-client/compare/v4.2.0-alpha.3...v4.2.0-alpha.4) (2026-04-22)
+### Features
+* expand chat sogni media tools ([fb0d6c7](https://github.com/Sogni-AI/sogni-client/commit/fb0d6c77e4b73fbbe57d8b64b4a37bb610eb8c68))
 # [4.2.0-alpha.3](https://github.com/Sogni-AI/sogni-client/compare/v4.2.0-alpha.2...v4.2.0-alpha.3) (2026-04-20)

package/README.md CHANGED Viewed

@@ -18,7 +18,7 @@ Behind the scenes this SDK uses a WebSocket connection for communication between
 - 🎯 **Advanced Controls** - Fine-tune generation with samplers, schedulers, ControlNets, and more
 - 🤖 **LLM Text Generation** - Chat completions with streaming, multi-turn conversations, and thinking/reasoning mode via OpenAI-compatible API
 - 🔧 **LLM Tool Calling** - Define custom tools (functions) that the LLM can invoke during conversations for real-time data and actions
-- 🎨🎬🎵 **Sogni Platform Tools** - Generate images, videos, and music through natural language chat — the LLM detects media intent, enhances prompts, and calls Sogni's generation APIs automatically
+- 🎨🎬🎵 **Sogni Platform Tools** - Generate images, reference-guided image edits, videos, audio-driven videos, video transforms, and music through natural language chat
 - 👁️ **Vision Chat** - Multimodal image understanding with scene description, OCR, object detection, visual analysis, and multi-image comparison via Qwen3.6 VLM
 ## Migration notes
 ### v3.x.x to v4.x.x
@@ -691,13 +691,20 @@ const response = await sogni.chat.completions.create({
 ### Sogni Platform Tools — Generate Media via Chat
-Combine LLM intelligence with Sogni's media generation capabilities. The LLM detects when a user wants to create an image, video, or music, enhances the prompt, and calls Sogni's generation APIs — turning natural language into creative output:
+Combine LLM intelligence with Sogni's media generation capabilities. The SDK exposes six built-in public platform tools for chat completions:
-- **Image Generation** via tool call — "Create an image of a cyberpunk city at night"
-- **Video Generation** via tool call — "Generate a video of ocean waves at sunset"
-- **Music Generation** via tool call — "Compose a jazz song about the rain"
+- **`sogni_generate_image`** — text-to-image generation
+- **`sogni_edit_image`** — reference-guided image editing using `source_image_url` and `reference_image_urls`
+- **`sogni_generate_video`** — text-to-video and image-to-video generation
+- **`sogni_sound_to_video`** — audio-driven video generation using `reference_audio_url`
+- **`sogni_video_to_video`** — video transformation / motion transfer using `reference_video_url`
+- **`sogni_generate_music`** — music generation with optional lyrics and advanced controls
-See the `workflow_text_chat_sogni_tools.mjs` example for a complete implementation that wires LLM tool calling to Sogni's image, video, and audio generation APIs.
+Use `SogniTools.all` to expose the full tool surface, then execute tool calls with `sogni.chat.tools.execute()` / `executeAll()` or `autoExecuteTools: true` for non-streaming flows.
+Media-conditioned workflows use explicit inline base64 `data:` URIs, including `source_image_url`, `reference_image_url`, `reference_audio_url`, `reference_audio_identity_url`, and `reference_video_url`. Remote `http(s)` URLs are not allowed for these tool inputs. Tool image inputs accept PNG or JPEG only, tool audio inputs accept MP3/M4A/WAV only, and tool video inputs accept MP4 or MOV/QuickTime only.
+The `workflow_text_chat_sogni_tools.mjs` example demonstrates the core text-to-image, text-to-video, and text-to-music composition flows. Dedicated workflow examples like `workflow_image_edit.mjs`, `workflow_sound_to_video.mjs`, and `workflow_video_to_video.mjs` cover the asset-backed workflows directly.
 ## Code Examples
@@ -719,7 +726,7 @@ The [examples](https://github.com/Sogni-AI/sogni-client/tree/main/examples) dire
 - **`workflow_text_chat_multi_turn.mjs`** - Multi-turn conversation with history, in-chat commands, and session stats
 - **`workflow_text_chat_vision.mjs`** - Vision chat with multimodal image understanding (scene description, OCR, object detection, visual analysis, multi-image comparison)
 - **`workflow_text_chat_tool_calling.mjs`** - LLM tool calling with built-in tools (weather, time, unit conversion, math)
-- **`workflow_text_chat_sogni_tools.mjs`** - Generate images, videos, and music through natural language via LLM tool calling
+- **`workflow_text_chat_sogni_tools.mjs`** - Core image/video/music generation through natural language via LLM tool calling
 ### Basic Examples
 - **`promise_based.mjs`** - Image generation using promises/async-await
@@ -777,8 +784,8 @@ When helping users generate images, videos, or use LLM features with Sogni:
 2. **Video generation**: Use `type: 'video'` with `network: 'fast'` (required)
 3. **Audio generation**: Use `type: 'audio'` with ACE-Step 1.5 models
 4. **LLM text chat**: Use `sogni.projects.chatCompletion()` for text generation with streaming and tool calling
-5. **Sogni Platform Tools**: Combine LLM tool calling with Sogni media generation to create images, videos, and music from natural language
-6. **Vision chat**: Use `qwen3.6-35b-a3b-gguf-iq4xs` VLM for multimodal image understanding with `image_url` content type
+5. **Sogni Platform Tools**: Combine LLM tool calling with Sogni media generation to create images, image edits, videos, audio-driven videos, video transforms, and music from natural language
+6. **Vision chat**: Use `qwen3.6-35b-a3b-gguf-iq4xs` VLM for multimodal image understanding with `image_url` content parts carrying inline base64 JPEG/PNG `data:` URIs. Vision requests allow up to 20 images, 10MB each, with longest side capped at 1024px. This 1024px dimension cap applies only to the vision `image_url` path, not to media-generation tool image inputs.
 7. **WAN 2.2 vs LTX-2.3**: These model families have different FPS behaviors - see `llms-full.txt` for details
 ## API Documentation

package/dist/Chat/ChatTools.d.ts CHANGED Viewed

@@ -1,65 +1,21 @@
 import type ProjectsApi from '../Projects';
 import { ToolCall, ToolExecutionOptions, ToolExecutionProgress, ToolExecutionResult } from './types';
-/**
- * API for executing Sogni platform tool calls (image, video, music generation).
- *
- * Accessed via `sogni.chat.tools`. Provides methods to execute tool calls returned
- * by the LLM, mapping them to `sogni.projects.create()` calls automatically.
- *
- * @example
- * ```typescript
- * // Execute a single tool call
- * const result = await sogni.chat.tools.execute(toolCall, {
- *   tokenType: 'sogni',
- *   onProgress: (p) => console.log(`${p.status}: ${p.percent}%`),
- * });
- *
- * // Execute all tool calls from a completion
- * const results = await sogni.chat.tools.executeAll(result.tool_calls, {
- *   onToolCall: async (tc) => myCustomHandler(tc), // for non-Sogni tools
- * });
- * ```
- */
 declare class ChatToolsApi {
     private projects;
     constructor(projects: ProjectsApi);
-    /**
-     * Execute a single Sogni platform tool call.
-     *
-     * Maps tool call arguments to `sogni.projects.create()`, waits for the media
-     * generation to complete, and returns the result URLs.
-     *
-     * @throws Error if the tool call is not a Sogni tool (use `isSogniToolCall()` to check first)
-     */
     execute(toolCall: ToolCall, options?: ToolExecutionOptions): Promise<ToolExecutionResult>;
-    /**
-     * Execute multiple tool calls from a single LLM response.
-     *
-     * Sogni tool calls (prefixed with `sogni_`) are executed automatically via
-     * `projects.create()`. Non-Sogni tool calls are delegated to the `onToolCall`
-     * callback if provided, or returned as errors.
-     *
-     * @param toolCalls - Array of tool calls from `result.tool_calls`
-     * @param options - Execution options plus optional handler for non-Sogni tools
-     */
     executeAll(toolCalls: ToolCall[], options?: ToolExecutionOptions & {
-        /** Handler for non-Sogni tool calls. Must return the tool result content string. */
         onToolCall?: (toolCall: ToolCall) => Promise<string>;
-        /** Per-tool progress callback (wraps the per-tool onProgress with tool identity). */
         onToolProgress?: (toolCall: ToolCall, progress: ToolExecutionProgress) => void;
     }): Promise<ToolExecutionResult[]>;
-    /**
-     * Get the default model for a media type. For video, prefers LTX-2.3 t2v models.
-     * Falls back to the model with the most available workers.
-     */
-    private getDefaultModel;
-    /**
-     * Create a project, wait for completion with timeout, track per-job progress,
-     * and clean up on failure or timeout.
-     */
+    private getAvailableModels;
+    private selectModel;
     private executeProject;
     private executeImageGeneration;
+    private executeImageEdit;
     private executeVideoGeneration;
+    private executeSoundToVideo;
+    private executeVideoToVideo;
     private executeMusicGeneration;
     private makeErrorResult;
 }