npm - @sogni-ai/sogni-client - Versions diffs - 4.2.0-alpha.2 → 4.2.0-alpha.4 - Mend

@sogni-ai/sogni-client 4.2.0-alpha.2 → 4.2.0-alpha.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/CHANGELOG.md +14 -0
package/README.md +16 -9
package/dist/Chat/ChatTools.d.ts +5 -49
package/dist/Chat/ChatTools.js +336 -81
package/dist/Chat/ChatTools.js.map +1 -1
package/dist/Chat/index.d.ts +5 -1
package/dist/Chat/index.js +21 -2
package/dist/Chat/index.js.map +1 -1
package/dist/Chat/tools.d.ts +8 -55
package/dist/Chat/tools.js +316 -98
package/dist/Chat/tools.js.map +1 -1
package/dist/Chat/types.d.ts +1 -0
package/llms-full.txt +30 -9
package/llms.txt +24 -7
package/package.json +1 -1
package/src/Chat/ChatTools.ts +408 -85
package/src/Chat/index.ts +32 -2
package/src/Chat/tools.ts +373 -102
package/src/Chat/types.ts +1 -0

package/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,17 @@
+# [4.2.0-alpha.4](https://github.com/Sogni-AI/sogni-client/compare/v4.2.0-alpha.3...v4.2.0-alpha.4) (2026-04-22)
+### Features
+* expand chat sogni media tools ([fb0d6c7](https://github.com/Sogni-AI/sogni-client/commit/fb0d6c77e4b73fbbe57d8b64b4a37bb610eb8c68))
+# [4.2.0-alpha.3](https://github.com/Sogni-AI/sogni-client/compare/v4.2.0-alpha.2...v4.2.0-alpha.3) (2026-04-20)
+### Bug Fixes
+* use thinkingComplexDefault in LLMModelInfo cost estimates ([162011f](https://github.com/Sogni-AI/sogni-client/commit/162011f698b56229629cea8d09fad1b3e313294f))
 # [4.2.0-alpha.2](https://github.com/Sogni-AI/sogni-client/compare/v4.2.0-alpha.1...v4.2.0-alpha.2) (2026-04-20)

package/README.md CHANGED Viewed

@@ -18,7 +18,7 @@ Behind the scenes this SDK uses a WebSocket connection for communication between
 - 🎯 **Advanced Controls** - Fine-tune generation with samplers, schedulers, ControlNets, and more
 - 🤖 **LLM Text Generation** - Chat completions with streaming, multi-turn conversations, and thinking/reasoning mode via OpenAI-compatible API
 - 🔧 **LLM Tool Calling** - Define custom tools (functions) that the LLM can invoke during conversations for real-time data and actions
-- 🎨🎬🎵 **Sogni Platform Tools** - Generate images, videos, and music through natural language chat — the LLM detects media intent, enhances prompts, and calls Sogni's generation APIs automatically
+- 🎨🎬🎵 **Sogni Platform Tools** - Generate images, reference-guided image edits, videos, audio-driven videos, video transforms, and music through natural language chat
 - 👁️ **Vision Chat** - Multimodal image understanding with scene description, OCR, object detection, visual analysis, and multi-image comparison via Qwen3.6 VLM
 ## Migration notes
 ### v3.x.x to v4.x.x
@@ -691,13 +691,20 @@ const response = await sogni.chat.completions.create({
 ### Sogni Platform Tools — Generate Media via Chat
-Combine LLM intelligence with Sogni's media generation capabilities. The LLM detects when a user wants to create an image, video, or music, enhances the prompt, and calls Sogni's generation APIs — turning natural language into creative output:
+Combine LLM intelligence with Sogni's media generation capabilities. The SDK exposes six built-in public platform tools for chat completions:
-- **Image Generation** via tool call — "Create an image of a cyberpunk city at night"
-- **Video Generation** via tool call — "Generate a video of ocean waves at sunset"
-- **Music Generation** via tool call — "Compose a jazz song about the rain"
+- **`sogni_generate_image`** — text-to-image generation
+- **`sogni_edit_image`** — reference-guided image editing using `source_image_url` and `reference_image_urls`
+- **`sogni_generate_video`** — text-to-video and image-to-video generation
+- **`sogni_sound_to_video`** — audio-driven video generation using `reference_audio_url`
+- **`sogni_video_to_video`** — video transformation / motion transfer using `reference_video_url`
+- **`sogni_generate_music`** — music generation with optional lyrics and advanced controls
-See the `workflow_text_chat_sogni_tools.mjs` example for a complete implementation that wires LLM tool calling to Sogni's image, video, and audio generation APIs.
+Use `SogniTools.all` to expose the full tool surface, then execute tool calls with `sogni.chat.tools.execute()` / `executeAll()` or `autoExecuteTools: true` for non-streaming flows.
+Media-conditioned workflows use explicit asset URLs or data URIs, including `source_image_url`, `reference_image_url`, `reference_audio_url`, `reference_audio_identity_url`, and `reference_video_url`.
+The `workflow_text_chat_sogni_tools.mjs` example demonstrates the core text-to-image, text-to-video, and text-to-music composition flows. Dedicated workflow examples like `workflow_image_edit.mjs`, `workflow_sound_to_video.mjs`, and `workflow_video_to_video.mjs` cover the asset-backed workflows directly.
 ## Code Examples
@@ -719,7 +726,7 @@ The [examples](https://github.com/Sogni-AI/sogni-client/tree/main/examples) dire
 - **`workflow_text_chat_multi_turn.mjs`** - Multi-turn conversation with history, in-chat commands, and session stats
 - **`workflow_text_chat_vision.mjs`** - Vision chat with multimodal image understanding (scene description, OCR, object detection, visual analysis, multi-image comparison)
 - **`workflow_text_chat_tool_calling.mjs`** - LLM tool calling with built-in tools (weather, time, unit conversion, math)
-- **`workflow_text_chat_sogni_tools.mjs`** - Generate images, videos, and music through natural language via LLM tool calling
+- **`workflow_text_chat_sogni_tools.mjs`** - Core image/video/music generation through natural language via LLM tool calling
 ### Basic Examples
 - **`promise_based.mjs`** - Image generation using promises/async-await
@@ -736,7 +743,7 @@ The workflow examples showcase a few powerful open-source frontier models suppor
 | `qwen_image_edit_2511_fp8_lightning` | **Qwen Image Edit Lightning** - Fast 4-step editing | Rapid reference-based image generation |
 | `qwen_image_edit_2511_fp8` | **Qwen Image Edit** - High quality 20-step editing | Professional image editing with context awareness |
 | `wan_v2.2-14b-fp8_t2v_lightx2v` | **Wan 2.2 T2V** - Text-to-video | Generate videos from text prompts |
-| `qwen3.6-35b-a3b-gguf-iq4xs` | **Qwen3.6 35B VLM** - LLM chat, tool calling & vision | Latest model with 128K context target, reasoning, tool calling, and multimodal image understanding |
+| `qwen3.6-35b-a3b-gguf-iq4xs` | **Qwen3.6 35B VLM** - LLM chat, tool calling & vision | Latest model with 262,144 native context length, reasoning, tool calling, and multimodal image understanding |
 All workflow examples include:
 - Interactive model and parameter selection
@@ -777,7 +784,7 @@ When helping users generate images, videos, or use LLM features with Sogni:
 2. **Video generation**: Use `type: 'video'` with `network: 'fast'` (required)
 3. **Audio generation**: Use `type: 'audio'` with ACE-Step 1.5 models
 4. **LLM text chat**: Use `sogni.projects.chatCompletion()` for text generation with streaming and tool calling
-5. **Sogni Platform Tools**: Combine LLM tool calling with Sogni media generation to create images, videos, and music from natural language
+5. **Sogni Platform Tools**: Combine LLM tool calling with Sogni media generation to create images, image edits, videos, audio-driven videos, video transforms, and music from natural language
 6. **Vision chat**: Use `qwen3.6-35b-a3b-gguf-iq4xs` VLM for multimodal image understanding with `image_url` content type
 7. **WAN 2.2 vs LTX-2.3**: These model families have different FPS behaviors - see `llms-full.txt` for details

package/dist/Chat/ChatTools.d.ts CHANGED Viewed

@@ -1,65 +1,21 @@
 import type ProjectsApi from '../Projects';
 import { ToolCall, ToolExecutionOptions, ToolExecutionProgress, ToolExecutionResult } from './types';
-/**
- * API for executing Sogni platform tool calls (image, video, music generation).
- *
- * Accessed via `sogni.chat.tools`. Provides methods to execute tool calls returned
- * by the LLM, mapping them to `sogni.projects.create()` calls automatically.
- *
- * @example
- * ```typescript
- * // Execute a single tool call
- * const result = await sogni.chat.tools.execute(toolCall, {
- *   tokenType: 'sogni',
- *   onProgress: (p) => console.log(`${p.status}: ${p.percent}%`),
- * });
- *
- * // Execute all tool calls from a completion
- * const results = await sogni.chat.tools.executeAll(result.tool_calls, {
- *   onToolCall: async (tc) => myCustomHandler(tc), // for non-Sogni tools
- * });
- * ```
- */
 declare class ChatToolsApi {
     private projects;
     constructor(projects: ProjectsApi);
-    /**
-     * Execute a single Sogni platform tool call.
-     *
-     * Maps tool call arguments to `sogni.projects.create()`, waits for the media
-     * generation to complete, and returns the result URLs.
-     *
-     * @throws Error if the tool call is not a Sogni tool (use `isSogniToolCall()` to check first)
-     */
     execute(toolCall: ToolCall, options?: ToolExecutionOptions): Promise<ToolExecutionResult>;
-    /**
-     * Execute multiple tool calls from a single LLM response.
-     *
-     * Sogni tool calls (prefixed with `sogni_`) are executed automatically via
-     * `projects.create()`. Non-Sogni tool calls are delegated to the `onToolCall`
-     * callback if provided, or returned as errors.
-     *
-     * @param toolCalls - Array of tool calls from `result.tool_calls`
-     * @param options - Execution options plus optional handler for non-Sogni tools
-     */
     executeAll(toolCalls: ToolCall[], options?: ToolExecutionOptions & {
-        /** Handler for non-Sogni tool calls. Must return the tool result content string. */
         onToolCall?: (toolCall: ToolCall) => Promise<string>;
-        /** Per-tool progress callback (wraps the per-tool onProgress with tool identity). */
         onToolProgress?: (toolCall: ToolCall, progress: ToolExecutionProgress) => void;
     }): Promise<ToolExecutionResult[]>;
-    /**
-     * Get the default model for a media type. For video, prefers LTX-2.3 t2v models.
-     * Falls back to the model with the most available workers.
-     */
-    private getDefaultModel;
-    /**
-     * Create a project, wait for completion with timeout, track per-job progress,
-     * and clean up on failure or timeout.
-     */
+    private getAvailableModels;
+    private selectModel;
     private executeProject;
     private executeImageGeneration;
+    private executeImageEdit;
     private executeVideoGeneration;
+    private executeSoundToVideo;
+    private executeVideoToVideo;
     private executeMusicGeneration;
     private makeErrorResult;
 }