npm - @mindstudio-ai/remy - Versions diffs - 0.1.119 → 0.1.120 - Mend

@mindstudio-ai/remy 0.1.119 → 0.1.120

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/dist/headless.js +12 -10
package/dist/index.js +12 -10
package/dist/prompt/compiled/sdk-actions.md +4 -0
package/dist/prompt/compiled/task-agents.md +187 -0
package/dist/prompt/static/coding.md +2 -0
package/dist/prompt/static/intake.md +1 -1
package/dist/subagents/codeSanityCheck/prompt.md +4 -0
package/package.json +1 -1

package/dist/headless.js CHANGED Viewed

@@ -382,6 +382,8 @@ Current date: ${now}
 <mindstudio_agent_sdk_docs>
   {{compiled/sdk-actions.md}}
+  {{compiled/task-agents.md}}
 </mindstudio_agent_sdk_docs>
 <mindstudio_flavored_markdown_spec_docs>
@@ -5269,16 +5271,6 @@ async function runTurn(params) {
       apiConfig,
       getContext: () => {
         const parts = [];
-        if (userMessage) {
-          parts.push(`User message: ${userMessage.slice(-200)}`);
-        }
-        if (onboardingState) {
-          parts.push(`Build phase: ${onboardingState}`);
-        }
-        const text = subAgentText || getTextContent(contentBlocks).slice(-500);
-        if (text) {
-          parts.push(`Assistant text: ${text}`);
-        }
         const toolName = currentToolNames || getToolCalls(contentBlocks).filter((tc) => !STATUS_EXCLUDED_TOOLS.has(tc.name)).at(-1)?.name || lastCompletedTools;
         if (toolName) {
           parts.push(`Tool: ${toolName}`);
@@ -5289,6 +5281,16 @@ async function runTurn(params) {
         if (lastCompletedResult) {
           parts.push(`Tool result: ${lastCompletedResult.slice(-200)}`);
         }
+        const text = subAgentText || getTextContent(contentBlocks).slice(-500);
+        if (text) {
+          parts.push(`Assistant text: ${text}`);
+        }
+        if (onboardingState && onboardingState !== "onboardingFinished") {
+          parts.push(`Build phase: ${onboardingState}`);
+        }
+        if (userMessage) {
+          parts.push(`User request: ${userMessage.slice(-100)}`);
+        }
         return parts.join("\n");
       },
       onStatus: (label) => onEvent({ type: "status", message: label }),

package/dist/index.js CHANGED Viewed

@@ -5319,16 +5319,6 @@ async function runTurn(params) {
       apiConfig,
       getContext: () => {
         const parts = [];
-        if (userMessage) {
-          parts.push(`User message: ${userMessage.slice(-200)}`);
-        }
-        if (onboardingState) {
-          parts.push(`Build phase: ${onboardingState}`);
-        }
-        const text = subAgentText || getTextContent(contentBlocks).slice(-500);
-        if (text) {
-          parts.push(`Assistant text: ${text}`);
-        }
         const toolName = currentToolNames || getToolCalls(contentBlocks).filter((tc) => !STATUS_EXCLUDED_TOOLS.has(tc.name)).at(-1)?.name || lastCompletedTools;
         if (toolName) {
           parts.push(`Tool: ${toolName}`);
@@ -5339,6 +5329,16 @@ async function runTurn(params) {
         if (lastCompletedResult) {
           parts.push(`Tool result: ${lastCompletedResult.slice(-200)}`);
         }
+        const text = subAgentText || getTextContent(contentBlocks).slice(-500);
+        if (text) {
+          parts.push(`Assistant text: ${text}`);
+        }
+        if (onboardingState && onboardingState !== "onboardingFinished") {
+          parts.push(`Build phase: ${onboardingState}`);
+        }
+        if (userMessage) {
+          parts.push(`User request: ${userMessage.slice(-100)}`);
+        }
         return parts.join("\n");
       },
       onStatus: (label) => onEvent({ type: "status", message: label }),
@@ -5942,6 +5942,8 @@ Current date: ${now}
 <mindstudio_agent_sdk_docs>
   {{compiled/sdk-actions.md}}
+  {{compiled/task-agents.md}}
 </mindstudio_agent_sdk_docs>
 <mindstudio_flavored_markdown_spec_docs>

package/dist/prompt/compiled/sdk-actions.md CHANGED Viewed

@@ -157,3 +157,7 @@ MindStudio SDK allows us to build all kinds of amazing AI experiences in apps, i
   - Detailed image and video analysis
 Consider the ways in which AI can be incorporated into backend methods to solve problems and be creative.
+### Task Agents
+For multi-step tasks where the model needs to autonomously compose actions (research + scrape + generate, enrichment pipelines, content creation), use `runTask()` instead of chaining actions manually. It runs an agent loop with the SDK actions as tools and returns structured JSON. See the task agents reference for full details.

package/dist/prompt/compiled/task-agents.md ADDED Viewed

@@ -0,0 +1,187 @@
+# Task Agents (`MindStudioAgent runTask`)
+A user types the name of a restaurant into your app, or uploads a photo of a storefront. The API call returns early, and in the background, a task agent searches Google, finds the official website, scrapes the address, gets the official social media accounts, and generates a stylized watercolor postcard of the exterior from images it found online. The user gets back a rich, illustrated card with the canonical name, website, address, and a custom image. A few tool calls (some in parallel), fully autonomous.
+`runTask()` makes this possible. It runs a multi-step, tool-use agent loop: give it a prompt, a set of SDK actions as tools, and an example of the structured output you want. The platform runs the loop (calling the model, executing tool calls, feeding results back) until the model produces JSON matching your output example. The model decides what to do next based on intermediate results — retrying searches with different terms, working around failed tools, batching independent calls in parallel.
+This is one of the most powerful pieces of the MindStudio SDK and can make turn apps from amazing into truly magical. Use `askMindStudioSdk` to help construct the perfect agent for a task.
+## When to Use
+This is the tool to reach for whenever a feature would be dramatically more compelling if the app could autonomously research, enrich, or create on behalf of the user. Think about the difference between "user enters a restaurant name and it gets saved" vs. "user enters a restaurant name and gets back a fully researched, illustrated card." Task agents close that gap.
+Run tasks in the background — depending on complexity they can take time to complete. Return an early partial result to the user and upsert later with the final result when the agent finishes.
+- **Research and enrichment:** "Given this email, find the person's LinkedIn, role, company, and a headshot" — the model searches, scrapes, extracts, and assembles structured data.
+- **Content creation pipelines:** "Write SEO copy for this product in 3 languages, generate a hero image, extract keywords" — the model calls text generation, image generation, and analysis actions as needed.
+- **Data processing with judgment:** "Given this restaurant name, find the canonical name, website, address, and create a stylized illustration" — the model searches, verifies, generates, and returns clean structured output.
+- **Any multi-step task with branching logic:** If the model might need to retry a search with different terms, try a different approach when one fails, or make decisions based on intermediate results.
+## When NOT to Use
+- **Simple linear pipelines (2-3 steps, no branching):** Just call the SDK actions directly in sequence. `runTask()` adds overhead from the model reasoning about what to do next.
+- **Chat/conversation:** Use an Agent interface instead. Task agents are single-shot, no persistent conversation history.
+- **One-off text generation:** Just use `generateText()` directly.
+## Usage
+```typescript
+import { MindStudioAgent } from '@mindstudio-ai/agent';
+const agent = new MindStudioAgent();
+const result = await agent.runTask<{
+  name: string;
+  url: string;
+  address: string;
+  photoUrl: string;
+}>({
+  prompt: `You are a restaurant research assistant. Given a restaurant name,
+    find its canonical name, website URL, full address, and create a stylized
+    watercolor illustration of the restaurant exterior.`,
+  input: { restaurantName: 'Tartine Bakery SF' },
+  tools: [
+    'searchGoogle',
+    'fetchUrl',
+    { method: 'generateImage', defaults: { imageModelOverride: { model: 'seedream-4.5' } } },
+  ],
+  structuredOutputExample: {
+    name: 'Tartine Bakery',
+    url: 'https://tartinebakery.com',
+    address: '600 Guerrero St, San Francisco, CA 94110',
+    photoUrl: 'https://cdn.mindstudio.ai/...',
+  },
+  model: 'claude-4-6-sonnet',
+  maxTurns: 15,
+});
+// Always validate before using output
+if (!result.parsedSuccessfully) {
+  console.error('Task failed to produce structured output:', result.outputRaw);
+  throw new Error('Task agent failed');
+}
+console.log(result.output.name);    // 'Tartine Bakery'
+console.log(result.output.photoUrl); // URL to the generated illustration
+```
+## Always Validate Output
+`runTask()` can return successfully with garbage output — fields null, data echoed back, or raw text instead of JSON. The result includes `parsedSuccessfully` to make this explicit. Always check it before using the output:
+```typescript
+const result = await agent.runTask<MyType>({ ... });
+if (!result.parsedSuccessfully) {
+  console.error('Task output was not valid JSON:', result.outputRaw);
+  throw new Error('Task agent failed to produce structured output');
+}
+// Now safe to use result.output
+await Table.update(id, result.output);
+```
+## Tool Configuration
+Tools are SDK action names. The model gets the full input schema for each tool so it knows what parameters to pass. Only include tools the task actually needs — the model may use extra tools unnecessarily.
+Use tool defaults for model/config choices. Use the prompt for task-level instructions.
+```typescript
+tools: [
+  // Simple — just the action name
+  'searchGoogle',
+  'fetchUrl',
+  'scrapeUrl',
+  // With defaults — override specific input fields while letting the model control the rest
+  { method: 'generateImage', defaults: { imageModelOverride: { model: 'seedream-4.5' } } },
+  { method: 'analyzeImage', defaults: { visionModelOverride: { model: 'gemini-3-flash' } } },
+]
+```
+When the model calls a tool, the platform deep-merges the model's arguments with the developer's defaults. The model decides what to do (prompt, query, parameters), the developer controls which model/config to use. If the model needs to search and generate an image and those are independent, it will call both tools in the same turn (parallel execution server-side).
+## Options
+| Field | Required | Default | Description |
+|-------|----------|---------|-------------|
+| `prompt` | Yes | — | System prompt defining the agent's behavior |
+| `input` | Yes | — | Structured input (passed as user message) |
+| `tools` | Yes | — | SDK action names with optional defaults |
+| `structuredOutputExample` | Yes | — | Object or JSON string showing expected output shape. Use realistic example values, not placeholders like `'string'` |
+| `model` | Yes | — | Model ID (must support tool use) |
+| `maxTurns` | No | 10 | Max loop iterations (capped at 25) |
+| `onEvent` | No | — | SSE event callback for real-time streaming |
+## Models
+Use `askMindStudioSdk` for appropriate models given the task and its complexity.
+## Return Value
+```typescript
+interface RunTaskResult<T> {
+  output: T;                // Parsed structured output matching your example
+  outputRaw: string;        // Raw model text before JSON parse
+  parsedSuccessfully: boolean; // Whether output was valid JSON
+  turns: number;            // Number of loop iterations used
+  usage: {
+    inputTokens: number;
+    outputTokens: number;
+    totalBillingCost: number;
+  };
+  toolCalls: Array<{        // Execution log for debugging
+    name: string;
+    success: boolean;
+    durationMs: number;
+  }>;
+}
+```
+When something goes wrong, `toolCalls` is the first thing to check. If it's empty, the model never used any tools (prompt probably isn't clear enough). If a tool failed, the model may have worked around it or produced garbage.
+## Streaming
+Pass an `onEvent` callback to get real-time events:
+```typescript
+const result = await agent.runTask({
+  // ... same options ...
+  onEvent: (event) => {
+    if (event.type === 'text') console.log('Agent:', event.text);
+    if (event.type === 'tool_call_start') console.log(`Calling ${event.name}...`);
+    if (event.type === 'tool_call_result') console.log('Result:', event.output);
+  },
+});
+```
+Event types: `text`, `thinking`, `thinking_complete`, `tool_use`, `tool_input_delta`, `tool_input_args`, `tool_call_start`, `tool_call_result`, `error`, `done`.
+Without `onEvent`, the SDK uses async polling (returns silently when complete). In dev mode (via the dev tunnel), progress and results are automatically logged to console with no setup needed.
+## Error Handling
+- Model produces non-JSON output: retried automatically if turns remain
+- Tool execution fails: error fed back to model, it can retry or work around it
+- Max turns exceeded: one final forced output attempt with tools disabled
+- If output still can't be parsed: `parsedSuccessfully` will be `false`, raw text available in `outputRaw`
+```typescript
+try {
+  const result = await agent.runTask({ ... });
+  if (!result.parsedSuccessfully) {
+    // Task completed but output wasn't valid JSON
+    console.error('Raw output:', result.outputRaw);
+    console.error('Tool calls:', result.toolCalls);
+  }
+} catch (err) {
+  if (err instanceof MindStudioError) {
+    // err.code: 'task_execution_error' | 'poll_token_expired' | 'stream_error'
+  }
+}
+```

package/dist/prompt/static/coding.md CHANGED Viewed

@@ -28,6 +28,8 @@ Process logs are available at .logs/ in NDJSON format (one JSON object per line)
 ### MindStudio SDK
 For any work involving AI models, external actions (web scraping, email, SMS), or third-party API/OAuth connections, prefer the `@mindstudio-ai/agent` SDK. It removes the need to research API methods, configure keys and tokens, or require the user to set up developer accounts.
+For multi-step tasks with branching logic (research, enrichment, content pipelines), use `runTask()` instead of manually chaining SDK actions. It runs an autonomous agent loop that composes actions, retries on failure, and returns structured JSON. See the task agents reference for details.
 ### Auth
 - Not every app needs auth, and even for apps that do need auth, not every screen needs auth. Think intentionally about places where auth is required. Don't make auth be the first thing a user sees - that's jarring. Only show auth at intuitive and natural moments in the user's journey - be thoughtful about how to implement auth in the UI.
 - Frontend interfaces are always untrusted. Always enforce auth in backend methods. Use frontend auth and role information as a hint to conditionally show/hide UI to make the experience pleasant and seamless for users depending on their state, but remember to always use backend methods for gating data that is conditional on auth.

package/dist/prompt/static/intake.md CHANGED Viewed

@@ -10,7 +10,7 @@ MindStudio apps are full-stack TypeScript projects. You have a lot to work with:
 - **Backend (Methods):** TypeScript in a sandboxed runtime. Any npm package. Managed SQLite database with typed schemas and automatic migrations. Built-in app-managed auth with email/SMS verification, cookie sessions, and role enforcement. None of these are required — use what the app needs.
 - **Frontend (Web Interface):** Starts as Vite + React, but any TypeScript project with a build command works. Any framework, any library, or no framework at all.
-- **AI & integrations:** The `@mindstudio-ai/agent` SDK gives access to 200+ AI models (OpenAI, Anthropic, Google, Meta, Mistral, and more) and 1000+ integrations (email, SMS, Slack, HubSpot, Google Workspace, web scraping, image/video generation, media processing) with zero configuration — credentials are handled automatically. No API keys needed. This SDK is really robust and used in production by 100k+ users and their AI agents.
+- **AI & integrations:** The `@mindstudio-ai/agent` SDK gives access to 200+ AI models (OpenAI, Anthropic, Google, Meta, Mistral, and more) and 1000+ integrations (email, SMS, Slack, HubSpot, Google Workspace, web scraping, image/video generation, media processing) with zero configuration — credentials are handled automatically. No API keys needed. Beyond individual actions, `runTask()` lets you spin up lightweight autonomous task agents that chain these actions together with judgment — e.g., a user types a restaurant name and the backend autonomously researches it in the background, finds the address, and generates a custom illustration. Think about where this kind of enrichment would make a feature go from functional to magical.
 - **Interfaces:** Web UI, REST API, cron jobs, webhooks, Discord bots, Telegram bots, MCP tool servers, email processors, conversational AI agents — all backed by the same methods. An app can use any combination.
 This is a capable, stable platform. Build with confidence; you're building production-grade apps, not fragile prototypes.

package/dist/subagents/codeSanityCheck/prompt.md CHANGED Viewed

@@ -58,6 +58,10 @@ When a plan includes multiple screens/API calls, always note this item for the d
 - **Auth state read once instead of subscribed.** If the plan reads `auth.currentUser` or `auth.getCurrentUser()` in a `useState` initializer, at component top-level, or in a one-time check, the UI won't update after login/logout. The correct pattern is `auth.onAuthStateChanged(cb)` which fires immediately and on every auth transition. Flag if you see auth state read without a subscription.
+- **Manual multi-step MindStudio SDK action chains that should be `runTask()`.** If a method chains AI-driven SDK actions with branching logic (search, then scrape based on results, then generate based on what was scraped), that's a `runTask()` use case. `runTask()` runs an agent loop that autonomously calls SDK actions as tools and returns structured JSON. The developer writes a prompt and an output example instead of imperative code. Flag when you see methods with complex sequential/branching SDK action chains — especially research, enrichment, or content generation pipelines. Similarly, flag opportunities where the developer might not have realized they could get better and richer data via runTask - it's a really powerful lever for working with data (e.g., user provides some fragment and agent task goes off and enriches it) that the developer might not have remembered when planning their work.
+- **MindStudio SDK `runTask()` output used without validation.** `runTask()` can return successfully with garbage output (null fields, echoed input, raw text). The result includes `parsedSuccessfully` — if the plan uses `result.output` without checking `result.parsedSuccessfully` first, flag it. This is the #1 footgun with task agents.
 - **Layout shift with dynamic data or AI generated text** If the plan includes dynamically-sized data (e.g., a wizard form with questions of differing lengths) or AI generated text (where text stream length is unpredictable), make sure to flag concerns about layout stability. Everything must either be a fixed size or smoothly animate between sizes. Text can never be clipped by a container or cause layout to jump around or grow in snappy/janky ways. Make sure to remind the developer that this is important to pay attention to.
 ## When to stay quiet

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@mindstudio-ai/remy",
-  "version": "0.1.119",
+  "version": "0.1.120",
   "description": "MindStudio coding agent",
   "repository": {
     "type": "git",