@mindstudio-ai/remy 0.1.14 → 0.1.16
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
|
@@ -2,7 +2,9 @@
|
|
|
2
2
|
|
|
3
3
|
`@mindstudio-ai/agent` provides access to 200+ AI models and 1,000+ actions through a single API key. No separate provider keys needed. MindStudio routes to the correct provider (OpenAI, Anthropic, Google, etc.) server-side.
|
|
4
4
|
|
|
5
|
-
There is a huge amount of capability here: hundreds of text generation models (OpenAI, Anthropic, Google, Meta, Mistral, and more), dozens of image generation models (FLUX, DALL-E, Stable Diffusion, Ideogram, and more), video generation, text-to-speech, music generation, vision analysis, web scraping, 850+ OAuth connectors, and much more. The tables below are a summary.
|
|
5
|
+
There is a huge amount of capability here: hundreds of text generation models (OpenAI, Anthropic, Google, Meta, Mistral, and more), dozens of image generation models (FLUX, DALL-E, Stable Diffusion, Ideogram, and more), video generation, text-to-speech, music generation, vision analysis, web scraping, 850+ OAuth connectors, and much more. The tables below are a summary.
|
|
6
|
+
|
|
7
|
+
**Always use `askMindStudioSdk` before writing code that uses the SDK.** Treat it as an expert consultant, not a docs search. Describe what you're trying to build at the method level — the full workflow, not just "how do I call generateText." The assistant knows every action, model, connector, configuration option, and the user's configured OAuth connections. It can advise on AI orchestration patterns (structured output, chaining calls, batch processing), help you avoid common mistakes (like manually parsing JSON when the SDK has structured output options), and provide complete working code for your use case.
|
|
6
8
|
|
|
7
9
|
## Usage in Methods
|
|
8
10
|
|
package/dist/headless.js
CHANGED
|
@@ -1245,13 +1245,20 @@ import { exec } from "child_process";
|
|
|
1245
1245
|
var askMindStudioSdkTool = {
|
|
1246
1246
|
definition: {
|
|
1247
1247
|
name: "askMindStudioSdk",
|
|
1248
|
-
description:
|
|
1248
|
+
description: `An expert consultant on building with the MindStudio SDK. Knows every action, model, connector, and configuration option. Use this as an architect, not just a docs lookup:
|
|
1249
|
+
|
|
1250
|
+
- Describe what you're trying to build at the method level ("I need a method that takes user text, generates a summary with GPT, extracts entities, and returns structured JSON") and get back architectural guidance + working code.
|
|
1251
|
+
- Ask about AI orchestration patterns: structured output, chaining model calls, batch processing, streaming, error handling.
|
|
1252
|
+
- Ask about connectors and integrations: what's available, whether the user has configured it, how to use it.
|
|
1253
|
+
- Always use this before writing SDK code. Model IDs, config options, and action signatures change frequently. Don't guess.
|
|
1254
|
+
|
|
1255
|
+
Batch related questions into a single query. This runs its own LLM call so it has a few seconds of latency.`,
|
|
1249
1256
|
inputSchema: {
|
|
1250
1257
|
type: "object",
|
|
1251
1258
|
properties: {
|
|
1252
1259
|
query: {
|
|
1253
1260
|
type: "string",
|
|
1254
|
-
description: "
|
|
1261
|
+
description: "Describe what you want to build or what you need to know. Be specific about the goal, not just the API method."
|
|
1255
1262
|
}
|
|
1256
1263
|
},
|
|
1257
1264
|
required: ["query"]
|
package/dist/index.js
CHANGED
|
@@ -1021,13 +1021,20 @@ var init_askMindStudioSdk = __esm({
|
|
|
1021
1021
|
askMindStudioSdkTool = {
|
|
1022
1022
|
definition: {
|
|
1023
1023
|
name: "askMindStudioSdk",
|
|
1024
|
-
description:
|
|
1024
|
+
description: `An expert consultant on building with the MindStudio SDK. Knows every action, model, connector, and configuration option. Use this as an architect, not just a docs lookup:
|
|
1025
|
+
|
|
1026
|
+
- Describe what you're trying to build at the method level ("I need a method that takes user text, generates a summary with GPT, extracts entities, and returns structured JSON") and get back architectural guidance + working code.
|
|
1027
|
+
- Ask about AI orchestration patterns: structured output, chaining model calls, batch processing, streaming, error handling.
|
|
1028
|
+
- Ask about connectors and integrations: what's available, whether the user has configured it, how to use it.
|
|
1029
|
+
- Always use this before writing SDK code. Model IDs, config options, and action signatures change frequently. Don't guess.
|
|
1030
|
+
|
|
1031
|
+
Batch related questions into a single query. This runs its own LLM call so it has a few seconds of latency.`,
|
|
1025
1032
|
inputSchema: {
|
|
1026
1033
|
type: "object",
|
|
1027
1034
|
properties: {
|
|
1028
1035
|
query: {
|
|
1029
1036
|
type: "string",
|
|
1030
|
-
description: "
|
|
1037
|
+
description: "Describe what you want to build or what you need to know. Be specific about the goal, not just the API method."
|
|
1031
1038
|
}
|
|
1032
1039
|
},
|
|
1033
1040
|
required: ["query"]
|
|
@@ -2,7 +2,9 @@
|
|
|
2
2
|
|
|
3
3
|
`@mindstudio-ai/agent` provides access to 200+ AI models and 1,000+ actions through a single API key. No separate provider keys needed. MindStudio routes to the correct provider (OpenAI, Anthropic, Google, etc.) server-side.
|
|
4
4
|
|
|
5
|
-
There is a huge amount of capability here: hundreds of text generation models (OpenAI, Anthropic, Google, Meta, Mistral, and more), dozens of image generation models (FLUX, DALL-E, Stable Diffusion, Ideogram, and more), video generation, text-to-speech, music generation, vision analysis, web scraping, 850+ OAuth connectors, and much more. The tables below are a summary.
|
|
5
|
+
There is a huge amount of capability here: hundreds of text generation models (OpenAI, Anthropic, Google, Meta, Mistral, and more), dozens of image generation models (FLUX, DALL-E, Stable Diffusion, Ideogram, and more), video generation, text-to-speech, music generation, vision analysis, web scraping, 850+ OAuth connectors, and much more. The tables below are a summary.
|
|
6
|
+
|
|
7
|
+
**Always use `askMindStudioSdk` before writing code that uses the SDK.** Treat it as an expert consultant, not a docs search. Describe what you're trying to build at the method level — the full workflow, not just "how do I call generateText." The assistant knows every action, model, connector, configuration option, and the user's configured OAuth connections. It can advise on AI orchestration patterns (structured output, chaining calls, batch processing), help you avoid common mistakes (like manually parsing JSON when the SDK has structured output options), and provide complete working code for your use case.
|
|
6
8
|
|
|
7
9
|
## Usage in Methods
|
|
8
10
|
|
|
@@ -16,35 +16,20 @@ Do not provide images as "references" - images must be ready-to-use assets that
|
|
|
16
16
|
|
|
17
17
|
### Writing good generation prompts
|
|
18
18
|
|
|
19
|
-
|
|
19
|
+
Lead with the visual style, then describe the content. This order helps the model establish the look before filling in details.
|
|
20
20
|
|
|
21
|
-
**Structure:**
|
|
21
|
+
**Structure:** Style/medium first, then subject, then details.
|
|
22
|
+
- "Digital photography, soft natural window light, shallow depth of field. A ceramic coffee cup on a marble countertop, morning light casting long shadows, warm tones."
|
|
23
|
+
- "Flat vector illustration, clean lines, limited color palette. An isometric view of a workspace with a laptop, plant, and notebook."
|
|
24
|
+
- "Abstract digital art, fluid gradients, high contrast. Deep navy flowing into warm amber, organic liquid shapes, editorial feel."
|
|
22
25
|
|
|
23
|
-
|
|
24
|
-
- "An overhead view of a cluttered designer's desk with fabric swatches, sketches, and a coffee cup. Natural window light from the left, slightly desaturated tones, Canon 5D with 35mm lens. For an about page."
|
|
25
|
-
- "Smooth organic shapes in deep navy and warm amber, flowing liquid forms with subtle grain texture. Abstract digital art, high contrast, editorial feel."
|
|
26
|
-
|
|
27
|
-
**Photography vocabulary produces the best results.** The model responds strongly to specific references:
|
|
28
|
-
- Film stocks: Kodak Portra, Fuji Superia, Cinestill 800T, expired film
|
|
29
|
-
- Lenses: 85mm f/1.4, 35mm wide angle, 50mm Summilux, macro
|
|
30
|
-
- Lighting: golden hour, chiaroscuro, tungsten warmth, soft diffused studio light, direct flash
|
|
31
|
-
- Shot types: close-up, overhead flat lay, low angle, eye-level candid, aerial
|
|
32
|
-
- Techniques: shallow depth of field, halation around highlights, film grain, motion blur
|
|
33
|
-
|
|
34
|
-
**Declare the medium early.** Saying "editorial photograph" vs "watercolor painting" vs "3D render" doesn't just change style — it changes the model's entire approach to composition, color, and detail. Set this expectation in the first sentence.
|
|
35
|
-
|
|
36
|
-
**For text in images**, wrap the exact text in double quotes and specify the style: `A neon sign reading "OPEN" in cursive pink lettering against a dark brick wall.`
|
|
37
|
-
|
|
38
|
-
**Compose for the layout.** If you know the image will have text overlaid, request space for it: "negative space in the upper left for headline text" or "clean sky area above the subject." If it's a background, consider "centered subject with clean margins." The first few words of the prompt carry the most weight — lead with the medium and subject.
|
|
26
|
+
**For photorealistic images:** Specify the photography style (editorial, portrait, product, aerial), lighting (natural, studio, golden hour, direct flash), and camera characteristics (close-up, wide angle, shallow depth of field, slightly grainy texture).
|
|
39
27
|
|
|
40
28
|
**Avoid:**
|
|
41
29
|
- Hex codes in prompts — the model renders them as visible text. Describe colors by name instead.
|
|
42
|
-
- Keyword lists separated by commas — write sentences.
|
|
43
30
|
- Describing positions of arms, legs, or specific limb arrangements.
|
|
44
31
|
- Conflicting style instructions ("photorealistic cartoon").
|
|
45
32
|
- Describing what you don't want — say "empty street" not "street with no cars."
|
|
46
|
-
- UI component language — "glass morphism effect", "card design", "button with hover state". Write prompts as if briefing a photographer or artist, not describing CSS.
|
|
47
|
-
- Generating text that should be HTML. Headlines, body copy, CTAs, and any text the user needs to read or interact with belongs in the markup, not baked into an image. Text *within a scene* is fine — a neon sign, a logo on a t-shirt, text on a billboard in a cityscape, an app screen in a device mockup. That's part of the visual content.
|
|
48
33
|
|
|
49
34
|
### How generated images work in the UI
|
|
50
35
|
|