@mindstudio-ai/remy 0.1.107 → 0.1.109

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -4,7 +4,7 @@
4
4
 
5
5
  This is an automated action triggered by the user pressing "Build" in the editor after reviewing the spec.
6
6
 
7
- The user has reviewed the spec and is ready to build. There are four phases to building: planning, coding, polishing, verifying.
7
+ The user has reviewed the spec and is ready to build. There are four phases to building: planning, coding, polishing, verifying. Execute each phase in order in a single turn.
8
8
 
9
9
  ## Planning
10
10
  Think about your approach and then get a quick sanity check from `codeSanityCheck` to make sure you aren't missing anything.
package/dist/headless.js CHANGED
@@ -2454,8 +2454,6 @@ async function analyzeImage(params) {
2454
2454
  // src/tools/_helpers/screenshot.ts
2455
2455
  var SCREENSHOT_ANALYSIS_PROMPT = `Describe everything visible on screen from top to bottom \u2014 every element, its position, its size relative to the viewport, its colors, its content. Be comprehensive, thorough, and spatial. After the inventory, note anything that looks visually broken (overlapping elements, clipped text, misaligned components).
2456
2456
 
2457
- Note: ignore text wrapping issues. Screenshots occasionally show text wrapping onto an extra line compared to the live page \u2014 most noticeable in buttons, badges, and headings. This is a known limitation of SVG foreignObject rendering used the DOM-to-image capture library that took the screenshot. The browser's SVG renderer computes slightly wider text metrics than the HTML layout engine, so text that fits on one line in the live DOM can overflow by a fraction of a pixel in the capture - this is not a real issue.
2458
-
2459
2457
  Respond only with your analysis as Markdown and absolutely no other text. Do not use emojis - use unicode if you need symbols.`;
2460
2458
  async function captureAndAnalyzeScreenshot(promptOrOptions) {
2461
2459
  let prompt;
@@ -2489,7 +2487,10 @@ async function captureAndAnalyzeScreenshot(promptOrOptions) {
2489
2487
  if (prompt === false) {
2490
2488
  return url;
2491
2489
  }
2492
- const analysisPrompt = prompt || SCREENSHOT_ANALYSIS_PROMPT;
2490
+ let analysisPrompt = prompt || SCREENSHOT_ANALYSIS_PROMPT;
2491
+ analysisPrompt += `
2492
+ Note: ignore text wrapping issues. Screenshots occasionally show text wrapping onto an extra line compared to the live page \u2014 most noticeable in buttons, badges, and headings. This is a known limitation of SVG foreignObject rendering used the DOM-to-image capture library that took the screenshot. The browser's SVG renderer computes slightly wider text metrics than the HTML layout engine, so text that fits on one line in the live DOM can overflow by a fraction of a pixel in the capture - this is not a real issue.
2493
+ `;
2493
2494
  const analysis = await analyzeImage({
2494
2495
  prompt: analysisPrompt,
2495
2496
  imageUrl: url,
@@ -3462,8 +3463,6 @@ __export(analyzeImage_exports, {
3462
3463
  var DEFAULT_PROMPT = `
3463
3464
  Describe everything visible in this image \u2014 every element, its position, its size relative to the frame, its colors, its content. Be comprehensive, thorough and spatial. After the inventory, note anything that looks visually broken (overlapping elements, clipped text, misaligned components).
3464
3465
 
3465
- Note: ignore text wrapping issues. Screenshots occasionally show text wrapping onto an extra line compared to the live page \u2014 most noticeable in buttons, badges, and headings. This is a known limitation of SVG foreignObject rendering used the DOM-to-image capture library that took the screenshot. The browser's SVG renderer computes slightly wider text metrics than the HTML layout engine, so text that fits on one line in the live DOM can overflow by a fraction of a pixel in the capture - this is not a real issue.
3466
-
3467
3466
  Respond only with your analysis as Markdown and absolutely no other text. Do not use emojis - use unicode if you need symbols.`;
3468
3467
  var definition4 = {
3469
3468
  clearable: true,
@@ -3486,7 +3485,11 @@ var definition4 = {
3486
3485
  };
3487
3486
  async function execute4(input, onLog) {
3488
3487
  const imageUrl = input.imageUrl;
3489
- const prompt = input.prompt || DEFAULT_PROMPT;
3488
+ let prompt = input.prompt || DEFAULT_PROMPT;
3489
+ prompt += `
3490
+
3491
+ Note: ignore text wrapping issues. Screenshots occasionally show text wrapping onto an extra line compared to the live page \u2014 most noticeable in buttons, badges, and headings. This is a known limitation of SVG foreignObject rendering used the DOM-to-image capture library that took the screenshot. The browser's SVG renderer computes slightly wider text metrics than the HTML layout engine, so text that fits on one line in the live DOM can overflow by a fraction of a pixel in the capture - this is not a real issue.
3492
+ `;
3490
3493
  const analysis = await analyzeImage({
3491
3494
  prompt,
3492
3495
  imageUrl,
@@ -4576,7 +4579,7 @@ var SANITY_CHECK_TOOLS = [
4576
4579
  },
4577
4580
  {
4578
4581
  name: "askMindStudioSdk",
4579
- description: "Check if the MindStudio SDK has a managed action for something before writing custom code.",
4582
+ description: "Check if the MindStudio SDK has a managed action for something before writing custom code. Use bullet points to ask many questions at once.",
4580
4583
  inputSchema: {
4581
4584
  type: "object",
4582
4585
  properties: {
package/dist/index.js CHANGED
@@ -2186,7 +2186,10 @@ async function captureAndAnalyzeScreenshot(promptOrOptions) {
2186
2186
  if (prompt === false) {
2187
2187
  return url;
2188
2188
  }
2189
- const analysisPrompt = prompt || SCREENSHOT_ANALYSIS_PROMPT;
2189
+ let analysisPrompt = prompt || SCREENSHOT_ANALYSIS_PROMPT;
2190
+ analysisPrompt += `
2191
+ Note: ignore text wrapping issues. Screenshots occasionally show text wrapping onto an extra line compared to the live page \u2014 most noticeable in buttons, badges, and headings. This is a known limitation of SVG foreignObject rendering used the DOM-to-image capture library that took the screenshot. The browser's SVG renderer computes slightly wider text metrics than the HTML layout engine, so text that fits on one line in the live DOM can overflow by a fraction of a pixel in the capture - this is not a real issue.
2192
+ `;
2190
2193
  const analysis = await analyzeImage({
2191
2194
  prompt: analysisPrompt,
2192
2195
  imageUrl: url,
@@ -2202,8 +2205,6 @@ var init_screenshot = __esm({
2202
2205
  init_analyzeImage();
2203
2206
  SCREENSHOT_ANALYSIS_PROMPT = `Describe everything visible on screen from top to bottom \u2014 every element, its position, its size relative to the viewport, its colors, its content. Be comprehensive, thorough, and spatial. After the inventory, note anything that looks visually broken (overlapping elements, clipped text, misaligned components).
2204
2207
 
2205
- Note: ignore text wrapping issues. Screenshots occasionally show text wrapping onto an extra line compared to the live page \u2014 most noticeable in buttons, badges, and headings. This is a known limitation of SVG foreignObject rendering used the DOM-to-image capture library that took the screenshot. The browser's SVG renderer computes slightly wider text metrics than the HTML layout engine, so text that fits on one line in the live DOM can overflow by a fraction of a pixel in the capture - this is not a real issue.
2206
-
2207
2208
  Respond only with your analysis as Markdown and absolutely no other text. Do not use emojis - use unicode if you need symbols.`;
2208
2209
  }
2209
2210
  });
@@ -3289,7 +3290,11 @@ __export(analyzeImage_exports, {
3289
3290
  });
3290
3291
  async function execute4(input, onLog) {
3291
3292
  const imageUrl = input.imageUrl;
3292
- const prompt = input.prompt || DEFAULT_PROMPT;
3293
+ let prompt = input.prompt || DEFAULT_PROMPT;
3294
+ prompt += `
3295
+
3296
+ Note: ignore text wrapping issues. Screenshots occasionally show text wrapping onto an extra line compared to the live page \u2014 most noticeable in buttons, badges, and headings. This is a known limitation of SVG foreignObject rendering used the DOM-to-image capture library that took the screenshot. The browser's SVG renderer computes slightly wider text metrics than the HTML layout engine, so text that fits on one line in the live DOM can overflow by a fraction of a pixel in the capture - this is not a real issue.
3297
+ `;
3293
3298
  const analysis = await analyzeImage({
3294
3299
  prompt,
3295
3300
  imageUrl,
@@ -3305,8 +3310,6 @@ var init_analyzeImage2 = __esm({
3305
3310
  DEFAULT_PROMPT = `
3306
3311
  Describe everything visible in this image \u2014 every element, its position, its size relative to the frame, its colors, its content. Be comprehensive, thorough and spatial. After the inventory, note anything that looks visually broken (overlapping elements, clipped text, misaligned components).
3307
3312
 
3308
- Note: ignore text wrapping issues. Screenshots occasionally show text wrapping onto an extra line compared to the live page \u2014 most noticeable in buttons, badges, and headings. This is a known limitation of SVG foreignObject rendering used the DOM-to-image capture library that took the screenshot. The browser's SVG renderer computes slightly wider text metrics than the HTML layout engine, so text that fits on one line in the live DOM can overflow by a fraction of a pixel in the capture - this is not a real issue.
3309
-
3310
3313
  Respond only with your analysis as Markdown and absolutely no other text. Do not use emojis - use unicode if you need symbols.`;
3311
3314
  definition4 = {
3312
3315
  clearable: true,
@@ -4562,7 +4565,7 @@ var init_tools4 = __esm({
4562
4565
  },
4563
4566
  {
4564
4567
  name: "askMindStudioSdk",
4565
- description: "Check if the MindStudio SDK has a managed action for something before writing custom code.",
4568
+ description: "Check if the MindStudio SDK has a managed action for something before writing custom code. Use bullet points to ask many questions at once.",
4566
4569
  inputSchema: {
4567
4570
  type: "object",
4568
4571
  properties: {
@@ -35,7 +35,7 @@ Layout shift is never acceptable. Elements jumping around as content loads or st
35
35
  - Loading-to-loaded transitions should swap content in-place without changing container size.
36
36
  - Buttons must not change size during loading states. Use a fixed width or `min-width`, and swap the label for a spinner or short text that fits the same space. "Submit" becoming "Submitting..." should not make the button wider and push adjacent elements around.
37
37
  - Conditional UI should use opacity/overlay transitions, not insertion into flow that displaces existing content.
38
- - This is especially important to keep in mind when building things that display AI generated text, especially if the text streams in. Make sure to never shift layout because of streaming AI text.
38
+ - This is especially important to keep in mind when building things that display AI generated text, especially if the text streams in. Make sure to never shift layout because of streaming AI text. If an element changes height or width, it must be fixed size or a smooth transition - never snap or jump to different dimensions.
39
39
 
40
40
  ## Responsive Design
41
41
 
@@ -17,7 +17,7 @@
17
17
  - Pushing to main branch will trigger a deploy. The user presses the publish button in the interface to request publishing.
18
18
 
19
19
  ### Build Notes
20
- For complex tasks — especially an initial buildout from a spec or making multiple changes in a single turn — write a `.remy-notes.md` scratchpad in the project root. Use it to track progress: a checklist of what's been built and what's remaining. Do not include implementation details in th notes. Read the spec files directly when you need reference data. Delete the notes file when your work is done.
20
+ For complex tasks — especially an initial buildout from a spec or making multiple changes in a single turn — write a `.remy-notes.md` scratchpad in the project root. Use it to track progress: a checklist of what's been built and what's remaining. Never include implementation details or other decisions in the notes - it is solely for keeping track of tasks. Read the spec files directly when you need design details, implementation decisions, or other reference materials. Delete the notes file when your work is done.
21
21
 
22
22
  ## Communication
23
23
  The user can already see your tool calls, so most of your work is visible without narration. Focus text output on three things:
@@ -25,7 +25,7 @@ Don't recite this list to users. Use it to calibrate your sense of what's possib
25
25
  - **Automations** — cron jobs that monitor competitors and send alerts, webhook handlers that sync data between services, email processors that triage support requests — no UI needed
26
26
  - **Conversational AI agents** — custom chat UIs backed by any model, with tool access to the app's methods. Full control over what the agent can do and who can use it
27
27
  - **Bots & agent tools** — Discord slash-command bots, Telegram bots, MCP tool servers for AI assistants
28
- - **Creative projects** — browser games with p5.js or Three.js, interactive visualizations, generative art, portfolio sites with dynamic backends
28
+ - **Creative projects** — browser games with p5.js or Three.js, interactive visualizations, 3D things, generative art, portfolio sites with dynamic backends
29
29
  - **Marketing & launch pages** — landing pages, waitlist pages with referral mechanics, product sites with scroll animations — visual polish is a strength here
30
30
  - **API services** — backend logic exposed as REST endpoints
31
31
  - **Simple static sites** — no backend needed, just a web interface with a build step
@@ -53,6 +53,8 @@ When a plan includes multiple screens/API calls, always note this item for the d
53
53
 
54
54
  - **Image preloading for detail views.** If the plan has a grid/list of thumbnails that link to detail views with full-size images, flag it if there's no preloading strategy. The fix: preload full-size images in the background (`new Image().src = url`) so they're in the browser cache by the time the user taps. This makes transitions feel instant.
55
55
 
56
+ - **Layout shift with dynamic data or AI generatd text** If the plan includes dynamically-sized data (e.g., a wizard form with questions of differing lengths) or AI generated text (where text stream length is unpredictable), make sure to flag concerns about layout stability. Everything must either be a fixed size or smoothly animate between sizes. Text can never be clipped by a container or cause layout to jump around or grow in snappy/janky ways. Make sure to remind the developer that this is important to pay attention to.
57
+
56
58
  ## When to stay quiet
57
59
 
58
60
  Nits, style preferences, missing edge cases, things the agent will figure out as it goes, patterns that are "not ideal but fine," minor code smells. Let them slide. The agent is busy.
@@ -12,7 +12,7 @@ Think about the ways you can truly elevate the design. Use image generation to c
12
12
 
13
13
  ## Tool Usage
14
14
  - When multiple tool calls are independent, make them all in a single turn. Searching for three different products, or fetching two reference sites: batch them instead of doing one per turn.
15
- - The screenshot tool supports an `instructions` parameter for taking screenshots that require interaction first. If you need to screenshot a state that's behind a modal, a specific tab, or a multi-step flow, pass `instructions` describing how to get there (e.g., "dismiss the welcome modal, then click XYZ"). A browser automation agent will follow your instructions and capture the screenshot for you.
15
+ - The screenshot tool supports an `instructions` parameter for taking screenshots that require interaction first. If you need to screenshot a state that's behind a modal, a specific tab, or a multi-step flow, pass `instructions` describing how to get there (e.g., "dismiss the welcome modal, then click XYZ"). A browser automation agent will follow your instructions and capture the screenshot for you. You can not use this to scroll - you will always receive a full page screenshot. Only use this if you need to trigger stateful changes within the app to get the full-page screenshot.
16
16
  - After you've taken a screenshot, use analyze image to ask different questions about it - don't re-screenshot the page unnecessarily.
17
17
 
18
18
  ## Voice
@@ -22,7 +22,7 @@ With the spatial architecture established, make it visually compelling. This is
22
22
 
23
23
  Layout is where interfaces fail most visibly. Generic patterns like centered content, three equal columns, card grids, symmetric everything feel tired and bland. Fight the use of generic layouts actively. Source layout inspiration from <visual_design_references> — these sites were hand-picked because they are doing something compelling.
24
24
 
25
- If the design calls for it, take risks and be bold. It might not always work out and that's okay — the user always has the opportunity to refine.
25
+ If the design calls for it, take risks and be bold. It might not always work out and that's okay — the user always has the opportunity to refine and will be impressed and pleased with your having taken risks.
26
26
 
27
27
  #### Things that make layouts interesting
28
28
  - Asymmetry — varied column widths, off-center compositions
@@ -42,7 +42,7 @@ If the design calls for it, take risks and be bold. It might not always work out
42
42
 
43
43
  ### Backgrounds
44
44
 
45
- Backgrounds create atmosphere. Solid white or solid gray is the safe default and the enemy of distinctiveness. Layer subtle gradients, use warm or cool tints, add geometric patterns or contextual textures. The background sets the mood before the user reads a single word. If a background uses images, it must scale to fill and blend nicely. It can never cut off abruptly or be overly visible. Use blend modes, masks, and other effects.
45
+ Backgrounds create atmosphere. Avoid boring solid-color backgrounds and instead layer subtle gradients, use warm or cool tints, add geometric patterns or contextual textures. The background sets the mood before the user reads a single word. Do not use images as backgrounds.
46
46
 
47
47
  ### Output
48
48
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@mindstudio-ai/remy",
3
- "version": "0.1.107",
3
+ "version": "0.1.109",
4
4
  "description": "MindStudio coding agent",
5
5
  "repository": {
6
6
  "type": "git",