@mindstudio-ai/remy 0.1.124 → 0.1.125
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/automatedActions/buildFromInitialSpec.md +7 -3
- package/dist/automatedActions/buildFromRoadmap.md +1 -1
- package/dist/automatedActions/codeCleanup.md +11 -0
- package/dist/automatedActions/continue.md +1 -1
- package/dist/automatedActions/familiarize.md +13 -0
- package/dist/automatedActions/publish.md +1 -1
- package/dist/automatedActions/reviseFromAnnotatedImage.md +1 -1
- package/dist/headless.js +90 -46
- package/dist/index.js +99 -53
- package/dist/prompt/static/coding.md +1 -0
- package/dist/prompt/static/team.md +1 -1
- package/dist/subagents/codeSanityCheck/prompt.md +10 -4
- package/dist/subagents/designExpert/prompts/images.md +2 -0
- package/package.json +1 -1
|
@@ -11,17 +11,21 @@ Think about your approach and then get a quick sanity check from `codeSanityChec
|
|
|
11
11
|
|
|
12
12
|
If you are building a web frontend, consult `visualDesignExpert` for guidance and ideas on specific component design, UI patterns, and interactions - it has access to a deep repository of design inspiration and will be able to give you great ideas to work with while building. Don't ask it to design full screens - focus on specific components, moments, and concepts where its ideas can be additive and transformative, you already have the basic design and layout guidance from the spec.
|
|
13
13
|
|
|
14
|
+
Use your remy-notes.md file to make a checklist of the work that needs to be done. Don't store implementation details in it - it is soley for keeping track of tasks.
|
|
15
|
+
|
|
14
16
|
## Building
|
|
15
|
-
Then, build everything in one turn:
|
|
17
|
+
Then, build everything in one turn: tables, methods, interfaces, manifest updates, scenarios, and anything else, using the spec as the master plan. Be sure to delete any unnecessary files from the "Hello World" scaffold that already exist in the project, and don't forget to update the page metadata on index.html too.
|
|
16
18
|
|
|
17
19
|
## Verifying
|
|
18
|
-
- First, run use `runScenario` to seed test data, then use `runMethod` to confirm important methods work
|
|
20
|
+
- First, run use `runScenario` to seed test data, then use `runMethod` to confirm important methods work.
|
|
19
21
|
- If the app has a web frontend, check the browser logs to make sure there are no errors rendering it.
|
|
20
22
|
- Use `runAutomatedBrowserTest` to smoke-test the main UI flow. The dev database is a disposable snapshot, so don't worry about being destructive. Fix any errors before finishing.
|
|
21
23
|
- If there is a scenario that seeds the app with mock data, use it to present the app to the user with initial data seeded, so they can see and play with the real app. Let the user know they can reset the app using a scenario to empty it if they wish. Showing the user something they can play with immediately is important when it comes to landing a strong first impression.
|
|
22
24
|
|
|
23
25
|
## Polishing
|
|
24
|
-
When
|
|
26
|
+
When verification is complete, take a step back and do an explicit polish pass before verifying. Re-read the spec files and the design expert's guidance, then walk through each frontend file looking for design details that got skipped in the initial build: animations, transitions, hover states, micro-interactions, spring physics, entrance reveals, gesture handling, layout issues, and anything else.
|
|
27
|
+
|
|
28
|
+
The initial build prioritizes getting everything connected and functional, but this pass closes the gap between "it works" and "it feels great." In many ways this is *the* most important part of the initial build, as the user's first experience of the deliverable will set their expectations for every iteration that follows. Don't mess this up.
|
|
25
29
|
|
|
26
30
|
Then, ask the `visualDesignExpert` to take a screenshot and verity that the visual design looks correct. Fix any issues it flags - we want the user's first time seeing the finished product to truly wow them.
|
|
27
31
|
|
|
@@ -6,7 +6,7 @@ This is an automated action triggered by the user pressing "Build Now" on the ro
|
|
|
6
6
|
|
|
7
7
|
First, review the specific item and think about how it fits with the existing spec and the existing codebase.
|
|
8
8
|
|
|
9
|
-
Then, ask the user clarifying questions about anything that is ambiguous or requires additional input. Consult the team for any design work, architecture review, or SDK guidance - even if they're just quick questions (that's what the team is there for - they want to help and feel valuable!).
|
|
9
|
+
Then, ask the user any clarifying questions about anything that is ambiguous or requires additional input. Consult the team for any design work, architecture review, or SDK guidance - even if they're just quick questions (that's what the team is there for - they want to help and feel valuable!). Additive feature work is the most fun to build, but make sure you take a step back first and consider any technical debt or organization implications - we need to keep the codebase clean, tidy, bug-free, and easy/intuitive to manage. When adding new features, this might mean extracting shared helpers to separate files, breaking apart components into multiple files, making new folders, etc. - Don't go overboard, but also don't keep adding to one giant file until it ends up unmanageable. Consider organization and structure when building the plan.
|
|
10
10
|
|
|
11
11
|
Then, put together a plan to build out the feature. Present the plan to the user for their approval.
|
|
12
12
|
|
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
---
|
|
2
|
+
trigger: codeCleanup
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
This is an automated action triggered by the user pressing "Code Cleanup" in the editor. The user can not see this message, so keep that in mind when replying.
|
|
6
|
+
|
|
7
|
+
First, explore the project and get a sense of what has built.
|
|
8
|
+
|
|
9
|
+
Read specific files and trace paths, don't just guess at how something works based on partial information. When you are finished exploring, identiy any high-impact areas for cleaning up the code. This can include things like directory organization, splitting things into separate files to make the project easier to scan, breaking up large components or screens, deduplicating copy-pasted code, removing dead code, and anything else that will leave the project more robust, reliable, and easier to scan/work on for developers. Do not optimize for the sake of optimizing, only focus on reducing real technical debt and leaving the product better and a more pleasant experience for other developers working on it.
|
|
10
|
+
|
|
11
|
+
When you have a plan, run it by the `codeSanityCheck` to get a second set of eyes. Then, present the plan to the user. The technical detail is important so the user has a sense of what the plan entails, but remember that the user is not very technical, so it is equally important to help them understand the why behind any refactors.
|
|
@@ -2,4 +2,4 @@
|
|
|
2
2
|
trigger: continue
|
|
3
3
|
---
|
|
4
4
|
|
|
5
|
-
This is an automated message triggered by the user pressing "Try Again" after an error in the
|
|
5
|
+
This is an automated message triggered by the user pressing "Try Again" after an error in the agent's turn. The error was likely a temporary network hiccup or something transient. Briefly acknowledge that you are continuing the work, and then immediately pick up from where you left off in the last conversation turn.
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
---
|
|
2
|
+
trigger: familiarize
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
This is an automated action triggered by the user pressing "Familiarize" in the editor. The user can not see this message, so keep that in mind when replying.
|
|
6
|
+
|
|
7
|
+
Explore the project and get a sense of what has built. Start with the manifest, then the spec in src/, then any code that has been generated in dist/.
|
|
8
|
+
|
|
9
|
+
Get a sense of the complete project - the intent behind the product, the work that has been done, work that is left to do, places for improvement, anything that feels incomplete, etc.
|
|
10
|
+
|
|
11
|
+
Read specific files and trace paths, don't just guess at how something works based on partial information. After this turn, you need to be able to dive right in and get to work, so make sure you are ready to do that and are in a strong, confident, evidence-based position to engage with the user.
|
|
12
|
+
|
|
13
|
+
When you are finished, provide a summary to the user of everything you found.
|
|
@@ -12,6 +12,6 @@ If approved:
|
|
|
12
12
|
- Stage and commit any uncommitted changes with a clean, descriptive commit message
|
|
13
13
|
- Push to main
|
|
14
14
|
- Use `mindstudio-prod releases status --wait` to poll the build until it completes. Let the user know it's deploying, then report back when it's live.
|
|
15
|
-
- Once deployed, offer to help with next steps: setting up a custom domain (`mindstudio-prod domains`), checking for errors (`mindstudio-prod requests stats`), seeding production data (`mindstudio-prod db`), or anything else they need for launch.
|
|
15
|
+
- Once deployed, offer to help with next steps: setting up a custom domain (`mindstudio-prod domains`), checking for errors (`mindstudio-prod requests stats`), seeding production data (`mindstudio-prod db`), managing env vars/secrets, or anything else they need for launch.
|
|
16
16
|
|
|
17
17
|
If dismissed, acknowledge and do nothing.
|
|
@@ -2,4 +2,4 @@
|
|
|
2
2
|
trigger: reviseFromAnnotatedImage
|
|
3
3
|
---
|
|
4
4
|
|
|
5
|
-
This is an automated message
|
|
5
|
+
This is an automated message triggered by the user submitting a screenshot of the app that they have annotated with text notes. Please analyze the provided image and make the requested revisions to the web interface.
|
package/dist/headless.js
CHANGED
|
@@ -2480,6 +2480,15 @@ var runMethodTool = {
|
|
|
2480
2480
|
input: {
|
|
2481
2481
|
type: "object",
|
|
2482
2482
|
description: "The input payload to pass to the method. Omit for methods that take no input."
|
|
2483
|
+
},
|
|
2484
|
+
roles: {
|
|
2485
|
+
type: "array",
|
|
2486
|
+
items: { type: "string" },
|
|
2487
|
+
description: 'Optional. Role names for this request (e.g. ["admin"]). Can be used without userId to test role-gated logic. Overrides session-level impersonation for this call only.'
|
|
2488
|
+
},
|
|
2489
|
+
userId: {
|
|
2490
|
+
type: "string",
|
|
2491
|
+
description: "Optional. User ID for this request \u2014 use a managed user's ID to simulate their identity. Overrides session-level impersonation for this call only."
|
|
2483
2492
|
}
|
|
2484
2493
|
},
|
|
2485
2494
|
required: ["method"]
|
|
@@ -2593,50 +2602,6 @@ async function captureAndAnalyzeScreenshot(promptOrOptions) {
|
|
|
2593
2602
|
return JSON.stringify({ url, analysis, ...styleMap ? { styleMap } : {} });
|
|
2594
2603
|
}
|
|
2595
2604
|
|
|
2596
|
-
// src/tools/code/screenshot.ts
|
|
2597
|
-
var screenshotTool = {
|
|
2598
|
-
clearable: true,
|
|
2599
|
-
definition: {
|
|
2600
|
-
name: "screenshot",
|
|
2601
|
-
description: "Capture a full-height screenshot of the app preview and get a description of what's on screen. Provides static image analysis only, will not capture animations or video. Optionally provide specific questions about what you're looking for. Use a bulleted list to ask many questions at once. To ask additional questions about a screenshot you have already captured, pass its URL as imageUrl to skip recapture.",
|
|
2602
|
-
inputSchema: {
|
|
2603
|
-
type: "object",
|
|
2604
|
-
properties: {
|
|
2605
|
-
prompt: {
|
|
2606
|
-
type: "string",
|
|
2607
|
-
description: "Optional question about the screenshot. If omitted, returns a general description of what's visible."
|
|
2608
|
-
},
|
|
2609
|
-
imageUrl: {
|
|
2610
|
-
type: "string",
|
|
2611
|
-
description: "URL of an existing screenshot to analyze instead of capturing a new one. Use this for additional questions about a previous screenshot."
|
|
2612
|
-
},
|
|
2613
|
-
path: {
|
|
2614
|
-
type: "string",
|
|
2615
|
-
description: 'Navigate to this path before capturing (e.g. "/settings", "/dashboard"). If omitted, screenshots the current page.'
|
|
2616
|
-
}
|
|
2617
|
-
}
|
|
2618
|
-
}
|
|
2619
|
-
},
|
|
2620
|
-
async execute(input, context) {
|
|
2621
|
-
try {
|
|
2622
|
-
if (input.imageUrl) {
|
|
2623
|
-
return await captureAndAnalyzeScreenshot({
|
|
2624
|
-
prompt: input.prompt,
|
|
2625
|
-
imageUrl: input.imageUrl,
|
|
2626
|
-
onLog: context?.onLog
|
|
2627
|
-
});
|
|
2628
|
-
}
|
|
2629
|
-
return await captureAndAnalyzeScreenshot({
|
|
2630
|
-
prompt: input.prompt,
|
|
2631
|
-
path: input.path,
|
|
2632
|
-
onLog: context?.onLog
|
|
2633
|
-
});
|
|
2634
|
-
} catch (err) {
|
|
2635
|
-
return `Error taking screenshot: ${err.message}`;
|
|
2636
|
-
}
|
|
2637
|
-
}
|
|
2638
|
-
};
|
|
2639
|
-
|
|
2640
2605
|
// src/statusWatcher.ts
|
|
2641
2606
|
function startStatusWatcher(config) {
|
|
2642
2607
|
const { apiConfig, getContext, onStatus, interval = 3e3, signal } = config;
|
|
@@ -3467,6 +3432,85 @@ var browserAutomationTool = {
|
|
|
3467
3432
|
}
|
|
3468
3433
|
};
|
|
3469
3434
|
|
|
3435
|
+
// src/tools/code/screenshot.ts
|
|
3436
|
+
var screenshotTool = {
|
|
3437
|
+
clearable: true,
|
|
3438
|
+
definition: {
|
|
3439
|
+
name: "screenshot",
|
|
3440
|
+
description: "Capture a full-height screenshot of the app preview and get a description of what's on screen. Provides static image analysis only, will not capture animations or video. Optionally provide specific questions about what you're looking for. Use a bulleted list to ask many questions at once. To ask additional questions about a screenshot you have already captured, pass its URL as imageUrl to skip recapture. If the screenshot requires interaction first (logging in, clicking a tab, dismissing a modal), use the instructions param to describe the steps.",
|
|
3441
|
+
inputSchema: {
|
|
3442
|
+
type: "object",
|
|
3443
|
+
properties: {
|
|
3444
|
+
prompt: {
|
|
3445
|
+
type: "string",
|
|
3446
|
+
description: "Optional question about the screenshot. If omitted, returns a general description of what's visible."
|
|
3447
|
+
},
|
|
3448
|
+
imageUrl: {
|
|
3449
|
+
type: "string",
|
|
3450
|
+
description: "URL of an existing screenshot to analyze instead of capturing a new one. Use this for additional questions about a previous screenshot."
|
|
3451
|
+
},
|
|
3452
|
+
path: {
|
|
3453
|
+
type: "string",
|
|
3454
|
+
description: 'Navigate to this path before capturing (e.g. "/settings", "/dashboard"). If omitted, screenshots the current page.'
|
|
3455
|
+
},
|
|
3456
|
+
instructions: {
|
|
3457
|
+
type: "string",
|
|
3458
|
+
description: "If the screenshot you need requires interaction first (dismissing a modal, clicking a tab, filling out a form, navigating a flow, getting through a login/auth checkpoint), describe the steps to get there. A browser automation agent will follow these instructions before capturing the screenshot - it can bypass auth and get right to where it needs to be if you tell it to authenticate as a test user and give it the path/screen to start its test at. You will always get back a full-height screenshot of the entire page. Do not attempt to scroll or capture specific areas. Only use instructions when you need to trigger stateful changes. Never describe what names or values to use when applying the isntructions - the browser automation agent must use its own values for it to work properly. If a specific auth role is required to access the content, be sure to note that - it can automatically assume it for the purpose of testing."
|
|
3459
|
+
}
|
|
3460
|
+
}
|
|
3461
|
+
}
|
|
3462
|
+
},
|
|
3463
|
+
async execute(input, context) {
|
|
3464
|
+
try {
|
|
3465
|
+
if (input.imageUrl) {
|
|
3466
|
+
return await captureAndAnalyzeScreenshot({
|
|
3467
|
+
prompt: input.prompt,
|
|
3468
|
+
imageUrl: input.imageUrl,
|
|
3469
|
+
onLog: context?.onLog
|
|
3470
|
+
});
|
|
3471
|
+
}
|
|
3472
|
+
if (input.instructions && context) {
|
|
3473
|
+
const task = input.path ? `Navigate to "${input.path}", then: ${input.instructions}. After completing these steps, take a full-page screenshot.` : `${input.instructions}. After completing these steps, take a full-page screenshot.`;
|
|
3474
|
+
const result = await browserAutomationTool.execute({ task }, context);
|
|
3475
|
+
const urlMatch = result.match(
|
|
3476
|
+
/https:\/\/[^\s"')]+\.(?:png|jpg|jpeg|webp)/i
|
|
3477
|
+
);
|
|
3478
|
+
if (!urlMatch) {
|
|
3479
|
+
return `Error: browser navigation completed but no screenshot URL was returned. Agent output: ${result}`;
|
|
3480
|
+
}
|
|
3481
|
+
const url = urlMatch[0];
|
|
3482
|
+
let styleMap;
|
|
3483
|
+
try {
|
|
3484
|
+
const parsed = JSON.parse(result);
|
|
3485
|
+
styleMap = parsed?.styleMap;
|
|
3486
|
+
} catch {
|
|
3487
|
+
}
|
|
3488
|
+
const analysisPrompt = buildScreenshotAnalysisPrompt({
|
|
3489
|
+
prompt: input.prompt,
|
|
3490
|
+
styleMap
|
|
3491
|
+
});
|
|
3492
|
+
const analysis = await analyzeImage({
|
|
3493
|
+
prompt: analysisPrompt,
|
|
3494
|
+
imageUrl: url,
|
|
3495
|
+
onLog: context?.onLog
|
|
3496
|
+
});
|
|
3497
|
+
return JSON.stringify({
|
|
3498
|
+
url,
|
|
3499
|
+
analysis,
|
|
3500
|
+
...styleMap ? { styleMap } : {}
|
|
3501
|
+
});
|
|
3502
|
+
}
|
|
3503
|
+
return await captureAndAnalyzeScreenshot({
|
|
3504
|
+
prompt: input.prompt,
|
|
3505
|
+
path: input.path,
|
|
3506
|
+
onLog: context?.onLog
|
|
3507
|
+
});
|
|
3508
|
+
} catch (err) {
|
|
3509
|
+
return `Error taking screenshot: ${err.message}`;
|
|
3510
|
+
}
|
|
3511
|
+
}
|
|
3512
|
+
};
|
|
3513
|
+
|
|
3470
3514
|
// src/subagents/designExpert/tools/searchGoogle.ts
|
|
3471
3515
|
var searchGoogle_exports = {};
|
|
3472
3516
|
__export(searchGoogle_exports, {
|
|
@@ -3651,7 +3695,7 @@ var definition5 = {
|
|
|
3651
3695
|
properties: {
|
|
3652
3696
|
prompt: {
|
|
3653
3697
|
type: "string",
|
|
3654
|
-
description: "Optional specific question about the screenshot."
|
|
3698
|
+
description: "Optional specific question about the screenshot. Use a bulleted list to ask many questions at once."
|
|
3655
3699
|
},
|
|
3656
3700
|
path: {
|
|
3657
3701
|
type: "string",
|
|
@@ -3659,7 +3703,7 @@ var definition5 = {
|
|
|
3659
3703
|
},
|
|
3660
3704
|
instructions: {
|
|
3661
3705
|
type: "string",
|
|
3662
|
-
description: "If the screenshot you need requires interaction first (dismissing a modal, clicking a tab, filling out a form, navigating a flow, getting through a login/auth checkpoint), describe the steps to get there. A browser automation agent will follow these instructions before capturing the screenshot - it can bypass auth and get right to where it needs to be if you tell it to authenticate as a test user and give it the path/screen to start its test at. You will always get back a full-height screenshot of the entire page. Do not attempt to scroll or capture specific areas. Only use instructions when you need to trigger stateful changes. Never describe what names or values to use when applying the isntructions - the browser automation agent must use its own values for it to work properly."
|
|
3706
|
+
description: "If the screenshot you need requires interaction first (dismissing a modal, clicking a tab, filling out a form, navigating a flow, getting through a login/auth checkpoint), describe the steps to get there. A browser automation agent will follow these instructions before capturing the screenshot - it can bypass auth and get right to where it needs to be if you tell it to authenticate as a test user and give it the path/screen to start its test at. You will always get back a full-height screenshot of the entire page. Do not attempt to scroll or capture specific areas. Only use instructions when you need to trigger stateful changes. Never describe what names or values to use when applying the isntructions - the browser automation agent must use its own values for it to work properly. If a specific auth role is required to access the content, be sure to note that - it can automatically assume it for the purpose of testing."
|
|
3663
3707
|
}
|
|
3664
3708
|
}
|
|
3665
3709
|
}
|
package/dist/index.js
CHANGED
|
@@ -2161,6 +2161,15 @@ var init_runMethod = __esm({
|
|
|
2161
2161
|
input: {
|
|
2162
2162
|
type: "object",
|
|
2163
2163
|
description: "The input payload to pass to the method. Omit for methods that take no input."
|
|
2164
|
+
},
|
|
2165
|
+
roles: {
|
|
2166
|
+
type: "array",
|
|
2167
|
+
items: { type: "string" },
|
|
2168
|
+
description: 'Optional. Role names for this request (e.g. ["admin"]). Can be used without userId to test role-gated logic. Overrides session-level impersonation for this call only.'
|
|
2169
|
+
},
|
|
2170
|
+
userId: {
|
|
2171
|
+
type: "string",
|
|
2172
|
+
description: "Optional. User ID for this request \u2014 use a managed user's ID to simulate their identity. Overrides session-level impersonation for this call only."
|
|
2164
2173
|
}
|
|
2165
2174
|
},
|
|
2166
2175
|
required: ["method"]
|
|
@@ -2297,57 +2306,6 @@ Respond only with your analysis as Markdown and absolutely no other text. Do not
|
|
|
2297
2306
|
}
|
|
2298
2307
|
});
|
|
2299
2308
|
|
|
2300
|
-
// src/tools/code/screenshot.ts
|
|
2301
|
-
var screenshotTool;
|
|
2302
|
-
var init_screenshot2 = __esm({
|
|
2303
|
-
"src/tools/code/screenshot.ts"() {
|
|
2304
|
-
"use strict";
|
|
2305
|
-
init_screenshot();
|
|
2306
|
-
screenshotTool = {
|
|
2307
|
-
clearable: true,
|
|
2308
|
-
definition: {
|
|
2309
|
-
name: "screenshot",
|
|
2310
|
-
description: "Capture a full-height screenshot of the app preview and get a description of what's on screen. Provides static image analysis only, will not capture animations or video. Optionally provide specific questions about what you're looking for. Use a bulleted list to ask many questions at once. To ask additional questions about a screenshot you have already captured, pass its URL as imageUrl to skip recapture.",
|
|
2311
|
-
inputSchema: {
|
|
2312
|
-
type: "object",
|
|
2313
|
-
properties: {
|
|
2314
|
-
prompt: {
|
|
2315
|
-
type: "string",
|
|
2316
|
-
description: "Optional question about the screenshot. If omitted, returns a general description of what's visible."
|
|
2317
|
-
},
|
|
2318
|
-
imageUrl: {
|
|
2319
|
-
type: "string",
|
|
2320
|
-
description: "URL of an existing screenshot to analyze instead of capturing a new one. Use this for additional questions about a previous screenshot."
|
|
2321
|
-
},
|
|
2322
|
-
path: {
|
|
2323
|
-
type: "string",
|
|
2324
|
-
description: 'Navigate to this path before capturing (e.g. "/settings", "/dashboard"). If omitted, screenshots the current page.'
|
|
2325
|
-
}
|
|
2326
|
-
}
|
|
2327
|
-
}
|
|
2328
|
-
},
|
|
2329
|
-
async execute(input, context) {
|
|
2330
|
-
try {
|
|
2331
|
-
if (input.imageUrl) {
|
|
2332
|
-
return await captureAndAnalyzeScreenshot({
|
|
2333
|
-
prompt: input.prompt,
|
|
2334
|
-
imageUrl: input.imageUrl,
|
|
2335
|
-
onLog: context?.onLog
|
|
2336
|
-
});
|
|
2337
|
-
}
|
|
2338
|
-
return await captureAndAnalyzeScreenshot({
|
|
2339
|
-
prompt: input.prompt,
|
|
2340
|
-
path: input.path,
|
|
2341
|
-
onLog: context?.onLog
|
|
2342
|
-
});
|
|
2343
|
-
} catch (err) {
|
|
2344
|
-
return `Error taking screenshot: ${err.message}`;
|
|
2345
|
-
}
|
|
2346
|
-
}
|
|
2347
|
-
};
|
|
2348
|
-
}
|
|
2349
|
-
});
|
|
2350
|
-
|
|
2351
2309
|
// src/statusWatcher.ts
|
|
2352
2310
|
function startStatusWatcher(config) {
|
|
2353
2311
|
const { apiConfig, getContext, onStatus, interval = 3e3, signal } = config;
|
|
@@ -3267,6 +3225,94 @@ var init_browserAutomation = __esm({
|
|
|
3267
3225
|
}
|
|
3268
3226
|
});
|
|
3269
3227
|
|
|
3228
|
+
// src/tools/code/screenshot.ts
|
|
3229
|
+
var screenshotTool;
|
|
3230
|
+
var init_screenshot2 = __esm({
|
|
3231
|
+
"src/tools/code/screenshot.ts"() {
|
|
3232
|
+
"use strict";
|
|
3233
|
+
init_screenshot();
|
|
3234
|
+
init_analyzeImage();
|
|
3235
|
+
init_browserAutomation();
|
|
3236
|
+
screenshotTool = {
|
|
3237
|
+
clearable: true,
|
|
3238
|
+
definition: {
|
|
3239
|
+
name: "screenshot",
|
|
3240
|
+
description: "Capture a full-height screenshot of the app preview and get a description of what's on screen. Provides static image analysis only, will not capture animations or video. Optionally provide specific questions about what you're looking for. Use a bulleted list to ask many questions at once. To ask additional questions about a screenshot you have already captured, pass its URL as imageUrl to skip recapture. If the screenshot requires interaction first (logging in, clicking a tab, dismissing a modal), use the instructions param to describe the steps.",
|
|
3241
|
+
inputSchema: {
|
|
3242
|
+
type: "object",
|
|
3243
|
+
properties: {
|
|
3244
|
+
prompt: {
|
|
3245
|
+
type: "string",
|
|
3246
|
+
description: "Optional question about the screenshot. If omitted, returns a general description of what's visible."
|
|
3247
|
+
},
|
|
3248
|
+
imageUrl: {
|
|
3249
|
+
type: "string",
|
|
3250
|
+
description: "URL of an existing screenshot to analyze instead of capturing a new one. Use this for additional questions about a previous screenshot."
|
|
3251
|
+
},
|
|
3252
|
+
path: {
|
|
3253
|
+
type: "string",
|
|
3254
|
+
description: 'Navigate to this path before capturing (e.g. "/settings", "/dashboard"). If omitted, screenshots the current page.'
|
|
3255
|
+
},
|
|
3256
|
+
instructions: {
|
|
3257
|
+
type: "string",
|
|
3258
|
+
description: "If the screenshot you need requires interaction first (dismissing a modal, clicking a tab, filling out a form, navigating a flow, getting through a login/auth checkpoint), describe the steps to get there. A browser automation agent will follow these instructions before capturing the screenshot - it can bypass auth and get right to where it needs to be if you tell it to authenticate as a test user and give it the path/screen to start its test at. You will always get back a full-height screenshot of the entire page. Do not attempt to scroll or capture specific areas. Only use instructions when you need to trigger stateful changes. Never describe what names or values to use when applying the isntructions - the browser automation agent must use its own values for it to work properly. If a specific auth role is required to access the content, be sure to note that - it can automatically assume it for the purpose of testing."
|
|
3259
|
+
}
|
|
3260
|
+
}
|
|
3261
|
+
}
|
|
3262
|
+
},
|
|
3263
|
+
async execute(input, context) {
|
|
3264
|
+
try {
|
|
3265
|
+
if (input.imageUrl) {
|
|
3266
|
+
return await captureAndAnalyzeScreenshot({
|
|
3267
|
+
prompt: input.prompt,
|
|
3268
|
+
imageUrl: input.imageUrl,
|
|
3269
|
+
onLog: context?.onLog
|
|
3270
|
+
});
|
|
3271
|
+
}
|
|
3272
|
+
if (input.instructions && context) {
|
|
3273
|
+
const task = input.path ? `Navigate to "${input.path}", then: ${input.instructions}. After completing these steps, take a full-page screenshot.` : `${input.instructions}. After completing these steps, take a full-page screenshot.`;
|
|
3274
|
+
const result = await browserAutomationTool.execute({ task }, context);
|
|
3275
|
+
const urlMatch = result.match(
|
|
3276
|
+
/https:\/\/[^\s"')]+\.(?:png|jpg|jpeg|webp)/i
|
|
3277
|
+
);
|
|
3278
|
+
if (!urlMatch) {
|
|
3279
|
+
return `Error: browser navigation completed but no screenshot URL was returned. Agent output: ${result}`;
|
|
3280
|
+
}
|
|
3281
|
+
const url = urlMatch[0];
|
|
3282
|
+
let styleMap;
|
|
3283
|
+
try {
|
|
3284
|
+
const parsed = JSON.parse(result);
|
|
3285
|
+
styleMap = parsed?.styleMap;
|
|
3286
|
+
} catch {
|
|
3287
|
+
}
|
|
3288
|
+
const analysisPrompt = buildScreenshotAnalysisPrompt({
|
|
3289
|
+
prompt: input.prompt,
|
|
3290
|
+
styleMap
|
|
3291
|
+
});
|
|
3292
|
+
const analysis = await analyzeImage({
|
|
3293
|
+
prompt: analysisPrompt,
|
|
3294
|
+
imageUrl: url,
|
|
3295
|
+
onLog: context?.onLog
|
|
3296
|
+
});
|
|
3297
|
+
return JSON.stringify({
|
|
3298
|
+
url,
|
|
3299
|
+
analysis,
|
|
3300
|
+
...styleMap ? { styleMap } : {}
|
|
3301
|
+
});
|
|
3302
|
+
}
|
|
3303
|
+
return await captureAndAnalyzeScreenshot({
|
|
3304
|
+
prompt: input.prompt,
|
|
3305
|
+
path: input.path,
|
|
3306
|
+
onLog: context?.onLog
|
|
3307
|
+
});
|
|
3308
|
+
} catch (err) {
|
|
3309
|
+
return `Error taking screenshot: ${err.message}`;
|
|
3310
|
+
}
|
|
3311
|
+
}
|
|
3312
|
+
};
|
|
3313
|
+
}
|
|
3314
|
+
});
|
|
3315
|
+
|
|
3270
3316
|
// src/subagents/designExpert/tools/searchGoogle.ts
|
|
3271
3317
|
var searchGoogle_exports = {};
|
|
3272
3318
|
__export(searchGoogle_exports, {
|
|
@@ -3534,7 +3580,7 @@ var init_screenshot3 = __esm({
|
|
|
3534
3580
|
properties: {
|
|
3535
3581
|
prompt: {
|
|
3536
3582
|
type: "string",
|
|
3537
|
-
description: "Optional specific question about the screenshot."
|
|
3583
|
+
description: "Optional specific question about the screenshot. Use a bulleted list to ask many questions at once."
|
|
3538
3584
|
},
|
|
3539
3585
|
path: {
|
|
3540
3586
|
type: "string",
|
|
@@ -3542,7 +3588,7 @@ var init_screenshot3 = __esm({
|
|
|
3542
3588
|
},
|
|
3543
3589
|
instructions: {
|
|
3544
3590
|
type: "string",
|
|
3545
|
-
description: "If the screenshot you need requires interaction first (dismissing a modal, clicking a tab, filling out a form, navigating a flow, getting through a login/auth checkpoint), describe the steps to get there. A browser automation agent will follow these instructions before capturing the screenshot - it can bypass auth and get right to where it needs to be if you tell it to authenticate as a test user and give it the path/screen to start its test at. You will always get back a full-height screenshot of the entire page. Do not attempt to scroll or capture specific areas. Only use instructions when you need to trigger stateful changes. Never describe what names or values to use when applying the isntructions - the browser automation agent must use its own values for it to work properly."
|
|
3591
|
+
description: "If the screenshot you need requires interaction first (dismissing a modal, clicking a tab, filling out a form, navigating a flow, getting through a login/auth checkpoint), describe the steps to get there. A browser automation agent will follow these instructions before capturing the screenshot - it can bypass auth and get right to where it needs to be if you tell it to authenticate as a test user and give it the path/screen to start its test at. You will always get back a full-height screenshot of the entire page. Do not attempt to scroll or capture specific areas. Only use instructions when you need to trigger stateful changes. Never describe what names or values to use when applying the isntructions - the browser automation agent must use its own values for it to work properly. If a specific auth role is required to access the content, be sure to note that - it can automatically assume it for the purpose of testing."
|
|
3546
3592
|
}
|
|
3547
3593
|
}
|
|
3548
3594
|
}
|
|
@@ -43,6 +43,7 @@ For multi-step tasks with branching logic (research, enrichment, content pipelin
|
|
|
43
43
|
- Use `dvh`/`svh` for mobile viewport heights.
|
|
44
44
|
- Use `clamp()`, `min()`, `max()` for fluid sizing instead of fixed pixel values with media query breakpoints.
|
|
45
45
|
- Use container queries for components that need to adapt to their container rather than the viewport.
|
|
46
|
+
- For canvas-based UIs (games, visualizations, interactive graphics): size the canvas to fill its container, account for `devicePixelRatio` for Retina sharpness, and scale all objects relative to the viewport — not in fixed pixel sizes - otherwise they are going to be tiny and unusable.
|
|
46
47
|
|
|
47
48
|
### State Management
|
|
48
49
|
- Calls to methods introduce latency. When building web frontends that load data from methods, front-load as much data as you can in a single API request - e.g., when possible, load a large data object into a central store and use that to render sub-screens in an app, rather than an API call on every screen. User experience and perceived speed/performance are far more valuable than normalization and good REST API design.
|
|
@@ -40,7 +40,7 @@ Always consult the code sanity check before writing code in initialCodegen with
|
|
|
40
40
|
|
|
41
41
|
For verifying complex stateful interactions: multi-step form submissions, auth flows, real-time updates, flows that require specific data/role setup. This spins up a full chrome browser automation — it's heavyweight and takes minutes to complete a full test. Do not use it for basic rendering or navigation checks. If you can verify something with a screenshot or by reading the code, do that instead. Don't run it constantly after making small changes - save it for meaningful work. Run a scenario first to seed test data and set user roles. The user is able to watch QA work on their screen via a live browser preview - the cursor will move, type, etc - so you can also use this to demo functionality to the user and help them understand how to use their app.
|
|
42
42
|
|
|
43
|
-
The QA agent can see the screen. Describe what to test, not how — it will figure out what to click, what to check, and what values to use. By default, it always starts its tests logged out/unauthenticated on "/" root, but if you want to test a deeper piece of the app it can bypass auth and automatically authenticate itself as any user/role - just tell it to authenticate as the test user and navigate to X to start the test. After every test session, the browser is reset to / and any authentication used or created by the tester is cleared and reset.
|
|
43
|
+
The QA agent can see the screen. Describe what to test, not how — it will figure out what to click, what to check, and what values to use. By default, it always starts its tests logged out/unauthenticated on "/" root, but if you want to test a deeper piece of the app it can bypass auth and automatically authenticate itself as any user/role - just tell it to authenticate as the test user and navigate to X to start the test. If a specific auth role is required to access the content, be sure to note that - it can automatically assume it for the purpose of testing. After every test session, the browser is reset to / and any authentication used or created by the tester is cleared and reset.
|
|
44
44
|
|
|
45
45
|
Never tell QA what names to use when testing or what values to input - it will use its own judgment.
|
|
46
46
|
|
|
@@ -33,6 +33,7 @@ These are things we already know about and have decided to accept:
|
|
|
33
33
|
- @tabler/icons-react
|
|
34
34
|
- streamdown
|
|
35
35
|
- react-textarea-autosize
|
|
36
|
+
- phaser
|
|
36
37
|
- Preferences:
|
|
37
38
|
- use [wouter](https://github.com/molefrog/wouter) for React routing instead of reaching for react-router
|
|
38
39
|
- uploading user files should always happen via `platform.uploadFile()` from `@mindstudio-ai/interface` — not custom S3 code, not FormData to a method endpoint
|
|
@@ -40,10 +41,6 @@ These are things we already know about and have decided to accept:
|
|
|
40
41
|
|
|
41
42
|
### Common pitfalls (always flag these)
|
|
42
43
|
|
|
43
|
-
- **External redirects in iframe.** If the plan involves redirecting to a third-party domain (payment checkout, OAuth login, external auth), flag that `window.location.href = url` will break in the preview iframe. Use `window.open(url, '_top')` for same-tab navigation or `window.open(url, '_blank')` for a new tab. This applies to any external redirect — Stripe, OAuth providers, third-party login pages.
|
|
44
|
-
|
|
45
|
-
- **Hardcoded credentials.** If the plan or code contains API keys, tokens, or connection strings inline, flag it — these should be `process.env` secrets managed via the dashboard. Also flag if the plan uses `process.env` for something the MindStudio SDK already handles (AI model keys, email/SMS sending, etc.).
|
|
46
|
-
|
|
47
44
|
These are recurring mistakes the coding agent makes. If you see the conditions for any of these, flag it proactively:
|
|
48
45
|
|
|
49
46
|
- **CSS Module animation scoping.** If the agent defines `@keyframes` in a global CSS file but references the animation name from a CSS Module, the animation will silently fail. CSS Modules scope animation names, so a keyframe defined globally can't be found by a scoped class. The fix: define keyframes in the same CSS Module that uses them, or use `:global()` to escape the scoping.
|
|
@@ -72,6 +69,15 @@ When a plan includes multiple screens/API calls, always note this item for the d
|
|
|
72
69
|
|
|
73
70
|
- **Layout shift with dynamic data or AI generated text** If the plan includes dynamically-sized data (e.g., a wizard form with questions of differing lengths) or AI generated text (where text stream length is unpredictable), make sure to flag concerns about layout stability. Everything must either be a fixed size or smoothly animate between sizes. Text can never be clipped by a container or cause layout to jump around or grow in snappy/janky ways. Make sure to remind the developer that this is important to pay attention to.
|
|
74
71
|
|
|
72
|
+
- **Games** If the plan involves building a browser game, make sure to remind the developer of the following:
|
|
73
|
+
- Use the correct tool for the scope: plain Canvas API for simple games, Phaser for 2D games that need physics/sprites/scene management, Three.js for 3D.
|
|
74
|
+
- The design expert can generate beautiful assets — characters, objects, backgrounds, UI elements — with transparent backgrounds at proper resolution.
|
|
75
|
+
- Most importantly, make sure all UI and assets are responsively sized, especially for modern mobile retina devices. This is critical - things with assets too small are literally unplayable on modern devices and the user will be disappointed.
|
|
76
|
+
|
|
77
|
+
- **External redirects in iframe.** If the plan involves redirecting to a third-party domain (payment checkout, OAuth login, external auth), flag that `window.location.href = url` will break in the preview iframe. Use `window.open(url, '_top')` for same-tab navigation or `window.open(url, '_blank')` for a new tab. This applies to any external redirect — Stripe, OAuth providers, third-party login pages.
|
|
78
|
+
|
|
79
|
+
- **Hardcoded credentials.** If the plan or code contains API keys, tokens, or connection strings inline, flag it — these should be `process.env` secrets managed via the dashboard. Also flag if the plan uses `process.env` for something the MindStudio SDK already handles (AI model keys, email/SMS sending, etc.).
|
|
80
|
+
|
|
75
81
|
## When to stay quiet
|
|
76
82
|
|
|
77
83
|
Nits, style preferences, missing edge cases, things the agent will figure out as it goes, patterns that are "not ideal but fine," minor code smells. Let them slide. The agent is busy.
|
|
@@ -85,6 +85,8 @@ Include image recommendations in your designs when the product calls for it. A l
|
|
|
85
85
|
|
|
86
86
|
Transparent assets open up layout possibilities: a mascot or other object, images for empty states, onboarding flows. When the design calls for layered compositions, generate the elements separately with transparent backgrounds rather than trying to compose everything into a single flat image.
|
|
87
87
|
|
|
88
|
+
For games and other interactive projects, generate real assets — characters, objects, enemies, backgrounds, UI elements — with `transparentBackground: true`.
|
|
89
|
+
|
|
88
90
|
### CDN image transforms
|
|
89
91
|
|
|
90
92
|
Generated images and uploaded images are hosted on `i.mscdn.ai`. Use query string parameters to request appropriately sized images rather than CSS-scaling full-resolution originals:
|