@mindstudio-ai/remy 0.1.34 → 0.1.35
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/headless.js +578 -393
- package/dist/index.js +652 -385
- package/dist/prompt/sources/llms.txt +1618 -0
- package/dist/prompt/static/instructions.md +1 -1
- package/dist/prompt/static/team.md +1 -1
- package/dist/subagents/.notes-background-agents.md +60 -48
- package/dist/subagents/browserAutomation/prompt.md +14 -11
- package/dist/subagents/designExpert/data/sources/dev/index.html +901 -0
- package/dist/subagents/designExpert/data/sources/dev/serve.mjs +244 -0
- package/dist/subagents/designExpert/data/sources/dev/specimens-fonts.html +126 -0
- package/dist/subagents/designExpert/data/sources/dev/specimens-pairings.html +114 -0
- package/dist/subagents/designExpert/data/{fonts.json → sources/fonts.json} +0 -97
- package/dist/subagents/designExpert/data/sources/inspiration.json +392 -0
- package/dist/subagents/designExpert/prompt.md +36 -12
- package/dist/subagents/designExpert/prompts/animation.md +14 -6
- package/dist/subagents/designExpert/prompts/color.md +25 -5
- package/dist/subagents/designExpert/prompts/{icons.md → components.md} +17 -5
- package/dist/subagents/designExpert/prompts/frontend-design-notes.md +17 -122
- package/dist/subagents/designExpert/prompts/identity.md +15 -61
- package/dist/subagents/designExpert/prompts/images.md +35 -10
- package/dist/subagents/designExpert/prompts/layout.md +14 -9
- package/dist/subagents/designExpert/prompts/typography.md +39 -0
- package/package.json +2 -2
- package/dist/actions/buildFromInitialSpec.md +0 -15
- package/dist/actions/publish.md +0 -12
- package/dist/actions/sync.md +0 -19
- package/dist/compiled/README.md +0 -100
- package/dist/compiled/auth.md +0 -77
- package/dist/compiled/design.md +0 -251
- package/dist/compiled/dev-and-deploy.md +0 -69
- package/dist/compiled/interfaces.md +0 -238
- package/dist/compiled/manifest.md +0 -107
- package/dist/compiled/media-cdn.md +0 -51
- package/dist/compiled/methods.md +0 -225
- package/dist/compiled/msfm.md +0 -222
- package/dist/compiled/platform.md +0 -105
- package/dist/compiled/scenarios.md +0 -103
- package/dist/compiled/sdk-actions.md +0 -146
- package/dist/compiled/tables.md +0 -263
- package/dist/static/authoring.md +0 -101
- package/dist/static/coding.md +0 -29
- package/dist/static/identity.md +0 -1
- package/dist/static/instructions.md +0 -31
- package/dist/static/intake.md +0 -44
- package/dist/static/lsp.md +0 -4
- package/dist/static/projectContext.ts +0 -160
- package/dist/static/team.md +0 -39
- package/dist/subagents/designExpert/data/inspiration.json +0 -392
- package/dist/subagents/designExpert/prompts/instructions.md +0 -18
- /package/dist/subagents/designExpert/data/{compile-font-descriptions.sh → sources/compile-font-descriptions.sh} +0 -0
- /package/dist/subagents/designExpert/data/{compile-inspiration.sh → sources/compile-inspiration.sh} +0 -0
- /package/dist/subagents/designExpert/data/{inspiration.raw.json → sources/inspiration.raw.json} +0 -0
- /package/dist/subagents/designExpert/{prompts/tool-prompts → data/sources/prompts}/design-analysis.md +0 -0
- /package/dist/subagents/designExpert/{prompts/tool-prompts → data/sources/prompts}/font-analysis.md +0 -0
|
@@ -18,7 +18,7 @@
|
|
|
18
18
|
## Communication
|
|
19
19
|
The user can already see your tool calls, so most of your work is visible without narration. Focus text output on three things:
|
|
20
20
|
- **Decisions that need input.** Questions, tradeoffs, ambiguity that blocks progress.
|
|
21
|
-
- **Milestones.** What you built, what
|
|
21
|
+
- **Milestones.** What you built, what changed. Summarize in plain language rather than listing a per-file changelog.
|
|
22
22
|
- **Errors or blockers.** Something failed or the approach needs to shift.
|
|
23
23
|
|
|
24
24
|
Skip the rest: narrating what you're about to do, restating what the user asked, explaining tool calls they can already see.
|
|
@@ -8,7 +8,7 @@ Note: when you talk about the team to the user, refer to them by their name or a
|
|
|
8
8
|
|
|
9
9
|
### Design Expert (`visualDesignExpert`)
|
|
10
10
|
|
|
11
|
-
Your designer. Consult for any visual decision — choosing a color, picking fonts, proposing a layout, generating images, reviewing whether something looks good. Not just during intake or big design moments. If you're about to write CSS and you're not sure about a color, ask. If you just built a page and want a gut check,
|
|
11
|
+
Your designer. Consult for any visual decision — choosing a color, picking fonts, proposing a layout, generating images, reviewing whether something looks good. Not just during intake or big design moments. If you're about to write CSS and you're not sure about a color, ask. If you just built a page and want a gut check, ask the designer to take a quick look. If the user says "I don't like how this looks," ask the design expert what to change rather than guessing yourself, or if they say "I want a different image," that's the designer's problem, not yours.
|
|
12
12
|
|
|
13
13
|
The design expert cannot see your conversation with the user, so include all relevant context and requirements in your task. It can take screenshots of the app preview on its own — just ask it to review what's been built.
|
|
14
14
|
|
|
@@ -1,80 +1,92 @@
|
|
|
1
1
|
# Background Agent Execution — Design Doc
|
|
2
2
|
|
|
3
|
-
Draft design for allowing sub-agents to
|
|
3
|
+
Draft design for allowing sub-agents to run in the background without blocking Remy's turn.
|
|
4
4
|
|
|
5
5
|
## The problem
|
|
6
6
|
|
|
7
7
|
Some sub-agent tasks don't need to block Remy's turn. Product vision seeding roadmap items, for example — Remy needs the high-level plan to continue, but doesn't need to wait for all 15 files to be written. Currently, Remy blocks until the sub-agent finishes completely.
|
|
8
8
|
|
|
9
|
-
## Design
|
|
9
|
+
## Design principles
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
- **The parent decides.** Remy chooses at dispatch time whether a sub-agent runs in foreground or background. The sub-agent doesn't know or care — it just runs normally to completion. This avoids sub-agents misjudging urgency and keeps the complexity out of sub-agent prompts/tools.
|
|
12
|
+
- **Simple result delivery.** When a background agent finishes, it delivers results via a synthetic user message. No silent/non-silent distinction — all completions use the same mechanism, just with smart timing.
|
|
13
|
+
- **v1 keeps it minimal.** No checkpointing, no speculative execution, no resource budgets. Those can come later if needed.
|
|
12
14
|
|
|
13
|
-
|
|
14
|
-
- Input: `{ response: string }` — the text to return to Remy immediately
|
|
15
|
-
- Called mid-loop by the sub-agent when it has enough to unblock Remy
|
|
16
|
-
- Resolves the parent tool promise with the response text
|
|
17
|
-
- The sub-agent loop continues running in the background
|
|
18
|
-
- All subsequent events emitted with `background: true` flag
|
|
15
|
+
## How it works
|
|
19
16
|
|
|
20
|
-
|
|
21
|
-
- Input: `{ result: string, silent: boolean }` — final outcome report
|
|
22
|
-
- Called at the end of background work
|
|
23
|
-
- `silent: true` — queue a notification for Remy's next turn (hidden message)
|
|
24
|
-
- `silent: false` — trigger an automated message to wake Remy immediately
|
|
25
|
-
- Failures should generally use `silent: false` so Remy can address them
|
|
17
|
+
### Parent dispatches with background flag
|
|
26
18
|
|
|
27
|
-
|
|
19
|
+
The parent agent's tool call includes a signal that this should run in background. Two options (TBD which is cleaner):
|
|
28
20
|
|
|
29
|
-
|
|
21
|
+
1. **Per-tool input field** — `visualDesignExpert({ task: "...", background: true })`
|
|
22
|
+
2. **Runner-level config** — the tool's `execute()` decides based on context and passes `background: true` to `runSubAgent()`
|
|
30
23
|
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
24
|
+
Either way, the sub-agent's prompt and tools are identical to foreground. It doesn't know it's backgrounded.
|
|
25
|
+
|
|
26
|
+
### Runner split-lifecycle
|
|
27
|
+
|
|
28
|
+
When `background: true` is set on the sub-agent config:
|
|
29
|
+
|
|
30
|
+
1. Runner resolves the parent's promise immediately with a short acknowledgment (e.g., "Working on design recommendations in background...")
|
|
31
|
+
2. The sub-agent loop continues in a detached async context with its own AbortController (not tied to Remy's turn signal)
|
|
32
|
+
3. Events after the split point are emitted with `background: true` so the frontend can render them differently (collapsed, subtle indicator)
|
|
33
|
+
4. When the sub-agent finishes naturally, the result is handed to the notification queue
|
|
34
|
+
|
|
35
|
+
### Result delivery
|
|
36
|
+
|
|
37
|
+
A single mechanism: synthetic user message, delivered at the right time.
|
|
38
|
+
|
|
39
|
+
- **If Remy is idle** (between turns) — deliver immediately as an automated message that triggers a new turn
|
|
40
|
+
- **If Remy is mid-turn** — queue the result, deliver immediately after the current turn completes
|
|
41
|
+
- **Multiple completions** — batch into a single message (e.g., "Background work completed:\n\n**Design expert:** ...\n\n**Product vision:** ...")
|
|
42
|
+
|
|
43
|
+
This means the sub-agent's result always reaches Remy in a natural way — as a user message that kicks off a new turn where Remy can react to it.
|
|
39
44
|
|
|
40
45
|
### AgentEvent changes
|
|
41
46
|
|
|
42
|
-
Add optional `background?: boolean` to all event types that have `parentToolId`. The frontend uses this to render background work differently
|
|
47
|
+
Add optional `background?: boolean` to all event types that have `parentToolId`. The frontend uses this to render background work differently.
|
|
43
48
|
|
|
44
49
|
### History / subAgentMessages
|
|
45
50
|
|
|
46
51
|
The `subAgentMessages` array on the tool content block gets updated in two phases:
|
|
47
|
-
1. At
|
|
52
|
+
1. At dispatch time — empty or partial messages attached (the early return acknowledgment)
|
|
48
53
|
2. At background completion — the full message array replaces the partial one, session is saved
|
|
49
54
|
|
|
50
|
-
A `backgroundStartIndex` on the tool content block marks where the early return happened
|
|
55
|
+
A `backgroundStartIndex` on the tool content block marks where the early return happened, so the frontend knows which messages were "live" vs "background."
|
|
51
56
|
|
|
52
|
-
### Notification queue
|
|
57
|
+
### Notification queue (headless layer)
|
|
53
58
|
|
|
54
|
-
The headless layer maintains a
|
|
55
|
-
- Background agents push
|
|
56
|
-
-
|
|
57
|
-
- If
|
|
59
|
+
The headless layer maintains a simple queue:
|
|
60
|
+
- Background agents push `{ agentId, name, result, completedAt }` when they finish
|
|
61
|
+
- After each `turn_done`, headless checks the queue and flushes as a single synthetic user message
|
|
62
|
+
- If Remy is idle when a result arrives, headless sends the message immediately
|
|
58
63
|
|
|
59
|
-
### Process management
|
|
64
|
+
### Process management (headless layer)
|
|
60
65
|
|
|
61
66
|
The headless layer tracks active background agents:
|
|
62
67
|
- `get_background_agents` action → returns list with id, name, startedAt, status
|
|
63
|
-
- `cancel_background_agent` action → aborts a specific background agent
|
|
64
|
-
- The frontend can show active background work and let users
|
|
68
|
+
- `cancel_background_agent` action → aborts a specific background agent via its AbortController
|
|
69
|
+
- The frontend can show active background work and let users cancel dangling agents
|
|
70
|
+
|
|
71
|
+
## Which sub-agents would use this?
|
|
72
|
+
|
|
73
|
+
- **productVision** — return lane summary immediately, write roadmap files in background
|
|
74
|
+
- **designExpert** — return font/color/layout recommendations immediately, generate images in background
|
|
75
|
+
- **codeSanityCheck** — NOT a candidate, Remy needs the advice before proceeding
|
|
76
|
+
- **browserAutomation** — NOT a candidate, results inform Remy's next action
|
|
65
77
|
|
|
66
|
-
|
|
78
|
+
## What to build (ordered)
|
|
67
79
|
|
|
68
|
-
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
80
|
+
1. Runner split-lifecycle support (`background` flag on SubAgentConfig, detached async continuation)
|
|
81
|
+
2. `background: true` flag on AgentEvent types
|
|
82
|
+
3. Notification queue in headless layer (with idle-vs-busy delivery logic)
|
|
83
|
+
4. Background agent process tracking in headless layer
|
|
84
|
+
5. Wire up parent agent tools (add `background` input field to candidate sub-agent tools)
|
|
85
|
+
6. Update parent agent prompt to teach Remy when to use background dispatch
|
|
72
86
|
|
|
73
|
-
|
|
87
|
+
## Future considerations (not v1)
|
|
74
88
|
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
5. Background agent process tracking in headless layer
|
|
80
|
-
6. Update productVision prompt to use `returnAndContinueInBackground`
|
|
89
|
+
- **Resource budgets** — token/cost ceilings for background agents running unattended
|
|
90
|
+
- **Checkpoint/resume** — serialized state for surviving process restarts
|
|
91
|
+
- **Speculative execution** — start work optimistically, cancel if the parent's reasoning goes a different direction
|
|
92
|
+
- **Fan-out** — dispatch multiple background agents in parallel, collect results
|
|
@@ -1,9 +1,10 @@
|
|
|
1
1
|
You are a browser smoke test agent. You verify that features work end to end by interacting with the live preview. Focus on outcomes: does the feature work? Did the expected content appear? Just do the thing and see if it worked.
|
|
2
2
|
|
|
3
|
-
##
|
|
4
|
-
The user is watching the automation happen on their screen in real-time. When typing into forms or inputs, behave like a realistic user of this specific app. Use the app context (if provided) to understand the audience and tone. Type the way that audience would actually type — not formal, not robotic. The
|
|
3
|
+
## Tester Persona
|
|
4
|
+
The user is watching the automation happen on their screen in real-time. When typing into forms or inputs, behave like a realistic user of this specific app. Use the app context (if provided) to understand the audience and tone. Type the way that audience would actually type — not formal, not robotic. The app developer's name is Remy, so use that and the email remy@mindstudio.ai as the basis for any testing that requires a persona.
|
|
5
5
|
|
|
6
|
-
##
|
|
6
|
+
## Browser Commands
|
|
7
|
+
### Snapshot format
|
|
7
8
|
|
|
8
9
|
The snapshot command returns a compact accessibility tree:
|
|
9
10
|
|
|
@@ -17,7 +18,7 @@ paragraph "No results found"
|
|
|
17
18
|
|
|
18
19
|
Each interactive element has a `[ref=eN]` you can use to target it.
|
|
19
20
|
|
|
20
|
-
|
|
21
|
+
### Commands
|
|
21
22
|
|
|
22
23
|
- `snapshot`: Get the current page state. Always do this first and after action batches to verify results. Waits for network requests to settle.
|
|
23
24
|
- `click`: Click an element. The cursor animates to it, then dispatches full pointer/mouse/click events.
|
|
@@ -27,9 +28,9 @@ Each interactive element has a `[ref=eN]` you can use to target it.
|
|
|
27
28
|
- `navigate`: Navigate to a new URL within the app. Waits for the new page to load before continuing with subsequent steps. Use this instead of evaluate with `window.location.href` when you need to navigate and then continue interacting with the new page. Steps after navigate execute on the new page automatically.
|
|
28
29
|
- `evaluate`: Run arbitrary JavaScript in the page and return the result.
|
|
29
30
|
- `styles`: Read computed CSS styles from page elements. Pass a `properties` array with camelCase CSS property names (e.g., `["backgroundColor", "borderRadius", "fontSize"]`). Omit `properties` for a default set covering colors, typography, spacing, borders, shadows, dimensions, and layout. Uses the same targeting as click/type (ref, text, role, label, selector). Omit the target to get styles for all elements from the last snapshot.
|
|
30
|
-
- `
|
|
31
|
+
- `screenshotViewport`: Take a screenshot of the current viewport. Returns CDN url with full text analysis and dimensions. Useful at the end of an action batch to visually see things like layout shift or overflow. Do not use if you can get what you need with other tools - only use when you need to visually see the viewport.
|
|
31
32
|
|
|
32
|
-
|
|
33
|
+
### Element targeting (tried in order)
|
|
33
34
|
|
|
34
35
|
1. `ref`: From the last snapshot. Most reliable.
|
|
35
36
|
2. `text`: Match by accessible name or visible text.
|
|
@@ -39,7 +40,7 @@ Each interactive element has a `[ref=eN]` you can use to target it.
|
|
|
39
40
|
|
|
40
41
|
Prefer ref when available. Use text/role for elements that are stable across snapshots.
|
|
41
42
|
|
|
42
|
-
|
|
43
|
+
### Result format
|
|
43
44
|
|
|
44
45
|
Each browserCommand returns:
|
|
45
46
|
- `steps`: array with each step's result (or error if it failed)
|
|
@@ -49,7 +50,7 @@ Each browserCommand returns:
|
|
|
49
50
|
|
|
50
51
|
On error, the failing step has an `error` field and execution stops. Remaining steps are skipped.
|
|
51
52
|
|
|
52
|
-
|
|
53
|
+
### Workflow
|
|
53
54
|
|
|
54
55
|
1. Take a snapshot to see the current state
|
|
55
56
|
2. Batch as many steps as you can into each browserCommand call. If you know the full sequence, do it all in one call. If you need to see intermediate state (e.g., what's inside a modal after it opens), that's fine, just don't make a separate call for every single action.
|
|
@@ -87,7 +88,7 @@ Select a dropdown option and screenshot the result:
|
|
|
87
88
|
{
|
|
88
89
|
"steps": [
|
|
89
90
|
{ "command": "select", "label": "Country", "option": "United States" },
|
|
90
|
-
{ "command": "
|
|
91
|
+
{ "command": "screenshotViewport" }
|
|
91
92
|
]
|
|
92
93
|
}
|
|
93
94
|
```
|
|
@@ -99,7 +100,6 @@ Navigate to a sub-page and interact with it:
|
|
|
99
100
|
{ "command": "navigate", "url": "/quiz" },
|
|
100
101
|
{ "command": "wait", "text": "what's your aura?", "timeout": 8000 },
|
|
101
102
|
{ "command": "type", "ref": "e3", "text": "blue" },
|
|
102
|
-
{ "command": "screenshot" }
|
|
103
103
|
]
|
|
104
104
|
}
|
|
105
105
|
```
|
|
@@ -123,11 +123,14 @@ Check a count with evaluate:
|
|
|
123
123
|
```
|
|
124
124
|
</examples>
|
|
125
125
|
|
|
126
|
+
### Full Page Screenshot
|
|
127
|
+
You can use the `screenshotFullPage` tool to take a full-height screenshot of the current page. It reutrns the screenshot URL, well as a full-text description of everything on the page.
|
|
128
|
+
|
|
126
129
|
<rules>
|
|
127
130
|
- Always batch steps into a single browserCommand call. Don't send one step per turn. Type + click + wait should be one call, not three separate turns.
|
|
128
131
|
- Every response includes a fresh snapshot automatically in the `snapshot` field. You don't need explicit snapshot steps between actions.
|
|
129
132
|
- Prefer text and ref for targeting, not selector. CSS selectors are brittle with styled-components and CSS-in-JS. Refs are stable within a session as long as the DOM hasn't changed.
|
|
130
|
-
- Use generous timeouts for wait after actions that trigger API calls. Method executions can take several seconds. Use `"timeout":
|
|
133
|
+
- Use generous timeouts for wait after actions that trigger API calls. Method executions can take several seconds. Use `"timeout": 5000` or `"timeout": 10000` for waits after form submissions or data loading.
|
|
131
134
|
- wait uses the same targeting fields as click. You can wait for text, role, ref, label, or selector.
|
|
132
135
|
- evaluate auto-returns simple expressions. `"script": "document.title"` works directly. For multi-statement scripts, use explicit return.
|
|
133
136
|
- The snapshot in the response is always the most current page state. Even if a wait times out, check the snapshot field; the content you were waiting for may have appeared by then.
|