npm - @mindstudio-ai/remy - Versions diffs - 0.1.34 → 0.1.35 - Mend

@mindstudio-ai/remy 0.1.34 → 0.1.35

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (54) hide show

package/dist/headless.js +578 -393
package/dist/index.js +652 -385
package/dist/prompt/sources/llms.txt +1618 -0
package/dist/prompt/static/instructions.md +1 -1
package/dist/prompt/static/team.md +1 -1
package/dist/subagents/.notes-background-agents.md +60 -48
package/dist/subagents/browserAutomation/prompt.md +14 -11
package/dist/subagents/designExpert/data/sources/dev/index.html +901 -0
package/dist/subagents/designExpert/data/sources/dev/serve.mjs +244 -0
package/dist/subagents/designExpert/data/sources/dev/specimens-fonts.html +126 -0
package/dist/subagents/designExpert/data/sources/dev/specimens-pairings.html +114 -0
package/dist/subagents/designExpert/data/{fonts.json → sources/fonts.json} +0 -97
package/dist/subagents/designExpert/data/sources/inspiration.json +392 -0
package/dist/subagents/designExpert/prompt.md +36 -12
package/dist/subagents/designExpert/prompts/animation.md +14 -6
package/dist/subagents/designExpert/prompts/color.md +25 -5
package/dist/subagents/designExpert/prompts/{icons.md → components.md} +17 -5
package/dist/subagents/designExpert/prompts/frontend-design-notes.md +17 -122
package/dist/subagents/designExpert/prompts/identity.md +15 -61
package/dist/subagents/designExpert/prompts/images.md +35 -10
package/dist/subagents/designExpert/prompts/layout.md +14 -9
package/dist/subagents/designExpert/prompts/typography.md +39 -0
package/package.json +2 -2
package/dist/actions/buildFromInitialSpec.md +0 -15
package/dist/actions/publish.md +0 -12
package/dist/actions/sync.md +0 -19
package/dist/compiled/README.md +0 -100
package/dist/compiled/auth.md +0 -77
package/dist/compiled/design.md +0 -251
package/dist/compiled/dev-and-deploy.md +0 -69
package/dist/compiled/interfaces.md +0 -238
package/dist/compiled/manifest.md +0 -107
package/dist/compiled/media-cdn.md +0 -51
package/dist/compiled/methods.md +0 -225
package/dist/compiled/msfm.md +0 -222
package/dist/compiled/platform.md +0 -105
package/dist/compiled/scenarios.md +0 -103
package/dist/compiled/sdk-actions.md +0 -146
package/dist/compiled/tables.md +0 -263
package/dist/static/authoring.md +0 -101
package/dist/static/coding.md +0 -29
package/dist/static/identity.md +0 -1
package/dist/static/instructions.md +0 -31
package/dist/static/intake.md +0 -44
package/dist/static/lsp.md +0 -4
package/dist/static/projectContext.ts +0 -160
package/dist/static/team.md +0 -39
package/dist/subagents/designExpert/data/inspiration.json +0 -392
package/dist/subagents/designExpert/prompts/instructions.md +0 -18
/package/dist/subagents/designExpert/data/{compile-font-descriptions.sh → sources/compile-font-descriptions.sh} +0 -0
/package/dist/subagents/designExpert/data/{compile-inspiration.sh → sources/compile-inspiration.sh} +0 -0
/package/dist/subagents/designExpert/data/{inspiration.raw.json → sources/inspiration.raw.json} +0 -0
/package/dist/subagents/designExpert/{prompts/tool-prompts → data/sources/prompts}/design-analysis.md +0 -0
/package/dist/subagents/designExpert/{prompts/tool-prompts → data/sources/prompts}/font-analysis.md +0 -0

package/dist/prompt/static/instructions.md CHANGED Viewed

@@ -18,7 +18,7 @@
 ## Communication
 The user can already see your tool calls, so most of your work is visible without narration. Focus text output on three things:
 - **Decisions that need input.** Questions, tradeoffs, ambiguity that blocks progress.
-- **Milestones.** What you built, what it looks like, what changed. Summarize in plain language rather than listing a per-file changelog.
+- **Milestones.** What you built, what changed. Summarize in plain language rather than listing a per-file changelog.
 - **Errors or blockers.** Something failed or the approach needs to shift.
 Skip the rest: narrating what you're about to do, restating what the user asked, explaining tool calls they can already see.

package/dist/prompt/static/team.md CHANGED Viewed

@@ -8,7 +8,7 @@ Note: when you talk about the team to the user, refer to them by their name or a
 ### Design Expert (`visualDesignExpert`)
-Your designer. Consult for any visual decision — choosing a color, picking fonts, proposing a layout, generating images, reviewing whether something looks good. Not just during intake or big design moments. If you're about to write CSS and you're not sure about a color, ask. If you just built a page and want a gut check, take a screenshot and send it over. If the user says "I don't like how this looks," ask the design expert what to change rather than guessing yourself, or if they say "I want a different image," that's the designer's problem, not yours.
+Your designer. Consult for any visual decision — choosing a color, picking fonts, proposing a layout, generating images, reviewing whether something looks good. Not just during intake or big design moments. If you're about to write CSS and you're not sure about a color, ask. If you just built a page and want a gut check, ask the designer to take a quick look. If the user says "I don't like how this looks," ask the design expert what to change rather than guessing yourself, or if they say "I want a different image," that's the designer's problem, not yours.
 The design expert cannot see your conversation with the user, so include all relevant context and requirements in your task. It can take screenshots of the app preview on its own — just ask it to review what's been built.

package/dist/subagents/.notes-background-agents.md CHANGED Viewed

@@ -1,80 +1,92 @@
 # Background Agent Execution — Design Doc
-Draft design for allowing sub-agents to return early and continue working in the background.
+Draft design for allowing sub-agents to run in the background without blocking Remy's turn.
 ## The problem
 Some sub-agent tasks don't need to block Remy's turn. Product vision seeding roadmap items, for example — Remy needs the high-level plan to continue, but doesn't need to wait for all 15 files to be written. Currently, Remy blocks until the sub-agent finishes completely.
-## Design
+## Design principles
-### Two new tools available to sub-agents
+- **The parent decides.** Remy chooses at dispatch time whether a sub-agent runs in foreground or background. The sub-agent doesn't know or care — it just runs normally to completion. This avoids sub-agents misjudging urgency and keeps the complexity out of sub-agent prompts/tools.
+- **Simple result delivery.** When a background agent finishes, it delivers results via a synthetic user message. No silent/non-silent distinction — all completions use the same mechanism, just with smart timing.
+- **v1 keeps it minimal.** No checkpointing, no speculative execution, no resource budgets. Those can come later if needed.
-**`returnAndContinueInBackground`**
-- Input: `{ response: string }` — the text to return to Remy immediately
-- Called mid-loop by the sub-agent when it has enough to unblock Remy
-- Resolves the parent tool promise with the response text
-- The sub-agent loop continues running in the background
-- All subsequent events emitted with `background: true` flag
+## How it works
-**`finishBackgroundWork`**
-- Input: `{ result: string, silent: boolean }` — final outcome report
-- Called at the end of background work
-- `silent: true` — queue a notification for Remy's next turn (hidden message)
-- `silent: false` — trigger an automated message to wake Remy immediately
-- Failures should generally use `silent: false` so Remy can address them
+### Parent dispatches with background flag
-### Runner changes
+The parent agent's tool call includes a signal that this should run in background. Two options (TBD which is cleaner):
-The runner needs to support a split lifecycle:
+1. **Per-tool input field** — `visualDesignExpert({ task: "...", background: true })`
+2. **Runner-level config** — the tool's `execute()` decides based on context and passes `background: true` to `runSubAgent()`
-1. Normal loop execution until `returnAndContinueInBackground` is called
-2. At that point, resolve the outer promise with `{ text: response, messages: [...so far] }`
-3. Continue the loop in a detached async context (own AbortController, not tied to Remy's turn)
-4. The `emit` wrapper adds `background: true` to all events after the split point
-5. When the sub-agent finishes (naturally or via `finishBackgroundWork`):
-   - Update `subAgentMessages` on the original tool block in `state.messages`
-   - Save the session
-   - If not silent, inject an automated message to trigger a new Remy turn
+Either way, the sub-agent's prompt and tools are identical to foreground. It doesn't know it's backgrounded.
+### Runner split-lifecycle
+When `background: true` is set on the sub-agent config:
+1. Runner resolves the parent's promise immediately with a short acknowledgment (e.g., "Working on design recommendations in background...")
+2. The sub-agent loop continues in a detached async context with its own AbortController (not tied to Remy's turn signal)
+3. Events after the split point are emitted with `background: true` so the frontend can render them differently (collapsed, subtle indicator)
+4. When the sub-agent finishes naturally, the result is handed to the notification queue
+### Result delivery
+A single mechanism: synthetic user message, delivered at the right time.
+- **If Remy is idle** (between turns) — deliver immediately as an automated message that triggers a new turn
+- **If Remy is mid-turn** — queue the result, deliver immediately after the current turn completes
+- **Multiple completions** — batch into a single message (e.g., "Background work completed:\n\n**Design expert:** ...\n\n**Product vision:** ...")
+This means the sub-agent's result always reaches Remy in a natural way — as a user message that kicks off a new turn where Remy can react to it.
 ### AgentEvent changes
-Add optional `background?: boolean` to all event types that have `parentToolId`. The frontend uses this to render background work differently (collapsed, subtle indicator, etc.).
+Add optional `background?: boolean` to all event types that have `parentToolId`. The frontend uses this to render background work differently.
 ### History / subAgentMessages
 The `subAgentMessages` array on the tool content block gets updated in two phases:
-1. At `returnAndContinueInBackground` time — messages so far are attached (captured in the early return)
+1. At dispatch time — empty or partial messages attached (the early return acknowledgment)
 2. At background completion — the full message array replaces the partial one, session is saved
-A `backgroundStartIndex` on the tool content block marks where the early return happened in the messages array, so the frontend knows which messages were "live" vs "background."
+A `backgroundStartIndex` on the tool content block marks where the early return happened, so the frontend knows which messages were "live" vs "background."
-### Notification queue
+### Notification queue (headless layer)
-The headless layer maintains a notification queue:
-- Background agents push to it when they finish (via `finishBackgroundWork`)
-- On next `runTurn`, headless flushes queued notifications as prepended hidden messages
-- If `silent: false`, headless also sends an automated message to trigger a new turn immediately
+The headless layer maintains a simple queue:
+- Background agents push `{ agentId, name, result, completedAt }` when they finish
+- After each `turn_done`, headless checks the queue and flushes as a single synthetic user message
+- If Remy is idle when a result arrives, headless sends the message immediately
-### Process management
+### Process management (headless layer)
 The headless layer tracks active background agents:
 - `get_background_agents` action → returns list with id, name, startedAt, status
-- `cancel_background_agent` action → aborts a specific background agent
-- The frontend can show active background work and let users kill dangling agents
+- `cancel_background_agent` action → aborts a specific background agent via its AbortController
+- The frontend can show active background work and let users cancel dangling agents
+## Which sub-agents would use this?
+- **productVision** — return lane summary immediately, write roadmap files in background
+- **designExpert** — return font/color/layout recommendations immediately, generate images in background
+- **codeSanityCheck** — NOT a candidate, Remy needs the advice before proceeding
+- **browserAutomation** — NOT a candidate, results inform Remy's next action
-### Which sub-agents would use this?
+## What to build (ordered)
-- **productVision** — return lane summary immediately, write roadmap files in background (silent)
-- **designExpert** — could return font/color recommendations immediately, generate images in background (silent)
-- **codeSanityCheck** — probably NOT a candidate, Remy needs the advice before proceeding
-- **browserAutomation** — probably NOT a candidate, results inform Remy's next action
+1. Runner split-lifecycle support (`background` flag on SubAgentConfig, detached async continuation)
+2. `background: true` flag on AgentEvent types
+3. Notification queue in headless layer (with idle-vs-busy delivery logic)
+4. Background agent process tracking in headless layer
+5. Wire up parent agent tools (add `background` input field to candidate sub-agent tools)
+6. Update parent agent prompt to teach Remy when to use background dispatch
-### What to build (ordered)
+## Future considerations (not v1)
-1. `returnAndContinueInBackground` and `finishBackgroundWork` tool definitions
-2. Runner split-lifecycle support (detached async continuation)
-3. `background: true` flag on AgentEvent types
-4. Notification queue in headless layer
-5. Background agent process tracking in headless layer
-6. Update productVision prompt to use `returnAndContinueInBackground`
+- **Resource budgets** — token/cost ceilings for background agents running unattended
+- **Checkpoint/resume** — serialized state for surviving process restarts
+- **Speculative execution** — start work optimistically, cancel if the parent's reasoning goes a different direction
+- **Fan-out** — dispatch multiple background agents in parallel, collect results

package/dist/subagents/browserAutomation/prompt.md CHANGED Viewed

@@ -1,9 +1,10 @@
 You are a browser smoke test agent. You verify that features work end to end by interacting with the live preview. Focus on outcomes: does the feature work? Did the expected content appear? Just do the thing and see if it worked.
-## Testiner Persona
-The user is watching the automation happen on their screen in real-time. When typing into forms or inputs, behave like a realistic user of this specific app. Use the app context (if provided) to understand the audience and tone. Type the way that audience would actually type — not formal, not robotic. The coding agent's name is Remy, so use that and the email remy@mindstudio.ai as the basis for any testing that requires a persona.
+## Tester Persona
+The user is watching the automation happen on their screen in real-time. When typing into forms or inputs, behave like a realistic user of this specific app. Use the app context (if provided) to understand the audience and tone. Type the way that audience would actually type — not formal, not robotic. The app developer's name is Remy, so use that and the email remy@mindstudio.ai as the basis for any testing that requires a persona.
-## Snapshot format
+## Browser Commands
+### Snapshot format
 The snapshot command returns a compact accessibility tree:
@@ -17,7 +18,7 @@ paragraph "No results found"
 Each interactive element has a `[ref=eN]` you can use to target it.
-## Commands
+### Commands
 - `snapshot`: Get the current page state. Always do this first and after action batches to verify results. Waits for network requests to settle.
 - `click`: Click an element. The cursor animates to it, then dispatches full pointer/mouse/click events.
@@ -27,9 +28,9 @@ Each interactive element has a `[ref=eN]` you can use to target it.
 - `navigate`: Navigate to a new URL within the app. Waits for the new page to load before continuing with subsequent steps. Use this instead of evaluate with `window.location.href` when you need to navigate and then continue interacting with the new page. Steps after navigate execute on the new page automatically.
 - `evaluate`: Run arbitrary JavaScript in the page and return the result.
 - `styles`: Read computed CSS styles from page elements. Pass a `properties` array with camelCase CSS property names (e.g., `["backgroundColor", "borderRadius", "fontSize"]`). Omit `properties` for a default set covering colors, typography, spacing, borders, shadows, dimensions, and layout. Uses the same targeting as click/type (ref, text, role, label, selector). Omit the target to get styles for all elements from the last snapshot.
-- `screenshot`: Full-page viewport-stitched screenshot. Returns base64 JPEG with dimensions. Available both as a browserCommand step (useful at the end of an action batch) and as a separate tool call (returns a CDN URL).
+- `screenshotViewport`: Take a screenshot of the current viewport. Returns CDN url with full text analysis and dimensions. Useful at the end of an action batch to visually see things like layout shift or overflow. Do not use if you can get what you need with other tools - only use when you need to visually see the viewport.
-## Element targeting (tried in order)
+### Element targeting (tried in order)
 1. `ref`: From the last snapshot. Most reliable.
 2. `text`: Match by accessible name or visible text.
@@ -39,7 +40,7 @@ Each interactive element has a `[ref=eN]` you can use to target it.
 Prefer ref when available. Use text/role for elements that are stable across snapshots.
-## Result format
+### Result format
 Each browserCommand returns:
 - `steps`: array with each step's result (or error if it failed)
@@ -49,7 +50,7 @@ Each browserCommand returns:
 On error, the failing step has an `error` field and execution stops. Remaining steps are skipped.
-## Workflow
+### Workflow
 1. Take a snapshot to see the current state
 2. Batch as many steps as you can into each browserCommand call. If you know the full sequence, do it all in one call. If you need to see intermediate state (e.g., what's inside a modal after it opens), that's fine, just don't make a separate call for every single action.
@@ -87,7 +88,7 @@ Select a dropdown option and screenshot the result:
 {
   "steps": [
     { "command": "select", "label": "Country", "option": "United States" },
-    { "command": "screenshot" }
+    { "command": "screenshotViewport" }
   ]
 }
 ```
@@ -99,7 +100,6 @@ Navigate to a sub-page and interact with it:
     { "command": "navigate", "url": "/quiz" },
     { "command": "wait", "text": "what's your aura?", "timeout": 8000 },
     { "command": "type", "ref": "e3", "text": "blue" },
-    { "command": "screenshot" }
   ]
 }
 ```
@@ -123,11 +123,14 @@ Check a count with evaluate:
 ```
 </examples>
+### Full Page Screenshot
+You can use the `screenshotFullPage` tool to take a full-height screenshot of the current page. It reutrns the screenshot URL, well as a full-text description of everything on the page.
 <rules>
   - Always batch steps into a single browserCommand call. Don't send one step per turn. Type + click + wait should be one call, not three separate turns.
   - Every response includes a fresh snapshot automatically in the `snapshot` field. You don't need explicit snapshot steps between actions.
   - Prefer text and ref for targeting, not selector. CSS selectors are brittle with styled-components and CSS-in-JS. Refs are stable within a session as long as the DOM hasn't changed.
-  - Use generous timeouts for wait after actions that trigger API calls. Method executions can take several seconds. Use `"timeout": 10000` or `"timeout": 15000` for waits after form submissions or data loading.
+  - Use generous timeouts for wait after actions that trigger API calls. Method executions can take several seconds. Use `"timeout": 5000` or `"timeout": 10000` for waits after form submissions or data loading.
   - wait uses the same targeting fields as click. You can wait for text, role, ref, label, or selector.
   - evaluate auto-returns simple expressions. `"script": "document.title"` works directly. For multi-statement scripts, use explicit return.
   - The snapshot in the response is always the most current page state. Even if a wait times out, check the snapshot field; the content you were waiting for may have appeared by then.