@mindstudio-ai/remy 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,314 @@
1
+ # Remy
2
+
3
+ A spec-building and coding agent for MindStudio apps.
4
+
5
+ Remy helps users design, spec, build, and iterate on MindStudio projects. It runs locally in a terminal or as a headless subprocess in the MindStudio sandbox. It has tools for reading/writing specs and code, running shell commands, searching code, prompting users with structured forms, and (in the sandbox) TypeScript language server integration. LLM calls are routed through the MindStudio platform for billing and model routing.
6
+
7
+ ## Quick Start
8
+
9
+ ```bash
10
+ # Make sure you're logged in (shares credentials with @mindstudio-ai/agent)
11
+ mindstudio login
12
+
13
+ # Navigate to your project
14
+ cd my-mindstudio-app
15
+
16
+ # Run remy
17
+ npx remy
18
+ ```
19
+
20
+ ## Usage
21
+
22
+ ```
23
+ $ remy [options]
24
+
25
+ Options:
26
+ --api-key <key> API key (overrides env/config)
27
+ --base-url <url> Platform API base URL
28
+ --model <id> Model ID (defaults to org's default model)
29
+ --headless Run in headless mode (stdin/stdout JSON protocol)
30
+ --lsp-url <url> LSP sidecar URL (enables LSP tools when set)
31
+ ```
32
+
33
+ ### Slash Commands
34
+
35
+ | Command | Description |
36
+ |---------|-------------|
37
+ | `/clear` | Clear conversation history and start a fresh session |
38
+ | `Escape` | Cancel the current turn (while agent is running) |
39
+
40
+ ### Session Persistence
41
+
42
+ Remy saves conversation history to `.remy-session.json` in the working directory after each turn and before blocking on external tools. On restart, it picks up where you left off. Use `/clear` to start fresh.
43
+
44
+ ## Tools
45
+
46
+ Remy's tool set depends on the project state. The sandbox tells remy whether the project has generated code in `dist/` via the `projectHasCode` field on messages.
47
+
48
+ ### Always Available
49
+
50
+ | Tool | Description |
51
+ |------|-------------|
52
+ | `setViewMode` | Switch the IDE view (intake, preview, spec, code, databases, scenarios, logs) |
53
+ | `promptUser` | Ask the user structured questions (form or inline display) |
54
+ | `clearSyncStatus` | Clear sync flags after syncing spec and code |
55
+
56
+ ### Spec Tools
57
+
58
+ Available in all sessions. Used for authoring and editing MSFM specs in `src/`.
59
+
60
+ | Tool | Description |
61
+ |------|-------------|
62
+ | `readSpec` | Read a spec file with line numbers (paths must start with `src/`) |
63
+ | `writeSpec` | Create or overwrite a spec file (creates parent dirs) |
64
+ | `editSpec` | Heading-addressed edits (replace, insert, delete by heading path) |
65
+ | `listSpecFiles` | List all files in the `src/` directory tree |
66
+
67
+ ### Code Tools
68
+
69
+ Available when the project has generated code (`projectHasCode: true`).
70
+
71
+ | Tool | Description |
72
+ |------|-------------|
73
+ | `readFile` | Read a file with line numbers |
74
+ | `writeFile` | Create or overwrite a file (creates parent dirs) |
75
+ | `editFile` | Targeted string replacement (must be unique match) |
76
+ | `bash` | Run a shell command |
77
+ | `grep` | Search file contents |
78
+ | `glob` | Find files by pattern |
79
+ | `listDir` | List directory contents |
80
+ | `editsFinished` | Signal that file edits are complete for live preview |
81
+
82
+ ### LSP Tools (sandbox only)
83
+
84
+ Available when `--lsp-url` is passed.
85
+
86
+ | Tool | Description |
87
+ |------|-------------|
88
+ | `lspDiagnostics` | Type errors and warnings for a file, with suggested quick fixes |
89
+ | `restartProcess` | Restart a managed sandbox process (e.g., dev server after npm install) |
90
+
91
+ ### Sync Tools (sync turns only)
92
+
93
+ Available when the sandbox sends a `runCommand: "sync"` message.
94
+
95
+ | Tool | Description |
96
+ |------|-------------|
97
+ | `presentSyncPlan` | Present a markdown sync plan to the user for approval (streams content) |
98
+
99
+ ### Tool Streaming
100
+
101
+ Tools can opt into streaming via a `streaming` config on the tool definition:
102
+
103
+ - **Content streaming** (writeSpec, writeFile, presentSyncPlan): Streams `tool_input_delta` events with progressive content as the LLM generates tool arguments. Tools can provide a `transform` function to customize the streamed output (e.g., writeSpec/writeFile compute a progressive diff).
104
+ - **Input streaming** (promptUser): Streams progressive `tool_start` events with `partial: true` as structured input (like a questions array) builds up.
105
+ - **No streaming** (all other tools): `tool_start` fires once when the complete tool arguments are available.
106
+
107
+ Streaming is driven by `tool_input_delta` (Anthropic) or `tool_input_args` (Gemini) SSE events from the platform.
108
+
109
+ ## Architecture
110
+
111
+ ```
112
+ User input
113
+ → Agent loop (src/agent.ts)
114
+ → POST /_internal/v2/agent/chat (SSE stream)
115
+ ← text, thinking, tool_input_delta, tool_input_args, tool_use events
116
+ → Execute tools locally in parallel
117
+ → External tools (promptUser, setViewMode, etc.) wait for sandbox response
118
+ → Send tool results back
119
+ → Loop until done
120
+ → Save session to .remy-session.json
121
+ ```
122
+
123
+ The agent core (`src/agent.ts`) is a pure async function with no UI dependencies. The TUI (`src/tui/`) is an Ink + React layer on top. Headless mode (`src/headless.ts`) provides the same agent over a stdin/stdout JSON protocol for the sandbox.
124
+
125
+ ### Project Structure
126
+
127
+ ```
128
+ src/
129
+ index.tsx CLI entry point
130
+ agent.ts Core tool-call loop (pure async, no UI)
131
+ api.ts SSE streaming client for platform API
132
+ parsePartialJson.ts Partial JSON parser for streaming tool input
133
+ session.ts .remy-session.json persistence
134
+ config.ts API key/URL resolution
135
+ logger.ts Structured logging
136
+ headless.ts stdin/stdout JSON protocol for sandbox
137
+
138
+ prompt/
139
+ index.ts System prompt builder (mode-aware)
140
+ actions/ Built-in prompts for runCommand actions
141
+ sync.md
142
+ static/ Behavioral instruction fragments
143
+ identity.md
144
+ intake.md
145
+ authoring.md
146
+ instructions.md
147
+ lsp.md
148
+ projectContext.ts Reads manifest, spec metadata, file listing at runtime
149
+ compiled/ Platform docs distilled for agent consumption
150
+ sources/ Raw source docs (fetched + manual)
151
+
152
+ tools/
153
+ index.ts Tool registry with streaming config interface
154
+ _helpers/
155
+ diff.ts Unified diff generator
156
+ lsp.ts LSP sidecar HTTP client
157
+ spec/ Spec and external tools
158
+ readSpec.ts
159
+ writeSpec.ts
160
+ editSpec.ts
161
+ listSpecFiles.ts
162
+ setViewMode.ts
163
+ promptUser.ts
164
+ clearSyncStatus.ts
165
+ presentSyncPlan.ts
166
+ _helpers.ts Heading resolution, path validation
167
+ code/ Code tools (file editing, shell, search)
168
+ readFile.ts
169
+ writeFile.ts
170
+ editFile/
171
+ index.ts
172
+ _helpers.ts
173
+ bash.ts
174
+ grep.ts
175
+ glob.ts
176
+ listDir.ts
177
+ editsFinished.ts
178
+ lspDiagnostics.ts
179
+ restartProcess.ts
180
+
181
+ tui/ Interactive terminal UI (Ink + React)
182
+ App.tsx
183
+ InputPrompt.tsx
184
+ MessageList.tsx
185
+ ThinkingBlock.tsx
186
+ ToolCall.tsx
187
+ ```
188
+
189
+ ### External Tools
190
+
191
+ Some tools are resolved by the sandbox rather than executed locally. Remy emits `tool_start`, then waits for the sandbox to send back a `tool_result` via stdin. This is used for tools that require sandbox/user interaction:
192
+
193
+ - `promptUser` — renders a form or inline prompt, blocks until user responds
194
+ - `setViewMode` — switches the IDE view mode
195
+ - `clearSyncStatus` — clears sync dirty flags and updates git sync ref
196
+ - `presentSyncPlan` — renders a full-screen markdown plan for user approval
197
+
198
+ ### Project Instructions
199
+
200
+ Remy automatically loads project-level agent instructions on startup. It checks for these files in order (first match wins):
201
+
202
+ `CLAUDE.md`, `claude.md`, `.claude/instructions.md`, `AGENTS.md`, `agents.md`, `.agents.md`, `COPILOT.md`, `copilot.md`, `.copilot-instructions.md`, `.github/copilot-instructions.md`, `REMY.md`, `remy.md`, `.cursorrules`, `.cursorules`
203
+
204
+ ## Headless Mode
205
+
206
+ Run `remy --headless` for programmatic control via newline-delimited JSON. This is how the sandbox C&C server runs remy as a managed child process.
207
+
208
+ ### Input Actions (stdin)
209
+
210
+ Send JSON commands, one per line.
211
+
212
+ #### `message`
213
+
214
+ Send a user message to the agent.
215
+
216
+ ```json
217
+ {"action": "message", "text": "fix the bug in auth.ts", "projectHasCode": true}
218
+ ```
219
+
220
+ Fields:
221
+ - `text` — the user message (required unless `runCommand` is set)
222
+ - `projectHasCode` — controls tool availability (default: `true`)
223
+ - `viewContext` — `{ mode, openFiles?, activeFile? }` for prompt context
224
+ - `attachments` — array of `{ url, extractedTextUrl? }` for file attachments
225
+ - `runCommand` — triggers a built-in action prompt (e.g., `"sync"`)
226
+
227
+ When `runCommand` is set, the message text is replaced with a built-in prompt and the user message is marked as `hidden` in conversation history (sent to the LLM but not shown in the UI).
228
+
229
+ #### `tool_result`
230
+
231
+ Send the result of an external tool back to the agent.
232
+
233
+ ```json
234
+ {"action": "tool_result", "id": "toolu_abc123", "result": "ok"}
235
+ ```
236
+
237
+ #### `get_history`
238
+
239
+ Return the full conversation history.
240
+
241
+ ```json
242
+ {"action": "get_history"}
243
+ ```
244
+
245
+ Messages with `hidden: true` were generated by `runCommand` actions and should not be displayed in the UI.
246
+
247
+ #### `cancel`
248
+
249
+ Cancel the current turn.
250
+
251
+ ```json
252
+ {"action": "cancel"}
253
+ ```
254
+
255
+ #### `clear`
256
+
257
+ Clear conversation history and delete the session file.
258
+
259
+ ```json
260
+ {"action": "clear"}
261
+ ```
262
+
263
+ ### Output Events (stdout)
264
+
265
+ Events are emitted as newline-delimited JSON.
266
+
267
+ #### Lifecycle Events
268
+
269
+ | Event | Fields | Description |
270
+ |-------|--------|-------------|
271
+ | `ready` | | Headless mode initialized, ready for input |
272
+ | `session_restored` | `messageCount` | Previous session loaded |
273
+ | `session_cleared` | | Session history cleared |
274
+ | `stopping` | | Shutdown initiated |
275
+ | `stopped` | | Shutdown complete |
276
+
277
+ #### Agent Events (streamed during message processing)
278
+
279
+ | Event | Fields | Description |
280
+ |-------|--------|-------------|
281
+ | `turn_started` | | Agent began processing a message |
282
+ | `text` | `text` | Streaming text chunk |
283
+ | `thinking` | `text` | Agent's internal reasoning |
284
+ | `tool_start` | `id`, `name`, `input`, `partial?` | Tool execution started. `partial: true` means more `tool_start` events will follow for this id (progressive input streaming). |
285
+ | `tool_input_delta` | `id`, `name`, `result` | Progressive tool content (streaming tools only) |
286
+ | `tool_done` | `id`, `name`, `result`, `isError` | Tool execution completed |
287
+ | `turn_done` | | Agent finished responding |
288
+ | `turn_cancelled` | | Turn was cancelled |
289
+ | `error` | `error` | Error message |
290
+ | `history` | `messages` | Response to `get_history` |
291
+
292
+ ### Logging
293
+
294
+ In headless mode, structured logs go to **stderr**. Stdout is reserved for the JSON protocol. Log levels: `error`, `warn`, `info`, `debug`.
295
+
296
+ In interactive mode, logs go to `.remy-debug.log` in the working directory (default level: `error`). Override with `--log-level`.
297
+
298
+ ## Development
299
+
300
+ ```bash
301
+ npm install
302
+ npm run build # Build with tsup
303
+ npm run dev # Watch mode
304
+ npm run typecheck # Type check only
305
+ ```
306
+
307
+ ## Config
308
+
309
+ Remy reads credentials from `~/.mindstudio-local-tunnel/config.json`, using the active environment's `apiKey` and `apiBaseUrl`.
310
+
311
+ Resolution order for API key:
312
+ 1. `--api-key` flag
313
+ 2. `MINDSTUDIO_API_KEY` environment variable
314
+ 3. `~/.mindstudio-local-tunnel/config.json` (active environment)
@@ -0,0 +1,12 @@
1
+ This is an automated action triggered by the user pressing "Publish" in the editor.
2
+
3
+ The user wants to deploy their app. Pushing to the `main` branch triggers a production deploy.
4
+
5
+ Review the current state of the working tree — what has changed since the last commit, what's been committed since the last push, and the overall shape of recent work. Write a user-friendly changelog with `presentPublishPlan` — summarize what changed in plain language ("added vendor approval workflow", "fixed invoice totals", "updated the dashboard layout"). Reference specific code or file paths only when it helps clarity. This is what the user will see before deploying.
6
+
7
+ If approved:
8
+ - Stage and commit any uncommitted changes with a clean, descriptive commit message
9
+ - Push to main
10
+ - Let the user know their app is deploying
11
+
12
+ If dismissed, acknowledge and do nothing.
@@ -0,0 +1,19 @@
1
+ This is an automated action triggered by the user pressing "Sync" in the editor.
2
+
3
+ The user has manually edited files since the last sync. The `refs/sync-point` git ref marks the last known-good sync state. It's created using a temporary git index that captures the full working tree (including unstaged changes) as a tree object — so it represents exactly what the files looked like at sync time, not just what was committed.
4
+
5
+ To see what the user changed, run: `git diff refs/sync-point -- src/ dist/`
6
+
7
+ This compares the sync-point tree against the current working tree. Do not add `HEAD` or any other ref — the command as written diffs directly against the working tree, which is what you want.
8
+
9
+ In the diff output: lines prefixed with `-` are what was in the file at last sync. Lines prefixed with `+` are the user's current edits. Sync should bring the other side in line with the `+` side.
10
+
11
+ Analyze the changes and write a sync plan with `presentSyncPlan` — a clear markdown summary of what changed and what you intend to update. Write it for a human: describe changes in plain language ("renamed the greeting field", "added a note about error handling"), not as a list of file paths and code diffs. Reference specific code or file paths only when it helps clarity. The user will review and approve before you make changes.
12
+
13
+ If approved:
14
+ - If spec files (`src/`) changed, update the corresponding code in `dist/` to match
15
+ - If code files (`dist/`) changed, update the corresponding spec in `src/` to match
16
+ - If both changed, reconcile — spec is the source of truth for intent, but respect code changes that add implementation detail
17
+ - When all files are synced, call `clearSyncStatus`
18
+
19
+ If dismissed, acknowledge and do nothing.
@@ -0,0 +1,100 @@
1
+ # Compiled Prompt Fragments
2
+
3
+ This directory contains distilled prompt fragments generated from the source
4
+ docs in `docs/developer-guide/` (project root). These are loaded by `../index.ts` and injected
5
+ into Remy's system prompt at runtime.
6
+
7
+ ## How to compile
8
+
9
+ The compilation is done manually in a session with an LLM (Claude Code or
10
+ similar). Work through the source docs and compile them into prompt-ready
11
+ fragments.
12
+
13
+ ### Step 1: Compile with an LLM
14
+
15
+ Open a session and ask it to work through the compilation. Give it these
16
+ instructions:
17
+
18
+ ---
19
+
20
+ **You will compile source docs into prompt fragments for Remy, a coding agent
21
+ that builds MindStudio apps. The compiled fragments go in `src/prompt/compiled/`
22
+ and are loaded into the agent's system prompt at runtime.**
23
+
24
+ **Work through this one source file at a time, sequentially.** For each one:
25
+ 1. Read the source doc thoroughly
26
+ 2. Decide whether it should become its own fragment, be merged with a related
27
+ source, or be skipped entirely
28
+ 3. Present your draft of the compiled fragment
29
+ 4. Wait for review and feedback before moving to the next one
30
+
31
+ Do not parallelize this work. Do not generate multiple fragments at once. Each
32
+ fragment deserves careful attention — these are the instructions a coding agent
33
+ will follow to build real products, and mistakes here propagate into every app
34
+ it builds.
35
+
36
+ Source files are in `docs/developer-guide/` at the project root.
37
+
38
+ ## How to think about compilation
39
+
40
+ **Your audience is an LLM acting as a coding agent.** It needs to produce
41
+ correct code, not learn concepts. Everything you write should be optimized
42
+ for an agent that is actively building a MindStudio app and needs to get
43
+ the details right.
44
+
45
+ ### What to keep
46
+
47
+ - **API signatures, parameter types, return types, and code examples.**
48
+ These must be exactly right. The agent will copy these patterns directly
49
+ into the code it writes. A wrong type or a missing parameter means broken
50
+ code in production.
51
+ - **Concrete examples, specific error cases, explicit constraints, enumerated
52
+ edge cases.** These are the highest-value content. A source doc that says
53
+ "ensure data integrity, including checking for duplicate keys, null foreign
54
+ references, and orphaned records" — the specific checks ARE the value.
55
+ Collapsing that to "ensure data integrity" loses the actionable detail.
56
+ - **Tables and structured reference data.** Manifest fields, db predicates,
57
+ interface config schemas, role API methods — these are lookup references
58
+ the agent will consult while writing code. Keep them complete.
59
+ - **Rules and constraints that affect correctness.** "Only packages declared
60
+ in package.json are available at runtime" is the kind of detail that
61
+ prevents hard-to-debug errors.
62
+
63
+ ### What to strip
64
+
65
+ - **Setup instructions, installation steps, CLI commands.** The agent isn't
66
+ setting up a dev environment — it's writing code inside one.
67
+ - **Platform internals and deployment pipeline details.** How the platform
68
+ builds and deploys is not the agent's concern.
69
+ - **Conceptual explanations and philosophy.** "Why" something was designed
70
+ a certain way is rarely useful mid-task. Keep the "what" and "how."
71
+ - **Marketing language, feature pitches, comparative positioning.**
72
+ - **Cross-references to other docs** ("see Section X for details"). The
73
+ fragment should be self-contained.
74
+
75
+ ### Fragment format
76
+
77
+ ```markdown
78
+ # Fragment Title
79
+
80
+ Brief one-line context.
81
+
82
+ ## Section
83
+ ...content...
84
+ ```
85
+
86
+ No YAML frontmatter. No meta-commentary. Just the reference content the
87
+ agent needs. Each fragment should make sense on its own — the agent may
88
+ not see all fragments in every session.
89
+
90
+ ---
91
+
92
+ ### Step 2: Review
93
+
94
+ Read through the compiled fragments and verify code examples are accurate.
95
+ The LLM may hallucinate API details — cross-check against the source docs.
96
+
97
+ ### Step 3: Commit
98
+
99
+ The compiled fragments are committed to git. They're the snapshot the agent
100
+ uses at runtime.
@@ -0,0 +1,77 @@
1
+ # Roles & Auth
2
+
3
+ MindStudio apps use role-based access control. Roles are defined in the manifest, assigned to users in the editor, and enforced in methods. The backend is the authority — methods enforce access control via `auth.requireRole()`. The frontend can read roles for conditional rendering, but enforcement always happens server-side.
4
+
5
+ **Roles are optional.** Many apps don't need them — single-user apps, internal tools, simple utilities. If the app doesn't have multiple user types with different permissions, skip roles entirely. Only add them when the app explicitly needs to distinguish who can do what.
6
+
7
+ ## Defining Roles
8
+
9
+ In `mindstudio.json`:
10
+
11
+ ```json
12
+ {
13
+ "roles": [
14
+ { "id": "requester", "name": "Requester", "description": "Can submit vendor requests and purchase orders." },
15
+ { "id": "approver", "name": "Approver", "description": "Reviews and approves purchase orders." },
16
+ { "id": "admin", "name": "Administrator", "description": "Full access to all app functions." },
17
+ { "id": "ap", "name": "Accounts Payable", "description": "Processes invoices and payments." }
18
+ ]
19
+ }
20
+ ```
21
+
22
+ - `id` — kebab-case, used in code (`auth.requireRole('admin')`)
23
+ - `name` — display name shown in the editor
24
+ - `description` — what this role can do (useful for the agent and for users in the role assignment UI)
25
+
26
+ Roles are synced to the platform on deploy. Adding or removing roles in the manifest creates or deletes them on the next push.
27
+
28
+ ## Backend Auth API
29
+
30
+ ```typescript
31
+ import { auth } from '@mindstudio-ai/agent';
32
+ ```
33
+
34
+ ### `auth.requireRole(...roles)`
35
+
36
+ Throws a 403 error if the current user doesn't have **any** of the specified roles. Use at the top of methods to gate access.
37
+
38
+ ```typescript
39
+ auth.requireRole('admin'); // single role
40
+ auth.requireRole('admin', 'approver'); // any of these
41
+ ```
42
+
43
+ ### `auth.hasRole(...roles)`
44
+
45
+ Returns `boolean`. Same logic as `requireRole` but doesn't throw. Use for conditional behavior within a method.
46
+
47
+ ### `auth.userId`
48
+
49
+ The current user's UUID. Always available.
50
+
51
+ ### `auth.roles`
52
+
53
+ Array of role names assigned to the current user.
54
+
55
+ ### `auth.getUsersByRole(role)`
56
+
57
+ Returns an array of user IDs that have the specified role. Useful for things like "notify all admins."
58
+
59
+ ## Frontend Auth
60
+
61
+ ```typescript
62
+ import { auth } from '@mindstudio-ai/interface';
63
+
64
+ auth.userId; // current user's ID
65
+ auth.name; // display name
66
+ auth.email; // email address
67
+ auth.profilePictureUrl; // URL or null
68
+ ```
69
+
70
+ The frontend SDK provides display-only auth context. Role checking for UI purposes (showing/hiding elements) is done by reading role data from the backend:
71
+
72
+ ```typescript
73
+ const { isAdmin, pendingCount } = await api.getDashboard();
74
+ {isAdmin && <AdminPanel />}
75
+ ```
76
+
77
+ The frontend is untrusted — anyone can modify JavaScript in the browser. Access control must be enforced server-side in methods. The frontend shows or hides UI based on role data from the backend, but the backend is the authority.