@letta-ai/letta-code 0.27.7 → 0.27.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/letta.js +1095 -534
- package/package.json +2 -1
- package/scripts/check-bundled-skill-scripts.js +169 -0
- package/scripts/check.js +1 -0
- package/skills/converting-mcps-to-skills/SKILL.md +1 -12
- package/skills/converting-mcps-to-skills/scripts/mcp-stdio.ts +192 -57
- package/skills/creating-extensions/references/plan-mode.md +63 -24
- package/skills/creating-skills/scripts/validate-skill.ts +129 -5
- package/skills/image-generation/SKILL.md +110 -0
- package/skills/converting-mcps-to-skills/scripts/package.json +0 -13
|
@@ -24,13 +24,15 @@ This is a pattern reference, not a full product implementation. Keep local exten
|
|
|
24
24
|
-> remind the agent that only read-only tools and plan-file writes are allowed
|
|
25
25
|
-> permission overlay denies mutations outside ~/.letta/plans/*.md
|
|
26
26
|
-> agent writes the plan with normal Write/Edit/ApplyPatch tools
|
|
27
|
-
-> agent reads the plan and calls AskUserQuestion with Approve / Revise
|
|
27
|
+
-> agent reads the plan and calls AskUserQuestion with the full current plan text and Approve / Revise
|
|
28
28
|
-> if approved, agent calls exit_plan_mode
|
|
29
29
|
-> exit_plan_mode clears state and returns the approved-plan execution handoff
|
|
30
30
|
```
|
|
31
31
|
|
|
32
32
|
Plan files are normal markdown files. Do not add a special `update_plan_file` tool unless the user explicitly wants that abstraction. Let the agent use normal write tools and constrain those tools with permissions.
|
|
33
33
|
|
|
34
|
+
Plan approval must show the user the full current plan text. Do not ask "does this look right?" with only a summary. After every revision, read the plan file again and present the full revised plan in the `AskUserQuestion.question` body before exiting plan mode.
|
|
35
|
+
|
|
34
36
|
## Capabilities used
|
|
35
37
|
|
|
36
38
|
Guard each registration with the matching capability:
|
|
@@ -38,7 +40,7 @@ Guard each registration with the matching capability:
|
|
|
38
40
|
- `commands`: `/plan` for explicit human entry
|
|
39
41
|
- `tools`: `enter_plan_mode` and `exit_plan_mode` for model-driven entry/exit
|
|
40
42
|
- `events.turns`: append a focused plan-mode reminder while active
|
|
41
|
-
- `permissions`: block mutating tools except plan-file writes
|
|
43
|
+
- `permissions`: block mutating tools except planning coordination tools and plan-file writes
|
|
42
44
|
|
|
43
45
|
Do not use panels for persistent mode state. Panels are transient UI and can be noisy/fragile for mode indicators. Do not add a custom statusline renderer just to show plan mode; `setStatuslineRenderer` is a single global renderer, not an additive slot. This example intentionally keeps visible mode state out of scope.
|
|
44
46
|
|
|
@@ -99,9 +101,10 @@ In plan mode, you should:
|
|
|
99
101
|
1. Thoroughly explore the codebase to understand existing patterns
|
|
100
102
|
2. Identify similar features and architectural approaches
|
|
101
103
|
3. Consider multiple approaches and their trade-offs
|
|
102
|
-
4. Use
|
|
103
|
-
5.
|
|
104
|
-
6.
|
|
104
|
+
4. Use direct read-only tools for exploration. Do not launch coding, general-purpose, or fork subagents in plan mode; they may mutate files and should be denied. Only recall-style subagents are allowed if available.
|
|
105
|
+
5. Use AskUserQuestion if you need to clarify the approach
|
|
106
|
+
6. Design a concrete implementation strategy
|
|
107
|
+
7. When ready, write the plan to the plan file, read the plan file, use AskUserQuestion to present the full current plan text for approval, and call exit_plan_mode after the user approves
|
|
105
108
|
|
|
106
109
|
Remember: DO NOT write or edit any files except the plan file. This is a read-only exploration and planning phase.
|
|
107
110
|
|
|
@@ -159,8 +162,9 @@ Plan mode is active. The user indicated that they do not want you to execute yet
|
|
|
159
162
|
1. Answer the user's query comprehensively, using the AskUserQuestion tool if you need to ask the user clarifying questions.
|
|
160
163
|
2. Write your implementation plan to the plan file. Plan file path: ${session.planFilePath}
|
|
161
164
|
3. If using apply_patch, use this exact relative path in patch headers: ${relativePatchPath}
|
|
162
|
-
4.
|
|
163
|
-
5.
|
|
165
|
+
4. Use direct read-only tools for exploration. Do not launch coding, general-purpose, or fork subagents in plan mode; they may mutate files and should be denied. Only recall-style subagents are allowed if available.
|
|
166
|
+
5. When the plan is complete, read the plan file and present the full current plan text to the user with AskUserQuestion. The question body must include the entire plan, not a summary. The question should offer at least "Approve" and "Revise" options.
|
|
167
|
+
6. If the user approves, call exit_plan_mode immediately. If the user asks to revise, stay in plan mode, update the plan file, then read and present the full revised plan again.
|
|
164
168
|
Do NOT make any file changes outside the plan file or run any tools that modify the system state until the user has approved the plan and you have called exit_plan_mode.
|
|
165
169
|
</system-reminder>`;
|
|
166
170
|
}
|
|
@@ -176,35 +180,58 @@ if (letta.capabilities.events.turns) {
|
|
|
176
180
|
|
|
177
181
|
## Permission overlay
|
|
178
182
|
|
|
179
|
-
Use a permission overlay, not `tool_start`, for policy. Normalize tool names by family; UI display names and provider-specific tool names drift (`Read`, `read`, `read_file`, `ReadFile`, `SearchFileContent`, etc.).
|
|
183
|
+
Use a permission overlay, not `tool_start`, for policy. Normalize tool names by family; UI display names and provider-specific tool names drift (`Read`, `read`, `read_file`, `ReadFile`, `SearchFileContent`, etc.). Keep pure read-only tools separate from planning coordination tools like `AskUserQuestion` and todo/plan updates so the policy stays honest.
|
|
180
184
|
|
|
181
185
|
```ts
|
|
182
186
|
const readOnlyToolNames = new Set([
|
|
183
|
-
"askuserquestion",
|
|
184
|
-
"ask_user_question",
|
|
185
187
|
"glob",
|
|
188
|
+
"globgemini",
|
|
186
189
|
"grep",
|
|
190
|
+
"grepfiles",
|
|
191
|
+
"list",
|
|
187
192
|
"listdir",
|
|
188
|
-
"
|
|
193
|
+
"listdirectory",
|
|
189
194
|
"ls",
|
|
195
|
+
"notebookread",
|
|
190
196
|
"read",
|
|
191
|
-
"read_file",
|
|
192
197
|
"readfile",
|
|
198
|
+
"readfilegemini",
|
|
199
|
+
"readlsp",
|
|
200
|
+
"readmanyfiles",
|
|
193
201
|
"search",
|
|
194
|
-
"
|
|
202
|
+
"searchfilecontent",
|
|
203
|
+
"searchfiles",
|
|
195
204
|
"skill",
|
|
196
205
|
"taskoutput",
|
|
197
|
-
"
|
|
198
|
-
"view_image",
|
|
206
|
+
"viewimage",
|
|
199
207
|
]);
|
|
200
208
|
|
|
209
|
+
const planningToolNames = new Set([
|
|
210
|
+
"askuserquestion",
|
|
211
|
+
"enterplanmode",
|
|
212
|
+
"exitplanmode",
|
|
213
|
+
"todowrite",
|
|
214
|
+
"updateplan",
|
|
215
|
+
"writetodos",
|
|
216
|
+
]);
|
|
217
|
+
|
|
218
|
+
const readOnlySubagentTypes = new Set(["recall"]);
|
|
219
|
+
|
|
201
220
|
function normalizedToolName(toolName) {
|
|
202
|
-
return toolName.replace(/[
|
|
221
|
+
return toolName.replace(/[^a-z0-9]/gi, "").toLowerCase();
|
|
203
222
|
}
|
|
204
223
|
|
|
205
224
|
function isReadOnlyToolName(toolName) {
|
|
206
|
-
|
|
207
|
-
|
|
225
|
+
return readOnlyToolNames.has(normalizedToolName(toolName));
|
|
226
|
+
}
|
|
227
|
+
|
|
228
|
+
function isPlanningToolName(toolName) {
|
|
229
|
+
return planningToolNames.has(normalizedToolName(toolName));
|
|
230
|
+
}
|
|
231
|
+
|
|
232
|
+
function isAllowedReadOnlySubagent(args) {
|
|
233
|
+
const subagentType = args?.subagent_type;
|
|
234
|
+
return typeof subagentType === "string" && readOnlySubagentTypes.has(normalizedToolName(subagentType));
|
|
208
235
|
}
|
|
209
236
|
|
|
210
237
|
function isPlanFileWrite(toolName, args, cwd) {
|
|
@@ -220,18 +247,28 @@ if (letta.capabilities.permissions) {
|
|
|
220
247
|
check(event) {
|
|
221
248
|
const session = getSession(event.conversationId);
|
|
222
249
|
if (!session) return;
|
|
250
|
+
const toolName = String(event.toolName);
|
|
251
|
+
const args = event.args ?? {};
|
|
252
|
+
|
|
253
|
+
if (isReadOnlyToolName(toolName)) return { decision: "allow" };
|
|
254
|
+
if (isPlanningToolName(toolName)) return { decision: "allow", reason: "planning" };
|
|
255
|
+
|
|
256
|
+
const normalized = normalizedToolName(toolName);
|
|
257
|
+
if ((normalized === "agent" || normalized === "task") && isAllowedReadOnlySubagent(args)) {
|
|
258
|
+
return { decision: "allow", reason: "read-only subagent" };
|
|
259
|
+
}
|
|
223
260
|
|
|
224
|
-
if (
|
|
225
|
-
if (isPlanFileWrite(event.toolName, event.args, event.workingDirectory || event.cwd)) {
|
|
261
|
+
if (isPlanFileWrite(toolName, args, event.workingDirectory || event.cwd)) {
|
|
226
262
|
return { decision: "allow", reason: "plan file" };
|
|
227
263
|
}
|
|
228
264
|
|
|
229
265
|
return {
|
|
230
266
|
decision: "deny",
|
|
231
267
|
reason:
|
|
232
|
-
`Plan mode is active.
|
|
268
|
+
`Plan mode is active. Use direct read-only tools (Read, Grep, Glob, List, Search, Skill, TaskOutput, safe read-only Bash), planning tools (AskUserQuestion, TodoWrite/UpdatePlan), or recall-style subagents only. ` +
|
|
269
|
+
`Do not use coding, general-purpose, or fork subagents in plan mode. ` +
|
|
233
270
|
`Write your plan to: ${session.planFilePath}. ` +
|
|
234
|
-
`
|
|
271
|
+
`When ready, read the plan file and include the full current plan text in AskUserQuestion for approval, then call exit_plan_mode after approval.`,
|
|
235
272
|
};
|
|
236
273
|
},
|
|
237
274
|
}));
|
|
@@ -242,14 +279,14 @@ Shell allowlists are easy to get wrong. Start conservative: allow clearly read-o
|
|
|
242
279
|
|
|
243
280
|
## Exit tool
|
|
244
281
|
|
|
245
|
-
In the extension version, `exit_plan_mode` is not the approval UI. The agent should present the plan with `AskUserQuestion
|
|
282
|
+
In the extension version, `exit_plan_mode` is not the approval UI. The agent should read the plan file, present the full current plan text with `AskUserQuestion`, then call `exit_plan_mode` only after the user approves.
|
|
246
283
|
|
|
247
284
|
```ts
|
|
248
285
|
if (letta.capabilities.tools) {
|
|
249
286
|
disposers.push(letta.tools.register({
|
|
250
287
|
name: "exit_plan_mode",
|
|
251
288
|
description:
|
|
252
|
-
"Exit plan mode only after the plan file has been written, the full plan has been presented with AskUserQuestion, and the user has approved it.",
|
|
289
|
+
"Exit plan mode only after the plan file has been written, the full current plan text has been presented with AskUserQuestion, and the user has approved it.",
|
|
253
290
|
parameters: { type: "object", properties: {}, additionalProperties: false },
|
|
254
291
|
requiresApproval: false,
|
|
255
292
|
parallelSafe: false,
|
|
@@ -282,4 +319,6 @@ if (letta.capabilities.tools) {
|
|
|
282
319
|
## Notes
|
|
283
320
|
|
|
284
321
|
- Keep `exit_plan_mode` as the final state transition and execution handoff. The approved-plan text in its tool return is useful model context.
|
|
322
|
+
- Plan approval must include the full current plan text in `AskUserQuestion.question`, not just a summary or "does this look right?". After revisions, re-read the file and present the full revised plan again.
|
|
323
|
+
- Keep arbitrary coding subagents denied in plan mode unless the runtime has a true read-only child mode. With the current subagent set, allow only recall-style subagents.
|
|
285
324
|
- If the user renames the plan file, exit logic can use the newest non-empty `~/.letta/plans/*.md` modified after plan mode started, or accept an optional plan path. Keep the user-facing flow normal: write plan file, ask approval, then exit.
|
|
@@ -12,7 +12,6 @@
|
|
|
12
12
|
import { existsSync, readFileSync } from "node:fs";
|
|
13
13
|
import { basename, join, resolve } from "node:path";
|
|
14
14
|
import { fileURLToPath } from "node:url";
|
|
15
|
-
import { parse as parseYaml } from "yaml";
|
|
16
15
|
|
|
17
16
|
interface ValidationResult {
|
|
18
17
|
valid: boolean;
|
|
@@ -29,6 +28,131 @@ const ALLOWED_PROPERTIES = new Set([
|
|
|
29
28
|
"allowed-tools",
|
|
30
29
|
]);
|
|
31
30
|
|
|
31
|
+
export const MAX_SKILL_NAME_LENGTH = 64;
|
|
32
|
+
|
|
33
|
+
type BunYamlRuntime = {
|
|
34
|
+
Bun?: {
|
|
35
|
+
YAML?: {
|
|
36
|
+
parse?: (source: string) => unknown;
|
|
37
|
+
};
|
|
38
|
+
};
|
|
39
|
+
};
|
|
40
|
+
|
|
41
|
+
function parseQuotedScalar(value: string): string {
|
|
42
|
+
if (value.startsWith('"')) {
|
|
43
|
+
if (!value.endsWith('"') || value.length === 1) {
|
|
44
|
+
throw new Error("Unterminated double-quoted scalar");
|
|
45
|
+
}
|
|
46
|
+
return JSON.parse(value) as string;
|
|
47
|
+
}
|
|
48
|
+
|
|
49
|
+
if (value.startsWith("'")) {
|
|
50
|
+
if (!value.endsWith("'") || value.length === 1) {
|
|
51
|
+
throw new Error("Unterminated single-quoted scalar");
|
|
52
|
+
}
|
|
53
|
+
return value.slice(1, -1).replace(/''/g, "'");
|
|
54
|
+
}
|
|
55
|
+
|
|
56
|
+
return value;
|
|
57
|
+
}
|
|
58
|
+
|
|
59
|
+
function parseScalar(value: string): unknown {
|
|
60
|
+
const trimmed = value.trim();
|
|
61
|
+
if (!trimmed) return "";
|
|
62
|
+
|
|
63
|
+
if (trimmed === "true") return true;
|
|
64
|
+
if (trimmed === "false") return false;
|
|
65
|
+
if (trimmed === "null" || trimmed === "~") return null;
|
|
66
|
+
|
|
67
|
+
if (trimmed.startsWith('"') || trimmed.startsWith("'")) {
|
|
68
|
+
return parseQuotedScalar(trimmed);
|
|
69
|
+
}
|
|
70
|
+
|
|
71
|
+
// The fallback parser intentionally accepts only the frontmatter subset this
|
|
72
|
+
// validator needs. Unquoted ": " inside a scalar is the most common YAML
|
|
73
|
+
// authoring mistake; reject it instead of silently producing a bad value.
|
|
74
|
+
if (trimmed.includes(": ")) {
|
|
75
|
+
throw new Error(`Unexpected ':' in unquoted scalar: ${trimmed}`);
|
|
76
|
+
}
|
|
77
|
+
|
|
78
|
+
if (/^-?\d+(?:\.\d+)?$/.test(trimmed)) {
|
|
79
|
+
return Number(trimmed);
|
|
80
|
+
}
|
|
81
|
+
|
|
82
|
+
return trimmed;
|
|
83
|
+
}
|
|
84
|
+
|
|
85
|
+
function parseFrontmatterFallback(source: string): Record<string, unknown> {
|
|
86
|
+
const result: Record<string, unknown> = {};
|
|
87
|
+
const lines = source.split(/\r?\n/);
|
|
88
|
+
|
|
89
|
+
for (let i = 0; i < lines.length; i++) {
|
|
90
|
+
const line = lines[i];
|
|
91
|
+
if (line === undefined) continue;
|
|
92
|
+
const trimmed = line.trim();
|
|
93
|
+
|
|
94
|
+
if (!trimmed || trimmed.startsWith("#")) {
|
|
95
|
+
continue;
|
|
96
|
+
}
|
|
97
|
+
|
|
98
|
+
if (/^\s/.test(line)) {
|
|
99
|
+
// Nested data belongs to the previous top-level key. The validator only
|
|
100
|
+
// checks top-level field names plus name/description scalar values.
|
|
101
|
+
continue;
|
|
102
|
+
}
|
|
103
|
+
|
|
104
|
+
const colonIndex = line.indexOf(":");
|
|
105
|
+
if (colonIndex <= 0) {
|
|
106
|
+
throw new Error(`Invalid frontmatter line: ${line}`);
|
|
107
|
+
}
|
|
108
|
+
|
|
109
|
+
const key = line.slice(0, colonIndex).trim();
|
|
110
|
+
const rawValue = line.slice(colonIndex + 1).trim();
|
|
111
|
+
if (!key) {
|
|
112
|
+
throw new Error(`Invalid frontmatter line: ${line}`);
|
|
113
|
+
}
|
|
114
|
+
|
|
115
|
+
if (!rawValue) {
|
|
116
|
+
result[key] = {};
|
|
117
|
+
continue;
|
|
118
|
+
}
|
|
119
|
+
|
|
120
|
+
if (rawValue === "|" || rawValue === ">") {
|
|
121
|
+
const blockLines: string[] = [];
|
|
122
|
+
for (let j = i + 1; j < lines.length; j++) {
|
|
123
|
+
const nextLine = lines[j];
|
|
124
|
+
if (nextLine === undefined) continue;
|
|
125
|
+
if (nextLine.trim() && !/^\s/.test(nextLine)) {
|
|
126
|
+
break;
|
|
127
|
+
}
|
|
128
|
+
blockLines.push(nextLine.replace(/^\s{2}/, ""));
|
|
129
|
+
i = j;
|
|
130
|
+
}
|
|
131
|
+
result[key] =
|
|
132
|
+
rawValue === ">" ? blockLines.join(" ").trim() : blockLines.join("\n");
|
|
133
|
+
continue;
|
|
134
|
+
}
|
|
135
|
+
|
|
136
|
+
result[key] = parseScalar(rawValue);
|
|
137
|
+
}
|
|
138
|
+
|
|
139
|
+
return result;
|
|
140
|
+
}
|
|
141
|
+
|
|
142
|
+
function parseFrontmatter(source: string): Record<string, unknown> {
|
|
143
|
+
const bunParse = (globalThis as typeof globalThis & BunYamlRuntime).Bun?.YAML
|
|
144
|
+
?.parse;
|
|
145
|
+
if (bunParse) {
|
|
146
|
+
const parsed = bunParse(source);
|
|
147
|
+
if (typeof parsed !== "object" || parsed === null) {
|
|
148
|
+
throw new Error("Frontmatter must be a YAML dictionary");
|
|
149
|
+
}
|
|
150
|
+
return parsed as Record<string, unknown>;
|
|
151
|
+
}
|
|
152
|
+
|
|
153
|
+
return parseFrontmatterFallback(source);
|
|
154
|
+
}
|
|
155
|
+
|
|
32
156
|
export function validateSkill(skillPath: string): ValidationResult {
|
|
33
157
|
// Check SKILL.md exists
|
|
34
158
|
const skillMdPath = join(skillPath, "SKILL.md");
|
|
@@ -55,7 +179,7 @@ export function validateSkill(skillPath: string): ValidationResult {
|
|
|
55
179
|
// Parse YAML frontmatter
|
|
56
180
|
let frontmatter: Record<string, unknown>;
|
|
57
181
|
try {
|
|
58
|
-
frontmatter =
|
|
182
|
+
frontmatter = parseFrontmatter(frontmatterText);
|
|
59
183
|
if (typeof frontmatter !== "object" || frontmatter === null) {
|
|
60
184
|
return { valid: false, message: "Frontmatter must be a YAML dictionary" };
|
|
61
185
|
}
|
|
@@ -112,11 +236,11 @@ export function validateSkill(skillPath: string): ValidationResult {
|
|
|
112
236
|
message: `Name '${trimmedName}' cannot start/end with hyphen or contain consecutive hyphens`,
|
|
113
237
|
};
|
|
114
238
|
}
|
|
115
|
-
// Check name length
|
|
116
|
-
if (trimmedName.length >
|
|
239
|
+
// Check name length
|
|
240
|
+
if (trimmedName.length > MAX_SKILL_NAME_LENGTH) {
|
|
117
241
|
return {
|
|
118
242
|
valid: false,
|
|
119
|
-
message: `Name is too long (${trimmedName.length} characters). Maximum is
|
|
243
|
+
message: `Name is too long (${trimmedName.length} characters). Maximum is ${MAX_SKILL_NAME_LENGTH} characters.`,
|
|
120
244
|
};
|
|
121
245
|
}
|
|
122
246
|
|
|
@@ -0,0 +1,110 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: image-generation
|
|
3
|
+
description: Generate images from text prompts (and optionally edit/remix input images). Use when the user asks to create, generate, draw, render, or edit an image, illustration, logo, icon, diagram, or photo.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Image Generation
|
|
7
|
+
|
|
8
|
+
Generate images via Letta's hosted endpoint `POST /v1/images/generations`. The API
|
|
9
|
+
usually returns base64 image bytes, so save the response to a local image file
|
|
10
|
+
before replying.
|
|
11
|
+
|
|
12
|
+
## Example
|
|
13
|
+
|
|
14
|
+
Generate the image, save it locally, then show it inline:
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
curl -sS -X POST "https://api.letta.com/v1/images/generations" \
|
|
18
|
+
-H "Authorization: Bearer $LETTA_API_KEY" \
|
|
19
|
+
-H "Content-Type: application/json" \
|
|
20
|
+
-d '{"provider":"gemini","prompt":"a friendly robot mascot waving, flat vector logo, mint green background","n":1}' \
|
|
21
|
+
> image-response.json
|
|
22
|
+
|
|
23
|
+
python3 - <<'PY'
|
|
24
|
+
import base64, json
|
|
25
|
+
|
|
26
|
+
with open("image-response.json") as f:
|
|
27
|
+
response = json.load(f)
|
|
28
|
+
|
|
29
|
+
with open("robot-mascot.png", "wb") as f:
|
|
30
|
+
f.write(base64.b64decode(response["images"][0]["b64_json"]))
|
|
31
|
+
|
|
32
|
+
print("saved robot-mascot.png; credits:", response["billing"]["credits_charged"])
|
|
33
|
+
PY
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
In Bash tools launched by Letta Code, the current Letta credential is available
|
|
37
|
+
as `$LETTA_API_KEY`. This works for both Letta auth modes: it may be a normal
|
|
38
|
+
Letta API key, or the OAuth access token from a Letta Cloud OAuth login. Reference
|
|
39
|
+
it directly. If it is missing, the user needs to authenticate with Letta Cloud (or
|
|
40
|
+
provide a Letta API key); do **not** ask for an OpenAI/Gemini provider key. This
|
|
41
|
+
endpoint also does not use `/connect` BYOK providers — the only `provider` values
|
|
42
|
+
supported here are `gemini` and `openai`.
|
|
43
|
+
|
|
44
|
+
Then **show the image to the user** by embedding the saved file in your reply:
|
|
45
|
+
|
|
46
|
+
```markdown
|
|
47
|
+
Here's the mascot:
|
|
48
|
+
|
|
49
|
+

|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
The Letta Code UI renders local file paths in markdown image tags, so the image
|
|
53
|
+
appears inline. **Always display generated images this way** — don't just report
|
|
54
|
+
the path, and never paste the raw base64 / a `data:` URI. The markdown path must
|
|
55
|
+
match where you saved the file. For `n > 1`, save each image to its own file and
|
|
56
|
+
embed each on its own line. Also tell the user the `credits_charged`.
|
|
57
|
+
|
|
58
|
+
## Request body
|
|
59
|
+
|
|
60
|
+
| Field | Type | Notes |
|
|
61
|
+
|-------|------|-------|
|
|
62
|
+
| `provider` | `"gemini"` \| `"openai"` | Required. |
|
|
63
|
+
| `prompt` | string | Required, 1–32000 chars. |
|
|
64
|
+
| `model` | string | Optional; defaults per provider (below). |
|
|
65
|
+
| `n` | int 1–4 | Optional, default 1. Request variations in one call. |
|
|
66
|
+
| `size` | string | Optional, e.g. `"1024x1024"` (OpenAI). |
|
|
67
|
+
| `quality` | `low`\|`medium`\|`high`\|`auto` | Optional (OpenAI; higher = more credits). |
|
|
68
|
+
| `output_format` | `png`\|`jpeg`\|`webp` | Optional (OpenAI). |
|
|
69
|
+
| `input_images` | string[] (max 14) | Optional. Base64 **data URLs** for edit/remix. |
|
|
70
|
+
| `seed` | int | Optional. |
|
|
71
|
+
|
|
72
|
+
| Provider | Default model | Use for |
|
|
73
|
+
|----------|---------------|---------|
|
|
74
|
+
| `gemini` | `gemini-3-pro-image` | Default. Strong prompt adherence, image editing/remix. |
|
|
75
|
+
| `openai` | `gpt-image-2` | Photoreal output, explicit `size`/`quality`/`output_format`. |
|
|
76
|
+
|
|
77
|
+
Default to `gemini` unless the user wants photoreal or a specific size/quality.
|
|
78
|
+
|
|
79
|
+
## Response
|
|
80
|
+
|
|
81
|
+
```json
|
|
82
|
+
{
|
|
83
|
+
"provider": "gemini",
|
|
84
|
+
"model": "gemini-3-pro-image",
|
|
85
|
+
"images": [{ "b64_json": "<base64>", "mime_type": "image/png" }],
|
|
86
|
+
"billing": { "credits_charged": 12, "...": "..." }
|
|
87
|
+
}
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
Each `images[]` entry has either `b64_json` or `url`, plus `mime_type`. Gemini
|
|
91
|
+
always returns `b64_json`. If OpenAI returns a `url`, download that URL to your
|
|
92
|
+
local image file instead of base64-decoding.
|
|
93
|
+
|
|
94
|
+
## Editing / remixing images
|
|
95
|
+
|
|
96
|
+
Pass source images in `input_images` as base64 **data URLs**
|
|
97
|
+
(`data:<mime>;base64,<data>`) and describe the edit in `prompt`. Gemini handles
|
|
98
|
+
multi-image edits well. To build a data URL from a local file:
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
DATA_URL="data:image/png;base64,$(base64 < input.png | tr -d '\n')"
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
## Notes
|
|
105
|
+
|
|
106
|
+
- **Billing**: every success charges credits; don't loop needlessly, and report
|
|
107
|
+
`credits_charged`.
|
|
108
|
+
- **Errors**: `402` = insufficient credits (`credits_required` in body); `400`/`500`
|
|
109
|
+
return `{ "message": "..." }` — surface it to the user.
|
|
110
|
+
- Only `gemini` and `openai` are supported here.
|
|
@@ -1,13 +0,0 @@
|
|
|
1
|
-
{
|
|
2
|
-
"name": "mcp-client-scripts",
|
|
3
|
-
"version": "1.0.0",
|
|
4
|
-
"type": "module",
|
|
5
|
-
"description": "MCP client scripts for converting-mcps-to-skills",
|
|
6
|
-
"scripts": {
|
|
7
|
-
"http": "npx tsx mcp-http.ts",
|
|
8
|
-
"stdio": "npx tsx mcp-stdio.ts"
|
|
9
|
-
},
|
|
10
|
-
"dependencies": {
|
|
11
|
-
"@modelcontextprotocol/sdk": "^1.25.0"
|
|
12
|
-
}
|
|
13
|
-
}
|