npm - @letta-ai/letta-code - Versions diffs - 0.27.7 → 0.27.8 - Mend

@letta-ai/letta-code 0.27.7 → 0.27.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/letta.js +1095 -534
package/package.json +2 -1
package/scripts/check-bundled-skill-scripts.js +169 -0
package/scripts/check.js +1 -0
package/skills/converting-mcps-to-skills/SKILL.md +1 -12
package/skills/converting-mcps-to-skills/scripts/mcp-stdio.ts +192 -57
package/skills/creating-extensions/references/plan-mode.md +63 -24
package/skills/creating-skills/scripts/validate-skill.ts +129 -5
package/skills/image-generation/SKILL.md +110 -0
package/skills/converting-mcps-to-skills/scripts/package.json +0 -13

package/skills/creating-extensions/references/plan-mode.md CHANGED Viewed

@@ -24,13 +24,15 @@ This is a pattern reference, not a full product implementation. Keep local exten
 -> remind the agent that only read-only tools and plan-file writes are allowed
 -> permission overlay denies mutations outside ~/.letta/plans/*.md
 -> agent writes the plan with normal Write/Edit/ApplyPatch tools
--> agent reads the plan and calls AskUserQuestion with Approve / Revise
+-> agent reads the plan and calls AskUserQuestion with the full current plan text and Approve / Revise
 -> if approved, agent calls exit_plan_mode
 -> exit_plan_mode clears state and returns the approved-plan execution handoff
 ```
 Plan files are normal markdown files. Do not add a special `update_plan_file` tool unless the user explicitly wants that abstraction. Let the agent use normal write tools and constrain those tools with permissions.
+Plan approval must show the user the full current plan text. Do not ask "does this look right?" with only a summary. After every revision, read the plan file again and present the full revised plan in the `AskUserQuestion.question` body before exiting plan mode.
 ## Capabilities used
 Guard each registration with the matching capability:
@@ -38,7 +40,7 @@ Guard each registration with the matching capability:
 - `commands`: `/plan` for explicit human entry
 - `tools`: `enter_plan_mode` and `exit_plan_mode` for model-driven entry/exit
 - `events.turns`: append a focused plan-mode reminder while active
-- `permissions`: block mutating tools except plan-file writes
+- `permissions`: block mutating tools except planning coordination tools and plan-file writes
 Do not use panels for persistent mode state. Panels are transient UI and can be noisy/fragile for mode indicators. Do not add a custom statusline renderer just to show plan mode; `setStatuslineRenderer` is a single global renderer, not an additive slot. This example intentionally keeps visible mode state out of scope.
@@ -99,9 +101,10 @@ In plan mode, you should:
 1. Thoroughly explore the codebase to understand existing patterns
 2. Identify similar features and architectural approaches
 3. Consider multiple approaches and their trade-offs
-4. Use AskUserQuestion if you need to clarify the approach
-5. Design a concrete implementation strategy
-6. When ready, write the plan to the plan file, use AskUserQuestion to present the full plan for approval, and call exit_plan_mode after the user approves
+4. Use direct read-only tools for exploration. Do not launch coding, general-purpose, or fork subagents in plan mode; they may mutate files and should be denied. Only recall-style subagents are allowed if available.
+5. Use AskUserQuestion if you need to clarify the approach
+6. Design a concrete implementation strategy
+7. When ready, write the plan to the plan file, read the plan file, use AskUserQuestion to present the full current plan text for approval, and call exit_plan_mode after the user approves
 Remember: DO NOT write or edit any files except the plan file. This is a read-only exploration and planning phase.
@@ -159,8 +162,9 @@ Plan mode is active. The user indicated that they do not want you to execute yet
 1. Answer the user's query comprehensively, using the AskUserQuestion tool if you need to ask the user clarifying questions.
 2. Write your implementation plan to the plan file. Plan file path: ${session.planFilePath}
 3. If using apply_patch, use this exact relative path in patch headers: ${relativePatchPath}
-4. When the plan is complete, read the plan file and present the full plan to the user with AskUserQuestion. The question should offer at least "Approve" and "Revise" options.
-5. If the user approves, call exit_plan_mode immediately. If the user asks to revise, stay in plan mode and update the plan file.
+4. Use direct read-only tools for exploration. Do not launch coding, general-purpose, or fork subagents in plan mode; they may mutate files and should be denied. Only recall-style subagents are allowed if available.
+5. When the plan is complete, read the plan file and present the full current plan text to the user with AskUserQuestion. The question body must include the entire plan, not a summary. The question should offer at least "Approve" and "Revise" options.
+6. If the user approves, call exit_plan_mode immediately. If the user asks to revise, stay in plan mode, update the plan file, then read and present the full revised plan again.
 Do NOT make any file changes outside the plan file or run any tools that modify the system state until the user has approved the plan and you have called exit_plan_mode.
 </system-reminder>`;
 }
@@ -176,35 +180,58 @@ if (letta.capabilities.events.turns) {
 ## Permission overlay
-Use a permission overlay, not `tool_start`, for policy. Normalize tool names by family; UI display names and provider-specific tool names drift (`Read`, `read`, `read_file`, `ReadFile`, `SearchFileContent`, etc.).
+Use a permission overlay, not `tool_start`, for policy. Normalize tool names by family; UI display names and provider-specific tool names drift (`Read`, `read`, `read_file`, `ReadFile`, `SearchFileContent`, etc.). Keep pure read-only tools separate from planning coordination tools like `AskUserQuestion` and todo/plan updates so the policy stays honest.
 ```ts
 const readOnlyToolNames = new Set([
-  "askuserquestion",
-  "ask_user_question",
   "glob",
+  "globgemini",
   "grep",
+  "grepfiles",
+  "list",
   "listdir",
-  "list_directory",
+  "listdirectory",
   "ls",
+  "notebookread",
   "read",
-  "read_file",
   "readfile",
+  "readfilegemini",
+  "readlsp",
+  "readmanyfiles",
   "search",
-  "search_file_content",
+  "searchfilecontent",
+  "searchfiles",
   "skill",
   "taskoutput",
-  "update_plan",
-  "view_image",
+  "viewimage",
 ]);
+const planningToolNames = new Set([
+  "askuserquestion",
+  "enterplanmode",
+  "exitplanmode",
+  "todowrite",
+  "updateplan",
+  "writetodos",
+]);
+const readOnlySubagentTypes = new Set(["recall"]);
 function normalizedToolName(toolName) {
-  return toolName.replace(/[\s-]/g, "").toLowerCase();
+  return toolName.replace(/[^a-z0-9]/gi, "").toLowerCase();
 }
 function isReadOnlyToolName(toolName) {
-  const raw = toolName.toLowerCase();
-  return readOnlyToolNames.has(raw) || readOnlyToolNames.has(normalizedToolName(toolName));
+  return readOnlyToolNames.has(normalizedToolName(toolName));
+}
+function isPlanningToolName(toolName) {
+  return planningToolNames.has(normalizedToolName(toolName));
+}
+function isAllowedReadOnlySubagent(args) {
+  const subagentType = args?.subagent_type;
+  return typeof subagentType === "string" && readOnlySubagentTypes.has(normalizedToolName(subagentType));
 }
 function isPlanFileWrite(toolName, args, cwd) {
@@ -220,18 +247,28 @@ if (letta.capabilities.permissions) {
     check(event) {
       const session = getSession(event.conversationId);
       if (!session) return;
+      const toolName = String(event.toolName);
+      const args = event.args ?? {};
+      if (isReadOnlyToolName(toolName)) return { decision: "allow" };
+      if (isPlanningToolName(toolName)) return { decision: "allow", reason: "planning" };
+      const normalized = normalizedToolName(toolName);
+      if ((normalized === "agent" || normalized === "task") && isAllowedReadOnlySubagent(args)) {
+        return { decision: "allow", reason: "read-only subagent" };
+      }
-      if (isReadOnlyToolName(event.toolName)) return { decision: "allow" };
-      if (isPlanFileWrite(event.toolName, event.args, event.workingDirectory || event.cwd)) {
+      if (isPlanFileWrite(toolName, args, event.workingDirectory || event.cwd)) {
         return { decision: "allow", reason: "plan file" };
       }
       return {
         decision: "deny",
         reason:
-          `Plan mode is active. You can only use read-only tools (Read, Grep, Glob, etc.) and write to the plan file. ` +
+          `Plan mode is active. Use direct read-only tools (Read, Grep, Glob, List, Search, Skill, TaskOutput, safe read-only Bash), planning tools (AskUserQuestion, TodoWrite/UpdatePlan), or recall-style subagents only. ` +
+          `Do not use coding, general-purpose, or fork subagents in plan mode. ` +
           `Write your plan to: ${session.planFilePath}. ` +
-          `Use AskUserQuestion when your plan is ready for user approval, then call exit_plan_mode after approval.`,
+          `When ready, read the plan file and include the full current plan text in AskUserQuestion for approval, then call exit_plan_mode after approval.`,
       };
     },
   }));
@@ -242,14 +279,14 @@ Shell allowlists are easy to get wrong. Start conservative: allow clearly read-o
 ## Exit tool
-In the extension version, `exit_plan_mode` is not the approval UI. The agent should present the plan with `AskUserQuestion` first, then call `exit_plan_mode` only after the user approves.
+In the extension version, `exit_plan_mode` is not the approval UI. The agent should read the plan file, present the full current plan text with `AskUserQuestion`, then call `exit_plan_mode` only after the user approves.
 ```ts
 if (letta.capabilities.tools) {
   disposers.push(letta.tools.register({
     name: "exit_plan_mode",
     description:
-      "Exit plan mode only after the plan file has been written, the full plan has been presented with AskUserQuestion, and the user has approved it.",
+      "Exit plan mode only after the plan file has been written, the full current plan text has been presented with AskUserQuestion, and the user has approved it.",
     parameters: { type: "object", properties: {}, additionalProperties: false },
     requiresApproval: false,
     parallelSafe: false,
@@ -282,4 +319,6 @@ if (letta.capabilities.tools) {
 ## Notes
 - Keep `exit_plan_mode` as the final state transition and execution handoff. The approved-plan text in its tool return is useful model context.
+- Plan approval must include the full current plan text in `AskUserQuestion.question`, not just a summary or "does this look right?". After revisions, re-read the file and present the full revised plan again.
+- Keep arbitrary coding subagents denied in plan mode unless the runtime has a true read-only child mode. With the current subagent set, allow only recall-style subagents.
 - If the user renames the plan file, exit logic can use the newest non-empty `~/.letta/plans/*.md` modified after plan mode started, or accept an optional plan path. Keep the user-facing flow normal: write plan file, ask approval, then exit.

package/skills/creating-skills/scripts/validate-skill.ts CHANGED Viewed

@@ -12,7 +12,6 @@
 import { existsSync, readFileSync } from "node:fs";
 import { basename, join, resolve } from "node:path";
 import { fileURLToPath } from "node:url";
-import { parse as parseYaml } from "yaml";
 interface ValidationResult {
   valid: boolean;
@@ -29,6 +28,131 @@ const ALLOWED_PROPERTIES = new Set([
   "allowed-tools",
 ]);
+export const MAX_SKILL_NAME_LENGTH = 64;
+type BunYamlRuntime = {
+  Bun?: {
+    YAML?: {
+      parse?: (source: string) => unknown;
+    };
+  };
+};
+function parseQuotedScalar(value: string): string {
+  if (value.startsWith('"')) {
+    if (!value.endsWith('"') || value.length === 1) {
+      throw new Error("Unterminated double-quoted scalar");
+    }
+    return JSON.parse(value) as string;
+  }
+  if (value.startsWith("'")) {
+    if (!value.endsWith("'") || value.length === 1) {
+      throw new Error("Unterminated single-quoted scalar");
+    }
+    return value.slice(1, -1).replace(/''/g, "'");
+  }
+  return value;
+}
+function parseScalar(value: string): unknown {
+  const trimmed = value.trim();
+  if (!trimmed) return "";
+  if (trimmed === "true") return true;
+  if (trimmed === "false") return false;
+  if (trimmed === "null" || trimmed === "~") return null;
+  if (trimmed.startsWith('"') || trimmed.startsWith("'")) {
+    return parseQuotedScalar(trimmed);
+  }
+  // The fallback parser intentionally accepts only the frontmatter subset this
+  // validator needs. Unquoted ": " inside a scalar is the most common YAML
+  // authoring mistake; reject it instead of silently producing a bad value.
+  if (trimmed.includes(": ")) {
+    throw new Error(`Unexpected ':' in unquoted scalar: ${trimmed}`);
+  }
+  if (/^-?\d+(?:\.\d+)?$/.test(trimmed)) {
+    return Number(trimmed);
+  }
+  return trimmed;
+}
+function parseFrontmatterFallback(source: string): Record<string, unknown> {
+  const result: Record<string, unknown> = {};
+  const lines = source.split(/\r?\n/);
+  for (let i = 0; i < lines.length; i++) {
+    const line = lines[i];
+    if (line === undefined) continue;
+    const trimmed = line.trim();
+    if (!trimmed || trimmed.startsWith("#")) {
+      continue;
+    }
+    if (/^\s/.test(line)) {
+      // Nested data belongs to the previous top-level key. The validator only
+      // checks top-level field names plus name/description scalar values.
+      continue;
+    }
+    const colonIndex = line.indexOf(":");
+    if (colonIndex <= 0) {
+      throw new Error(`Invalid frontmatter line: ${line}`);
+    }
+    const key = line.slice(0, colonIndex).trim();
+    const rawValue = line.slice(colonIndex + 1).trim();
+    if (!key) {
+      throw new Error(`Invalid frontmatter line: ${line}`);
+    }
+    if (!rawValue) {
+      result[key] = {};
+      continue;
+    }
+    if (rawValue === "|" || rawValue === ">") {
+      const blockLines: string[] = [];
+      for (let j = i + 1; j < lines.length; j++) {
+        const nextLine = lines[j];
+        if (nextLine === undefined) continue;
+        if (nextLine.trim() && !/^\s/.test(nextLine)) {
+          break;
+        }
+        blockLines.push(nextLine.replace(/^\s{2}/, ""));
+        i = j;
+      }
+      result[key] =
+        rawValue === ">" ? blockLines.join(" ").trim() : blockLines.join("\n");
+      continue;
+    }
+    result[key] = parseScalar(rawValue);
+  }
+  return result;
+}
+function parseFrontmatter(source: string): Record<string, unknown> {
+  const bunParse = (globalThis as typeof globalThis & BunYamlRuntime).Bun?.YAML
+    ?.parse;
+  if (bunParse) {
+    const parsed = bunParse(source);
+    if (typeof parsed !== "object" || parsed === null) {
+      throw new Error("Frontmatter must be a YAML dictionary");
+    }
+    return parsed as Record<string, unknown>;
+  }
+  return parseFrontmatterFallback(source);
+}
 export function validateSkill(skillPath: string): ValidationResult {
   // Check SKILL.md exists
   const skillMdPath = join(skillPath, "SKILL.md");
@@ -55,7 +179,7 @@ export function validateSkill(skillPath: string): ValidationResult {
   // Parse YAML frontmatter
   let frontmatter: Record<string, unknown>;
   try {
-    frontmatter = parseYaml(frontmatterText);
+    frontmatter = parseFrontmatter(frontmatterText);
     if (typeof frontmatter !== "object" || frontmatter === null) {
       return { valid: false, message: "Frontmatter must be a YAML dictionary" };
     }
@@ -112,11 +236,11 @@ export function validateSkill(skillPath: string): ValidationResult {
         message: `Name '${trimmedName}' cannot start/end with hyphen or contain consecutive hyphens`,
       };
     }
-    // Check name length (max 64 characters)
-    if (trimmedName.length > 64) {
+    // Check name length
+    if (trimmedName.length > MAX_SKILL_NAME_LENGTH) {
       return {
         valid: false,
-        message: `Name is too long (${trimmedName.length} characters). Maximum is 64 characters.`,
+        message: `Name is too long (${trimmedName.length} characters). Maximum is ${MAX_SKILL_NAME_LENGTH} characters.`,
       };
     }

package/skills/image-generation/SKILL.md ADDED Viewed

@@ -0,0 +1,110 @@
+---
+name: image-generation
+description: Generate images from text prompts (and optionally edit/remix input images). Use when the user asks to create, generate, draw, render, or edit an image, illustration, logo, icon, diagram, or photo.
+---
+# Image Generation
+Generate images via Letta's hosted endpoint `POST /v1/images/generations`. The API
+usually returns base64 image bytes, so save the response to a local image file
+before replying.
+## Example
+Generate the image, save it locally, then show it inline:
+```bash
+curl -sS -X POST "https://api.letta.com/v1/images/generations" \
+  -H "Authorization: Bearer $LETTA_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{"provider":"gemini","prompt":"a friendly robot mascot waving, flat vector logo, mint green background","n":1}' \
+  > image-response.json
+python3 - <<'PY'
+import base64, json
+with open("image-response.json") as f:
+    response = json.load(f)
+with open("robot-mascot.png", "wb") as f:
+    f.write(base64.b64decode(response["images"][0]["b64_json"]))
+print("saved robot-mascot.png; credits:", response["billing"]["credits_charged"])
+PY
+```
+In Bash tools launched by Letta Code, the current Letta credential is available
+as `$LETTA_API_KEY`. This works for both Letta auth modes: it may be a normal
+Letta API key, or the OAuth access token from a Letta Cloud OAuth login. Reference
+it directly. If it is missing, the user needs to authenticate with Letta Cloud (or
+provide a Letta API key); do **not** ask for an OpenAI/Gemini provider key. This
+endpoint also does not use `/connect` BYOK providers — the only `provider` values
+supported here are `gemini` and `openai`.
+Then **show the image to the user** by embedding the saved file in your reply:
+```markdown
+Here's the mascot:
+![a friendly robot mascot waving, flat vector logo](./robot-mascot.png)
+```
+The Letta Code UI renders local file paths in markdown image tags, so the image
+appears inline. **Always display generated images this way** — don't just report
+the path, and never paste the raw base64 / a `data:` URI. The markdown path must
+match where you saved the file. For `n > 1`, save each image to its own file and
+embed each on its own line. Also tell the user the `credits_charged`.
+## Request body
+| Field | Type | Notes |
+|-------|------|-------|
+| `provider` | `"gemini"` \| `"openai"` | Required. |
+| `prompt` | string | Required, 1–32000 chars. |
+| `model` | string | Optional; defaults per provider (below). |
+| `n` | int 1–4 | Optional, default 1. Request variations in one call. |
+| `size` | string | Optional, e.g. `"1024x1024"` (OpenAI). |
+| `quality` | `low`\|`medium`\|`high`\|`auto` | Optional (OpenAI; higher = more credits). |
+| `output_format` | `png`\|`jpeg`\|`webp` | Optional (OpenAI). |
+| `input_images` | string[] (max 14) | Optional. Base64 **data URLs** for edit/remix. |
+| `seed` | int | Optional. |
+| Provider | Default model | Use for |
+|----------|---------------|---------|
+| `gemini` | `gemini-3-pro-image` | Default. Strong prompt adherence, image editing/remix. |
+| `openai` | `gpt-image-2` | Photoreal output, explicit `size`/`quality`/`output_format`. |
+Default to `gemini` unless the user wants photoreal or a specific size/quality.
+## Response
+```json
+{
+  "provider": "gemini",
+  "model": "gemini-3-pro-image",
+  "images": [{ "b64_json": "<base64>", "mime_type": "image/png" }],
+  "billing": { "credits_charged": 12, "...": "..." }
+}
+```
+Each `images[]` entry has either `b64_json` or `url`, plus `mime_type`. Gemini
+always returns `b64_json`. If OpenAI returns a `url`, download that URL to your
+local image file instead of base64-decoding.
+## Editing / remixing images
+Pass source images in `input_images` as base64 **data URLs**
+(`data:<mime>;base64,<data>`) and describe the edit in `prompt`. Gemini handles
+multi-image edits well. To build a data URL from a local file:
+```bash
+DATA_URL="data:image/png;base64,$(base64 < input.png | tr -d '\n')"
+```
+## Notes
+- **Billing**: every success charges credits; don't loop needlessly, and report
+  `credits_charged`.
+- **Errors**: `402` = insufficient credits (`credits_required` in body); `400`/`500`
+  return `{ "message": "..." }` — surface it to the user.
+- Only `gemini` and `openai` are supported here.

package/skills/converting-mcps-to-skills/scripts/package.json DELETED Viewed

@@ -1,13 +0,0 @@
-{
-  "name": "mcp-client-scripts",
-  "version": "1.0.0",
-  "type": "module",
-  "description": "MCP client scripts for converting-mcps-to-skills",
-  "scripts": {
-    "http": "npx tsx mcp-http.ts",
-    "stdio": "npx tsx mcp-stdio.ts"
-  },
-  "dependencies": {
-    "@modelcontextprotocol/sdk": "^1.25.0"
-  }
-}