@letta-ai/letta-code 0.18.4 → 0.19.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/types/protocol.d.ts +30 -11
- package/dist/types/protocol.d.ts.map +1 -1
- package/letta.js +10989 -8432
- package/package.json +1 -1
- package/skills/dispatching-coding-agents/SKILL.md +156 -124
package/package.json
CHANGED
|
@@ -1,192 +1,224 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: dispatching-coding-agents
|
|
3
|
-
description: Dispatch
|
|
3
|
+
description: Dispatch stateless coding agents (Claude Code or Codex) via Bash. Use when you're stuck, need a second opinion, or need parallel research on a hard problem. They have no memory — you must provide all context.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Dispatching Coding Agents
|
|
7
7
|
|
|
8
|
-
You can shell out to **Claude Code** (`claude`) and **Codex** (`codex`) as stateless sub-agents via Bash. They have
|
|
8
|
+
You can shell out to **Claude Code** (`claude`) and **Codex** (`codex`) as stateless sub-agents via Bash. They have filesystem and tool access (scope depends on sandbox/approval settings) but **zero memory** — every session starts from scratch.
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
**Default to `run_in_background: true`** on the Bash call so you can keep working while they run. Check results later with `TaskOutput`. Don't sit idle waiting for a subagent.
|
|
11
11
|
|
|
12
|
-
|
|
12
|
+
## The Core Mental Model
|
|
13
13
|
|
|
14
|
-
|
|
15
|
-
2. **Be specific** — tell them exactly what to investigate or implement, and what files to look at
|
|
16
|
-
3. **Run async when possible** — use `run_in_background: true` on Bash calls to avoid blocking
|
|
17
|
-
4. **Learn from results** — track which models/agents perform better on which tasks, and update memory
|
|
18
|
-
5. **Mine their history** — use the `history-analyzer` subagent to access past Claude Code and Codex sessions
|
|
14
|
+
Claude Code and Codex are highly optimized coding agents, but are re-born with each new session. Think of them like a brilliant intern that showed up today. Provide them with the right instructions and context to help them succeed and avoid having to re-learn things that you've learned.
|
|
19
15
|
|
|
20
|
-
|
|
16
|
+
You are the experienced manager with persistent memory of the user's preferences, the codebase, past decisions, and hard-won lessons. **Give them context, not a plan.** They won't know anything you don't tell them:
|
|
21
17
|
|
|
22
|
-
|
|
18
|
+
- **Specific task**: Be precise about what you need — not "look into the auth system" but "trace the request flow from the messages endpoint through to the LLM call, cite files and line numbers."
|
|
19
|
+
- **File paths and architecture**: Tell them exactly where to look and how pieces connect. They will wander aimlessly without this.
|
|
20
|
+
- **Preferences and constraints**: Code style, error handling patterns, things the user has corrected you on. Save them from making mistakes you already learned from.
|
|
21
|
+
- **What you've already tried**: If you're dispatching because you're stuck, this prevents them from rediscovering your dead ends.
|
|
23
22
|
|
|
24
|
-
|
|
25
|
-
claude -p "YOUR PROMPT" --model MODEL --dangerously-skip-permissions
|
|
26
|
-
```
|
|
23
|
+
If a subagent needs clarification or asks a question, respond in the same session (see Session Resumption below) — don't start a new session or you'll lose the conversation context.
|
|
27
24
|
|
|
28
|
-
|
|
29
|
-
- `--dangerously-skip-permissions`: use in trusted repos to skip approval prompts. Without this, killed/timed-out sessions can leave stale approval state that blocks future runs with "stale approval from interrupted session" errors.
|
|
30
|
-
- `--model MODEL`: alias (`sonnet`, `opus`) or full name (`claude-sonnet-4-6`)
|
|
31
|
-
- `--effort LEVEL`: `low`, `medium`, `high` — controls reasoning depth
|
|
32
|
-
- `--append-system-prompt "..."`: inject additional system instructions
|
|
33
|
-
- `--allowedTools "Bash Edit Read"`: restrict available tools
|
|
34
|
-
- `--max-budget-usd N`: cap spend for the invocation
|
|
35
|
-
- `-C DIR`: set working directory
|
|
25
|
+
## When to Dispatch (and When Not To)
|
|
36
26
|
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
27
|
+
### Dispatch for:
|
|
28
|
+
- **Hard debugging** — you've been looping on a problem and need fresh eyes
|
|
29
|
+
- **Second opinions** — you want validation before a risky change
|
|
30
|
+
- **Parallel research** — investigate multiple hypotheses simultaneously
|
|
31
|
+
- **Large-scope investigation** — tracing a flow across many files in an unfamiliar area
|
|
32
|
+
- **Code review** — have another agent review your diff or plan
|
|
42
33
|
|
|
43
|
-
###
|
|
34
|
+
### Don't dispatch for:
|
|
35
|
+
- Simple file reads, greps, or small edits — faster to do yourself
|
|
36
|
+
- Anything that takes less than ~3 minutes of direct work
|
|
37
|
+
- Tasks where you already know exactly what to do
|
|
38
|
+
- When context transfer would take longer than just doing the task
|
|
44
39
|
|
|
45
|
-
|
|
46
|
-
codex exec "YOUR PROMPT" -m codex-5.3 --full-auto
|
|
47
|
-
```
|
|
40
|
+
## Choosing an Agent and Model
|
|
48
41
|
|
|
49
|
-
|
|
50
|
-
- `-m MODEL`: prefer `codex-5.3` (frontier), also `gpt-5.2`, `o3`
|
|
51
|
-
- `--full-auto`: auto-approve commands in sandbox (equivalent to `-a on-request --sandbox workspace-write`)
|
|
52
|
-
- `-C DIR`: set working directory
|
|
53
|
-
- `--search`: enable web search tool
|
|
42
|
+
Different agents have different strengths. Track what works in your memory over time — your own observations are more valuable than these defaults.
|
|
54
43
|
|
|
55
|
-
|
|
56
|
-
```bash
|
|
57
|
-
codex exec "Find all places where system prompt is recompiled. Cite files and line numbers." \
|
|
58
|
-
-m codex-5.3 --full-auto -C /path/to/repo
|
|
59
|
-
```
|
|
44
|
+
### Categories
|
|
60
45
|
|
|
61
|
-
|
|
46
|
+
**Codex:**
|
|
47
|
+
- `gpt-5.3-codex` — Frontier reasoning. Best for the hardest debugging and complex tasks.
|
|
48
|
+
- Strengths: Best reasoning, excellent at debugging, best option for the hardest tasks
|
|
49
|
+
- Weaknesses: Slow with long trajectories, compactions can destroy trajectories
|
|
50
|
+
- `gpt-5.4` — Latest frontier model. Fast and general-purpose.
|
|
51
|
+
- Strengths: Easier for humans to understand, general-purpose, faster
|
|
52
|
+
- Weaknesses: More likely to make silly errors than gpt-5.3-codex
|
|
62
53
|
|
|
63
|
-
|
|
54
|
+
**Claude Code:**
|
|
55
|
+
- `opus` — Excellent writer. Best for docs, refactors, open-ended tasks, and vague instructions.
|
|
56
|
+
- Strengths: Excellent writer, understands vague instructions, excellent for coding but also general-purpose
|
|
57
|
+
- Weaknesses: Tends to generate "slop", writing excessive quantities of code unnecessarily. Can hang on large repos.
|
|
64
58
|
|
|
65
|
-
###
|
|
59
|
+
### Cost and speed tradeoffs
|
|
60
|
+
- Frontier models (`gpt-5.3-codex`, Opus) are slower and more expensive — use for tasks that justify it
|
|
61
|
+
- Fast models (`gpt-5.4`) are good for quick checks and simple tasks
|
|
62
|
+
- Use `--max-budget-usd N` (Claude Code) to cap spend on exploratory tasks
|
|
66
63
|
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
64
|
+
### Known quirks
|
|
65
|
+
- **Claude Code can hang on large repos** with unrestricted tools — consider `--allowedTools "Read Grep Glob"` (no Bash) and shorter timeouts for research tasks
|
|
66
|
+
- **Codex compactions can destroy long trajectories** — for very long tasks, prefer multiple shorter sessions over one marathon
|
|
67
|
+
- **Opus tends to over-generate** — produces more code than necessary. Good for exploration, verify before applying.
|
|
70
68
|
|
|
71
|
-
|
|
72
|
-
claude -c -p "Also check..."
|
|
69
|
+
## Prompting Subagents
|
|
73
70
|
|
|
74
|
-
|
|
75
|
-
claude -r SESSION_ID --fork-session -p "Try a different approach..."
|
|
71
|
+
### Prompt template
|
|
76
72
|
```
|
|
73
|
+
TASK: [one-sentence summary]
|
|
77
74
|
|
|
78
|
-
|
|
75
|
+
CONTEXT:
|
|
76
|
+
- Repo: [path]
|
|
77
|
+
- Key files: [list specific files and what they contain]
|
|
78
|
+
- Architecture: [brief relevant context]
|
|
79
79
|
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
codex resume SESSION_ID "Follow up prompt"
|
|
80
|
+
WHAT TO DO:
|
|
81
|
+
[what you need done — be precise, but let them figure out the approach]
|
|
83
82
|
|
|
84
|
-
|
|
85
|
-
|
|
83
|
+
CONSTRAINTS:
|
|
84
|
+
- [any preferences, patterns to follow, things to avoid]
|
|
85
|
+
- [what you've already tried, if dispatching because stuck]
|
|
86
86
|
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
codex fork --last "Try a different approach"
|
|
87
|
+
OUTPUT:
|
|
88
|
+
[what you want back — a diff, a list of files, a root cause analysis, etc.]
|
|
90
89
|
```
|
|
91
90
|
|
|
92
|
-
|
|
91
|
+
### What makes a good prompt
|
|
92
|
+
- **Be specific about files** — "look at `src/agent/message.ts` lines 40-80" not "look at the message handling code"
|
|
93
|
+
- **State the output format** — "return a bullet list of findings" vs. leaving it open-ended
|
|
94
|
+
- **Include constraints** — if the user prefers certain patterns, say so explicitly
|
|
95
|
+
- **Provide what you've tried** — when dispatching because you're stuck, this prevents them from repeating your dead ends
|
|
93
96
|
|
|
94
|
-
##
|
|
97
|
+
## Dispatch Patterns
|
|
95
98
|
|
|
96
|
-
|
|
99
|
+
### Parallel research — multiple perspectives
|
|
100
|
+
Run Claude Code and Codex simultaneously on the same question via separate Bash calls in a single message (use `run_in_background: true`). Compare results for higher confidence.
|
|
97
101
|
|
|
98
|
-
###
|
|
102
|
+
### Background dispatch — keep working while they run
|
|
103
|
+
Use `run_in_background: true` on the Bash call to dispatch async. Continue your own work, then check results with `TaskOutput` when ready.
|
|
99
104
|
|
|
100
|
-
|
|
105
|
+
### Deep investigation — frontier models
|
|
106
|
+
For hard problems, use the strongest available models:
|
|
101
107
|
```bash
|
|
102
|
-
|
|
108
|
+
codex exec "YOUR PROMPT" -m gpt-5.3-codex --full-auto -C /path/to/repo
|
|
103
109
|
```
|
|
104
|
-
The JSON response includes `session_id`, `cost_usd`, `duration_ms`, `num_turns`, and `result`.
|
|
105
110
|
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
### Codex
|
|
111
|
+
### Code review — cross-agent validation
|
|
112
|
+
Have one agent write code or create a plan, then dispatch another to review:
|
|
113
|
+
```bash
|
|
114
|
+
# Codex has a native review command:
|
|
115
|
+
codex review --uncommitted # Review all local changes
|
|
116
|
+
codex exec review "Focus on error handling and edge cases" -m gpt-5.4 --full-auto
|
|
113
117
|
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
118
|
+
# Claude Code — pass the diff inline:
|
|
119
|
+
claude -p "Review the following diff for correctness, edge cases, and missed error handling:\n\n$(git diff)" \
|
|
120
|
+
--model opus --dangerously-skip-permissions
|
|
117
121
|
```
|
|
118
|
-
Extract it with: `grep "^session id:" output | awk '{print $3}'`
|
|
119
122
|
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
+
### Get outside feedback on your work
|
|
124
|
+
Write your plan or analysis to a file, then ask a subagent to critique it:
|
|
125
|
+
```bash
|
|
126
|
+
claude -p "Read /tmp/my-plan.md and critique it. What am I missing? What could go wrong?" \
|
|
127
|
+
--model opus --dangerously-skip-permissions -C /path/to/repo
|
|
123
128
|
```
|
|
124
129
|
|
|
125
|
-
##
|
|
130
|
+
## Handling Failures
|
|
126
131
|
|
|
127
|
-
|
|
132
|
+
- **Timeout**: If an agent times out (especially Claude Code on large repos), try: (1) a shorter, more focused prompt, (2) restricting tools with `--allowedTools`, (3) switching to Codex which handles large repos better
|
|
133
|
+
- **Garbage output**: If results are incoherent, the prompt was probably too vague. Rewrite with more specific file paths and clearer instructions.
|
|
134
|
+
- **Session errors**: Claude Code can hit "stale approval from interrupted session" — `--dangerously-skip-permissions` prevents this. If Codex errors, start a fresh `exec` session.
|
|
135
|
+
- **Compaction mid-task**: If a Codex session runs long enough to compact, it may lose earlier context. Break long tasks into smaller sequential sessions.
|
|
128
136
|
|
|
129
|
-
|
|
137
|
+
## CLI Reference
|
|
130
138
|
|
|
131
|
-
|
|
132
|
-
```
|
|
133
|
-
~/.claude/projects/<encoded-path>/<session-id>.jsonl
|
|
134
|
-
```
|
|
135
|
-
Where `<encoded-path>` is the working directory with `/` replaced by `-` (e.g. `/Users/foo/repos/bar` → `-Users-foo-repos-bar`). Use `--output-format json` to get the `session_id` in structured output.
|
|
139
|
+
### Claude Code
|
|
136
140
|
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
~/.codex/sessions/<year>/<month>/<day>/rollout-*-<session-id>.jsonl
|
|
141
|
+
```bash
|
|
142
|
+
claude -p "YOUR PROMPT" --model MODEL --dangerously-skip-permissions
|
|
140
143
|
```
|
|
141
|
-
The session ID is printed in the output header: `session id: <uuid>`.
|
|
142
144
|
|
|
143
|
-
|
|
145
|
+
| Flag | Purpose |
|
|
146
|
+
|------|---------|
|
|
147
|
+
| `-p` / `--print` | Non-interactive mode, prints response and exits |
|
|
148
|
+
| `--dangerously-skip-permissions` | Skip approval prompts (prevents stale approval errors on timeout) |
|
|
149
|
+
| `--model MODEL` | Alias (`sonnet`, `opus`) or full name (`claude-sonnet-4-6`) |
|
|
150
|
+
| `--effort LEVEL` | `low`, `medium`, `high` — controls reasoning depth |
|
|
151
|
+
| `--append-system-prompt "..."` | Inject additional system instructions |
|
|
152
|
+
| `--allowedTools "Bash Edit Read"` | Restrict available tools |
|
|
153
|
+
| `--max-budget-usd N` | Cap spend for the invocation |
|
|
154
|
+
| `-C DIR` | Set working directory |
|
|
155
|
+
| `--output-format json` | Structured output with `session_id`, `cost_usd`, `duration_ms` |
|
|
144
156
|
|
|
145
|
-
|
|
157
|
+
### Codex
|
|
146
158
|
|
|
147
|
-
|
|
159
|
+
```bash
|
|
160
|
+
codex exec "YOUR PROMPT" -m gpt-5.3-codex --full-auto
|
|
161
|
+
```
|
|
148
162
|
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
163
|
+
| Flag | Purpose |
|
|
164
|
+
|------|---------|
|
|
165
|
+
| `exec` | Non-interactive mode |
|
|
166
|
+
| `-m MODEL` | `gpt-5.3-codex` (frontier), `gpt-5.4` (fast), `gpt-5.3-codex-spark` (ultra-fast), `gpt-5.2-codex`, `gpt-5.2` |
|
|
167
|
+
| `--full-auto` | Auto-approve all commands in sandbox |
|
|
168
|
+
| `-C DIR` | Set working directory |
|
|
169
|
+
| `--search` | Enable web search tool |
|
|
170
|
+
| `review` | Native code review — `codex review --uncommitted` or `codex exec review "prompt"` |
|
|
153
171
|
|
|
154
|
-
##
|
|
172
|
+
## Session Management
|
|
155
173
|
|
|
156
|
-
|
|
174
|
+
Both CLIs persist full session data (tool calls, reasoning, files read) to disk. The Bash output you see is just the final summary — the local session file is much richer.
|
|
157
175
|
|
|
158
|
-
|
|
176
|
+
### Session storage paths
|
|
159
177
|
|
|
160
|
-
|
|
178
|
+
**Claude Code:** `~/.claude/projects/<encoded-path>/<session-id>.jsonl`
|
|
179
|
+
- `<encoded-path>` = working directory with `/` replaced by `-` (e.g. `/Users/foo/repos/bar` becomes `-Users-foo-repos-bar`)
|
|
180
|
+
- Use `--output-format json` to get the `session_id` in the response
|
|
161
181
|
|
|
162
|
-
|
|
163
|
-
-
|
|
164
|
-
-
|
|
182
|
+
**Codex:** `~/.codex/sessions/<year>/<month>/<day>/rollout-*-<session-id>.jsonl`
|
|
183
|
+
- Session ID is printed in output header: `session id: <uuid>`
|
|
184
|
+
- Extract with: `grep "^session id:" output | awk '{print $3}'`
|
|
165
185
|
|
|
166
|
-
###
|
|
186
|
+
### Resuming sessions
|
|
187
|
+
|
|
188
|
+
Use session resumption to continue a line of investigation without re-providing all context:
|
|
167
189
|
|
|
168
|
-
|
|
190
|
+
**Claude Code:**
|
|
169
191
|
```bash
|
|
170
|
-
claude -p "
|
|
192
|
+
claude -r SESSION_ID -p "Follow up: now check if..." # Resume by ID
|
|
193
|
+
claude -c -p "Also check..." # Continue most recent
|
|
194
|
+
claude -r SESSION_ID --fork-session -p "Try differently" # Fork (new ID, keeps history)
|
|
171
195
|
```
|
|
172
196
|
|
|
173
|
-
|
|
197
|
+
**Codex:**
|
|
198
|
+
```bash
|
|
199
|
+
codex exec resume SESSION_ID "Follow up prompt" # Resume by ID (non-interactive)
|
|
200
|
+
codex exec resume --last "Follow up prompt" # Resume most recent (non-interactive)
|
|
201
|
+
codex resume SESSION_ID "Follow up prompt" # Resume by ID (interactive)
|
|
202
|
+
codex resume --last "Follow up prompt" # Resume most recent (interactive)
|
|
203
|
+
codex fork SESSION_ID "Try a different approach" # Fork session (interactive)
|
|
204
|
+
```
|
|
174
205
|
|
|
175
|
-
|
|
206
|
+
Note: `codex exec resume` works non-interactively. `codex resume` and `codex fork` are interactive only.
|
|
176
207
|
|
|
177
|
-
|
|
208
|
+
### When to analyze past sessions
|
|
178
209
|
|
|
179
|
-
|
|
180
|
-
- Research/analysis: `timeout: 300000` (5 min)
|
|
181
|
-
- Implementation: `timeout: 600000` (10 min)
|
|
210
|
+
**Don't** run `history-analyzer` after every dispatch — your reflection agent already captures insights naturally, and single-session analysis produces overly detailed notes.
|
|
182
211
|
|
|
183
|
-
|
|
212
|
+
**Do** use `history-analyzer` for **bulk migration** when bootstrapping memory from months of accumulated history (e.g. during `/init`). See the `migrating-from-codex-and-claude-code` skill.
|
|
184
213
|
|
|
185
|
-
|
|
214
|
+
Direct uses for session files:
|
|
215
|
+
- **Resume** an investigation (see above)
|
|
216
|
+
- **Review** what an agent actually did (read the JSONL file directly)
|
|
217
|
+
- **Bulk migration** when setting up a new agent
|
|
186
218
|
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
219
|
+
## Timeouts
|
|
220
|
+
|
|
221
|
+
Set Bash timeouts appropriate to the task:
|
|
222
|
+
- Quick checks / reviews: `timeout: 120000` (2 min)
|
|
223
|
+
- Research / analysis: `timeout: 300000` (5 min)
|
|
224
|
+
- Implementation: `timeout: 600000` (10 min)
|