agentic-pi 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (53) hide show
  1. package/README.md +418 -0
  2. package/dist/args.d.ts +31 -0
  3. package/dist/args.js +109 -0
  4. package/dist/args.js.map +1 -0
  5. package/dist/cli.d.ts +9 -0
  6. package/dist/cli.js +47 -0
  7. package/dist/cli.js.map +1 -0
  8. package/dist/emitter.d.ts +49 -0
  9. package/dist/emitter.js +67 -0
  10. package/dist/emitter.js.map +1 -0
  11. package/dist/extensions/github/auth.d.ts +54 -0
  12. package/dist/extensions/github/auth.js +116 -0
  13. package/dist/extensions/github/auth.js.map +1 -0
  14. package/dist/extensions/github/client.d.ts +6387 -0
  15. package/dist/extensions/github/client.js +358 -0
  16. package/dist/extensions/github/client.js.map +1 -0
  17. package/dist/extensions/github/credentials.d.ts +24 -0
  18. package/dist/extensions/github/credentials.js +44 -0
  19. package/dist/extensions/github/credentials.js.map +1 -0
  20. package/dist/extensions/github/index.d.ts +46 -0
  21. package/dist/extensions/github/index.js +67 -0
  22. package/dist/extensions/github/index.js.map +1 -0
  23. package/dist/extensions/github/profiles.d.ts +17 -0
  24. package/dist/extensions/github/profiles.js +71 -0
  25. package/dist/extensions/github/profiles.js.map +1 -0
  26. package/dist/extensions/github/tools.d.ts +18 -0
  27. package/dist/extensions/github/tools.js +289 -0
  28. package/dist/extensions/github/tools.js.map +1 -0
  29. package/dist/index.d.ts +38 -0
  30. package/dist/index.js +34 -0
  31. package/dist/index.js.map +1 -0
  32. package/dist/models.d.ts +13 -0
  33. package/dist/models.js +31 -0
  34. package/dist/models.js.map +1 -0
  35. package/dist/run.d.ts +139 -0
  36. package/dist/run.js +131 -0
  37. package/dist/run.js.map +1 -0
  38. package/dist/runner.d.ts +22 -0
  39. package/dist/runner.js +143 -0
  40. package/dist/runner.js.map +1 -0
  41. package/dist/sandbox/gondolin.d.ts +39 -0
  42. package/dist/sandbox/gondolin.js +210 -0
  43. package/dist/sandbox/gondolin.js.map +1 -0
  44. package/dist/sandbox/index.d.ts +37 -0
  45. package/dist/sandbox/index.js +55 -0
  46. package/dist/sandbox/index.js.map +1 -0
  47. package/dist/sandbox/preflight.d.ts +24 -0
  48. package/dist/sandbox/preflight.js +93 -0
  49. package/dist/sandbox/preflight.js.map +1 -0
  50. package/dist/stdin.d.ts +1 -0
  51. package/dist/stdin.js +11 -0
  52. package/dist/stdin.js.map +1 -0
  53. package/package.json +44 -0
package/README.md ADDED
@@ -0,0 +1,418 @@
1
+ # agentic-pi
2
+
3
+ A pre-configured, opinionated wrapper around [earendil-works/pi](https://github.com/earendil-works/pi)
4
+ that turns it into a **one-shot coding-agent worker** for workflow systems like
5
+ [lastlight](https://github.com/cliftonc/lastlight).
6
+
7
+ If you already have an orchestrator that wants to spawn an agent for one
8
+ phase (architect, build, review, triage, …), pipe a prompt in, and parse a
9
+ structured event stream back out, this is what slots in. It does the
10
+ boring wiring so you don't have to.
11
+
12
+ ## What this is opinionated about
13
+
14
+ Pi itself is a deliberately minimal harness — it gives you an SDK, a multi-provider
15
+ LLM API, an extension model, and four run modes. agentic-pi makes opinionated
16
+ choices on top of all of that for one specific use case:
17
+
18
+ ### 1. One-shot only, no interactive mode
19
+
20
+ The only command is `agentic-pi run`. It reads the prompt from **stdin**, runs
21
+ exactly one agent turn (which may contain many tool calls), emits JSONL to
22
+ **stdout**, and exits when Pi's `agent_end` fires. There is no REPL, no chat
23
+ loop, no `serve` mode. If a phase needs follow-ups, the orchestrator spawns a
24
+ new process.
25
+
26
+ ### 2. JSONL event stream tailored for downstream parsing
27
+
28
+ Pi natively emits a JSONL event stream in `--mode json`. agentic-pi uses Pi's SDK
29
+ in-process, subscribes to the same events, and adds three things on top:
30
+
31
+ - A leading `{"type":"session", "version":3, "id":<uuid>, "cwd":...}` header.
32
+ - `sessionId` and `timestamp` injected onto **every** subsequent event, so a
33
+ downstream consumer can correlate without parsing the header line separately.
34
+ - A terminal `{"type":"usage_snapshot", "stats":{...}}` event synthesized from
35
+ `session.getSessionStats()` — because Pi's per-event payloads do **not**
36
+ carry token counts or cost.
37
+
38
+ If your orchestrator needs cost/token accounting, the snapshot is the
39
+ single line you parse.
40
+
41
+ ### 3. GitHub repo operations as first-class native tools
42
+
43
+ Pi explicitly does not support MCP. agentic-pi ships a native Pi extension
44
+ exposing **31 GitHub tools** ported from lastlight's `mcp-github-app`:
45
+ clone/push, issues, PRs, reviews, labels, search. Tools are registered with
46
+ the `github_` prefix to match opencode's MCP-server-name convention.
47
+
48
+ Auth is opinionated: **GitHub App credentials preferred**, static
49
+ `GITHUB_TOKEN` only as a low-trust fallback. JWT-minted installation tokens
50
+ cached for ~50 minutes, 5-minute refresh buffer, `git credential-store`
51
+ file written with mode 600 and a regex-validated token.
52
+
53
+ ### 4. Permission profiles as a registration-time gate
54
+
55
+ `--profile <name>` picks one of four allowlists ported from lastlight:
56
+
57
+ | Profile | Tool count | What it can do |
58
+ | --- | --- | --- |
59
+ | `read` | 18 | Repo/issue/PR reads + search. No mutations. |
60
+ | `issues-write` | 24 | Read + issue/comment/label mutations. |
61
+ | `review-write` | 26 | Read + issues + PR review/comment + create PR. |
62
+ | `repo-write` | 31 | Everything: clone, push, branch, file edits, merge. |
63
+
64
+ Tools outside the active profile are **never registered** — the LLM cannot see
65
+ them in the system prompt and cannot call them. This is a stronger guarantee
66
+ than a runtime "ask each time" gate.
67
+
68
+ The extension is **safe by default** when credentials are missing or
69
+ mis-configured:
70
+
71
+ | Situation | Behaviour | Stderr warning? |
72
+ | --- | --- | --- |
73
+ | `--profile` not passed | Silent skip. No GitHub tools registered. | No |
74
+ | `--profile X`, no `GITHUB_*` env vars at all | Skip. Run continues without GitHub tools. | Yes |
75
+ | `--profile X`, partial App creds (e.g. APP_ID set but INSTALLATION_ID missing) | Skip with explicit error. | Yes |
76
+ | `--profile X`, App creds set but PEM file unreadable | Skip with explicit error. | Yes |
77
+ | `--profile X`, all App creds set and PEM readable | Tools registered. | No |
78
+ | `--profile X`, only `GITHUB_TOKEN` set | Tools registered (static-token mode, lower trust). | No |
79
+
80
+ The `extension_status` JSONL event always reports `status`, `reason`,
81
+ `message`, `profile`, and `toolCount` so the orchestrator can log the
82
+ outcome programmatically without parsing stderr.
83
+
84
+ ### 5. Models named the way opencode names them
85
+
86
+ `--model provider/id` accepts the exact string format opencode used
87
+ (`openai/gpt-5.5`, `anthropic/claude-opus-4-5`, etc.). Credentials come from
88
+ environment variables (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`,
89
+ `OPENROUTER_API_KEY`) or Pi's `~/.pi/agent/auth.json` if you've logged in
90
+ interactively. Provider/id mapping is delegated to `@earendil-works/pi-ai`'s
91
+ `getModel()`.
92
+
93
+ `--thinking <level>` maps directly to Pi's `thinkingLevel`
94
+ (`off`/`minimal`/`low`/`medium`/`high`/`xhigh`). Per-provider effort is
95
+ handled by Pi.
96
+
97
+ ### 6. Things accepted but ignored for caller-side compatibility
98
+
99
+ - `--dangerously-skip-permissions` — Pi has no permission prompts to skip
100
+ ("run in a container" is Pi's design stance). The flag is accepted so a
101
+ caller that previously spawned opencode does not need to strip it.
102
+ - `--variant <level>` — alias for `--thinking`.
103
+
104
+ ### 7. Defaults that match a containerized sandbox
105
+
106
+ - **`--no-session`** is intended to be the default in sandboxed runs (state
107
+ lives outside the container).
108
+ - **Built-in tools** (read, write, edit, bash, grep, find, ls) are enabled
109
+ by default. Add `--no-builtin-tools` if you want a GitHub-only agent.
110
+ - **`AGENTS.md`** in the working directory is auto-loaded as the agent's
111
+ system prompt — same convention Pi and opencode share. Drop your
112
+ workflow's `AGENTS.md` into the mounted workspace and the agent picks it
113
+ up.
114
+
115
+ ### 8. Optional micro-VM sandboxing via `--sandbox gondolin`
116
+
117
+ By default Pi's file and bash tools run on the host. Pass `--sandbox gondolin`
118
+ and they get routed through a per-run [Gondolin](https://github.com/earendil-works/gondolin)
119
+ QEMU micro-VM instead. The orchestrator doesn't need to manage anything —
120
+ agentic-pi boots the VM, mounts the working directory at `/workspace`
121
+ inside it, runs the agent's tools through it, and tears it down on
122
+ `agent_end`.
123
+
124
+ **What this protects against.** Arbitrary code the agent runs via `bash`
125
+ or `write` executes inside the VM, not on the host. A prompt-injection
126
+ that gets the agent to `rm -rf /` only rm's the guest, which is thrown
127
+ away seconds later. The host workspace is mounted in, so legitimate file
128
+ edits *do* persist — destructive `bash` against `/workspace` will still
129
+ modify host files (the same trade-off `chroot` and Docker bind mounts
130
+ have).
131
+
132
+ **What this does NOT protect against.** GitHub credentials and the LLM
133
+ API key live in the agentic-pi process *outside* the VM. The `github_*`
134
+ tools run there. A prompt-injection that subverts Pi into calling
135
+ `github_create_issue` does not need to escape the VM — the call happens
136
+ host-side. The VM protects against *code execution*, not *tool misuse*.
137
+ For protection against tool misuse, restrict the GitHub profile
138
+ (`--profile read`).
139
+
140
+ **Hard requirements.**
141
+
142
+ - QEMU on the host: `brew install qemu` (macOS) or
143
+ `apt install qemu-system-x86 qemu-system-arm qemu-utils` (Debian/Ubuntu).
144
+ - agentic-pi running **natively** on the host, not inside a Docker
145
+ container. See `SPIKE-gondolin.md`: managed-host containers don't
146
+ expose `/dev/kvm`, and macOS Docker uses Apple's
147
+ Virtualization.Framework (not KVM), which is unreachable from inside
148
+ a container.
149
+ - On Linux, the running user must have read access to `/dev/kvm`.
150
+
151
+ **Pre-flight is loud, not silent.** agentic-pi probes for QEMU and
152
+ `qemu-img` before starting the VM, and probes the booted VM with
153
+ `/bin/true` (5s timeout) before returning. If any check fails, the
154
+ process exits 2 with a clean error pointing at the spike doc. The
155
+ upstream `VM.create` failure mode of "returns ready but the guest is
156
+ dead" cannot leak through.
157
+
158
+ **Latency cost (measured on macOS Apple Silicon).**
159
+
160
+ | Op | Time |
161
+ | --- | --- |
162
+ | First `VM.create` post-boot | ~13 s (one-time cache warm-up) |
163
+ | Subsequent `VM.create` | < 100 ms |
164
+ | Per-tool overhead | ~200 ms each |
165
+ | Realistic shell op (`ls /etc && uname -a`) | ~2.8 s |
166
+ | `vm.close` | ~10 ms |
167
+
168
+ Linux + KVM should be in the same ballpark. Numbers are reproducible
169
+ from `test/fixtures/phase3-smoke-sandbox-gondolin.jsonl`.
170
+
171
+ **Event stream.** A `sandbox_status` JSONL line is emitted right after
172
+ the session header:
173
+
174
+ ```jsonl
175
+ {"type":"sandbox_status","backend":"gondolin","status":{"backend":"gondolin","cwd":"/path/to/workspace","guestPath":"/workspace","createMs":47},"sessionId":"…","timestamp":"…"}
176
+ ```
177
+
178
+ If `--sandbox none` (the default), the same line is still emitted with
179
+ `backend: "none"` so downstream consumers always know which mode the run
180
+ used.
181
+
182
+ ## When to use this
183
+
184
+ - You have an orchestrator that calls a coding agent once per workflow
185
+ phase, in a container, and parses a JSONL stream.
186
+ - You used to call `opencode run --format json` and want a less-opaque
187
+ replacement built on a more hackable substrate.
188
+ - You need GitHub repo operations available to the agent without standing
189
+ up an MCP server.
190
+
191
+ ## When **not** to use this
192
+
193
+ - You want a chat UI or a long-running agent. Use [`pi`](https://github.com/earendil-works/pi)
194
+ directly — its interactive and RPC modes are excellent.
195
+ - You want generic MCP support. Pi has none by design and agentic-pi inherits
196
+ that decision; only the GitHub tool surface is built-in.
197
+ - You want a different tool surface (Linear, GitLab, internal APIs). Fork the
198
+ `extensions/github/` directory as a template, not as a runtime plugin
199
+ system — agentic-pi does not (yet) load arbitrary external extensions.
200
+
201
+ ## Usage
202
+
203
+ ```bash
204
+ echo "list open PRs on owner/repo" | agentic-pi run \
205
+ --model anthropic/claude-haiku-4-5 \
206
+ --profile read \
207
+ --no-session
208
+ ```
209
+
210
+ Required env (one of):
211
+
212
+ ```bash
213
+ # Anthropic/OpenAI/OpenRouter — at least one matching your --model
214
+ ANTHROPIC_API_KEY=sk-ant-…
215
+ OPENAI_API_KEY=sk-…
216
+ OPENROUTER_API_KEY=sk-or-…
217
+
218
+ # GitHub — App credentials preferred over static token
219
+ GITHUB_APP_ID=…
220
+ GITHUB_APP_PRIVATE_KEY_PATH=/abs/path/app.pem
221
+ GITHUB_APP_INSTALLATION_ID=…
222
+ # or, for low-trust fallback:
223
+ GITHUB_TOKEN=ghp_…
224
+ ```
225
+
226
+ ## Flags
227
+
228
+ | Flag | Description |
229
+ | --- | --- |
230
+ | `--model <provider/id>` | Required. e.g. `anthropic/claude-opus-4-5`, `openai/gpt-4o`. |
231
+ | `--thinking <level>` | `off` \| `minimal` \| `low` \| `medium` \| `high` \| `xhigh`. |
232
+ | `--variant <level>` | Alias for `--thinking`. |
233
+ | `--profile <name>` | `read` \| `issues-write` \| `review-write` \| `repo-write`. Omit to disable GitHub tools entirely. |
234
+ | `--cwd <path>` | Working directory for the agent. Default: `$PWD`. |
235
+ | `--no-session` | Ephemeral run — do not persist session jsonl. Recommended in sandboxed containers. |
236
+ | `--session-dir <path>` | Override session storage location. |
237
+ | `--no-builtin-tools` | Disable Pi's `read,write,edit,bash,grep,find,ls`. |
238
+ | `--tools <a,b,c>` | Explicit tool allowlist (combined with profile if set). |
239
+ | `--sandbox <none\|gondolin>` | Route `read`/`write`/`edit`/`bash` through a sandbox backend. Default `none`. `gondolin` boots a QEMU micro-VM mounting cwd at `/workspace`. Requires QEMU on the host; native-only (not Docker-in-Docker). See section 8. |
240
+ | `--dangerously-skip-permissions` | Accepted for caller-side compatibility. No-op. |
241
+
242
+ Reads the prompt from stdin. Emits JSONL on stdout. Exits 0 on `agent_end`,
243
+ 1 on fatal error.
244
+
245
+ ## Event stream
246
+
247
+ ```jsonl
248
+ {"type":"session","version":3,"id":"<uuid>","timestamp":"…","cwd":"…"}
249
+ {"type":"sandbox_status","backend":"none","status":{"backend":"none"},"sessionId":"<uuid>","timestamp":"…"}
250
+ {"type":"extension_status","extension":"github","status":"configured","profile":"read","toolCount":18,"sessionId":"<uuid>","timestamp":"…"}
251
+ {"type":"agent_start","sessionId":"<uuid>","timestamp":"…"}
252
+ {"type":"turn_start","sessionId":"<uuid>","timestamp":"…"}
253
+ {"type":"message_start","message":{…},"sessionId":"<uuid>","timestamp":"…"}
254
+ {"type":"message_update","assistantMessageEvent":{"type":"text_delta","delta":"…"},"sessionId":"<uuid>","timestamp":"…"}
255
+ {"type":"tool_execution_start","toolCallId":"…","toolName":"github_list_pull_requests","args":{…},"sessionId":"<uuid>","timestamp":"…"}
256
+ {"type":"tool_execution_end","toolCallId":"…","toolName":"github_list_pull_requests","result":{"content":[…]},"isError":false,"sessionId":"<uuid>","timestamp":"…"}
257
+ {"type":"message_end","message":{…},"sessionId":"<uuid>","timestamp":"…"}
258
+ {"type":"turn_end","message":{…},"toolResults":[…],"sessionId":"<uuid>","timestamp":"…"}
259
+ {"type":"agent_end","messages":[…],"willRetry":false,"sessionId":"<uuid>","timestamp":"…"}
260
+ {"type":"usage_snapshot","stats":{"userMessages":1,"assistantMessages":2,"toolCalls":1,"toolResults":1,"tokens":{"input":…,"output":…,"cacheRead":…,"cacheWrite":…,"total":…},"cost":0.000…},"sessionId":"<uuid>","timestamp":"…"}
261
+ ```
262
+
263
+ `extension_status` is emitted once at startup so downstream logs can confirm
264
+ the GitHub profile (and whether auth succeeded). `usage_snapshot` is always
265
+ the last line in a successful run.
266
+
267
+ ## Programmatic usage
268
+
269
+ If your orchestrator runs Node, you can skip the subprocess and import
270
+ agentic-pi directly. The `run()` API never touches `process.stdout` or
271
+ `process.stderr` — it returns a fully-derived `RunResult` and forwards
272
+ events through callbacks instead.
273
+
274
+ ```ts
275
+ import { run } from "agentic-pi";
276
+
277
+ const result = await run({
278
+ model: "anthropic/claude-haiku-4-5",
279
+ prompt: "list the open PRs on owner/repo and summarize them",
280
+ thinking: "medium",
281
+ profile: "read",
282
+ sandbox: "none",
283
+ noSession: true,
284
+ cwd: "/path/to/workspace",
285
+
286
+ // Optional observability hooks. Both are pure callbacks — no I/O happens
287
+ // unless you do something with the values.
288
+ onEvent: (record) => myShim.writeJsonl(record),
289
+ onWarn: (msg) => myLogger.warn(msg),
290
+ });
291
+
292
+ if (!result.ok) {
293
+ throw new Error(result.fatalError?.message ?? "agent failed");
294
+ }
295
+
296
+ console.log(result.finalText); // "There are 3 open PRs: …"
297
+ console.log(result.sessionId); // Pi session UUID
298
+ console.log(result.stats?.tokens.total); // total tokens
299
+ console.log(result.stats?.cost); // USD
300
+ console.log(result.sandbox?.backend); // "none" | "gondolin"
301
+ console.log(result.github?.status); // "configured" | "skipped"
302
+ console.log(result.records.length); // full event log
303
+ ```
304
+
305
+ ### `RunResult` shape
306
+
307
+ | Field | Type | Description |
308
+ | --- | --- | --- |
309
+ | `exitCode` | `0 \| 1 \| 2` | Same code the CLI would have returned. |
310
+ | `ok` | `boolean` | `exitCode === 0`. |
311
+ | `agentEnded` | `boolean` | Pi emitted `agent_end`. |
312
+ | `toolErrors` | `boolean` | At least one tool returned an error. |
313
+ | `fatalError` | `{name, message}` \| `undefined` | Set if a fatal error short-circuited the run. |
314
+ | `sessionId` | `string` \| `undefined` | Pi session UUID. |
315
+ | `cwd` | `string` \| `undefined` | Working directory the agent ran in. |
316
+ | `startedAt` | `string` \| `undefined` | ISO timestamp of session start. |
317
+ | `finalText` | `string` | Concatenated last-assistant text content. |
318
+ | `messages` | `unknown[]` | Full Pi message array from `agent_end`. |
319
+ | `stats` | `{userMessages, assistantMessages, toolCalls, toolResults, tokens: {input, output, cacheRead, cacheWrite, total}, cost}` \| `undefined` | Token + cost rollup. |
320
+ | `sandbox` | `{backend, status}` \| `undefined` | Mirror of the `sandbox_status` event. |
321
+ | `github` | `{status, reason, profile, toolCount}` \| `undefined` | Mirror of the `extension_status` event. |
322
+ | `records` | `EmitterRecord[]` | Every JSONL record in order. Same shape that the CLI writes. |
323
+ | `warnings` | `string[]` | Warnings that would have gone to stderr in CLI mode. |
324
+
325
+ ### When to use which API
326
+
327
+ | If you want… | Use |
328
+ | --- | --- |
329
+ | The same observable stream the CLI produces, captured to a file or proxied to a UI | `run({ ..., onEvent })` |
330
+ | A single object describing the outcome (lastlight's `ExecutionResult` mapping) | `run()` and read `result.finalText`/`result.stats` |
331
+ | Direct control over the sink (e.g. write straight to a writable stream you already have) | `run({ ..., extraSink })` or drop down to `runOnce(config, prompt, { sink, onWarn })` |
332
+ | Cancellation | Not supported yet — kill the host process. Open an issue if you need this. |
333
+
334
+ ### Notes for in-process callers
335
+
336
+ - agentic-pi reuses the host process's env vars (`OPENAI_API_KEY`,
337
+ `GITHUB_APP_ID`, …). If your orchestrator runs multiple
338
+ workflows with different credentials, `process.env` is the seam to vary.
339
+ - `cwd` is per-call; you can run multiple agents in parallel against
340
+ different working directories from the same orchestrator process.
341
+ - Sessions are created fresh each call. Pass `noSession: true` if you
342
+ don't want session JSONLs accumulating under `~/.pi/agent/sessions/`.
343
+ - The sandbox boots and tears down per call. If you're processing many
344
+ short tasks against the same workspace, the per-task VM cost adds up;
345
+ consider batching or just leaving `sandbox: "none"`.
346
+
347
+ ## Development
348
+
349
+ ```bash
350
+ npm install
351
+ npm run build
352
+ npm test # full suite — skips integration tests if env not set
353
+ npm run test:unit # unit only (fast, no API keys, no QEMU)
354
+ npm run test:integration # integration only (needs OPENAI_API_KEY; sandbox also needs QEMU)
355
+
356
+ echo "hello" | node dist/cli.js run --model anthropic/claude-haiku-4-5 --no-session
357
+ ```
358
+
359
+ ### Tests
360
+
361
+ The test suite uses Node's built-in test runner (`node:test`) and `tsx`
362
+ to load TypeScript. Files are discovered by `scripts/run-tests.mjs`,
363
+ which walks `test/` for `*.test.ts`.
364
+
365
+ | File | What it covers | Skip condition |
366
+ | --- | --- | --- |
367
+ | `test/args.test.ts` | CLI flag parsing happy path + every error case | — |
368
+ | `test/emitter.test.ts` | `Emitter`, `CollectorSink`, `TeeSink` contracts | — |
369
+ | `test/models.test.ts` | `provider/id` parsing including openrouter triple-slash | — |
370
+ | `test/extensions/github/profiles.test.ts` | Profile → tool allowlist (counts, superset structure, scope tiering) | — |
371
+ | `test/extensions/github/credentials.test.ts` | `assertSafeToken` and `credentialsFilePath` validation | — |
372
+ | `test/sandbox/preflight.test.ts` | Preflight returns a structured ok\|error result | — |
373
+ | `test/run.integration.test.ts` | Programmatic `run()`: RunResult populated, onEvent fires for every record, **child-process check confirms zero stdout/stderr leak from library** | `OPENAI_API_KEY` not set |
374
+ | `test/run-sandbox.integration.test.ts` | `run({ sandbox: "gondolin" })` boots a VM, agent's `write` tool produces a host file via the mount | `OPENAI_API_KEY` not set OR QEMU/preflight unavailable |
375
+
376
+ Unit tests run in ~170 ms. Integration tests cost about $0.001 per run on
377
+ `gpt-5.4-nano`.
378
+
379
+ Project layout:
380
+
381
+ ```
382
+ src/
383
+ cli.ts argv → run config; reads stdin; wraps run()
384
+ index.ts public library API: run, RunResult, sinks
385
+ run.ts programmatic entry: in-process run() + result accumulation
386
+ args.ts flag parser
387
+ stdin.ts stdin slurp
388
+ runner.ts createAgentSession → subscribe → prompt → emit
389
+ emitter.ts sink abstraction (Stdout / Collector / Tee) + Emitter
390
+ models.ts "provider/id" → getModel(...)
391
+ extensions/github/
392
+ index.ts loadGitHubExtension(profile) entry
393
+ auth.ts GitHubAppAuth (JWT → installation token) + static fallback
394
+ client.ts Octokit wrapper with retry/backoff
395
+ credentials.ts git credential-store file writer (mode 600)
396
+ profiles.ts 4 profiles → tool name allowlists
397
+ tools.ts 31 defineTool() registrations
398
+ sandbox/
399
+ index.ts buildSandbox(backend) dispatcher
400
+ preflight.ts QEMU + accelerator detection (refuses to start if hung)
401
+ gondolin.ts VM lifecycle + tool overrides for read/write/edit/bash
402
+ test/fixtures/ golden JSONL streams from real runs
403
+ SPIKE-gondolin.md spike notes on why sandbox is native-only
404
+ ```
405
+
406
+ ## Status & relationship to Pi
407
+
408
+ agentic-pi pins to `@earendil-works/pi-coding-agent ^0.75.4`. It uses Pi's SDK
409
+ in-process (`createAgentSession`, `session.subscribe`, `session.prompt`,
410
+ `session.getSessionStats`) — not the CLI subprocess. If Pi's SDK changes shape,
411
+ agentic-pi will need to track it; that's the trade-off taken for in-process
412
+ speed and direct access to session state.
413
+
414
+ It does **not** wrap, fork, or modify Pi. Pi's defaults that we don't
415
+ override remain in effect: AGENTS.md auto-discovery, ~/.pi/agent skills /
416
+ extensions / prompts / themes, the same model registry, the same auth
417
+ storage. If you `pi /login` and authenticate via subscription, agentic-pi
418
+ will pick that up too.
package/dist/args.d.ts ADDED
@@ -0,0 +1,31 @@
1
+ /**
2
+ * Argument parsing for `agentic-pi run`.
3
+ *
4
+ * Modeled loosely on opencode's CLI surface so the swap inside a Docker
5
+ * sandbox is one line. We intentionally do NOT mimic opencode's JSON event
6
+ * shape — see the plan doc for why.
7
+ */
8
+ export interface RunConfig {
9
+ /** "provider/model_id", e.g. "anthropic/claude-haiku-4-5" */
10
+ model: string;
11
+ /** Pi thinking level. */
12
+ thinking?: "off" | "minimal" | "low" | "medium" | "high" | "xhigh";
13
+ /** GitHub tool profile. Phase 2 will use this. */
14
+ profile?: string;
15
+ /** Working directory for the agent. Default: process.cwd(). */
16
+ cwd: string;
17
+ /** Whether to persist the session to disk. Default: true (Pi's default). */
18
+ noSession: boolean;
19
+ /** Optional override for session storage directory. */
20
+ sessionDir?: string;
21
+ /** Disable built-in tools (read/write/edit/bash/grep/find/ls). */
22
+ noBuiltinTools: boolean;
23
+ /** Explicit tool allowlist (comma-separated). */
24
+ tools?: string[];
25
+ /** Ignored — accepted for opencode call-site compatibility. */
26
+ dangerouslySkipPermissions: boolean;
27
+ /** Sandbox backend for read/write/edit/bash. */
28
+ sandbox: "none" | "gondolin";
29
+ }
30
+ export declare function printHelp(): void;
31
+ export declare function parseArgs(argv: string[]): RunConfig;
package/dist/args.js ADDED
@@ -0,0 +1,109 @@
1
+ /**
2
+ * Argument parsing for `agentic-pi run`.
3
+ *
4
+ * Modeled loosely on opencode's CLI surface so the swap inside a Docker
5
+ * sandbox is one line. We intentionally do NOT mimic opencode's JSON event
6
+ * shape — see the plan doc for why.
7
+ */
8
+ export function printHelp() {
9
+ process.stdout.write(`agentic-pi — Pi-based coding-agent harness
10
+
11
+ Usage:
12
+ echo "<prompt>" | agentic-pi run --model <provider/id> [flags]
13
+
14
+ Flags:
15
+ --model <provider/id> e.g. anthropic/claude-opus-4-5, openai/gpt-4o
16
+ --thinking <level> off | minimal | low | medium | high | xhigh
17
+ --profile <name> GitHub tool profile (read|issues-write|review-write|repo-write)
18
+ --cwd <path> Working directory (default: $PWD)
19
+ --no-session Do not persist session jsonl
20
+ --session-dir <path> Where to persist sessions
21
+ --no-builtin-tools Disable Pi built-in tools (read,write,edit,bash,grep,find,ls)
22
+ --tools <a,b,c> Explicit tool allowlist
23
+ --sandbox <none|gondolin> Route Pi's read/write/edit/bash through a sandbox backend.
24
+ Default: none. 'gondolin' boots a per-run QEMU micro-VM
25
+ mounting the cwd at /workspace. Requires QEMU on host;
26
+ native only (Docker-in-Docker not viable; see SPIKE-gondolin.md).
27
+ --dangerously-skip-permissions Accepted for compat; Pi has no permission prompts anyway
28
+
29
+ Reads the prompt from stdin. Emits Pi-native JSONL events on stdout, terminating
30
+ with an "agent_end" event that includes synthesized usage/cost data.
31
+ `);
32
+ }
33
+ export function parseArgs(argv) {
34
+ const config = {
35
+ model: "",
36
+ cwd: process.cwd(),
37
+ noSession: false,
38
+ noBuiltinTools: false,
39
+ dangerouslySkipPermissions: false,
40
+ sandbox: "none",
41
+ };
42
+ for (let i = 0; i < argv.length; i++) {
43
+ const arg = argv[i];
44
+ const next = () => {
45
+ const v = argv[++i];
46
+ if (v === undefined)
47
+ throw new Error(`flag ${arg} requires a value`);
48
+ return v;
49
+ };
50
+ switch (arg) {
51
+ case "--model":
52
+ case "-m":
53
+ config.model = next();
54
+ break;
55
+ case "--thinking":
56
+ case "--variant": {
57
+ const v = next();
58
+ if (!["off", "minimal", "low", "medium", "high", "xhigh"].includes(v)) {
59
+ throw new Error(`invalid --thinking level: ${v}`);
60
+ }
61
+ config.thinking = v;
62
+ break;
63
+ }
64
+ case "--profile":
65
+ config.profile = next();
66
+ break;
67
+ case "--cwd":
68
+ config.cwd = next();
69
+ break;
70
+ case "--no-session":
71
+ config.noSession = true;
72
+ break;
73
+ case "--session-dir":
74
+ config.sessionDir = next();
75
+ break;
76
+ case "--no-builtin-tools":
77
+ config.noBuiltinTools = true;
78
+ break;
79
+ case "--tools":
80
+ config.tools = next().split(",").map((s) => s.trim()).filter(Boolean);
81
+ break;
82
+ case "--dangerously-skip-permissions":
83
+ config.dangerouslySkipPermissions = true;
84
+ break;
85
+ case "--sandbox": {
86
+ const v = next();
87
+ if (v !== "none" && v !== "gondolin") {
88
+ throw new Error(`invalid --sandbox '${v}'. Expected: none | gondolin`);
89
+ }
90
+ config.sandbox = v;
91
+ break;
92
+ }
93
+ case "-h":
94
+ case "--help":
95
+ printHelp();
96
+ process.exit(0);
97
+ default:
98
+ throw new Error(`unknown flag: ${arg}`);
99
+ }
100
+ }
101
+ if (!config.model) {
102
+ throw new Error("--model is required (e.g. anthropic/claude-haiku-4-5)");
103
+ }
104
+ if (!config.model.includes("/")) {
105
+ throw new Error(`--model must be 'provider/id', got '${config.model}'`);
106
+ }
107
+ return config;
108
+ }
109
+ //# sourceMappingURL=args.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"args.js","sourceRoot":"","sources":["../src/args.ts"],"names":[],"mappings":"AAAA;;;;;;GAMG;AAyBH,MAAM,UAAU,SAAS;IACvB,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC;;;;;;;;;;;;;;;;;;;;;;CAsBtB,CAAC,CAAC;AACH,CAAC;AAED,MAAM,UAAU,SAAS,CAAC,IAAc;IACtC,MAAM,MAAM,GAAc;QACxB,KAAK,EAAE,EAAE;QACT,GAAG,EAAE,OAAO,CAAC,GAAG,EAAE;QAClB,SAAS,EAAE,KAAK;QAChB,cAAc,EAAE,KAAK;QACrB,0BAA0B,EAAE,KAAK;QACjC,OAAO,EAAE,MAAM;KAChB,CAAC;IAEF,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE,CAAC;QACrC,MAAM,GAAG,GAAG,IAAI,CAAC,CAAC,CAAC,CAAC;QACpB,MAAM,IAAI,GAAG,GAAW,EAAE;YACxB,MAAM,CAAC,GAAG,IAAI,CAAC,EAAE,CAAC,CAAC,CAAC;YACpB,IAAI,CAAC,KAAK,SAAS;gBAAE,MAAM,IAAI,KAAK,CAAC,QAAQ,GAAG,mBAAmB,CAAC,CAAC;YACrE,OAAO,CAAC,CAAC;QACX,CAAC,CAAC;QACF,QAAQ,GAAG,EAAE,CAAC;YACZ,KAAK,SAAS,CAAC;YACf,KAAK,IAAI;gBACP,MAAM,CAAC,KAAK,GAAG,IAAI,EAAE,CAAC;gBACtB,MAAM;YACR,KAAK,YAAY,CAAC;YAClB,KAAK,WAAW,CAAC,CAAC,CAAC;gBACjB,MAAM,CAAC,GAAG,IAAI,EAAE,CAAC;gBACjB,IAAI,CAAC,CAAC,KAAK,EAAE,SAAS,EAAE,KAAK,EAAE,QAAQ,EAAE,MAAM,EAAE,OAAO,CAAC,CAAC,QAAQ,CAAC,CAAC,CAAC,EAAE,CAAC;oBACtE,MAAM,IAAI,KAAK,CAAC,6BAA6B,CAAC,EAAE,CAAC,CAAC;gBACpD,CAAC;gBACD,MAAM,CAAC,QAAQ,GAAG,CAA0B,CAAC;gBAC7C,MAAM;YACR,CAAC;YACD,KAAK,WAAW;gBACd,MAAM,CAAC,OAAO,GAAG,IAAI,EAAE,CAAC;gBACxB,MAAM;YACR,KAAK,OAAO;gBACV,MAAM,CAAC,GAAG,GAAG,IAAI,EAAE,CAAC;gBACpB,MAAM;YACR,KAAK,cAAc;gBACjB,MAAM,CAAC,SAAS,GAAG,IAAI,CAAC;gBACxB,MAAM;YACR,KAAK,eAAe;gBAClB,MAAM,CAAC,UAAU,GAAG,IAAI,EAAE,CAAC;gBAC3B,MAAM;YACR,KAAK,oBAAoB;gBACvB,MAAM,CAAC,cAAc,GAAG,IAAI,CAAC;gBAC7B,MAAM;YACR,KAAK,SAAS;gBACZ,MAAM,CAAC,KAAK,GAAG,IAAI,EAAE,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,IAAI,EAAE,CAAC,CAAC,MAAM,CAAC,OAAO,CAAC,CAAC;gBACtE,MAAM;YACR,KAAK,gCAAgC;gBACnC,MAAM,CAAC,0BAA0B,GAAG,IAAI,CAAC;gBACzC,MAAM;YACR,KAAK,WAAW,CAAC,CAAC,CAAC;gBACjB,MAAM,CAAC,GAAG,IAAI,EAAE,CAAC;gBACjB,IAAI,CAAC,KAAK,MAAM,IAAI,CAAC,KAAK,UAAU,EAAE,CAAC;oBACrC,MAAM,IAAI,KAAK,CAAC,sBAAsB,CAAC,8BAA8B,CAAC,CAAC;gBACzE,CAAC;gBACD,MAAM,CAAC,OAAO,GAAG,CAAC,CAAC;gBACnB,MAAM;YACR,CAAC;YACD,KAAK,IAAI,CAAC;YACV,KAAK,QAAQ;gBACX,SAAS,EAAE,CAAC;gBACZ,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;YAClB;gBACE,MAAM,IAAI,KAAK,CAAC,iBAAiB,GAAG,EAAE,CAAC,CAAC;QAC5C,CAAC;IACH,CAAC;IAED,IAAI,CAAC,MAAM,CAAC,KAAK,EAAE,CAAC;QAClB,MAAM,IAAI,KAAK,CAAC,uDAAuD,CAAC,CAAC;IAC3E,CAAC;IACD,IAAI,CAAC,MAAM,CAAC,KAAK,CAAC,QAAQ,CAAC,GAAG,CAAC,EAAE,CAAC;QAChC,MAAM,IAAI,KAAK,CAAC,uCAAuC,MAAM,CAAC,KAAK,GAAG,CAAC,CAAC;IAC1E,CAAC;IAED,OAAO,MAAM,CAAC;AAChB,CAAC"}
package/dist/cli.d.ts ADDED
@@ -0,0 +1,9 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * agentic-pi CLI entry point.
4
+ *
5
+ * Drives the Pi SDK in one-shot mode: reads a prompt from stdin, runs a single
6
+ * turn against the configured model, streams Pi-native JSONL events to stdout,
7
+ * and exits cleanly on `agent_end`.
8
+ */
9
+ export {};
package/dist/cli.js ADDED
@@ -0,0 +1,47 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * agentic-pi CLI entry point.
4
+ *
5
+ * Drives the Pi SDK in one-shot mode: reads a prompt from stdin, runs a single
6
+ * turn against the configured model, streams Pi-native JSONL events to stdout,
7
+ * and exits cleanly on `agent_end`.
8
+ */
9
+ import { readStdin } from "./stdin.js";
10
+ import { parseArgs, printHelp } from "./args.js";
11
+ import { runOnce } from "./runner.js";
12
+ import { StdoutSink } from "./emitter.js";
13
+ async function main() {
14
+ const argv = process.argv.slice(2);
15
+ if (argv.length === 0 || argv[0] === "--help" || argv[0] === "-h") {
16
+ printHelp();
17
+ return 0;
18
+ }
19
+ const command = argv[0];
20
+ if (command !== "run") {
21
+ process.stderr.write(`agentic-pi: unknown command '${command}'\n`);
22
+ printHelp();
23
+ return 2;
24
+ }
25
+ let config;
26
+ try {
27
+ config = parseArgs(argv.slice(1));
28
+ }
29
+ catch (err) {
30
+ process.stderr.write(`agentic-pi: ${err.message}\n`);
31
+ return 2;
32
+ }
33
+ const prompt = await readStdin();
34
+ if (!prompt.trim()) {
35
+ process.stderr.write("agentic-pi: empty prompt on stdin\n");
36
+ return 2;
37
+ }
38
+ return await runOnce(config, prompt, {
39
+ sink: new StdoutSink(),
40
+ onWarn: (msg) => process.stderr.write(`agentic-pi: ${msg}\n`),
41
+ });
42
+ }
43
+ main().then((code) => process.exit(code), (err) => {
44
+ process.stderr.write(`agentic-pi: fatal: ${err.stack ?? err}\n`);
45
+ process.exit(1);
46
+ });
47
+ //# sourceMappingURL=cli.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"cli.js","sourceRoot":"","sources":["../src/cli.ts"],"names":[],"mappings":";AACA;;;;;;GAMG;AAEH,OAAO,EAAE,SAAS,EAAE,MAAM,YAAY,CAAC;AACvC,OAAO,EAAE,SAAS,EAAE,SAAS,EAAkB,MAAM,WAAW,CAAC;AACjE,OAAO,EAAE,OAAO,EAAE,MAAM,aAAa,CAAC;AACtC,OAAO,EAAE,UAAU,EAAE,MAAM,cAAc,CAAC;AAE1C,KAAK,UAAU,IAAI;IACjB,MAAM,IAAI,GAAG,OAAO,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;IAEnC,IAAI,IAAI,CAAC,MAAM,KAAK,CAAC,IAAI,IAAI,CAAC,CAAC,CAAC,KAAK,QAAQ,IAAI,IAAI,CAAC,CAAC,CAAC,KAAK,IAAI,EAAE,CAAC;QAClE,SAAS,EAAE,CAAC;QACZ,OAAO,CAAC,CAAC;IACX,CAAC;IAED,MAAM,OAAO,GAAG,IAAI,CAAC,CAAC,CAAC,CAAC;IACxB,IAAI,OAAO,KAAK,KAAK,EAAE,CAAC;QACtB,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,gCAAgC,OAAO,KAAK,CAAC,CAAC;QACnE,SAAS,EAAE,CAAC;QACZ,OAAO,CAAC,CAAC;IACX,CAAC;IAED,IAAI,MAAiB,CAAC;IACtB,IAAI,CAAC;QACH,MAAM,GAAG,SAAS,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC;IACpC,CAAC;IAAC,OAAO,GAAG,EAAE,CAAC;QACb,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,eAAgB,GAAa,CAAC,OAAO,IAAI,CAAC,CAAC;QAChE,OAAO,CAAC,CAAC;IACX,CAAC;IAED,MAAM,MAAM,GAAG,MAAM,SAAS,EAAE,CAAC;IACjC,IAAI,CAAC,MAAM,CAAC,IAAI,EAAE,EAAE,CAAC;QACnB,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,qCAAqC,CAAC,CAAC;QAC5D,OAAO,CAAC,CAAC;IACX,CAAC;IAED,OAAO,MAAM,OAAO,CAAC,MAAM,EAAE,MAAM,EAAE;QACnC,IAAI,EAAE,IAAI,UAAU,EAAE;QACtB,MAAM,EAAE,CAAC,GAAW,EAAE,EAAE,CAAC,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,eAAe,GAAG,IAAI,CAAC;KACtE,CAAC,CAAC;AACL,CAAC;AAED,IAAI,EAAE,CAAC,IAAI,CACT,CAAC,IAAI,EAAE,EAAE,CAAC,OAAO,CAAC,IAAI,CAAC,IAAI,CAAC,EAC5B,CAAC,GAAG,EAAE,EAAE;IACN,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,sBAAuB,GAAa,CAAC,KAAK,IAAI,GAAG,IAAI,CAAC,CAAC;IAC5E,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CACF,CAAC"}