agent-afk 2.3.1 → 2.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,18 +1,23 @@
1
1
  # Agent AFK CLI
2
2
 
3
- > A TypeScript CLI, daemon, and Telegram bot for running Claude via `@anthropic-ai/claude-agent-sdk` — ships seven orchestration skills as subagents and mirrors the `agent-framework-private` plugin surface.
3
+ > A TypeScript CLI, daemon, and Telegram bot for running Claude (via `@anthropic-ai/sdk`) or OpenAI Codex — ships four orchestration skills as built-in subagent dispatchers with cross-session memory, DAG-composed waves, and background-task support.
4
4
 
5
5
  [![TypeScript](https://img.shields.io/badge/TypeScript-5.7-blue)](https://www.typescriptlang.org/)
6
6
 
7
7
  ## Features
8
8
 
9
- - 🚀 **Real Claude Agent SDK integration** native subagents, hooks, elicitations, cost guardrails
10
- - 🧩 **Seven orchestration skills as CLI subagents** — `/mint`, `/diagnose`, `/shadow-verify`, `/forge`, `/parallelize`, `/forge-gate-check`, `/forge-l2-eval`
11
- - 🔌 **Plugin skill-router** — skills under `~/.afk/plugins/*/skills/` auto-exposed as slash commands
12
- - 🏠 **AFK-scoped config** — `~/.afk/` independent of `~/.claude/`, with `afk plugin install/update/list/remove`
9
+ - 🚀 **Provider abstraction** Anthropic (direct) and OpenAI Codex; native subagents, hooks, elicitations, cost guardrails
10
+ - 🧩 **Four built-in orchestration skills** — `/mint`, `/diagnose`, `/forge`, `/audit-fit`, each dispatched as an isolated subagent
11
+ - 🧠 **Cross-session memory** — `memory_search`, `memory_update`, `procedure_write` tools backed by SQLite + `HOT.md`
12
+ - 🕸️ **DAG-composed parallel waves** — built-in `compose` tool runs subagent nodes with dependency edges and fail-fast
13
+ - ⏱️ **Background tasks** — Ctrl+B detaches the current turn; `/bg`, `/tasks`, `/attach` manage long-running work
14
+ - 📲 **`send_telegram` built-in tool** — agents can push terminal-state notifications to the operator
15
+ - 🔌 **Plugin & marketplace install** — `afk plugin install` / `afk marketplace add` keep everything under `~/.afk/`
16
+ - 🏠 **AFK-scoped config** — `~/.afk/` independent of `~/.claude/`, with sessions, plugins, agents, commands, skills
13
17
  - 💬 **Three surfaces** — interactive REPL, daemon, Telegram bot sharing one session manager
14
18
  - 📊 **Routing telemetry** — every subagent dispatch appended to `~/.afk/agent-framework/routing-decisions.jsonl`
15
- - 🤖 **Multiple Claude models** — Opus, Sonnet, Haiku
19
+ - 🤖 **Multiple models** — Opus, Sonnet, Haiku (Anthropic); GPT-5 family via Codex
20
+ - 🧠 **Extended thinking on by default** — controllable via `AFK_THINKING` / `--thinking`
16
21
  - 🔓 **Bypass permissions mode** — no prompts, fully automated tool execution
17
22
  - 🛡️ **Type-safe** — TypeScript strict mode
18
23
 
@@ -35,7 +40,7 @@
35
40
 
36
41
  ### Prerequisites
37
42
 
38
- - **Node.js ≥ 18.0.0**
43
+ - **Node.js ≥ 20.0.0** (enforced by `package.json#engines`)
39
44
  - **pnpm** — this project's lockfile is pnpm-specific. Running `npm install` will desync it.
40
45
  - Fastest path: `corepack enable` (bundled with Node ≥ 16.9), then use `pnpm` directly.
41
46
  - Or install globally: `npm install -g pnpm@latest`.
@@ -90,17 +95,21 @@ pnpm build
90
95
 
91
96
  ### CLI Commands
92
97
 
93
- The `afk` CLI exposes seven top-level commands registered in `src/cli/index.ts`:
98
+ The `afk` CLI exposes eleven top-level commands registered in `src/cli/index.ts`:
94
99
 
95
100
  | Command | Purpose |
96
101
  |---|---|
97
102
  | `chat` | Single-turn message |
98
- | `interactive` | REPL with full Agent SDK features |
103
+ | `interactive` | REPL with full Agent SDK features (default when invoked without a subcommand) |
99
104
  | `status` | Connection, API-key, model, bypass-mode status |
100
105
  | `config` | Dump resolved configuration |
101
106
  | `daemon` | Long-running headless agent (see `src/agent/daemon/`) |
102
107
  | `login` | OAuth flow for `console.anthropic.com` |
103
108
  | `plugin` | Manage `~/.afk/plugins/` (install / update / list / remove / enable / disable) |
109
+ | `marketplace` | Add / list / remove plugin marketplaces under `~/.afk/marketplaces/` |
110
+ | `doctor` | Environment self-check (Node version, API keys, paths, config) |
111
+ | `completion` | Print a shell completion script (`zsh`, `bash`, `fish`) |
112
+ | `telegram` | Manage the Telegram bot daemon (start, stop, status, logs, setup) |
104
113
 
105
114
  #### Chat (single message)
106
115
 
@@ -200,32 +209,32 @@ pnpm telegram:restart
200
209
 
201
210
  ## Orchestration Skills
202
211
 
203
- agent-afk ships seven built-in subagent orchestrators plus `devils-advocate` (in development). These are **skill-router-dispatched**: typing `/mint add dark mode` in the REPL parses the slash form, resolves it to a skill handler under `src/skills/<name>/`, and dispatches a fresh subagent via `SubagentManager.forkSubagent()`. Every dispatch is logged to `~/.afk/agent-framework/routing-decisions.jsonl`.
212
+ agent-afk ships four built-in subagent orchestrators. These are **built-in skills** exposed through the slash registry: typing `/mint add dark mode` in the REPL parses the slash form, resolves it to a TypeScript handler under `src/skills/<name>/index.ts`, and dispatches a fresh subagent via `SubagentManager.forkSubagent()`. Every dispatch is logged to `~/.afk/agent-framework/routing-decisions.jsonl`.
213
+
214
+ The canonical list lives in `src/skills/all.ts`:
204
215
 
205
216
  | Skill | Purpose |
206
217
  |---|---|
207
218
  | `/mint` | End-to-end feature/refactor pipeline: spec → research → plan → parallelize → build → verify → heal → ship |
208
219
  | `/diagnose` | Parallel hypothesis generation + validation for bugs and failing tests |
209
- | `/shadow-verify` | Adversarial re-derivation of sub-agent claims before you act on them |
210
- | `/forge` | Generate new skills autonomously, gated by L1 capability evals |
211
- | `/parallelize` | Transform a linear plan into dependency-aware parallel waves |
212
- | `/forge-gate-check` | Report whether `/forge` is thawed; rerun the L1 eval harness |
213
- | `/forge-l2-eval` | Run L2 capability evals (live sub-agent verdict probes) |
220
+ | `/forge` | Generate new skills autonomously, gated by L1/L2 capability evals (gate-check is inlined; no separate `/forge-gate-check` skill) |
221
+ | `/audit-fit` | Audit `~/.afk` artifacts (skills, commands, agents, hooks) for correct type categorization |
214
222
 
215
- The same seven skills ship in two surfaces:
223
+ Skills surface in two shapes:
216
224
 
217
- - **CLI surface** (this repo) — TypeScript handlers under `src/skills/<name>/` invoked via `skill-router`.
218
- - **Plugin surface** — prompt-based `SKILL.md` files under `agent-framework-private/skills/<name>/`, invoked inside a Claude Code session.
225
+ - **Built-in (this repo)** — TypeScript handlers under `src/skills/<name>/`, registered via `src/skills/all.ts` and bridged into the slash registry by `src/cli/slash/builtin-skills.ts`.
226
+ - **Plugin / user** — `SKILL.md` files discovered under `~/.afk/plugins/<plugin>/skills/<skill>/` or `~/.afk/skills/<skill>/`, scanned at session start and auto-exposed as slash commands.
219
227
 
220
- Vendored subagents (`qualify`, `research-agent`, `contract`) live under `src/skills/_agents/` and are kept byte-equal with the upstream copies — drift is caught by `src/skills/_agents/vendored.test.ts`.
228
+ Vendored subagents (`qualify`, `research-agent`, `contract`) live under `src/skills/_agents/` and are kept byte-equal with the upstream copies — drift is caught by tests in `src/skills/_agents/`.
221
229
 
222
- See the workspace-root [`SYSTEM.md`](../SYSTEM.md) for the topology, skill dependency graph, and multi-prompt loading convention.
230
+ See [`AGENTS.md`](AGENTS.md) and [`CONTRIBUTING.md`](CONTRIBUTING.md) for repo conventions; the workspace-root [`SYSTEM.md`](../SYSTEM.md) covers the broader topology when present.
223
231
 
224
232
  ## Scripts
225
233
 
226
234
  ```bash
227
235
  # Build / dev
228
236
  pnpm build # tsc && node scripts/copy-prompts.js (markdown prompts → dist/)
237
+ pnpm build:dist # esbuild bundle into dist/ for release artifacts
229
238
  pnpm dev # tsx watch src/cli/index.ts
230
239
  pnpm start # node dist/cli/index.js
231
240
  pnpm start:chat # shortcut for `chat`
@@ -235,26 +244,27 @@ pnpm clean # rm -rf dist
235
244
 
236
245
  # Testing
237
246
  pnpm test # vitest run (all)
238
- pnpm test:integration # integration tests only
239
- pnpm test:e2e # end-to-end tests only
247
+ pnpm test:integration # vitest run tests/integration (note: dir not yet populated)
248
+ pnpm test:e2e # vitest run tests/e2e (note: dir not yet populated)
240
249
  pnpm test:coverage # with coverage report
241
250
  pnpm test:watch # watch mode
242
251
  pnpm lint # tsc --noEmit (type-check only)
243
252
 
244
253
  # Telegram daemon
245
254
  pnpm telegram # run in foreground
255
+ pnpm telegram:setup # interactive setup wizard (bot token, allowed chat IDs)
246
256
  pnpm telegram:start # background service (launchd/systemd-style wrapper)
247
257
  pnpm telegram:stop
248
258
  pnpm telegram:status
249
259
  pnpm telegram:restart
260
+ pnpm telegram:logs # tail the background-service log
250
261
 
251
- # SDK dependency auditing
252
- pnpm audit:sdk # regenerate docs/sdk-dependency.md snapshot
253
- pnpm audit:sdk:check # CI gate fail if new SDK symbols appear without a lock update
254
- pnpm audit:sdk:update-lock # update .sdk-dependency.lock.json allowlist
262
+ # Release
263
+ pnpm release # scripts/release.mjs — version bump + publish flow
264
+ pnpm release:dry # dry-run release flow (no git push / no npm publish)
255
265
  ```
256
266
 
257
- agent-afk is the only subrepo in its workspace that imports `@anthropic-ai/claude-agent-sdk` / `@anthropic-ai/sdk` directly. The `audit:sdk*` scripts track that surface mechanically and fail CI on unauthorized drift. See [`docs/sdk-dependency.md`](docs/sdk-dependency.md) and [`.sdk-dependency.lock.json`](.sdk-dependency.lock.json). When adding a new SDK import, run `pnpm audit:sdk:update-lock` and edit the lock entry's `reason` with a one-line justification before committing.
267
+ > Note: `tests/integration/` and `tests/e2e/` directories are not yet populated the live test suites live under `tests/agent/`, `tests/telegram/`, and colocated `*.test.ts` files in `src/`. The `test:integration` / `test:e2e` scripts are kept as placeholders for the planned split.
258
268
 
259
269
  ## Configuration
260
270
 
@@ -283,12 +293,17 @@ You can delete `~/.claude/` entirely and agent-afk still runs.
283
293
  Create a `.env` file in the project root:
284
294
 
285
295
  ```env
286
- # Required
296
+ # Required (Anthropic provider)
287
297
  ANTHROPIC_API_KEY=sk-ant-api03-...
288
298
 
289
- # Optional
290
- CLAUDE_MODEL=sonnet # opus | sonnet | haiku (default sonnet)
291
- TELEGRAM_BOT_TOKEN=1234567890:ABC...
299
+ # Provider / model selection
300
+ AFK_MODEL=sonnet # opus | sonnet | haiku (Anthropic) or codex (default sonnet)
301
+ AFK_DEFAULT_SUBAGENT_MODEL= # override default subagent model
302
+ AFK_THINKING=on # on | off | <budget-tokens> — extended thinking (on by default)
303
+ AFK_EFFORT= # low | medium | high — reasoning effort (Codex provider)
304
+ AFK_MAX_OUTPUT_TOKENS= # cap on output tokens per turn
305
+ AFK_TEMPERATURE= # numeric override; provider-default if unset
306
+ AFK_TIMEOUT_MS= # per-turn timeout
292
307
 
293
308
  # Cost guardrails (see SDK-native features below)
294
309
  AFK_MAX_BUDGET_USD=5.00
@@ -298,12 +313,32 @@ AFK_TASK_BUDGET=100000
298
313
  AFK_DISABLE_PROMPT_CACHE= # 1 | true | yes | on disables; unset = enabled
299
314
  AFK_PROMPT_CACHE_TTL=1h # 5m | 1h (default 1h)
300
315
 
301
- # Optional: Color output control
302
- # Set NO_COLOR=1 to disable colors (per https://no-color.org)
303
- # Unset or leave empty for auto-detection (disables in CI or piped output)
304
- NO_COLOR=
316
+ # Cross-session memory & system prompt
317
+ AFK_SYSTEM_PROMPT= # raw string highest-priority override of AFK.md
318
+ AFK_HOME= # override ~/.afk
319
+ AFK_STATE_DIR= # override ~/.afk/state
320
+ AFK_FRAMEWORK_DIR= # override ~/.afk/agent-framework
321
+ AFK_AUTO_ROUTING= # auto-route bare slash inputs to skills
322
+
323
+ # Telegram bot (foreground + background daemon + send_telegram tool)
324
+ TELEGRAM_BOT_TOKEN=1234567890:ABC...
325
+ AFK_TELEGRAM_BOT_TOKEN= # alternative name accepted by setup wizard
326
+ AFK_TELEGRAM_ALLOWED_CHAT_IDS= # comma-separated chat IDs allowed to push to / receive from
327
+ TELEGRAM_DATA_DIR= # override Telegram state dir (defaults under ~/.afk/state)
328
+ TELEGRAM_VERBOSE= # 1 to log per-message details
329
+ AFK_TELEGRAM_TRACE= # 1 to dump raw bridge traffic
330
+
331
+ # Debug / dev
332
+ AFK_DEBUG= # 1 enables verbose logging
333
+ AFK_DEBUG_CLIPBOARD= # debug bracketed-paste / image-paste handling
334
+ AFK_DUMP_PROMPT= # write resolved system prompt to a file
335
+
336
+ # Color output (per https://no-color.org)
337
+ NO_COLOR= # set to disable colors; unset = auto-detect (CI / pipes)
305
338
  ```
306
339
 
340
+ The authoritative list of supported env vars lives in `src/` — search for `process.env.AFK_` or `process.env.TELEGRAM_` for the full surface. `.env.example` mirrors the most common ones.
341
+
307
342
  ### System Prompt Auto-Discovery
308
343
 
309
344
  agent-afk resolves the session system prompt through a 4-tier precedence chain (highest tier wins):
@@ -317,6 +352,8 @@ agent-afk resolves the session system prompt through a 4-tier precedence chain (
317
352
 
318
353
  **AFK.md format:** Plain Markdown, no frontmatter. The entire file content (trimmed) becomes the system prompt. Empty or whitespace-only files are treated as absent (tier 4 applies instead).
319
354
 
355
+ **Bootstrapping AFK.md:** run `/init` in the REPL to scan the current project and generate a tailored `AFK.md` at the repo root. See `src/cli/slash/commands/init.ts`.
356
+
320
357
  **Provenance tracking:** When using `--dump-prompt`, the `systemPromptSource` field in the dump shows which tier won:
321
358
  - `"env:AFK_SYSTEM_PROMPT"` — tier 1
322
359
  - `"file:/abs/path/afk.config.json"` — tier 2
@@ -335,12 +372,19 @@ Defaults are tuned for `agent-afk`'s long-lived surfaces (daemon, Telegram bot)
335
372
 
336
373
  Markers never leak back into stored history — `cache-policy.ts` clones-and-stamps so the canonical `messages` array stays marker-free across iterations (accumulating markers would break prefix-hash matching). Implementation: [`src/agent/providers/anthropic-direct/cache-policy.ts`](src/agent/providers/anthropic-direct/cache-policy.ts).
337
374
 
338
- ### Supported Models
375
+ ### Supported Models & Providers
376
+
377
+ agent-afk speaks to two providers through a single abstraction (`src/agent/providers/`):
339
378
 
379
+ **Anthropic (direct)** — default. Selects from:
340
380
  - **opus** — most capable, for complex tasks
341
381
  - **sonnet** — balanced performance and speed (default)
342
382
  - **haiku** — fastest, best for simple tasks
343
383
 
384
+ **OpenAI Codex** (`@openai/codex-sdk`) — set `AFK_MODEL=codex` (or pass `--model codex`). Implementation lives in `src/agent/providers/openai-codex.ts`. Tune reasoning effort via `AFK_EFFORT=low|medium|high`.
385
+
386
+ Per-session overrides: `--model <name>`, `--thinking <on|off|N>`, `--max-output-tokens <n>`, `--temperature <n>`.
387
+
344
388
  ## Plugins & Slash Commands
345
389
 
346
390
  ### Installing plugins
@@ -375,26 +419,66 @@ Advanced: AFK still auto-discovers any plugin dropped into `~/.afk/plugins/<name
375
419
 
376
420
  Plugin state (telemetry, ledger, briefs) writes to `~/.afk/agent-framework/` in the AFK runtime.
377
421
 
378
- ### Plugin skills as slash commands
422
+ ### REPL slash commands
379
423
 
380
- Every skill loaded from `~/.afk/plugins/<plugin>/skills/*/SKILL.md` is exposed automatically in the interactive REPL as `/<skill-name>`. There is no per-skill handler code — agent-afk asks the SDK for its skill catalog at session start (`session.supportedCommands()`) and registers a passthrough handler per entry. Typing `/mint add dark mode` pipes the raw line into the SDK turn loop; the subprocess parses the slash form natively and dispatches to the plugin's skill, exactly the way Claude Code does it.
424
+ The interactive REPL registers slash commands directly in TypeScript (`src/cli/slash/`) they don't pass through to any external Claude Code subprocess. Categories:
381
425
 
426
+ **Core / session control**
382
427
  - `/help` — list all available slash commands (built-in + plugin-loaded)
383
- - `/skills` — discover skills loaded from plugins
384
- - `/reload-plugins` — reload after editing SKILL.md files on disk
385
-
386
- Implementation lives in `src/cli/slash/plugin-skills.ts`; see the module header for the flow.
387
-
388
- ### SDK-native features surfaced in the REPL
389
-
390
- agent-afk wires several Claude Agent SDK capabilities that Claude Code exposes natively, so they feel the same here:
391
-
392
- - **`/agents`** — lists Task-tool subagents loaded by the SDK (plugin + user + project scope). Agents are not user-invokable slashes; they're dispatch targets the model picks via the Task tool. The list shows name, description, and model override when present. Refresh with `/reload-plugins` after editing `~/.afk/agents/` or plugin agent definitions.
393
- - **`/tokens`**renders the authoritative SDK breakdown of context usage: total vs model max, auto-compact threshold, top categories, system tools, MCP tools, agents, skills, slash commands, and the last-turn API usage. Falls back to local-stats aggregation when the SDK call can't be served (e.g., before the subprocess is warm).
394
- - **Status-line context %** sampled every 3 turns from `session.getContextUsage()`, cached between samples, degrades gracefully on transient failures. See `src/cli/context-sampler.ts`.
395
- - **Progress banners** when the SDK emits `task_progress` events (long subagent runs, multi-tool flows), they render inline as `◦ description (stats)` plus an indented summary when present. Telegram forwards the same lines with the existing edit-throttle, and prompt suggestions trail as `💡` lines below the response. Enabled by default via `agentProgressSummaries: true`.
396
- - **Cost guardrails** pass `--max-budget-usd <n>` (or set `AFK_MAX_BUDGET_USD`) to abort the session cleanly on cost breach. `--task-budget <tokens>` (or `AFK_TASK_BUDGET`) is an advisory per-task token hint surfaced to the model so it can pace itself. Both work across `afk interactive`, `afk chat`, and the Telegram bot.
397
- - **MCP elicitations** — when an MCP server requests OAuth consent (e.g. Supabase re-auth), the REPL prints the server name, message, and URL, then asks `Continue? [y/N]`. Empty answer cancels; `n` declines; `y` accepts. Form-mode elicitations are auto-declined in v1 (tracked in `todo.md`). Handler is installed via `elicitationRouter.install(...)`; bridges can install their own.
428
+ - `/exit`, `/quit` — leave the REPL
429
+ - `/clear` — clear screen
430
+ - `/compact` — manually compact conversation history
431
+ - `/reset` start a fresh session, discarding history
432
+
433
+ **Information**
434
+ - `/cost` — running cost for the session
435
+ - `/tokens` (alias `/ctx`) SDK breakdown of context usage
436
+ - `/history` — print prior turns
437
+ - `/model` show or switch active model
438
+ - `/tools`list registered tools
439
+ - `/mcp`show MCP server status
440
+ - `/limits`show rate-limit / budget state
441
+ - `/debug`toggle verbose debug output
442
+
443
+ **Planning & state**
444
+ - `/plan` — open the plan editor
445
+ - `/todo` — manage the persistent todo list
446
+ - `/save` — snapshot session state to disk
447
+ - `/resume` — resume a saved session
448
+ - `/init` — scan the current project and write `AFK.md`
449
+ - `/changelog` — render `CHANGELOG.md` paginated
450
+
451
+ **Background tasks** (Ctrl+B detaches the current turn)
452
+ - `/bg` — list backgrounded tasks
453
+ - `/tasks` — show running/queued tasks with status
454
+ - `/attach <id>` — re-attach to a backgrounded task
455
+
456
+ **Skills (built-in)** — see [Orchestration Skills](#orchestration-skills)
457
+ - `/mint`, `/diagnose`, `/forge`, `/audit-fit`
458
+
459
+ **Plugins / marketplaces**
460
+ - `/skills` (alias `/builtin-skills`) — discover skills loaded from plugins & user scope
461
+ - `/agents` — list Task-tool subagents loaded by the SDK
462
+ - `/reload-plugins` — re-scan plugin and user directories after edits
463
+
464
+ Implementation: `src/cli/slash/index.ts` (`registerAll()`), individual command modules under `src/cli/slash/commands/`. Plugin-discovered skills (`~/.afk/plugins/<plugin>/skills/<skill>/SKILL.md` and `~/.afk/skills/<skill>/SKILL.md`) are registered via `src/cli/slash/builtin-skills.ts` and `src/cli/slash/plugin-skills.ts`.
465
+
466
+ ### Runtime features surfaced in the REPL
467
+
468
+ agent-afk wires several capabilities on top of the provider abstraction:
469
+
470
+ - **Cross-session memory** — three built-in tools (`memory_search`, `memory_update`, `procedure_write`) backed by SQLite at `~/.afk/agent-framework/memory/`. `HOT.md` is injected into every future session's system prompt for durable essentials. See `src/agent/memory/` and `src/agent/tools/handlers/memory-*.ts`.
471
+ - **`compose` tool — DAG-based orchestration** — agents (and the main session) can dispatch up to 20 subagent nodes with explicit dependency edges. Independent nodes run in parallel; dependent nodes wait. Fail-fast cancels downstream nodes by default. See `src/agent/tools/compose-executor.ts` and `src/agent/dag.ts`.
472
+ - **Background tasks** — Ctrl+B in the REPL detaches the current turn into a tracked background task. `/bg` lists tasks, `/tasks` shows status, `/attach <id>` re-attaches. Status bar at the bottom of the REPL surfaces running task counts. Implementation: `src/cli/background-status-bar.ts`, `src/cli/commands/interactive/background.js`.
473
+ - **`send_telegram` built-in tool** — agents can push terminal-state notifications to the operator. Recipients are gated by `AFK_TELEGRAM_ALLOWED_CHAT_IDS`; safe to attempt unconditionally (returns an error if Telegram is unconfigured). Handler: `src/agent/tools/handlers/send-telegram.ts`.
474
+ - **Extended thinking on by default** — Anthropic's thinking budget is auto-enabled. Override per-session with `--thinking on|off|<budget-tokens>` or globally with `AFK_THINKING`.
475
+ - **`/tokens`** — authoritative breakdown of context usage: total vs model max, auto-compact threshold, top categories, system tools, MCP tools, agents, skills, slash commands, and the last-turn API usage.
476
+ - **Status-line context %** — sampled every few turns from `session.getContextUsage()`, cached between samples, degrades gracefully on transient failures. See `src/cli/context-sampler.ts`.
477
+ - **Progress banners** — when the provider emits `task_progress` events (long subagent runs, multi-tool flows), they render inline as `◦ description (stats)` with an indented summary when present. Telegram forwards the same lines with edit-throttling, and prompt suggestions trail as `💡` lines below the response.
478
+ - **Cost guardrails** — `--max-budget-usd <n>` / `AFK_MAX_BUDGET_USD` aborts on cost breach. `--task-budget <tokens>` / `AFK_TASK_BUDGET` is an advisory per-task hint surfaced to the model.
479
+ - **MCP elicitations** — when an MCP server requests OAuth consent (e.g. Supabase re-auth), the REPL prints the server name, message, and URL, then asks `Continue? [y/N]`. Empty cancels; `n` declines; `y` accepts. Form-mode elicitations are auto-declined in v1. Handler: `src/agent/elicitation-router.ts`.
480
+ - **Clipboard image paste** — paste images directly into the REPL (macOS pasteboard; bracketed-paste-aware). See `src/cli/input/clipboard-image.ts`.
481
+ - **Auto-update check** — startup checks for a newer published version and prints a notice. Suppress with `afk --no-update-check`. Policy field `updatePolicy` (`notify`|`auto`|`off`) lives in `afk.config.json`. Implementation: `src/cli/update-checker.ts`.
398
482
 
399
483
  ## Bypass Permissions Mode
400
484
 
@@ -457,7 +541,17 @@ agent-afk/
457
541
  ├── src/
458
542
  │ ├── cli/
459
543
  │ │ ├── index.ts # CLI entry (commander)
460
- │ │ └── commands/ # chat, interactive, status, config, daemon, login, plugin, skill-router
544
+ │ │ ├── commands/ # chat, interactive, status, config, daemon,
545
+ │ │ │ # login, plugin, marketplace, doctor,
546
+ │ │ │ # completion, telegram, etc.
547
+ │ │ ├── slash/ # REPL slash registry + commands/
548
+ │ │ │ # (help, plan, todo, bg, tasks, attach,
549
+ │ │ │ # init, changelog, builtin-skills, …)
550
+ │ │ ├── input/ # raw-mode, bracketed paste, clipboard images
551
+ │ │ ├── background-status-bar.ts
552
+ │ │ ├── context-sampler.ts
553
+ │ │ ├── update-checker.ts
554
+ │ │ └── config.ts, shared-helpers.ts
461
555
  │ ├── agent/
462
556
  │ │ ├── session.ts # AgentSession barrel
463
557
  │ │ ├── session/ # agent-session, query-options, …
@@ -466,8 +560,13 @@ agent-afk/
466
560
  │ │ ├── subagent-hooks.ts
467
561
  │ │ ├── routing-telemetry.ts # appends routing-decisions.jsonl
468
562
  │ │ ├── daemon/ # long-running headless agent
469
- │ │ ├── plugins/ # afk plugin install / update / remove / index-store
470
- │ │ ├── providers/ # provider abstraction
563
+ │ │ ├── plugins/ # afk plugin install / update / remove
564
+ │ │ ├── marketplaces/ # marketplace install / resolve / manifest
565
+ │ │ ├── providers/ # anthropic-direct, openai-codex
566
+ │ │ ├── memory/ # cross-session memory + HOT.md loader
567
+ │ │ ├── tools/ # built-in tool dispatcher + handlers
568
+ │ │ │ # (compose, subagent, skills, memory_*,
569
+ │ │ │ # send_telegram, …)
471
570
  │ │ ├── elicitation-router.ts
472
571
  │ │ ├── hook-registry.ts, hooks.ts, default-hook-registry.ts
473
572
  │ │ ├── permissions.ts, abort-graph.ts, dag.ts, message-queue.ts
@@ -475,33 +574,42 @@ agent-afk/
475
574
  │ │ ├── shadow-verify-nudge.ts
476
575
  │ │ └── types.ts, types/
477
576
  │ ├── skills/
577
+ │ │ ├── all.ts # canonical skill registry
478
578
  │ │ ├── mint/ # /mint
479
579
  │ │ ├── diagnose/ # /diagnose
480
- │ │ ├── shadow-verify/ # /shadow-verify
481
- │ │ ├── forge/ # /forge
482
- │ │ ├── parallelize/ # /parallelize
483
- │ │ ├── forge-gate-check/ # /forge-gate-check
484
- │ │ ├── forge-l2-eval/ # /forge-l2-eval
485
- │ │ ├── devils-advocate/ # in development
486
- │ │ ├── _agents/ # vendored subagents (qualify, research-agent, contract)
580
+ │ │ ├── forge/ # /forge (gate-check inlined)
581
+ │ │ ├── audit-fit/ # /audit-fit
582
+ │ │ ├── _agents/ # vendored subagents (qualify,
583
+ │ │ │ # research-agent, contract)
487
584
  │ │ ├── _lib/ # prompt-loader, shared helpers
488
585
  │ │ ├── example-template/ # scaffold for new skills
489
- │ │ └── index.ts
490
- │ ├── telegram/ # telegram bridge
586
+ │ │ └── user-skills.ts # lazy scan of ~/.afk/skills + project skills
587
+ │ ├── telemetry/ # shared telemetry schemas
588
+ │ ├── telegram/ # telegram bridge (setup wizard, push, etc.)
491
589
  │ ├── telegram.ts # telegram bot entry
492
590
  │ ├── utils/
493
591
  │ ├── paths.ts
494
592
  │ └── index.ts
495
593
  ├── tests/
496
- │ ├── integration/
497
- │ └── e2e/
594
+ │ ├── agent/ # cross-cutting integration suites
595
+ │ └── telegram/ # telegram bridge tests
498
596
  ├── scripts/
499
597
  │ ├── copy-prompts.js # bundles src/**/*.md into dist/ after tsc
500
- │ ├── audit-sdk-dependency.ts
501
- └── telegram-manager.sh
502
- ├── docs/
503
- └── sdk-dependency.md # committed SDK symbol snapshot
504
- ├── .sdk-dependency.lock.json # SDK symbol allowlist (CI-gated)
598
+ │ ├── build-dist.mjs # esbuild release bundle
599
+ ├── release.mjs # version bump + publish flow
600
+ ├── generate-changelog.mjs
601
+ ├── audit-sdk-dependency.ts # (not yet wired into package.json)
602
+ ├── colocate-tests.mjs
603
+ │ ├── telegram-manager.sh
604
+ │ └── verify-install.sh
605
+ ├── docs/ # design notes, audits, failure geometry
606
+ ├── landing/ # marketing site assets
607
+ ├── AGENTS.md
608
+ ├── CHANGELOG.md
609
+ ├── CLAUDE.md
610
+ ├── CONTRIBUTING.md
611
+ ├── afk.config.json.example
612
+ ├── verify.sh
505
613
  ├── pnpm-lock.yaml
506
614
  ├── package.json
507
615
  ├── tsconfig.json
@@ -519,26 +627,18 @@ pnpm lint # type-check without emitting
519
627
 
520
628
  `pnpm build` runs `tsc` and then `scripts/copy-prompts.js`, which copies every `src/**/*.md` file into `dist/` at matching relative paths. Skills read their prompts via `readFileSync` at import time, so those markdown files must live next to the compiled `.js` output.
521
629
 
522
- ### SDK Dependency Tracking
523
-
524
- agent-afk is the only subrepo in its workspace that imports from `@anthropic-ai/claude-agent-sdk` or `@anthropic-ai/sdk`. That surface is tracked mechanically:
525
-
526
- - [`docs/sdk-dependency.md`](docs/sdk-dependency.md) — committed snapshot of every tracked symbol, its files, and runtime-vs-type classification.
527
- - [`.sdk-dependency.lock.json`](.sdk-dependency.lock.json) — allowlist with per-symbol `reason` fields. CI fails when a new symbol or kind-change appears without a lock update.
528
- - `~/.afk/agent-framework/sdk-dependency-telemetry.jsonl` — append-only log of symbol-count deltas and SHA over time.
529
-
530
- When adding a new SDK import: run `pnpm audit:sdk:update-lock`, then edit the generated lock entry's `reason` with a one-line justification before committing.
531
-
532
630
  ### Testing
533
631
 
534
632
  ```bash
535
- pnpm test # all
633
+ pnpm test # all (vitest run)
536
634
  pnpm test:coverage # with coverage
537
635
  pnpm test:watch # watch mode
538
- pnpm test:integration # integration only
539
- pnpm test:e2e # e2e only
636
+ pnpm test:integration # vitest run tests/integration (dir not yet populated)
637
+ pnpm test:e2e # vitest run tests/e2e (dir not yet populated)
540
638
  ```
541
639
 
640
+ Tests are colocated as `*.test.ts` next to the implementation under `src/`, plus cross-cutting suites under `tests/agent/` and `tests/telegram/`.
641
+
542
642
  ## Troubleshooting
543
643
 
544
644
  ### API Key Issues
@@ -667,16 +767,15 @@ type SessionState = 'idle' | 'processing' | 'streaming' | 'closed';
667
767
 
668
768
  ## Contributing
669
769
 
670
- Contributions welcome. Standard flow:
770
+ Contributions welcome. See [`CONTRIBUTING.md`](CONTRIBUTING.md) and [`AGENTS.md`](AGENTS.md) for repo conventions. Standard flow:
671
771
 
672
772
  1. Fork the repository
673
773
  2. Create a feature branch
674
- 3. Make your changes
675
- 4. Add tests
676
- 5. Run `pnpm test && pnpm lint`
677
- 6. Open a pull request
774
+ 3. Make your changes (add or update tests alongside)
775
+ 4. Run `pnpm test && pnpm lint`
776
+ 5. Open a pull request
678
777
 
679
- New orchestration skills, CI gate changes, and ceiling-ledger conventions are documented in the workspace-root [`SYSTEM.md`](../SYSTEM.md).
778
+ New orchestration skills, CI gate changes, and ceiling-ledger conventions are documented in [`AGENTS.md`](AGENTS.md). A change log is maintained in [`CHANGELOG.md`](CHANGELOG.md) (also viewable in-REPL via `/changelog`).
680
779
 
681
780
  ## License
682
781
 
@@ -684,8 +783,10 @@ MIT © Griffin Long
684
783
 
685
784
  ## Acknowledgments
686
785
 
687
- - Built with [@anthropic-ai/claude-agent-sdk](https://www.npmjs.com/package/@anthropic-ai/claude-agent-sdk)
786
+ - Anthropic API client: [@anthropic-ai/sdk](https://www.npmjs.com/package/@anthropic-ai/sdk)
787
+ - OpenAI Codex client: [@openai/codex-sdk](https://www.npmjs.com/package/@openai/codex-sdk)
688
788
  - CLI framework: [Commander.js](https://github.com/tj/commander.js)
789
+ - Telegram: [Telegraf](https://telegraf.js.org/)
689
790
  - Testing: [Vitest](https://vitest.dev/)
690
791
 
691
792
  ---