talon-agent 1.6.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,22 +1,28 @@
1
1
  # Talon
2
2
 
3
3
  [![Node.js](https://img.shields.io/badge/node-%3E%3D22-339933?logo=nodedotjs&logoColor=white)](https://nodejs.org)
4
- [![TypeScript](https://img.shields.io/badge/TypeScript-5.9-3178C6?logo=typescript&logoColor=white)](https://www.typescriptlang.org/)
4
+ [![TypeScript](https://img.shields.io/badge/TypeScript-6.0-3178C6?logo=typescript&logoColor=white)](https://www.typescriptlang.org/)
5
5
  [![Claude](https://img.shields.io/badge/Claude_Agent_SDK-Anthropic-D97706)](https://github.com/anthropics/claude-agent-sdk-typescript)
6
6
  [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
7
7
  [![CI](https://github.com/dylanneve1/talon/actions/workflows/ci.yml/badge.svg)](https://github.com/dylanneve1/talon/actions/workflows/ci.yml)
8
8
 
9
- Multi-platform agentic AI harness powered by Claude. Runs on Telegram, Teams, and Terminal with full tool access through MCP.
9
+ Multi-platform agentic AI harness powered by Claude. Runs on **Telegram**, **Teams**, and **Terminal** with full tool access through MCP.
10
+
11
+ ---
10
12
 
11
13
  ## Features
12
14
 
13
- - **Multi-frontend** — Telegram (Grammy), Teams (Bot Framework), Terminal (readline)
14
- - **Claude Agent SDK** — streaming responses, extended thinking, 1M context sessions
15
- - **31 MCP tools** messaging, media, history, search, web, cron jobs, file system
16
- - **Plugin system** extend with external tool packages (keeps core OSS-clean)
17
- - **Cron jobs** persistent recurring tasks with full tool access
18
- - **Pulse** periodic conversation-aware engagement in group chats
19
- - **Per-chat settings** model, effort level, pulse toggle per conversation
15
+ | | |
16
+ |---|---|
17
+ | **Multi-frontend** | Telegram (Grammy + GramJS userbot), Microsoft Teams (Bot Framework), Terminal with live tool visibility |
18
+ | **Claude Agent SDK** | Streaming responses, extended thinking, adaptive effort, 1M token context, dynamic model discovery |
19
+ | **MCP tools** | Messaging, media, history, search, web fetch, cron jobs, stickers, file system, admin controls |
20
+ | **Plugins** | Hot-reloadable plugin system. Built-in: GitHub, MemPalace, Playwright, Brave Search |
21
+ | **Background agents** | Heartbeat (periodic maintenance) and Dream (memory consolidation + diary) |
22
+ | **Per-chat settings** | Model, effort level, and pulse toggle per conversation via inline keyboard |
23
+ | **Model registry** | Models discovered from the SDK at startup --- new models appear in all pickers automatically |
24
+
25
+ ---
20
26
 
21
27
  ## Quick Start
22
28
 
@@ -24,39 +30,134 @@ Multi-platform agentic AI harness powered by Claude. Runs on Telegram, Teams, an
24
30
  git clone https://github.com/dylanneve1/talon.git && cd talon
25
31
  npm install
26
32
 
27
- # Interactive setup (select frontend, configure tokens)
33
+ # Interactive setup (select frontend, configure tokens, pick model)
28
34
  npx talon setup
29
35
 
30
36
  # Start
31
- npx talon start # configured frontend (Telegram/Terminal)
37
+ npx talon start # configured frontend (daemon mode)
32
38
  npx talon chat # terminal chat mode
33
39
  ```
34
40
 
35
- Requires [Node.js 22+](https://nodejs.org/) and [Claude Code](https://docs.anthropic.com/en/docs/claude-code) installed and authenticated.
41
+ **Prerequisites:**
42
+ - [Node.js 22+](https://nodejs.org/)
43
+ - [Claude Code](https://docs.anthropic.com/en/docs/claude-code) installed and authenticated (`claude` CLI on PATH)
44
+
45
+ ---
36
46
 
37
47
  ## Architecture
38
48
 
39
49
  ```
40
- index.ts (Composition Root)
41
- ├── core/ Platform-agnostic core
42
- │ ├── gateway.ts HTTP bridge for MCP tool calls
43
- ├── dispatcher.ts Query queue + lifecycle
44
- ├── plugin.ts Plugin loader + registry
45
- ├── pulse.ts Periodic engagement
46
- └── cron.ts Persistent scheduled jobs
47
- ├── backend/
48
- ├── claude-sdk/ Claude Agent SDK + MCP subprocess
49
- └── opencode/ OpenCode SDK alternative
50
- ├── frontend/
51
- ├── telegram/ Grammy + GramJS userbot
52
- │ ├── teams/ Bot Framework
53
- │ └── terminal/ Readline CLI with tool call visibility
54
- └── storage/ Sessions, history, settings, cron, media
50
+ index.ts Composition root
51
+ |
52
+ +-- core/ Platform-agnostic engine
53
+ | +-- models.ts Model registry (dynamic SDK discovery)
54
+ | +-- gateway.ts HTTP bridge for MCP tool calls
55
+ | +-- dispatcher.ts Per-chat serial, cross-chat parallel execution
56
+ | +-- plugin.ts Plugin loader, registry, hot-reload
57
+ | +-- heartbeat.ts Periodic background agent
58
+ | +-- dream.ts Memory consolidation agent
59
+ | +-- pulse.ts Conversation-aware group engagement
60
+ | +-- cron.ts Persistent scheduled jobs
61
+ | +-- tools/ MCP tool definitions (13 files)
62
+ |
63
+ +-- backend/
64
+ | +-- claude-sdk/ Claude Agent SDK (modular: handler, stream,
65
+ | | options, state, warm, models, constants)
66
+ | +-- opencode/ OpenCode SDK alternative backend
67
+ |
68
+ +-- frontend/
69
+ | +-- telegram/ Grammy bot + GramJS userbot (10 files)
70
+ | +-- teams/ Bot Framework + Graph API
71
+ | +-- terminal/ Readline CLI with tool call visibility
72
+ |
73
+ +-- storage/ Sessions, history, chat settings,
74
+ | cron jobs, media index, daily logs
75
+ +-- util/ Config, logging, workspace, paths, time
76
+ ```
77
+
78
+ **Dependency rule:** `core/` imports nothing from `frontend/` or `backend/`. Frontends and backends depend on core types, never on each other.
79
+
80
+ ---
81
+
82
+ ## Built-in Plugins
83
+
84
+ ### GitHub
85
+
86
+ GitHub API access via the official GitHub MCP server. Gives the agent access to repositories, issues, PRs, code search, and more.
87
+
88
+ **Requirements:** Docker installed and running.
89
+
90
+ ```json
91
+ {
92
+ "github": {
93
+ "enabled": true,
94
+ "token": "ghp_..."
95
+ }
96
+ }
97
+ ```
98
+
99
+ The token is optional --- defaults to the output of `gh auth token` if the GitHub CLI is authenticated.
100
+
101
+ ### MemPalace
102
+
103
+ Structured long-term memory with vector search. The agent can store, search, and retrieve memories semantically. Integrates with Dream mode for automatic memory consolidation and personal diary entries.
104
+
105
+ **Requirements:** Python 3.10+ with the `mempalace` package.
106
+
107
+ ```bash
108
+ # Set up a Python environment
109
+ python -m venv ~/.talon/mempalace-venv
110
+ ~/.talon/mempalace-venv/bin/pip install mempalace # Unix
111
+ # or: ~/.talon/mempalace-venv/Scripts/pip install mempalace # Windows
112
+ ```
113
+
114
+ ```json
115
+ {
116
+ "mempalace": {
117
+ "enabled": true,
118
+ "palacePath": "~/.talon/workspace/palace",
119
+ "pythonPath": "~/.talon/mempalace-venv/bin/python"
120
+ }
121
+ }
55
122
  ```
56
123
 
57
- ## Plugin System
124
+ Both paths are optional --- defaults to `~/.talon/workspace/palace/` and the venv Python respectively.
125
+
126
+ ### Playwright
58
127
 
59
- Plugins add MCP tools and gateway actions without modifying core code. SOLID interface only `name` is required, everything else is optional.
128
+ Headless browser automation via the Playwright MCP server. The agent can browse websites, take screenshots, generate PDFs, fill forms, and scrape content.
129
+
130
+ **Requirements:** None --- `@playwright/mcp` is bundled with Talon.
131
+
132
+ ```json
133
+ {
134
+ "playwright": {
135
+ "enabled": true,
136
+ "browser": "chromium",
137
+ "headless": true
138
+ }
139
+ }
140
+ ```
141
+
142
+ Supported browsers: `chromium` (default), `chrome`, `firefox`, `webkit`, `msedge`.
143
+
144
+ ### Brave Search
145
+
146
+ Web search via the Brave Search MCP server. Replaces the built-in WebSearch/WebFetch tools with higher-quality search results.
147
+
148
+ ```json
149
+ {
150
+ "braveApiKey": "BSA..."
151
+ }
152
+ ```
153
+
154
+ Get an API key at [brave.com/search/api](https://brave.com/search/api/).
155
+
156
+ ---
157
+
158
+ ## Custom Plugins
159
+
160
+ Plugins add MCP tools and gateway actions without modifying core code. SOLID interface --- only `name` is required.
60
161
 
61
162
  ```json
62
163
  {
@@ -80,59 +181,92 @@ export default {
80
181
  };
81
182
  ```
82
183
 
184
+ Plugins support hot-reload via the `reload_plugins` MCP tool --- no restart required.
185
+
186
+ ---
187
+
83
188
  ## CLI
84
189
 
85
190
  ```
86
- talon setup Interactive setup wizard (multi-select frontends)
87
- talon start Start the configured frontend
191
+ talon setup Interactive setup wizard
192
+ talon start Start as a background daemon
193
+ talon stop Stop the daemon
88
194
  talon chat Terminal chat mode (always available)
89
- talon status Health, sessions, and plugin status
90
- talon config View/edit configuration
195
+ talon status Health, sessions, plugins, disk usage
196
+ talon config View or edit configuration
91
197
  talon logs Tail structured log file
92
- talon doctor Validate environment
198
+ talon doctor Validate environment and dependencies
93
199
  ```
94
200
 
201
+ ---
202
+
95
203
  ## Configuration
96
204
 
97
- `workspace/talon.json`:
205
+ Config file: `~/.talon/config.json`
98
206
 
99
207
  | Field | Default | Description |
100
208
  |-------|---------|-------------|
101
- | `frontend` | `"telegram"` | `"telegram"`, `"terminal"`, or both |
102
- | `botToken` | | Telegram bot token (required for Telegram) |
103
- | `model` | `"claude-sonnet-4-6"` | Default model |
104
- | `concurrency` | `1` | Max concurrent AI queries |
209
+ | `frontend` | `"telegram"` | `"telegram"`, `"terminal"`, `"teams"`, or an array |
210
+ | `backend` | `"claude"` | `"claude"` or `"opencode"` |
211
+ | `botToken` | --- | Telegram bot token |
212
+ | `model` | `"default"` | Default Claude model. Legacy `claude-*` aliases are still accepted. |
213
+ | `concurrency` | `1` | Max concurrent AI queries (1--20) |
105
214
  | `pulse` | `true` | Periodic group engagement |
215
+ | `heartbeat` | `false` | Background maintenance agent |
216
+ | `heartbeatIntervalMinutes` | `60` | Heartbeat interval |
217
+ | `braveApiKey` | --- | Brave Search API key |
218
+ | `timezone` | --- | IANA timezone (e.g. `"Europe/London"`) |
106
219
  | `plugins` | `[]` | External plugin packages |
107
- | `adminUserId` | | Telegram user ID for /admin |
108
- | `apiId` / `apiHash` | | Telegram API for full history |
220
+ | `adminUserId` | --- | Telegram user ID for `/admin` commands |
221
+ | `allowedUsers` | --- | Whitelist of Telegram user IDs |
222
+ | `apiId` / `apiHash` | --- | Telegram API credentials for full message history |
223
+ | `github` | --- | GitHub plugin config (see above) |
224
+ | `mempalace` | --- | MemPalace plugin config (see above) |
225
+ | `playwright` | --- | Playwright plugin config (see above) |
226
+
227
+ ---
109
228
 
110
229
  ## Terminal Mode
111
230
 
112
231
  ```bash
113
- talon chat # interactive terminal chat
232
+ npx talon chat
114
233
  ```
115
234
 
116
- Tool calls shown in real-time with parameters. Streaming phase indicators (thinking/responding/using tools). Per-turn stats (duration, tokens, cache hit, tool count).
235
+ Tool calls shown in real-time with parameters. Streaming phase indicators (thinking / responding / using tools). Per-turn stats: duration, tokens, cache hit rate, tool count.
236
+
237
+ Commands: `/model`, `/effort`, `/reset`, `/status`, `/help`
238
+
239
+ ---
117
240
 
118
241
  ## Production
119
242
 
120
- - **Docker**: `docker compose up -d`
121
- - **Systemd**: `talon.service` included
122
- - **Health**: `GET http://localhost:19876/health` — JSON with uptime, memory, queue, sessions
123
- - **Logging**: Structured JSON via pino to `workspace/talon.log`
124
- - **Resilience**: Model fallback, session auto-retry, rate limiting, atomic writes, graceful shutdown
243
+ **Docker:**
244
+ ```bash
245
+ docker compose up -d
246
+ ```
247
+
248
+ **Systemd:** `talon.service` included in the repository.
249
+
250
+ **Health endpoint:** `GET http://localhost:19876/health` returns JSON with uptime, memory, queue depth, active sessions, and last activity timestamp.
251
+
252
+ **Logging:** Structured JSON via pino to `~/.talon/talon.log`. Rotated on startup when the file exceeds 10MB.
253
+
254
+ **Resilience:** Dynamic model fallback on overload, session auto-retry on expiry, rate limit handling with backoff, atomic file writes, graceful shutdown with 15-second drain timeout.
255
+
256
+ ---
125
257
 
126
258
  ## Development
127
259
 
128
260
  ```bash
129
261
  npm run dev # watch mode
130
- npm test # 322 tests
131
- npm run test:coverage # with coverage
262
+ npm test # 1300+ tests
263
+ npm run test:coverage # with coverage report
132
264
  npm run typecheck # tsc --noEmit
133
265
  npm run lint # oxlint
134
266
  ```
135
267
 
268
+ ---
269
+
136
270
  ## License
137
271
 
138
272
  MIT
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "talon-agent",
3
- "version": "1.6.0",
3
+ "version": "1.7.0",
4
4
  "description": "Multi-frontend AI agent with full tool access, streaming, cron jobs, and plugin system",
5
5
  "author": "Dylan Neve",
6
6
  "license": "MIT",
@@ -51,7 +51,7 @@
51
51
  "format:check": "prettier --check src/ prompts/"
52
52
  },
53
53
  "dependencies": {
54
- "@anthropic-ai/claude-agent-sdk": "^0.2.104",
54
+ "@anthropic-ai/claude-agent-sdk": "^0.2.108",
55
55
  "@brave/brave-search-mcp-server": "^2.0.75",
56
56
  "@clack/prompts": "^1.2.0",
57
57
  "@grammyjs/auto-retry": "^2.0.2",
@@ -38,6 +38,12 @@ const { registerClaudeModelsStatic, CLAUDE_MODELS_STATIC } =
38
38
  await import("../backend/claude-sdk/models.js");
39
39
  registerClaudeModelsStatic(CLAUDE_MODELS_STATIC);
40
40
 
41
+ const SDK_MODEL_IDS = {
42
+ sonnet: "default",
43
+ opus: "opus",
44
+ haiku: "haiku",
45
+ } as const;
46
+
41
47
  describe("chat-settings", () => {
42
48
  describe("getChatSettings", () => {
43
49
  it("returns empty object for unknown chat", () => {
@@ -85,62 +91,67 @@ describe("chat-settings", () => {
85
91
  });
86
92
 
87
93
  describe("resolveModelName", () => {
88
- it("resolves 'sonnet' to claude-sonnet-4-6", () => {
89
- expect(resolveModelName("sonnet")).toBe("claude-sonnet-4-6");
94
+ it("resolves 'sonnet' to the SDK default model ID", () => {
95
+ expect(resolveModelName("sonnet")).toBe(SDK_MODEL_IDS.sonnet);
90
96
  });
91
97
 
92
- it("resolves 'opus' to claude-opus-4-6", () => {
93
- expect(resolveModelName("opus")).toBe("claude-opus-4-6");
98
+ it("resolves 'opus' to the SDK Opus model ID", () => {
99
+ expect(resolveModelName("opus")).toBe(SDK_MODEL_IDS.opus);
94
100
  });
95
101
 
96
- it("resolves 'haiku' to claude-haiku-4-5", () => {
97
- expect(resolveModelName("haiku")).toBe("claude-haiku-4-5");
102
+ it("resolves 'haiku' to the SDK Haiku model ID", () => {
103
+ expect(resolveModelName("haiku")).toBe(SDK_MODEL_IDS.haiku);
98
104
  });
99
105
 
100
106
  it("resolves versioned aliases", () => {
101
- expect(resolveModelName("sonnet-4.6")).toBe("claude-sonnet-4-6");
102
- expect(resolveModelName("opus-4.6")).toBe("claude-opus-4-6");
103
- expect(resolveModelName("haiku-4.5")).toBe("claude-haiku-4-5");
107
+ expect(resolveModelName("sonnet-4.6")).toBe(SDK_MODEL_IDS.sonnet);
108
+ expect(resolveModelName("opus-4.6")).toBe(SDK_MODEL_IDS.opus);
109
+ expect(resolveModelName("haiku-4.5")).toBe(SDK_MODEL_IDS.haiku);
104
110
  });
105
111
 
106
112
  it("resolves dash-separated aliases", () => {
107
- expect(resolveModelName("sonnet-4-6")).toBe("claude-sonnet-4-6");
108
- expect(resolveModelName("opus-4-6")).toBe("claude-opus-4-6");
109
- expect(resolveModelName("haiku-4-5")).toBe("claude-haiku-4-5");
113
+ expect(resolveModelName("sonnet-4-6")).toBe(SDK_MODEL_IDS.sonnet);
114
+ expect(resolveModelName("opus-4-6")).toBe(SDK_MODEL_IDS.opus);
115
+ expect(resolveModelName("haiku-4-5")).toBe(SDK_MODEL_IDS.haiku);
110
116
  });
111
117
 
112
118
  it("is case-insensitive", () => {
113
- expect(resolveModelName("Sonnet")).toBe("claude-sonnet-4-6");
114
- expect(resolveModelName("OPUS")).toBe("claude-opus-4-6");
119
+ expect(resolveModelName("Sonnet")).toBe(SDK_MODEL_IDS.sonnet);
120
+ expect(resolveModelName("OPUS")).toBe(SDK_MODEL_IDS.opus);
115
121
  });
116
122
 
117
123
  it("trims whitespace", () => {
118
- expect(resolveModelName(" sonnet ")).toBe("claude-sonnet-4-6");
124
+ expect(resolveModelName(" sonnet ")).toBe(SDK_MODEL_IDS.sonnet);
119
125
  });
120
126
 
121
127
  it("passes through unknown model names unchanged", () => {
122
128
  expect(resolveModelName("gpt-4")).toBe("gpt-4");
123
- expect(resolveModelName("claude-sonnet-4-6")).toBe("claude-sonnet-4-6");
129
+ });
130
+
131
+ it("resolves legacy claude-* aliases to the current SDK IDs", () => {
132
+ expect(resolveModelName("claude-sonnet-4-6")).toBe(SDK_MODEL_IDS.sonnet);
133
+ expect(resolveModelName("claude-opus-4-6")).toBe(SDK_MODEL_IDS.opus);
134
+ expect(resolveModelName("claude-haiku-4-5")).toBe(SDK_MODEL_IDS.haiku);
124
135
  });
125
136
  });
126
137
 
127
138
  describe("resolveModelName — exhaustive alias coverage", () => {
128
139
  it("resolves all base aliases correctly", () => {
129
- expect(resolveModelName("sonnet")).toBe("claude-sonnet-4-6");
130
- expect(resolveModelName("opus")).toBe("claude-opus-4-6");
131
- expect(resolveModelName("haiku")).toBe("claude-haiku-4-5");
140
+ expect(resolveModelName("sonnet")).toBe(SDK_MODEL_IDS.sonnet);
141
+ expect(resolveModelName("opus")).toBe(SDK_MODEL_IDS.opus);
142
+ expect(resolveModelName("haiku")).toBe(SDK_MODEL_IDS.haiku);
132
143
  });
133
144
 
134
145
  it("resolves all dot-separated version aliases", () => {
135
- expect(resolveModelName("sonnet-4.6")).toBe("claude-sonnet-4-6");
136
- expect(resolveModelName("opus-4.6")).toBe("claude-opus-4-6");
137
- expect(resolveModelName("haiku-4.5")).toBe("claude-haiku-4-5");
146
+ expect(resolveModelName("sonnet-4.6")).toBe(SDK_MODEL_IDS.sonnet);
147
+ expect(resolveModelName("opus-4.6")).toBe(SDK_MODEL_IDS.opus);
148
+ expect(resolveModelName("haiku-4.5")).toBe(SDK_MODEL_IDS.haiku);
138
149
  });
139
150
 
140
151
  it("resolves all dash-separated version aliases", () => {
141
- expect(resolveModelName("sonnet-4-6")).toBe("claude-sonnet-4-6");
142
- expect(resolveModelName("opus-4-6")).toBe("claude-opus-4-6");
143
- expect(resolveModelName("haiku-4-5")).toBe("claude-haiku-4-5");
152
+ expect(resolveModelName("sonnet-4-6")).toBe(SDK_MODEL_IDS.sonnet);
153
+ expect(resolveModelName("opus-4-6")).toBe(SDK_MODEL_IDS.opus);
154
+ expect(resolveModelName("haiku-4-5")).toBe(SDK_MODEL_IDS.haiku);
144
155
  });
145
156
 
146
157
  it("passes through completely unknown model names unchanged", () => {
@@ -149,10 +160,10 @@ describe("chat-settings", () => {
149
160
  expect(resolveModelName("mistral-large")).toBe("mistral-large");
150
161
  });
151
162
 
152
- it("passes through full claude model names unchanged (not aliases)", () => {
153
- expect(resolveModelName("claude-sonnet-4-6")).toBe("claude-sonnet-4-6");
154
- expect(resolveModelName("claude-opus-4-6")).toBe("claude-opus-4-6");
155
- expect(resolveModelName("claude-haiku-4-5")).toBe("claude-haiku-4-5");
163
+ it("maps full claude compatibility aliases to the current SDK IDs", () => {
164
+ expect(resolveModelName("claude-sonnet-4-6")).toBe(SDK_MODEL_IDS.sonnet);
165
+ expect(resolveModelName("claude-opus-4-6")).toBe(SDK_MODEL_IDS.opus);
166
+ expect(resolveModelName("claude-haiku-4-5")).toBe(SDK_MODEL_IDS.haiku);
156
167
  });
157
168
 
158
169
  it("preserves original casing for unknown models", () => {
@@ -171,16 +182,16 @@ describe("chat-settings", () => {
171
182
  });
172
183
 
173
184
  describe("model alias resolution (via registry)", () => {
174
- it("resolves short aliases to full model IDs", () => {
175
- expect(resolveModelName("sonnet")).toBe("claude-sonnet-4-6");
176
- expect(resolveModelName("opus")).toBe("claude-opus-4-6");
177
- expect(resolveModelName("haiku")).toBe("claude-haiku-4-5");
185
+ it("resolves short aliases to SDK model IDs", () => {
186
+ expect(resolveModelName("sonnet")).toBe(SDK_MODEL_IDS.sonnet);
187
+ expect(resolveModelName("opus")).toBe(SDK_MODEL_IDS.opus);
188
+ expect(resolveModelName("haiku")).toBe(SDK_MODEL_IDS.haiku);
178
189
  });
179
190
 
180
191
  it("resolves versioned aliases", () => {
181
- expect(resolveModelName("sonnet-4-6")).toBe("claude-sonnet-4-6");
182
- expect(resolveModelName("opus-4.6")).toBe("claude-opus-4-6");
183
- expect(resolveModelName("haiku-4.5")).toBe("claude-haiku-4-5");
192
+ expect(resolveModelName("sonnet-4-6")).toBe(SDK_MODEL_IDS.sonnet);
193
+ expect(resolveModelName("opus-4.6")).toBe(SDK_MODEL_IDS.opus);
194
+ expect(resolveModelName("haiku-4.5")).toBe(SDK_MODEL_IDS.haiku);
184
195
  });
185
196
 
186
197
  it("passes through unknown names unchanged", () => {
@@ -0,0 +1,157 @@
1
+ import { beforeEach, describe, expect, it, vi } from "vitest";
2
+
3
+ const mockSupportedModels = vi.fn();
4
+
5
+ vi.mock("@anthropic-ai/claude-agent-sdk", () => ({
6
+ query: vi.fn(() => ({
7
+ supportedModels: mockSupportedModels,
8
+ [Symbol.asyncIterator]() {
9
+ return {
10
+ next: async () => ({ done: true, value: undefined }),
11
+ };
12
+ },
13
+ })),
14
+ }));
15
+
16
+ const sdkModels = [
17
+ {
18
+ value: "default",
19
+ displayName: "Default (recommended)",
20
+ description: "Sonnet 4.6 · Best for everyday tasks",
21
+ },
22
+ {
23
+ value: "sonnet[1m]",
24
+ displayName: "Sonnet (1M context)",
25
+ description:
26
+ "Sonnet 4.6 with 1M context · Billed as extra usage · $3/$15 per Mtok",
27
+ },
28
+ {
29
+ value: "opus",
30
+ displayName: "Opus",
31
+ description: "Opus 4.6 · Most capable for complex work",
32
+ },
33
+ {
34
+ value: "opus[1m]",
35
+ displayName: "Opus (1M context)",
36
+ description:
37
+ "Opus 4.6 with 1M context · Billed as extra usage · $5/$25 per Mtok",
38
+ },
39
+ {
40
+ value: "haiku",
41
+ displayName: "Haiku",
42
+ description: "Haiku 4.5 · Fastest for quick answers",
43
+ },
44
+ {
45
+ value: "claude-sonnet-4-6",
46
+ displayName: "Sonnet 4.6",
47
+ description: "claude-sonnet-4-6",
48
+ },
49
+ ];
50
+
51
+ describe("registerClaudeModels", () => {
52
+ beforeEach(async () => {
53
+ vi.resetModules();
54
+ vi.clearAllMocks();
55
+ mockSupportedModels.mockResolvedValue(sdkModels);
56
+
57
+ const { clearModels } = await import("../core/models.js");
58
+ clearModels();
59
+ });
60
+
61
+ it("keeps SDK IDs/display names and maps 1M upgrades explicitly", async () => {
62
+ const { registerClaudeModels } =
63
+ await import("../backend/claude-sdk/models.js");
64
+ const {
65
+ get1mContextModelId,
66
+ getModels,
67
+ resolveModelId,
68
+ supports1mContext,
69
+ } = await import("../core/models.js");
70
+
71
+ await registerClaudeModels({ model: "default" });
72
+
73
+ const anthropicModels = getModels("anthropic");
74
+ expect(anthropicModels.map((model) => model.id)).toEqual([
75
+ "opus",
76
+ "opus[1m]",
77
+ "default",
78
+ "sonnet[1m]",
79
+ "haiku",
80
+ ]);
81
+
82
+ expect(
83
+ anthropicModels.find((model) => model.id === "default")?.displayName,
84
+ ).toBe("Default (recommended)");
85
+ expect(
86
+ anthropicModels.find((model) => model.id === "sonnet[1m]")?.displayName,
87
+ ).toBe("Sonnet (1M context)");
88
+ expect(
89
+ anthropicModels.some((model) => model.id === "claude-sonnet-4-6"),
90
+ ).toBe(false);
91
+
92
+ expect(resolveModelId("claude-sonnet-4-6")).toBe("default");
93
+ expect(resolveModelId("claude-sonnet-4-6[1m]")).toBe("sonnet[1m]");
94
+ expect(resolveModelId("claude-opus-4-6")).toBe("opus");
95
+
96
+ expect(get1mContextModelId("default")).toBe("sonnet[1m]");
97
+ expect(get1mContextModelId("claude-sonnet-4-6")).toBe("sonnet[1m]");
98
+ expect(get1mContextModelId("opus")).toBe("opus[1m]");
99
+ expect(get1mContextModelId("haiku")).toBeNull();
100
+
101
+ expect(supports1mContext("claude-sonnet-4-6")).toBe(true);
102
+ expect(supports1mContext("haiku")).toBe(false);
103
+ });
104
+
105
+ it("derives compatibility aliases from SDK metadata instead of hardcoded versions", async () => {
106
+ mockSupportedModels.mockResolvedValue([
107
+ {
108
+ value: "default",
109
+ displayName: "Default (recommended)",
110
+ description: "Sonnet 5.0 · Best for everyday tasks",
111
+ },
112
+ {
113
+ value: "sonnet[1m]",
114
+ displayName: "Sonnet (1M context)",
115
+ description:
116
+ "Sonnet 5.0 with 1M context · Billed as extra usage · $3/$15 per Mtok",
117
+ },
118
+ {
119
+ value: "opus",
120
+ displayName: "Opus",
121
+ description: "Opus 5.0 · Most capable for complex work",
122
+ },
123
+ {
124
+ value: "opus[1m]",
125
+ displayName: "Opus (1M context)",
126
+ description:
127
+ "Opus 5.0 with 1M context · Billed as extra usage · $5/$25 per Mtok",
128
+ },
129
+ {
130
+ value: "haiku",
131
+ displayName: "Haiku",
132
+ description: "Haiku 5.0 · Fastest for quick answers",
133
+ },
134
+ {
135
+ value: "claude-sonnet-5-0",
136
+ displayName: "Sonnet 5.0",
137
+ description: "claude-sonnet-5-0",
138
+ },
139
+ ]);
140
+
141
+ const { registerClaudeModels } =
142
+ await import("../backend/claude-sdk/models.js");
143
+ const { get1mContextModelId, resolveModelId } =
144
+ await import("../core/models.js");
145
+
146
+ await registerClaudeModels({ model: "default" });
147
+
148
+ expect(resolveModelId("claude-sonnet-5-0")).toBe("default");
149
+ expect(resolveModelId("claude-sonnet-4-6")).toBe("default");
150
+ expect(resolveModelId("claude-opus-5-0")).toBe("opus");
151
+ expect(resolveModelId("claude-opus-4-6")).toBe("opus");
152
+ expect(resolveModelId("claude-haiku-5-0")).toBe("haiku");
153
+ expect(resolveModelId("claude-haiku-4-5")).toBe("haiku");
154
+ expect(get1mContextModelId("claude-sonnet-4-6")).toBe("sonnet[1m]");
155
+ expect(get1mContextModelId("claude-sonnet-5-0")).toBe("sonnet[1m]");
156
+ });
157
+ });