@arikusi/deepseek-mcp-server 1.7.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -7,6 +7,44 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [2.0.0] - 2026-06-22
11
+
12
+ ### Changed
13
+ - **DeepSeek V4 migration.** `deepseek-v4-flash` (new default) and `deepseek-v4-pro` are now the primary models, both with 1M context and up to 384K output tokens. `deepseek-chat` and `deepseek-reasoner` are kept as backward-compatible aliases that resolve to `deepseek-v4-flash` (chat maps to non-thinking, reasoner to thinking). The DeepSeek API retires those two names on 2026-07-24, so the server translates them before sending the request.
14
+ - **Default model is now `deepseek-v4-flash`** (was `deepseek-chat`). Direct v4 calls default to non-thinking for fast responses; enable reasoning with `thinking: {type: "enabled"}` or the `deepseek-reasoner` alias. The V4 API defaults thinking to enabled, so the server now always sends an explicit thinking flag.
15
+ - V4 pricing per 1M tokens: v4-flash `$0.0028` cache hit / `$0.14` cache miss / `$0.28` output; v4-pro `$0.003625` / `$0.435` / `$0.87`.
16
+ - `max_tokens` upper bound raised to 384000.
17
+ - Model fallback now pairs `deepseek-v4-flash` with `deepseek-v4-pro`.
18
+ - The hosted worker endpoint (`deepseek-mcp.tahirl.com`) was migrated to V4 as well.
19
+
20
+ ### Added
21
+ - `reasoning_effort` parameter (`high` / `max`) for thinking mode.
22
+
23
+ ### Fixed
24
+ - Declared `cost_usd` and `routed_from` in the `deepseek_chat` output schema. Strict MCP clients rejected every response under the SDK 1.29 structured-content validation because these fields were returned but not declared.
25
+
26
+ ### Removed
27
+ - Stopped sending `frequency_penalty` / `presence_penalty`, which the V4 API deprecated and ignores.
28
+
29
+ ## [1.8.0] - 2026-06-14
30
+
31
+ ### Security
32
+ - **Missing authentication on the self-hosted HTTP endpoint.** In HTTP transport mode the server holds your `DEEPSEEK_API_KEY` and uses it for every `deepseek_chat` call, yet `POST /mcp` had no authentication and the server bound to `0.0.0.0`, so any client that could reach the port could initialize a session, enumerate tools, and invoke them. The defaults now bind to loopback and an optional bearer token guards the endpoint. Reported independently; advisory and CVE coordination in progress.
33
+
34
+ ### Changed
35
+ - HTTP transport now binds to `127.0.0.1` by default (configurable via `HTTP_HOST`). The SDK's DNS rebinding protection is active on loopback. Binding to `0.0.0.0` without a token prints a startup security warning.
36
+ - `docker-compose.yml` publishes the port to `127.0.0.1` only, and the README's `docker run` example does the same.
37
+ - **Minimum Node.js is now 20.** Node 18 reached end of life in April 2025 and the test toolchain (vitest 4) no longer runs on it. The published package follows suit (`engines.node` is `>=20.0.0`); CI tests on Node 20, 22, and 24.
38
+
39
+ ### Added
40
+ - `HTTP_AUTH_TOKEN`: when set, `POST/GET/DELETE /mcp` require `Authorization: Bearer <token>` (constant-time comparison). `/health` stays open for probes.
41
+ - `HTTP_ALLOWED_HOSTS`: comma-separated allowed `Host` headers, keeping DNS rebinding protection when binding to `0.0.0.0`.
42
+ - `SECURITY.md` with the disclosure policy and self-hosted HTTP hardening guidance.
43
+ - Auth and host-binding tests (`src/transport-auth.test.ts`).
44
+
45
+ ### Fixed
46
+ - Bumped `@modelcontextprotocol/sdk` to 1.29.0 and `vitest`/`@vitest/coverage-v8` to 4.1.8, clearing all transitive `npm audit` advisories (13 to 0).
47
+
10
48
  ## [1.7.0] - 2026-04-22
11
49
 
12
50
  ### Security
package/README.md CHANGED
@@ -5,7 +5,7 @@
5
5
  <h1 align="center">DeepSeek MCP Server</h1>
6
6
 
7
7
  <p align="center">
8
- MCP server for DeepSeek AI with chat, reasoning, multi-turn sessions, function calling, thinking mode, and cost tracking.
8
+ MCP server for DeepSeek V4 (v4-flash and v4-pro, 1M context) with multi-turn sessions, function calling, thinking mode, and cost tracking.
9
9
  </p>
10
10
 
11
11
  <p align="center">
@@ -13,7 +13,8 @@
13
13
  <a href="https://www.npmjs.com/package/@arikusi/deepseek-mcp-server"><img src="https://img.shields.io/npm/dm/@arikusi/deepseek-mcp-server.svg" alt="npm downloads" /></a>
14
14
  <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT" /></a>
15
15
  <a href="https://nodejs.org/"><img src="https://img.shields.io/node/v/@arikusi/deepseek-mcp-server.svg" alt="Node.js Version" /></a>
16
- <a href="https://www.typescriptlang.org/"><img src="https://img.shields.io/badge/TypeScript-5.7-blue.svg" alt="TypeScript" /></a>
16
+ <a href="https://www.typescriptlang.org/"><img src="https://img.shields.io/badge/TypeScript-6.0-blue.svg" alt="TypeScript" /></a>
17
+ <a href="https://api-docs.deepseek.com"><img src="https://img.shields.io/badge/DeepSeek-V4-7c3aed.svg" alt="DeepSeek V4" /></a>
17
18
  <a href="https://github.com/arikusi/deepseek-mcp-server/actions"><img src="https://github.com/arikusi/deepseek-mcp-server/workflows/CI/badge.svg" alt="Build Status" /></a>
18
19
  </p>
19
20
 
@@ -34,6 +35,8 @@
34
35
  </a>
35
36
  </p>
36
37
 
38
+ > **v2.0.0 runs on DeepSeek V4.** Two models, `deepseek-v4-flash` (fast and economical) and `deepseek-v4-pro` (top capability), both with a 1M-token context window and optional chain-of-thought thinking. Existing `deepseek-chat` and `deepseek-reasoner` setups keep working as aliases, so upgrading is drop-in.
39
+
37
40
  ## Quick Start
38
41
 
39
42
  ### Remote (No Install)
@@ -84,7 +87,7 @@ gemini mcp add deepseek npx @arikusi/deepseek-mcp-server -e DEEPSEEK_API_KEY=you
84
87
 
85
88
  ## Features
86
89
 
87
- - **DeepSeek V3.2**: Both models now run DeepSeek-V3.2 (since Sept 2025)
90
+ - **DeepSeek V4**: `deepseek-v4-flash` and `deepseek-v4-pro`, both with 1M context and optional chain-of-thought thinking mode
88
91
  - **Multi-Turn Sessions**: Conversation context preserved across requests via `session_id` parameter
89
92
  - **Model Fallback & Circuit Breaker**: Automatic fallback between models with circuit breaker protection against cascading failures
90
93
  - **MCP Resources**: `deepseek://models`, `deepseek://config`, `deepseek://usage` — query model info, config, and usage stats
@@ -100,7 +103,7 @@ gemini mcp add deepseek npx @arikusi/deepseek-mcp-server -e DEEPSEEK_API_KEY=you
100
103
  - **Remote Endpoint**: Hosted at `deepseek-mcp.tahirl.com/mcp` — BYOK (Bring Your Own Key), no install needed
101
104
  - **HTTP Transport**: Self-hosted remote access via Streamable HTTP with `TRANSPORT=http`
102
105
  - **Docker Ready**: Multi-stage Dockerfile with health checks for containerized deployment
103
- - **Tested**: 265 tests, ~89% line coverage
106
+ - **Tested**: 280 tests, ~89% line coverage
104
107
  - **Type-Safe**: Full TypeScript implementation
105
108
  - **MCP Compatible**: Works with any MCP-compatible CLI (Claude Code, Gemini CLI, etc.)
106
109
 
@@ -108,7 +111,7 @@ gemini mcp add deepseek npx @arikusi/deepseek-mcp-server -e DEEPSEEK_API_KEY=you
108
111
 
109
112
  ### Prerequisites
110
113
 
111
- - Node.js 18+
114
+ - Node.js 20+
112
115
  - A DeepSeek API key (get one at [https://platform.deepseek.com](https://platform.deepseek.com))
113
116
 
114
117
  ### Manual Installation
@@ -186,13 +189,14 @@ Chat with DeepSeek AI models with automatic cost tracking and function calling s
186
189
  - `role`: "system" | "user" | "assistant" | "tool"
187
190
  - `content`: Message text
188
191
  - `tool_call_id` (optional): Required for tool role messages
189
- - `model` (optional): "deepseek-chat" (default) or "deepseek-reasoner"
192
+ - `model` (optional): "deepseek-v4-flash" (default) or "deepseek-v4-pro". "deepseek-chat" and "deepseek-reasoner" are accepted as aliases that resolve to v4-flash (non-thinking / thinking).
190
193
  - `temperature` (optional): 0-2, controls randomness (default: 1.0). Ignored when thinking mode is enabled.
191
- - `max_tokens` (optional): Maximum tokens to generate (deepseek-chat: max 8192, deepseek-reasoner: max 65536)
194
+ - `max_tokens` (optional): Maximum tokens to generate (V4 models support up to 384000)
192
195
  - `stream` (optional): Enable streaming mode (default: false)
193
196
  - `tools` (optional): Array of tool definitions for function calling (max 128)
194
197
  - `tool_choice` (optional): "auto" | "none" | "required" | `{type: "function", function: {name: "..."}}`
195
- - `thinking` (optional): Enable thinking mode `{type: "enabled"}`
198
+ - `thinking` (optional): Toggle thinking mode, `{type: "enabled"}` to reason or `{type: "disabled"}` for a fast answer (non-thinking is the default)
199
+ - `reasoning_effort` (optional): "high" (default) or "max", applies only while thinking mode is active
196
200
  - `json_mode` (optional): Enable JSON output mode (supported by both models)
197
201
  - `session_id` (optional): Session ID for multi-turn conversations. Previous context is automatically prepended.
198
202
 
@@ -212,13 +216,13 @@ Chat with DeepSeek AI models with automatic cost tracking and function calling s
212
216
  "content": "Explain the theory of relativity in simple terms"
213
217
  }
214
218
  ],
215
- "model": "deepseek-chat",
219
+ "model": "deepseek-v4-flash",
216
220
  "temperature": 0.7,
217
221
  "max_tokens": 1000
218
222
  }
219
223
  ```
220
224
 
221
- **DeepSeek Reasoner Example:**
225
+ **Reasoning Example (`deepseek-reasoner` alias, routes to v4-flash + thinking):**
222
226
 
223
227
  ```json
224
228
  {
@@ -232,7 +236,22 @@ Chat with DeepSeek AI models with automatic cost tracking and function calling s
232
236
  }
233
237
  ```
234
238
 
235
- The reasoner model will show its thinking process in `<thinking>` tags followed by the final answer.
239
+ Thinking mode returns the chain-of-thought in `<thinking>` tags followed by the final answer.
240
+
241
+ **DeepSeek V4 Pro Example (hardest tasks):**
242
+
243
+ ```json
244
+ {
245
+ "messages": [
246
+ {
247
+ "role": "user",
248
+ "content": "Prove that the square root of 2 is irrational."
249
+ }
250
+ ],
251
+ "model": "deepseek-v4-pro",
252
+ "thinking": { "type": "enabled" }
253
+ }
254
+ ```
236
255
 
237
256
  **Function Calling Example:**
238
257
 
@@ -279,12 +298,12 @@ When the model decides to call a function, the response includes `tool_calls` wi
279
298
  "content": "Analyze the time complexity of quicksort"
280
299
  }
281
300
  ],
282
- "model": "deepseek-chat",
301
+ "model": "deepseek-v4-flash",
283
302
  "thinking": { "type": "enabled" }
284
303
  }
285
304
  ```
286
305
 
287
- When thinking mode is enabled, `temperature`, `top_p`, `frequency_penalty`, and `presence_penalty` are automatically ignored.
306
+ When thinking mode is enabled, `temperature` and `top_p` are automatically ignored.
288
307
 
289
308
  **JSON Output Mode Example:**
290
309
 
@@ -296,12 +315,12 @@ When thinking mode is enabled, `temperature`, `top_p`, `frequency_penalty`, and
296
315
  "content": "Return a json object with name, age, and city fields for a sample user"
297
316
  }
298
317
  ],
299
- "model": "deepseek-chat",
318
+ "model": "deepseek-v4-flash",
300
319
  "json_mode": true
301
320
  }
302
321
  ```
303
322
 
304
- JSON mode ensures the model outputs valid JSON. Include the word "json" in your prompt for best results. Supported by both `deepseek-chat` and `deepseek-reasoner`.
323
+ JSON mode ensures the model outputs valid JSON. Include the word "json" in your prompt for best results. Supported by all models.
305
324
 
306
325
  **Multi-Turn Session Example:**
307
326
 
@@ -383,28 +402,27 @@ Each prompt is optimized for the DeepSeek Reasoner model to provide detailed rea
383
402
 
384
403
  ## Models
385
404
 
386
- Both models run **DeepSeek-V3.2** with unified pricing.
405
+ Both V4 models have a 1M-token context window, up to 384K output tokens, and support function calling, JSON mode, and optional chain-of-thought thinking. They are non-thinking by default here for fast responses; enable reasoning with `thinking: {type: "enabled"}` (or the `deepseek-reasoner` alias).
406
+
407
+ ### deepseek-v4-flash (default)
408
+
409
+ - **Best for**: General conversations, coding, content generation, agent loops
410
+ - **Speed**: Fast and economical
411
+ - **Context**: 1M tokens
412
+ - **Max Output**: 384K tokens
413
+ - **Pricing**: $0.0028/1M cache hit, $0.14/1M cache miss, $0.28/1M output
387
414
 
388
- ### deepseek-chat
415
+ ### deepseek-v4-pro
389
416
 
390
- - **Best for**: General conversations, coding, content generation
391
- - **Speed**: Fast
392
- - **Context**: 128K tokens
393
- - **Max Output**: 8K tokens (default 4K)
394
- - **Mode**: Non-thinking (can enable thinking via parameter)
395
- - **Features**: Thinking mode, JSON mode, function calling, FIM completion
396
- - **Pricing**: $0.028/1M cache hit, $0.28/1M cache miss, $0.42/1M output
417
+ - **Best for**: Complex reasoning, math, hard multi-step tasks, top-quality output
418
+ - **Speed**: Slower than flash, highest capability
419
+ - **Context**: 1M tokens
420
+ - **Max Output**: 384K tokens
421
+ - **Pricing**: $0.003625/1M cache hit, $0.435/1M cache miss, $0.87/1M output
397
422
 
398
- ### deepseek-reasoner
423
+ ### Compatibility aliases
399
424
 
400
- - **Best for**: Complex reasoning, math, logic problems, multi-step tasks
401
- - **Speed**: Slower (shows thinking process)
402
- - **Context**: 128K tokens
403
- - **Max Output**: 64K tokens (default 32K)
404
- - **Mode**: Thinking (always active, chain-of-thought reasoning)
405
- - **Features**: JSON mode, function calling
406
- - **Output**: Both reasoning process and final answer
407
- - **Pricing**: $0.028/1M cache hit, $0.28/1M cache miss, $0.42/1M output
425
+ `deepseek-chat` and `deepseek-reasoner` are still accepted and resolve to `deepseek-v4-flash` (chat = non-thinking, reasoner = thinking), so existing configs keep working. The DeepSeek API retires those two names on **2026-07-24**; this server translates them to V4 for you.
408
426
 
409
427
  ## Configuration
410
428
 
@@ -414,7 +432,7 @@ The server is configured via environment variables. All settings except `DEEPSEE
414
432
  |----------|---------|-------------|
415
433
  | `DEEPSEEK_API_KEY` | (required) | Your DeepSeek API key |
416
434
  | `DEEPSEEK_BASE_URL` | `https://api.deepseek.com` | Custom API endpoint |
417
- | `DEFAULT_MODEL` | `deepseek-chat` | Default model for requests |
435
+ | `DEFAULT_MODEL` | `deepseek-v4-flash` | Default model for requests |
418
436
  | `SHOW_COST_INFO` | `true` | Show cost info in responses |
419
437
  | `REQUEST_TIMEOUT` | `60000` | Request timeout in milliseconds |
420
438
  | `MAX_RETRIES` | `2` | Maximum retry count for failed requests |
@@ -429,6 +447,9 @@ The server is configured via environment variables. All settings except `DEEPSEE
429
447
  | `ENABLE_MULTIMODAL` | `false` | Enable multimodal (image) input support |
430
448
  | `TRANSPORT` | `stdio` | Transport mode: `stdio` or `http` |
431
449
  | `HTTP_PORT` | `3000` | HTTP server port (when TRANSPORT=http) |
450
+ | `HTTP_HOST` | `127.0.0.1` | Bind address for HTTP transport. Loopback by default so a fresh run is not exposed. Set to `0.0.0.0` to accept remote connections (do this only with auth or a proxy in front) |
451
+ | `HTTP_AUTH_TOKEN` | _(unset)_ | When set, `POST /mcp` requires `Authorization: Bearer <token>`. `/health` stays open. Strongly recommended whenever the port is reachable beyond localhost |
452
+ | `HTTP_ALLOWED_HOSTS` | _(unset)_ | Comma-separated list of allowed `Host` headers for DNS rebinding protection when binding to `0.0.0.0` (e.g. `mcp.example.com,localhost`) |
432
453
 
433
454
  **Example with custom config:**
434
455
  ```bash
@@ -559,10 +580,39 @@ curl http://localhost:3000/health
559
580
 
560
581
  The MCP endpoint is available at `POST /mcp` (Streamable HTTP protocol).
561
582
 
583
+ **Securing the endpoint (read before exposing it).** In self-hosted HTTP mode the
584
+ server holds your `DEEPSEEK_API_KEY` and uses it for every `deepseek_chat` call.
585
+ Anyone who can reach `POST /mcp` can invoke tools and spend that key, so the
586
+ endpoint must not sit open on a public interface. The defaults are built around
587
+ this:
588
+
589
+ 1. `HTTP_HOST` defaults to `127.0.0.1`, so a plain run only listens on loopback and the SDK's DNS rebinding protection is active. Nothing off the machine can reach it.
590
+ 2. To accept remote connections, set `HTTP_HOST=0.0.0.0`, but then set `HTTP_AUTH_TOKEN` as well so `/mcp` requires `Authorization: Bearer <token>`. If you bind to `0.0.0.0` without a token, the server prints a loud warning on startup.
591
+ 3. For an internet-facing deployment, put an authenticating reverse proxy with TLS in front and set `HTTP_ALLOWED_HOSTS` to your real hostname(s).
592
+
593
+ ```bash
594
+ # Exposed deployment with a bearer token
595
+ TRANSPORT=http HTTP_HOST=0.0.0.0 HTTP_PORT=3000 \
596
+ HTTP_AUTH_TOKEN=$(openssl rand -hex 32) \
597
+ HTTP_ALLOWED_HOSTS=mcp.example.com \
598
+ DEEPSEEK_API_KEY=your-key node dist/index.js
599
+
600
+ # Calling it
601
+ curl -X POST http://mcp.example.com:3000/mcp \
602
+ -H "Authorization: Bearer YOUR_TOKEN" \
603
+ -H "Content-Type: application/json" \
604
+ -H "Accept: application/json, text/event-stream" \
605
+ -d '{"jsonrpc":"2.0","method":"initialize","params":{"capabilities":{}},"id":1}'
606
+ ```
607
+
608
+ `HTTP_AUTH_TOKEN` is a static gateway token for the self-hosted endpoint and is
609
+ unrelated to your DeepSeek key. It is separate from the hosted BYOK endpoint
610
+ above, where clients pass their own DeepSeek key as the bearer.
611
+
562
612
  **Session isolation (1.7.0+):** In HTTP transport each connected MCP session
563
613
  gets its own `McpServer` instance and its own `SessionStore`. Conversation
564
614
  history, session listings, and deletions are scoped to the MCP session that
565
- created them one client cannot read, enumerate, or wipe another client's
615
+ created them, so one client cannot read, enumerate, or wipe another client's
566
616
  sessions. STDIO transport is single-tenant by nature and unaffected.
567
617
 
568
618
  ### Docker
@@ -571,14 +621,21 @@ sessions. STDIO transport is single-tenant by nature and unaffected.
571
621
  # Build
572
622
  docker build -t deepseek-mcp-server .
573
623
 
574
- # Run
575
- docker run -d -p 3000:3000 -e DEEPSEEK_API_KEY=your-key deepseek-mcp-server
624
+ # Run, reachable only from the host's loopback, with a bearer token
625
+ docker run -d -p 127.0.0.1:3000:3000 \
626
+ -e DEEPSEEK_API_KEY=your-key \
627
+ -e HTTP_AUTH_TOKEN=your-token \
628
+ deepseek-mcp-server
576
629
 
577
630
  # Or use docker-compose
578
- DEEPSEEK_API_KEY=your-key docker compose up -d
631
+ DEEPSEEK_API_KEY=your-key HTTP_AUTH_TOKEN=your-token docker compose up -d
579
632
  ```
580
633
 
581
- The Docker image defaults to HTTP transport on port 3000 with a built-in health check.
634
+ The image runs HTTP transport on port 3000 with a health check. Inside the
635
+ container it binds `0.0.0.0` (required for the port mapping to work), so control
636
+ exposure at the publish layer: the example above and the bundled
637
+ `docker-compose.yml` publish to `127.0.0.1` only. If you publish the port on a
638
+ public interface, set `HTTP_AUTH_TOKEN`.
582
639
 
583
640
  ## Troubleshooting
584
641
 
package/dist/config.d.ts CHANGED
@@ -24,6 +24,9 @@ declare const ConfigSchema: z.ZodObject<{
24
24
  http: "http";
25
25
  }>>;
26
26
  httpPort: z.ZodDefault<z.ZodNumber>;
27
+ httpHost: z.ZodDefault<z.ZodString>;
28
+ httpAuthToken: z.ZodOptional<z.ZodString>;
29
+ httpAllowedHosts: z.ZodOptional<z.ZodArray<z.ZodString>>;
27
30
  }, z.core.$strip>;
28
31
  export type Config = z.infer<typeof ConfigSchema>;
29
32
  /**
@@ -1 +1 @@
1
- {"version":3,"file":"config.d.ts","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAEH,OAAO,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAGxB,QAAA,MAAM,YAAY;;;;;;;;;;;;;;;;;;;;;iBAkBhB,CAAC;AAEH,MAAM,MAAM,MAAM,GAAG,CAAC,CAAC,KAAK,CAAC,OAAO,YAAY,CAAC,CAAC;AAIlD;;;;GAIG;AACH,wBAAgB,UAAU,IAAI,MAAM,CAyDnC;AAED;;;GAGG;AACH,wBAAgB,SAAS,IAAI,MAAM,CAKlC;AAED;;GAEG;AACH,wBAAgB,WAAW,IAAI,IAAI,CAElC"}
1
+ {"version":3,"file":"config.d.ts","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAEH,OAAO,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAGxB,QAAA,MAAM,YAAY;;;;;;;;;;;;;;;;;;;;;;;;iBAqBhB,CAAC;AAEH,MAAM,MAAM,MAAM,GAAG,CAAC,CAAC,KAAK,CAAC,OAAO,YAAY,CAAC,CAAC;AAIlD;;;;GAIG;AACH,wBAAgB,UAAU,IAAI,MAAM,CA8DnC;AAED;;;GAGG;AACH,wBAAgB,SAAS,IAAI,MAAM,CAKlC;AAED;;GAEG;AACH,wBAAgB,WAAW,IAAI,IAAI,CAElC"}
package/dist/config.js CHANGED
@@ -15,13 +15,16 @@ const ConfigSchema = z.object({
15
15
  sessionTtlMinutes: z.number().positive().default(30),
16
16
  maxSessions: z.number().positive().default(100),
17
17
  fallbackEnabled: z.boolean().default(true),
18
- defaultModel: z.string().default('deepseek-chat'),
18
+ defaultModel: z.string().default('deepseek-v4-flash'),
19
19
  circuitBreakerThreshold: z.number().positive().default(5),
20
20
  circuitBreakerResetTimeout: z.number().positive().default(30000),
21
21
  maxSessionMessages: z.number().positive().default(200),
22
22
  enableMultimodal: z.boolean().default(false),
23
23
  transport: z.enum(['stdio', 'http']).default('stdio'),
24
24
  httpPort: z.number().positive().default(3000),
25
+ httpHost: z.string().min(1).default('127.0.0.1'),
26
+ httpAuthToken: z.string().min(1).optional(),
27
+ httpAllowedHosts: z.array(z.string().min(1)).optional(),
25
28
  });
26
29
  let cachedConfig = null;
27
30
  /**
@@ -51,7 +54,7 @@ export function loadConfig() {
51
54
  ? parseInt(process.env.MAX_SESSIONS, 10)
52
55
  : 100,
53
56
  fallbackEnabled: process.env.FALLBACK_ENABLED !== 'false',
54
- defaultModel: process.env.DEFAULT_MODEL || 'deepseek-chat',
57
+ defaultModel: process.env.DEFAULT_MODEL || 'deepseek-v4-flash',
55
58
  circuitBreakerThreshold: process.env.CIRCUIT_BREAKER_THRESHOLD
56
59
  ? parseInt(process.env.CIRCUIT_BREAKER_THRESHOLD, 10)
57
60
  : 5,
@@ -64,6 +67,11 @@ export function loadConfig() {
64
67
  enableMultimodal: process.env.ENABLE_MULTIMODAL === 'true',
65
68
  transport: (process.env.TRANSPORT || 'stdio'),
66
69
  httpPort: process.env.HTTP_PORT ? parseInt(process.env.HTTP_PORT, 10) : 3000,
70
+ httpHost: process.env.HTTP_HOST || '127.0.0.1',
71
+ httpAuthToken: process.env.HTTP_AUTH_TOKEN || undefined,
72
+ httpAllowedHosts: process.env.HTTP_ALLOWED_HOSTS
73
+ ? process.env.HTTP_ALLOWED_HOSTS.split(',').map((h) => h.trim()).filter(Boolean)
74
+ : undefined,
67
75
  };
68
76
  const result = ConfigSchema.safeParse(raw);
69
77
  if (!result.success) {
@@ -1 +1 @@
1
- {"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAEH,OAAO,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AACxB,OAAO,EAAE,WAAW,EAAE,MAAM,aAAa,CAAC;AAE1C,MAAM,YAAY,GAAG,CAAC,CAAC,MAAM,CAAC;IAC5B,MAAM,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,EAAE,8BAA8B,CAAC;IACzD,OAAO,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,EAAE,CAAC,OAAO,CAAC,0BAA0B,CAAC;IAC7D,YAAY,EAAE,CAAC,CAAC,OAAO,EAAE,CAAC,OAAO,CAAC,IAAI,CAAC;IACvC,cAAc,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,KAAK,CAAC;IACpD,UAAU,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC,EAAE,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC;IAChD,kBAAkB,EAAE,CAAC,CAAC,OAAO,EAAE,CAAC,OAAO,CAAC,KAAK,CAAC;IAC9C,gBAAgB,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,OAAO,CAAC;IACxD,iBAAiB,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,EAAE,CAAC;IACpD,WAAW,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,GAAG,CAAC;IAC/C,eAAe,EAAE,CAAC,CAAC,OAAO,EAAE,CAAC,OAAO,CAAC,IAAI,CAAC;IAC1C,YAAY,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,OAAO,CAAC,eAAe,CAAC;IACjD,uBAAuB,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,CAAC,CAAC;IACzD,0BAA0B,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,KAAK,CAAC;IAChE,kBAAkB,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,GAAG,CAAC;IACtD,gBAAgB,EAAE,CAAC,CAAC,OAAO,EAAE,CAAC,OAAO,CAAC,KAAK,CAAC;IAC5C,SAAS,EAAE,CAAC,CAAC,IAAI,CAAC,CAAC,OAAO,EAAE,MAAM,CAAC,CAAC,CAAC,OAAO,CAAC,OAAO,CAAC;IACrD,QAAQ,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,IAAI,CAAC;CAC9C,CAAC,CAAC;AAIH,IAAI,YAAY,GAAkB,IAAI,CAAC;AAEvC;;;;GAIG;AACH,MAAM,UAAU,UAAU;IACxB,MAAM,GAAG,GAAG;QACV,MAAM,EAAE,OAAO,CAAC,GAAG,CAAC,gBAAgB,IAAI,EAAE;QAC1C,OAAO,EAAE,OAAO,CAAC,GAAG,CAAC,iBAAiB,IAAI,0BAA0B;QACpE,YAAY,EAAE,OAAO,CAAC,GAAG,CAAC,cAAc,KAAK,OAAO;QACpD,cAAc,EAAE,OAAO,CAAC,GAAG,CAAC,eAAe;YACzC,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,eAAe,EAAE,EAAE,CAAC;YAC3C,CAAC,CAAC,KAAK;QACT,UAAU,EAAE,OAAO,CAAC,GAAG,CAAC,WAAW;YACjC,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,WAAW,EAAE,EAAE,CAAC;YACvC,CAAC,CAAC,CAAC;QACL,kBAAkB,EAAE,OAAO,CAAC,GAAG,CAAC,oBAAoB,KAAK,MAAM;QAC/D,gBAAgB,EAAE,OAAO,CAAC,GAAG,CAAC,kBAAkB;YAC9C,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,kBAAkB,EAAE,EAAE,CAAC;YAC9C,CAAC,CAAC,OAAO;QACX,iBAAiB,EAAE,OAAO,CAAC,GAAG,CAAC,mBAAmB;YAChD,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,mBAAmB,EAAE,EAAE,CAAC;YAC/C,CAAC,CAAC,EAAE;QACN,WAAW,EAAE,OAAO,CAAC,GAAG,CAAC,YAAY;YACnC,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,YAAY,EAAE,EAAE,CAAC;YACxC,CAAC,CAAC,GAAG;QACP,eAAe,EAAE,OAAO,CAAC,GAAG,CAAC,gBAAgB,KAAK,OAAO;QACzD,YAAY,EAAE,OAAO,CAAC,GAAG,CAAC,aAAa,IAAI,eAAe;QAC1D,uBAAuB,EAAE,OAAO,CAAC,GAAG,CAAC,yBAAyB;YAC5D,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,yBAAyB,EAAE,EAAE,CAAC;YACrD,CAAC,CAAC,CAAC;QACL,0BAA0B,EAAE,OAAO,CAAC,GAAG,CAAC,6BAA6B;YACnE,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,6BAA6B,EAAE,EAAE,CAAC;YACzD,CAAC,CAAC,KAAK;QACT,kBAAkB,EAAE,OAAO,CAAC,GAAG,CAAC,oBAAoB;YAClD,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,oBAAoB,EAAE,EAAE,CAAC;YAChD,CAAC,CAAC,GAAG;QACP,gBAAgB,EAAE,OAAO,CAAC,GAAG,CAAC,iBAAiB,KAAK,MAAM;QAC1D,SAAS,EAAE,CAAC,OAAO,CAAC,GAAG,CAAC,SAAS,IAAI,OAAO,CAAqB;QACjE,QAAQ,EAAE,OAAO,CAAC,GAAG,CAAC,SAAS,CAAC,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,SAAS,EAAE,EAAE,CAAC,CAAC,CAAC,CAAC,IAAI;KAC7E,CAAC;IAEF,MAAM,MAAM,GAAG,YAAY,CAAC,SAAS,CAAC,GAAG,CAAC,CAAC;IAE3C,IAAI,CAAC,MAAM,CAAC,OAAO,EAAE,CAAC;QACpB,MAAM,MAAM,GAAG,MAAM,CAAC,KAAK,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC,KAAK,EAAE,EAAE,CAAC,CAAC;YACjD,IAAI,EAAE,KAAK,CAAC,IAAI,CAAC,IAAI,CAAC,GAAG,CAAC;YAC1B,OAAO,EAAE,KAAK,CAAC,OAAO;SACvB,CAAC,CAAC,CAAC;QAEJ,MAAM,IAAI,GAAG,CAAC,GAAG,CAAC,MAAM;YACtB,CAAC,CAAC,oFAAoF;YACtF,CAAC,CAAC,EAAE,CAAC;QAEP,MAAM,IAAI,WAAW,CACnB,kCAAkC,IAAI,EAAE,EACxC,MAAM,CACP,CAAC;IACJ,CAAC;IAED,YAAY,GAAG,MAAM,CAAC,IAAI,CAAC;IAC3B,OAAO,YAAY,CAAC;AACtB,CAAC;AAED;;;GAGG;AACH,MAAM,UAAU,SAAS;IACvB,IAAI,CAAC,YAAY,EAAE,CAAC;QAClB,MAAM,IAAI,KAAK,CAAC,6CAA6C,CAAC,CAAC;IACjE,CAAC;IACD,OAAO,YAAY,CAAC;AACtB,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,WAAW;IACzB,YAAY,GAAG,IAAI,CAAC;AACtB,CAAC"}
1
+ {"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAEH,OAAO,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AACxB,OAAO,EAAE,WAAW,EAAE,MAAM,aAAa,CAAC;AAE1C,MAAM,YAAY,GAAG,CAAC,CAAC,MAAM,CAAC;IAC5B,MAAM,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,EAAE,8BAA8B,CAAC;IACzD,OAAO,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,EAAE,CAAC,OAAO,CAAC,0BAA0B,CAAC;IAC7D,YAAY,EAAE,CAAC,CAAC,OAAO,EAAE,CAAC,OAAO,CAAC,IAAI,CAAC;IACvC,cAAc,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,KAAK,CAAC;IACpD,UAAU,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC,EAAE,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC;IAChD,kBAAkB,EAAE,CAAC,CAAC,OAAO,EAAE,CAAC,OAAO,CAAC,KAAK,CAAC;IAC9C,gBAAgB,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,OAAO,CAAC;IACxD,iBAAiB,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,EAAE,CAAC;IACpD,WAAW,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,GAAG,CAAC;IAC/C,eAAe,EAAE,CAAC,CAAC,OAAO,EAAE,CAAC,OAAO,CAAC,IAAI,CAAC;IAC1C,YAAY,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,OAAO,CAAC,mBAAmB,CAAC;IACrD,uBAAuB,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,CAAC,CAAC;IACzD,0BAA0B,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,KAAK,CAAC;IAChE,kBAAkB,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,GAAG,CAAC;IACtD,gBAAgB,EAAE,CAAC,CAAC,OAAO,EAAE,CAAC,OAAO,CAAC,KAAK,CAAC;IAC5C,SAAS,EAAE,CAAC,CAAC,IAAI,CAAC,CAAC,OAAO,EAAE,MAAM,CAAC,CAAC,CAAC,OAAO,CAAC,OAAO,CAAC;IACrD,QAAQ,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE,CAAC,OAAO,CAAC,IAAI,CAAC;IAC7C,QAAQ,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,OAAO,CAAC,WAAW,CAAC;IAChD,aAAa,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,QAAQ,EAAE;IAC3C,gBAAgB,EAAE,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,QAAQ,EAAE;CACxD,CAAC,CAAC;AAIH,IAAI,YAAY,GAAkB,IAAI,CAAC;AAEvC;;;;GAIG;AACH,MAAM,UAAU,UAAU;IACxB,MAAM,GAAG,GAAG;QACV,MAAM,EAAE,OAAO,CAAC,GAAG,CAAC,gBAAgB,IAAI,EAAE;QAC1C,OAAO,EAAE,OAAO,CAAC,GAAG,CAAC,iBAAiB,IAAI,0BAA0B;QACpE,YAAY,EAAE,OAAO,CAAC,GAAG,CAAC,cAAc,KAAK,OAAO;QACpD,cAAc,EAAE,OAAO,CAAC,GAAG,CAAC,eAAe;YACzC,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,eAAe,EAAE,EAAE,CAAC;YAC3C,CAAC,CAAC,KAAK;QACT,UAAU,EAAE,OAAO,CAAC,GAAG,CAAC,WAAW;YACjC,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,WAAW,EAAE,EAAE,CAAC;YACvC,CAAC,CAAC,CAAC;QACL,kBAAkB,EAAE,OAAO,CAAC,GAAG,CAAC,oBAAoB,KAAK,MAAM;QAC/D,gBAAgB,EAAE,OAAO,CAAC,GAAG,CAAC,kBAAkB;YAC9C,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,kBAAkB,EAAE,EAAE,CAAC;YAC9C,CAAC,CAAC,OAAO;QACX,iBAAiB,EAAE,OAAO,CAAC,GAAG,CAAC,mBAAmB;YAChD,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,mBAAmB,EAAE,EAAE,CAAC;YAC/C,CAAC,CAAC,EAAE;QACN,WAAW,EAAE,OAAO,CAAC,GAAG,CAAC,YAAY;YACnC,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,YAAY,EAAE,EAAE,CAAC;YACxC,CAAC,CAAC,GAAG;QACP,eAAe,EAAE,OAAO,CAAC,GAAG,CAAC,gBAAgB,KAAK,OAAO;QACzD,YAAY,EAAE,OAAO,CAAC,GAAG,CAAC,aAAa,IAAI,mBAAmB;QAC9D,uBAAuB,EAAE,OAAO,CAAC,GAAG,CAAC,yBAAyB;YAC5D,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,yBAAyB,EAAE,EAAE,CAAC;YACrD,CAAC,CAAC,CAAC;QACL,0BAA0B,EAAE,OAAO,CAAC,GAAG,CAAC,6BAA6B;YACnE,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,6BAA6B,EAAE,EAAE,CAAC;YACzD,CAAC,CAAC,KAAK;QACT,kBAAkB,EAAE,OAAO,CAAC,GAAG,CAAC,oBAAoB;YAClD,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,oBAAoB,EAAE,EAAE,CAAC;YAChD,CAAC,CAAC,GAAG;QACP,gBAAgB,EAAE,OAAO,CAAC,GAAG,CAAC,iBAAiB,KAAK,MAAM;QAC1D,SAAS,EAAE,CAAC,OAAO,CAAC,GAAG,CAAC,SAAS,IAAI,OAAO,CAAqB;QACjE,QAAQ,EAAE,OAAO,CAAC,GAAG,CAAC,SAAS,CAAC,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,SAAS,EAAE,EAAE,CAAC,CAAC,CAAC,CAAC,IAAI;QAC5E,QAAQ,EAAE,OAAO,CAAC,GAAG,CAAC,SAAS,IAAI,WAAW;QAC9C,aAAa,EAAE,OAAO,CAAC,GAAG,CAAC,eAAe,IAAI,SAAS;QACvD,gBAAgB,EAAE,OAAO,CAAC,GAAG,CAAC,kBAAkB;YAC9C,CAAC,CAAC,OAAO,CAAC,GAAG,CAAC,kBAAkB,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,IAAI,EAAE,CAAC,CAAC,MAAM,CAAC,OAAO,CAAC;YAChF,CAAC,CAAC,SAAS;KACd,CAAC;IAEF,MAAM,MAAM,GAAG,YAAY,CAAC,SAAS,CAAC,GAAG,CAAC,CAAC;IAE3C,IAAI,CAAC,MAAM,CAAC,OAAO,EAAE,CAAC;QACpB,MAAM,MAAM,GAAG,MAAM,CAAC,KAAK,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC,KAAK,EAAE,EAAE,CAAC,CAAC;YACjD,IAAI,EAAE,KAAK,CAAC,IAAI,CAAC,IAAI,CAAC,GAAG,CAAC;YAC1B,OAAO,EAAE,KAAK,CAAC,OAAO;SACvB,CAAC,CAAC,CAAC;QAEJ,MAAM,IAAI,GAAG,CAAC,GAAG,CAAC,MAAM;YACtB,CAAC,CAAC,oFAAoF;YACtF,CAAC,CAAC,EAAE,CAAC;QAEP,MAAM,IAAI,WAAW,CACnB,kCAAkC,IAAI,EAAE,EACxC,MAAM,CACP,CAAC;IACJ,CAAC;IAED,YAAY,GAAG,MAAM,CAAC,IAAI,CAAC;IAC3B,OAAO,YAAY,CAAC;AACtB,CAAC;AAED;;;GAGG;AACH,MAAM,UAAU,SAAS;IACvB,IAAI,CAAC,YAAY,EAAE,CAAC;QAClB,MAAM,IAAI,KAAK,CAAC,6CAA6C,CAAC,CAAC;IACjE,CAAC;IACD,OAAO,YAAY,CAAC;AACtB,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,WAAW;IACzB,YAAY,GAAG,IAAI,CAAC;AACtB,CAAC"}
package/dist/cost.d.ts CHANGED
@@ -2,9 +2,11 @@
2
2
  * Cost Calculation Module
3
3
  * Handles pricing and cost formatting for DeepSeek API requests
4
4
  *
5
- * Model-aware pricing per 1M tokens (USD):
6
- * - deepseek-chat / deepseek-reasoner (V3.2): cache hit $0.028, cache miss $0.28, output $0.42
7
- * - New models can be added to MODEL_PRICING as they become available
5
+ * Model-aware pricing per 1M tokens (USD), from the DeepSeek V4 pricing page:
6
+ * - deepseek-v4-flash: cache hit $0.0028, cache miss $0.14, output $0.28
7
+ * - deepseek-v4-pro: cache hit $0.003625, cache miss $0.435, output $0.87
8
+ * The deepseek-chat / deepseek-reasoner aliases resolve to deepseek-v4-flash,
9
+ * so they carry v4-flash pricing. New models can be added to MODEL_PRICING.
8
10
  */
9
11
  /**
10
12
  * Pricing structure for a model
@@ -14,7 +16,7 @@ export interface ModelPricing {
14
16
  cache_miss: number;
15
17
  output: number;
16
18
  }
17
- /** Default pricing (V3.2 unified) used for unknown models */
19
+ /** v4-flash pricing also the default for unknown models (cheapest, most common) */
18
20
  export declare const DEFAULT_PRICING: ModelPricing;
19
21
  /** Backward-compatible alias */
20
22
  export declare const PRICING: ModelPricing;
@@ -36,7 +38,7 @@ export interface CostBreakdown {
36
38
  }
37
39
  /**
38
40
  * Calculate cost for a request based on token usage.
39
- * Supports V3.2 cache hit/miss pricing. If cache fields are absent,
41
+ * Supports cache hit/miss pricing. If cache fields are absent,
40
42
  * treats all input tokens as cache miss (backward compatible).
41
43
  */
42
44
  export declare function calculateCost(usage: {
@@ -1 +1 @@
1
- {"version":3,"file":"cost.d.ts","sourceRoot":"","sources":["../src/cost.ts"],"names":[],"mappings":"AAAA;;;;;;;GAOG;AAEH;;GAEG;AACH,MAAM,WAAW,YAAY;IAC3B,SAAS,EAAE,MAAM,CAAC;IAClB,UAAU,EAAE,MAAM,CAAC;IACnB,MAAM,EAAE,MAAM,CAAC;CAChB;AAED,+DAA+D;AAC/D,eAAO,MAAM,eAAe,EAAE,YAI7B,CAAC;AAEF,gCAAgC;AAChC,eAAO,MAAM,OAAO,cAAkB,CAAC;AAEvC,2EAA2E;AAC3E,eAAO,MAAM,aAAa,EAAE,MAAM,CAAC,MAAM,EAAE,YAAY,CAGtD,CAAC;AAEF;;GAEG;AACH,wBAAgB,UAAU,CAAC,KAAK,CAAC,EAAE,MAAM,GAAG,YAAY,CAKvD;AAED;;GAEG;AACH,MAAM,WAAW,aAAa;IAC5B,SAAS,EAAE,MAAM,CAAC;IAClB,UAAU,EAAE,MAAM,CAAC;IACnB,SAAS,EAAE,MAAM,CAAC;IAClB,aAAa,CAAC,EAAE,MAAM,CAAC;IACvB,YAAY,CAAC,EAAE,MAAM,CAAC;CACvB;AAED;;;;GAIG;AACH,wBAAgB,aAAa,CAAC,KAAK,EAAE;IACnC,aAAa,EAAE,MAAM,CAAC;IACtB,iBAAiB,EAAE,MAAM,CAAC;IAC1B,uBAAuB,CAAC,EAAE,MAAM,CAAC;IACjC,wBAAwB,CAAC,EAAE,MAAM,CAAC;CACnC,EAAE,KAAK,CAAC,EAAE,MAAM,GAAG,aAAa,CA8ChC;AAED;;GAEG;AACH,wBAAgB,UAAU,CAAC,SAAS,EAAE,aAAa,GAAG,MAAM,CAqB3D"}
1
+ {"version":3,"file":"cost.d.ts","sourceRoot":"","sources":["../src/cost.ts"],"names":[],"mappings":"AAAA;;;;;;;;;GASG;AAEH;;GAEG;AACH,MAAM,WAAW,YAAY;IAC3B,SAAS,EAAE,MAAM,CAAC;IAClB,UAAU,EAAE,MAAM,CAAC;IACnB,MAAM,EAAE,MAAM,CAAC;CAChB;AAED,qFAAqF;AACrF,eAAO,MAAM,eAAe,EAAE,YAI7B,CAAC;AAEF,gCAAgC;AAChC,eAAO,MAAM,OAAO,cAAkB,CAAC;AAEvC,2EAA2E;AAC3E,eAAO,MAAM,aAAa,EAAE,MAAM,CAAC,MAAM,EAAE,YAAY,CAMtD,CAAC;AAEF;;GAEG;AACH,wBAAgB,UAAU,CAAC,KAAK,CAAC,EAAE,MAAM,GAAG,YAAY,CAKvD;AAED;;GAEG;AACH,MAAM,WAAW,aAAa;IAC5B,SAAS,EAAE,MAAM,CAAC;IAClB,UAAU,EAAE,MAAM,CAAC;IACnB,SAAS,EAAE,MAAM,CAAC;IAClB,aAAa,CAAC,EAAE,MAAM,CAAC;IACvB,YAAY,CAAC,EAAE,MAAM,CAAC;CACvB;AAED;;;;GAIG;AACH,wBAAgB,aAAa,CAAC,KAAK,EAAE;IACnC,aAAa,EAAE,MAAM,CAAC;IACtB,iBAAiB,EAAE,MAAM,CAAC;IAC1B,uBAAuB,CAAC,EAAE,MAAM,CAAC;IACjC,wBAAwB,CAAC,EAAE,MAAM,CAAC;CACnC,EAAE,KAAK,CAAC,EAAE,MAAM,GAAG,aAAa,CA8ChC;AAED;;GAEG;AACH,wBAAgB,UAAU,CAAC,SAAS,EAAE,aAAa,GAAG,MAAM,CAqB3D"}
package/dist/cost.js CHANGED
@@ -2,22 +2,27 @@
2
2
  * Cost Calculation Module
3
3
  * Handles pricing and cost formatting for DeepSeek API requests
4
4
  *
5
- * Model-aware pricing per 1M tokens (USD):
6
- * - deepseek-chat / deepseek-reasoner (V3.2): cache hit $0.028, cache miss $0.28, output $0.42
7
- * - New models can be added to MODEL_PRICING as they become available
5
+ * Model-aware pricing per 1M tokens (USD), from the DeepSeek V4 pricing page:
6
+ * - deepseek-v4-flash: cache hit $0.0028, cache miss $0.14, output $0.28
7
+ * - deepseek-v4-pro: cache hit $0.003625, cache miss $0.435, output $0.87
8
+ * The deepseek-chat / deepseek-reasoner aliases resolve to deepseek-v4-flash,
9
+ * so they carry v4-flash pricing. New models can be added to MODEL_PRICING.
8
10
  */
9
- /** Default pricing (V3.2 unified) used for unknown models */
11
+ /** v4-flash pricing also the default for unknown models (cheapest, most common) */
10
12
  export const DEFAULT_PRICING = {
11
- cache_hit: 0.028,
12
- cache_miss: 0.28,
13
- output: 0.42,
13
+ cache_hit: 0.0028,
14
+ cache_miss: 0.14,
15
+ output: 0.28,
14
16
  };
15
17
  /** Backward-compatible alias */
16
18
  export const PRICING = DEFAULT_PRICING;
17
19
  /** Per-model pricing map. Add new models here as they become available. */
18
20
  export const MODEL_PRICING = {
19
- 'deepseek-chat': { cache_hit: 0.028, cache_miss: 0.28, output: 0.42 },
20
- 'deepseek-reasoner': { cache_hit: 0.028, cache_miss: 0.28, output: 0.42 },
21
+ 'deepseek-v4-flash': { cache_hit: 0.0028, cache_miss: 0.14, output: 0.28 },
22
+ 'deepseek-v4-pro': { cache_hit: 0.003625, cache_miss: 0.435, output: 0.87 },
23
+ // Compatibility aliases (resolve to v4-flash on the wire)
24
+ 'deepseek-chat': { cache_hit: 0.0028, cache_miss: 0.14, output: 0.28 },
25
+ 'deepseek-reasoner': { cache_hit: 0.0028, cache_miss: 0.14, output: 0.28 },
21
26
  };
22
27
  /**
23
28
  * Get pricing for a specific model. Falls back to DEFAULT_PRICING for unknown models.
@@ -30,7 +35,7 @@ export function getPricing(model) {
30
35
  }
31
36
  /**
32
37
  * Calculate cost for a request based on token usage.
33
- * Supports V3.2 cache hit/miss pricing. If cache fields are absent,
38
+ * Supports cache hit/miss pricing. If cache fields are absent,
34
39
  * treats all input tokens as cache miss (backward compatible).
35
40
  */
36
41
  export function calculateCost(usage, model) {
package/dist/cost.js.map CHANGED
@@ -1 +1 @@
1
- {"version":3,"file":"cost.js","sourceRoot":"","sources":["../src/cost.ts"],"names":[],"mappings":"AAAA;;;;;;;GAOG;AAWH,+DAA+D;AAC/D,MAAM,CAAC,MAAM,eAAe,GAAiB;IAC3C,SAAS,EAAE,KAAK;IAChB,UAAU,EAAE,IAAI;IAChB,MAAM,EAAE,IAAI;CACb,CAAC;AAEF,gCAAgC;AAChC,MAAM,CAAC,MAAM,OAAO,GAAG,eAAe,CAAC;AAEvC,2EAA2E;AAC3E,MAAM,CAAC,MAAM,aAAa,GAAiC;IACzD,eAAe,EAAE,EAAE,SAAS,EAAE,KAAK,EAAE,UAAU,EAAE,IAAI,EAAE,MAAM,EAAE,IAAI,EAAE;IACrE,mBAAmB,EAAE,EAAE,SAAS,EAAE,KAAK,EAAE,UAAU,EAAE,IAAI,EAAE,MAAM,EAAE,IAAI,EAAE;CAC1E,CAAC;AAEF;;GAEG;AACH,MAAM,UAAU,UAAU,CAAC,KAAc;IACvC,IAAI,KAAK,IAAI,KAAK,IAAI,aAAa,EAAE,CAAC;QACpC,OAAO,aAAa,CAAC,KAAK,CAAC,CAAC;IAC9B,CAAC;IACD,OAAO,eAAe,CAAC;AACzB,CAAC;AAaD;;;;GAIG;AACH,MAAM,UAAU,aAAa,CAAC,KAK7B,EAAE,KAAc;IACf,MAAM,EACJ,aAAa,EACb,iBAAiB,EACjB,uBAAuB,EACvB,wBAAwB,GACzB,GAAG,KAAK,CAAC;IAEV,MAAM,OAAO,GAAG,UAAU,CAAC,KAAK,CAAC,CAAC;IAElC,IAAI,SAAiB,CAAC;IACtB,IAAI,aAAiC,CAAC;IACtC,IAAI,YAAgC,CAAC;IAErC,IACE,uBAAuB,KAAK,SAAS;QACrC,wBAAwB,KAAK,SAAS,EACtC,CAAC;QACD,sBAAsB;QACtB,MAAM,OAAO,GACX,CAAC,uBAAuB,GAAG,SAAS,CAAC,GAAG,OAAO,CAAC,SAAS,CAAC;QAC5D,MAAM,QAAQ,GACZ,CAAC,wBAAwB,GAAG,SAAS,CAAC,GAAG,OAAO,CAAC,UAAU,CAAC;QAC9D,SAAS,GAAG,OAAO,GAAG,QAAQ,CAAC;QAE/B,IAAI,aAAa,GAAG,CAAC,EAAE,CAAC;YACtB,aAAa,GAAG,uBAAuB,GAAG,aAAa,CAAC;QAC1D,CAAC;QAED,uDAAuD;QACvD,MAAM,WAAW,GAAG,CAAC,aAAa,GAAG,SAAS,CAAC,GAAG,OAAO,CAAC,UAAU,CAAC;QACrE,YAAY,GAAG,WAAW,GAAG,SAAS,CAAC;IACzC,CAAC;SAAM,CAAC;QACN,qDAAqD;QACrD,SAAS,GAAG,CAAC,aAAa,GAAG,SAAS,CAAC,GAAG,OAAO,CAAC,UAAU,CAAC;IAC/D,CAAC;IAED,MAAM,UAAU,GAAG,CAAC,iBAAiB,GAAG,SAAS,CAAC,GAAG,OAAO,CAAC,MAAM,CAAC;IAEpE,OAAO;QACL,SAAS;QACT,UAAU;QACV,SAAS,EAAE,SAAS,GAAG,UAAU;QACjC,aAAa;QACb,YAAY;KACb,CAAC;AACJ,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,UAAU,CAAC,SAAwB;IACjD,MAAM,IAAI,GAAG,SAAS,CAAC,SAAS,CAAC;IACjC,IAAI,SAAiB,CAAC;IAEtB,IAAI,IAAI,GAAG,IAAI,EAAE,CAAC;QAChB,SAAS,GAAG,IAAI,IAAI,CAAC,OAAO,CAAC,CAAC,CAAC,EAAE,CAAC;IACpC,CAAC;SAAM,CAAC;QACN,SAAS,GAAG,IAAI,IAAI,CAAC,OAAO,CAAC,CAAC,CAAC,EAAE,CAAC;IACpC,CAAC;IAED,IACE,SAAS,CAAC,aAAa,KAAK,SAAS;QACrC,SAAS,CAAC,aAAa,GAAG,CAAC;QAC3B,SAAS,CAAC,YAAY,KAAK,SAAS;QACpC,SAAS,CAAC,YAAY,GAAG,CAAC,EAC1B,CAAC;QACD,MAAM,GAAG,GAAG,IAAI,CAAC,KAAK,CAAC,SAAS,CAAC,aAAa,GAAG,GAAG,CAAC,CAAC;QACtD,SAAS,IAAI,gBAAgB,GAAG,cAAc,SAAS,CAAC,YAAY,CAAC,OAAO,CAAC,CAAC,CAAC,GAAG,CAAC;IACrF,CAAC;IAED,OAAO,SAAS,CAAC;AACnB,CAAC"}
1
+ {"version":3,"file":"cost.js","sourceRoot":"","sources":["../src/cost.ts"],"names":[],"mappings":"AAAA;;;;;;;;;GASG;AAWH,qFAAqF;AACrF,MAAM,CAAC,MAAM,eAAe,GAAiB;IAC3C,SAAS,EAAE,MAAM;IACjB,UAAU,EAAE,IAAI;IAChB,MAAM,EAAE,IAAI;CACb,CAAC;AAEF,gCAAgC;AAChC,MAAM,CAAC,MAAM,OAAO,GAAG,eAAe,CAAC;AAEvC,2EAA2E;AAC3E,MAAM,CAAC,MAAM,aAAa,GAAiC;IACzD,mBAAmB,EAAE,EAAE,SAAS,EAAE,MAAM,EAAE,UAAU,EAAE,IAAI,EAAE,MAAM,EAAE,IAAI,EAAE;IAC1E,iBAAiB,EAAE,EAAE,SAAS,EAAE,QAAQ,EAAE,UAAU,EAAE,KAAK,EAAE,MAAM,EAAE,IAAI,EAAE;IAC3E,0DAA0D;IAC1D,eAAe,EAAE,EAAE,SAAS,EAAE,MAAM,EAAE,UAAU,EAAE,IAAI,EAAE,MAAM,EAAE,IAAI,EAAE;IACtE,mBAAmB,EAAE,EAAE,SAAS,EAAE,MAAM,EAAE,UAAU,EAAE,IAAI,EAAE,MAAM,EAAE,IAAI,EAAE;CAC3E,CAAC;AAEF;;GAEG;AACH,MAAM,UAAU,UAAU,CAAC,KAAc;IACvC,IAAI,KAAK,IAAI,KAAK,IAAI,aAAa,EAAE,CAAC;QACpC,OAAO,aAAa,CAAC,KAAK,CAAC,CAAC;IAC9B,CAAC;IACD,OAAO,eAAe,CAAC;AACzB,CAAC;AAaD;;;;GAIG;AACH,MAAM,UAAU,aAAa,CAAC,KAK7B,EAAE,KAAc;IACf,MAAM,EACJ,aAAa,EACb,iBAAiB,EACjB,uBAAuB,EACvB,wBAAwB,GACzB,GAAG,KAAK,CAAC;IAEV,MAAM,OAAO,GAAG,UAAU,CAAC,KAAK,CAAC,CAAC;IAElC,IAAI,SAAiB,CAAC;IACtB,IAAI,aAAiC,CAAC;IACtC,IAAI,YAAgC,CAAC;IAErC,IACE,uBAAuB,KAAK,SAAS;QACrC,wBAAwB,KAAK,SAAS,EACtC,CAAC;QACD,sBAAsB;QACtB,MAAM,OAAO,GACX,CAAC,uBAAuB,GAAG,SAAS,CAAC,GAAG,OAAO,CAAC,SAAS,CAAC;QAC5D,MAAM,QAAQ,GACZ,CAAC,wBAAwB,GAAG,SAAS,CAAC,GAAG,OAAO,CAAC,UAAU,CAAC;QAC9D,SAAS,GAAG,OAAO,GAAG,QAAQ,CAAC;QAE/B,IAAI,aAAa,GAAG,CAAC,EAAE,CAAC;YACtB,aAAa,GAAG,uBAAuB,GAAG,aAAa,CAAC;QAC1D,CAAC;QAED,uDAAuD;QACvD,MAAM,WAAW,GAAG,CAAC,aAAa,GAAG,SAAS,CAAC,GAAG,OAAO,CAAC,UAAU,CAAC;QACrE,YAAY,GAAG,WAAW,GAAG,SAAS,CAAC;IACzC,CAAC;SAAM,CAAC;QACN,qDAAqD;QACrD,SAAS,GAAG,CAAC,aAAa,GAAG,SAAS,CAAC,GAAG,OAAO,CAAC,UAAU,CAAC;IAC/D,CAAC;IAED,MAAM,UAAU,GAAG,CAAC,iBAAiB,GAAG,SAAS,CAAC,GAAG,OAAO,CAAC,MAAM,CAAC;IAEpE,OAAO;QACL,SAAS;QACT,UAAU;QACV,SAAS,EAAE,SAAS,GAAG,UAAU;QACjC,aAAa;QACb,YAAY;KACb,CAAC;AACJ,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,UAAU,CAAC,SAAwB;IACjD,MAAM,IAAI,GAAG,SAAS,CAAC,SAAS,CAAC;IACjC,IAAI,SAAiB,CAAC;IAEtB,IAAI,IAAI,GAAG,IAAI,EAAE,CAAC;QAChB,SAAS,GAAG,IAAI,IAAI,CAAC,OAAO,CAAC,CAAC,CAAC,EAAE,CAAC;IACpC,CAAC;SAAM,CAAC;QACN,SAAS,GAAG,IAAI,IAAI,CAAC,OAAO,CAAC,CAAC,CAAC,EAAE,CAAC;IACpC,CAAC;IAED,IACE,SAAS,CAAC,aAAa,KAAK,SAAS;QACrC,SAAS,CAAC,aAAa,GAAG,CAAC;QAC3B,SAAS,CAAC,YAAY,KAAK,SAAS;QACpC,SAAS,CAAC,YAAY,GAAG,CAAC,EAC1B,CAAC;QACD,MAAM,GAAG,GAAG,IAAI,CAAC,KAAK,CAAC,SAAS,CAAC,aAAa,GAAG,GAAG,CAAC,CAAC;QACtD,SAAS,IAAI,gBAAgB,GAAG,cAAc,SAAS,CAAC,YAAY,CAAC,OAAO,CAAC,CAAC,CAAC,GAAG,CAAC;IACrF,CAAC;IAED,OAAO,SAAS,CAAC;AACnB,CAAC"}
@@ -1 +1 @@
1
- {"version":3,"file":"deepseek-client.d.ts","sourceRoot":"","sources":["../src/deepseek-client.ts"],"names":[],"mappings":"AAAA;;;;GAIG;AAUH,OAAO,KAAK,EACV,oBAAoB,EACpB,sBAAsB,EACtB,oBAAoB,EAGpB,YAAY,EAEb,MAAM,YAAY,CAAC;AAqBpB,oDAAoD;AACpD,MAAM,WAAW,kCAAmC,SAAQ,sBAAsB;IAChF,QAAQ,CAAC,EAAE,YAAY,CAAC;CACzB;AAwBD,qBAAa,cAAc;IACzB,OAAO,CAAC,MAAM,CAAS;IACvB,OAAO,CAAC,eAAe,CAAqC;IAC5D,OAAO,CAAC,WAAW,CAAS;IAC5B,OAAO,CAAC,cAAc,CAAS;;IAgB/B;;OAEG;IACH,OAAO,CAAC,iBAAiB;IASzB;;OAEG;IACH,uBAAuB;IAQvB;;OAEG;IACH,OAAO,CAAC,kBAAkB;IAgE1B;;OAEG;IACH,OAAO,CAAC,SAAS;IASjB;;OAEG;IACH,OAAO,CAAC,aAAa;IAoCrB;;OAEG;IACG,oBAAoB,CACxB,MAAM,EAAE,oBAAoB,GAC3B,OAAO,CAAC,kCAAkC,CAAC;IAqD9C;;;OAGG;IACG,6BAA6B,CACjC,MAAM,EAAE,oBAAoB,GAC3B,OAAO,CAAC,kCAAkC,CAAC;IAyC9C;;OAEG;YACW,cAAc;IA4F5B;;OAEG;IACG,cAAc,IAAI,OAAO,CAAC,OAAO,CAAC;CAazC"}
1
+ {"version":3,"file":"deepseek-client.d.ts","sourceRoot":"","sources":["../src/deepseek-client.ts"],"names":[],"mappings":"AAAA;;;;GAIG;AAUH,OAAO,KAAK,EACV,oBAAoB,EACpB,sBAAsB,EACtB,oBAAoB,EAGpB,YAAY,EAEb,MAAM,YAAY,CAAC;AAmBpB,oDAAoD;AACpD,MAAM,WAAW,kCAAmC,SAAQ,sBAAsB;IAChF,QAAQ,CAAC,EAAE,YAAY,CAAC;CACzB;AAwBD,qBAAa,cAAc;IACzB,OAAO,CAAC,MAAM,CAAS;IACvB,OAAO,CAAC,eAAe,CAAqC;IAC5D,OAAO,CAAC,WAAW,CAAS;IAC5B,OAAO,CAAC,cAAc,CAAS;;IAgB/B;;OAEG;IACH,OAAO,CAAC,iBAAiB;IASzB;;OAEG;IACH,uBAAuB;IAQvB;;OAEG;IACH,OAAO,CAAC,kBAAkB;IAiF1B;;OAEG;IACH,OAAO,CAAC,SAAS;IASjB;;OAEG;IACH,OAAO,CAAC,aAAa;IAoCrB;;OAEG;IACG,oBAAoB,CACxB,MAAM,EAAE,oBAAoB,GAC3B,OAAO,CAAC,kCAAkC,CAAC;IAqD9C;;;OAGG;IACG,6BAA6B,CACjC,MAAM,EAAE,oBAAoB,GAC3B,OAAO,CAAC,kCAAkC,CAAC;IAyC9C;;OAEG;YACW,cAAc;IA4F5B;;OAEG;IACG,cAAc,IAAI,OAAO,CAAC,OAAO,CAAC;CAazC"}
@@ -8,21 +8,19 @@ import { getConfig } from './config.js';
8
8
  import { ApiError, CircuitBreakerOpenError, FallbackExhaustedError, } from './errors.js';
9
9
  import { CircuitBreaker } from './circuit-breaker.js';
10
10
  import { hasReasoningContent, getErrorMessage } from './types.js';
11
- /** Parameters that are incompatible with thinking mode */
12
- const THINKING_INCOMPATIBLE_PARAMS = [
13
- 'temperature',
14
- 'top_p',
15
- 'frequency_penalty',
16
- 'presence_penalty',
17
- ];
11
+ /** Parameters that the API ignores while thinking mode is active */
12
+ const THINKING_INCOMPATIBLE_PARAMS = ['temperature', 'top_p'];
18
13
  /**
19
14
  * Fallback order per model. Each model lists fallback candidates in priority order.
20
15
  * When a model fails with a retryable error, the first available fallback is tried.
21
- * Add new models here as they become available in the API.
16
+ * v4-flash and v4-pro are genuinely different models, so they back each other up.
17
+ * The chat/reasoner aliases (which resolve to v4-flash) fall back to v4-pro.
22
18
  */
23
19
  const FALLBACK_ORDER = {
24
- 'deepseek-chat': ['deepseek-reasoner'],
25
- 'deepseek-reasoner': ['deepseek-chat'],
20
+ 'deepseek-v4-flash': ['deepseek-v4-pro'],
21
+ 'deepseek-v4-pro': ['deepseek-v4-flash'],
22
+ 'deepseek-chat': ['deepseek-v4-pro'],
23
+ 'deepseek-reasoner': ['deepseek-v4-pro'],
26
24
  };
27
25
  /**
28
26
  * Check if an error is retryable / should trigger fallback
@@ -93,25 +91,36 @@ export class DeepSeekClient {
93
91
  * Build request params shared between streaming and non-streaming
94
92
  */
95
93
  buildRequestParams(params, stream) {
96
- // Transparently convert reasoner to chat + thinking
97
- // deepseek-reasoner = deepseek-chat with thinking always enabled
98
- // chat + thinking supports function calling, which reasoner alone does not
99
- let effectiveModel = params.model;
100
- let effectiveThinking = params.thinking;
101
- if (params.model === 'deepseek-reasoner') {
102
- effectiveModel = 'deepseek-chat';
103
- effectiveThinking = { type: 'enabled' };
104
- console.error('[DeepSeek MCP] Routing: reasoner -> chat + thinking');
94
+ // Resolve the user-facing model + thinking flag to what the API expects.
95
+ // v4-flash / v4-pro are the live API models. deepseek-chat and
96
+ // deepseek-reasoner are compatibility aliases (the API retires those names
97
+ // on 2026-07-24), so they are translated to v4-flash here. The API defaults
98
+ // thinking to ENABLED, so we always send an explicit flag — including
99
+ // disabled to keep the historical fast (non-thinking) default.
100
+ let effectiveModel;
101
+ let effectiveThinking;
102
+ switch (params.model) {
103
+ case 'deepseek-reasoner':
104
+ // "thinking" alias: always reason
105
+ effectiveModel = 'deepseek-v4-flash';
106
+ effectiveThinking = { type: 'enabled' };
107
+ console.error('[DeepSeek MCP] Routing: deepseek-reasoner -> deepseek-v4-flash + thinking');
108
+ break;
109
+ case 'deepseek-chat':
110
+ // "non-thinking" alias: fast by default, but honour an explicit thinking:enabled
111
+ effectiveModel = 'deepseek-v4-flash';
112
+ effectiveThinking = params.thinking ?? { type: 'disabled' };
113
+ console.error('[DeepSeek MCP] Routing: deepseek-chat -> deepseek-v4-flash');
114
+ break;
115
+ default:
116
+ // deepseek-v4-flash / deepseek-v4-pro: pass through, default to non-thinking
117
+ effectiveModel = params.model;
118
+ effectiveThinking = params.thinking ?? { type: 'disabled' };
105
119
  }
106
- const isThinkingEnabled = effectiveThinking?.type === 'enabled';
107
- // Filter incompatible params when thinking mode is active
120
+ const isThinkingEnabled = effectiveThinking.type === 'enabled';
121
+ // Warn when caller-supplied sampling params will be ignored under thinking mode
108
122
  if (isThinkingEnabled) {
109
- const filtered = [];
110
- for (const key of THINKING_INCOMPATIBLE_PARAMS) {
111
- if (params[key] !== undefined) {
112
- filtered.push(key);
113
- }
114
- }
123
+ const filtered = THINKING_INCOMPATIBLE_PARAMS.filter((key) => params[key] !== undefined);
115
124
  if (filtered.length > 0) {
116
125
  console.error(`[DeepSeek MCP] Warning: Thinking mode active, ignoring incompatible params: ${filtered.join(', ')}`);
117
126
  }
@@ -119,17 +128,21 @@ export class DeepSeekClient {
119
128
  const requestParams = {
120
129
  model: effectiveModel,
121
130
  messages: params.messages,
122
- temperature: isThinkingEnabled ? undefined : (params.temperature ?? 1.0),
123
131
  max_tokens: params.max_tokens,
124
- top_p: isThinkingEnabled ? undefined : params.top_p,
125
- frequency_penalty: isThinkingEnabled ? undefined : params.frequency_penalty,
126
- presence_penalty: isThinkingEnabled ? undefined : params.presence_penalty,
127
132
  stop: params.stop,
128
133
  stream,
134
+ // Always explicit: the API's thinking default is enabled
135
+ thinking: effectiveThinking,
129
136
  };
130
- // Pass thinking as top-level param (OpenAI SDK v6 passes unknown props to the API body)
131
- if (effectiveThinking?.type === 'enabled') {
132
- requestParams.thinking = effectiveThinking;
137
+ // Sampling params only take effect in non-thinking mode
138
+ if (!isThinkingEnabled) {
139
+ requestParams.temperature = params.temperature ?? 1.0;
140
+ if (params.top_p !== undefined)
141
+ requestParams.top_p = params.top_p;
142
+ }
143
+ // reasoning_effort only applies while thinking
144
+ if (isThinkingEnabled && params.reasoning_effort) {
145
+ requestParams.reasoning_effort = params.reasoning_effort;
133
146
  }
134
147
  // Pass response_format for JSON mode
135
148
  if (params.response_format) {
@@ -360,7 +373,7 @@ export class DeepSeekClient {
360
373
  async testConnection() {
361
374
  try {
362
375
  const response = await this.createChatCompletion({
363
- model: 'deepseek-chat',
376
+ model: 'deepseek-v4-flash',
364
377
  messages: [{ role: 'user', content: 'Hi' }],
365
378
  max_tokens: 10,
366
379
  });