@jsonstudio/rcc 0.89.1348 → 0.89.1488
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +51 -1427
- package/configsamples/config.json +12 -4
- package/dist/build-info.js +2 -2
- package/dist/cli/commands/config.js +3 -0
- package/dist/cli/commands/config.js.map +1 -1
- package/dist/cli/commands/init.js +3 -0
- package/dist/cli/commands/init.js.map +1 -1
- package/dist/cli/config/bundled-docs.js +2 -2
- package/dist/cli/config/bundled-docs.js.map +1 -1
- package/dist/cli/config/init-config.d.ts +2 -1
- package/dist/cli/config/init-config.js +33 -1
- package/dist/cli/config/init-config.js.map +1 -1
- package/dist/client/gemini/gemini-protocol-client.js +2 -1
- package/dist/client/gemini/gemini-protocol-client.js.map +1 -1
- package/dist/client/gemini-cli/gemini-cli-protocol-client.js +67 -16
- package/dist/client/gemini-cli/gemini-cli-protocol-client.js.map +1 -1
- package/dist/client/openai/chat-protocol-client.js +2 -1
- package/dist/client/openai/chat-protocol-client.js.map +1 -1
- package/dist/client/responses/responses-protocol-client.js +2 -1
- package/dist/client/responses/responses-protocol-client.js.map +1 -1
- package/dist/error-handling/quiet-error-handling-center.js +46 -8
- package/dist/error-handling/quiet-error-handling-center.js.map +1 -1
- package/dist/manager/modules/quota/antigravity-quota-manager.d.ts +4 -0
- package/dist/manager/modules/quota/antigravity-quota-manager.js +130 -2
- package/dist/manager/modules/quota/antigravity-quota-manager.js.map +1 -1
- package/dist/manager/modules/quota/provider-quota-daemon.events.js +67 -4
- package/dist/manager/modules/quota/provider-quota-daemon.events.js.map +1 -1
- package/dist/manager/modules/quota/provider-quota-daemon.model-backoff.js +9 -6
- package/dist/manager/modules/quota/provider-quota-daemon.model-backoff.js.map +1 -1
- package/dist/modules/llmswitch/bridge.js +17 -4
- package/dist/modules/llmswitch/bridge.js.map +1 -1
- package/dist/modules/llmswitch/core-loader.d.ts +1 -1
- package/dist/modules/llmswitch/core-loader.js +15 -3
- package/dist/modules/llmswitch/core-loader.js.map +1 -1
- package/dist/providers/auth/antigravity-userinfo-helper.d.ts +5 -2
- package/dist/providers/auth/antigravity-userinfo-helper.js +63 -8
- package/dist/providers/auth/antigravity-userinfo-helper.js.map +1 -1
- package/dist/providers/auth/gemini-cli-userinfo-helper.js +66 -4
- package/dist/providers/auth/gemini-cli-userinfo-helper.js.map +1 -1
- package/dist/providers/auth/oauth-lifecycle.js +112 -1
- package/dist/providers/auth/oauth-lifecycle.js.map +1 -1
- package/dist/providers/auth/tokenfile-auth.d.ts +14 -0
- package/dist/providers/auth/tokenfile-auth.js +125 -2
- package/dist/providers/auth/tokenfile-auth.js.map +1 -1
- package/dist/providers/core/config/camoufox-launcher.d.ts +5 -0
- package/dist/providers/core/config/camoufox-launcher.js +5 -0
- package/dist/providers/core/config/camoufox-launcher.js.map +1 -1
- package/dist/providers/core/config/service-profiles.js +7 -18
- package/dist/providers/core/config/service-profiles.js.map +1 -1
- package/dist/providers/core/runtime/base-provider.d.ts +0 -5
- package/dist/providers/core/runtime/base-provider.js +26 -112
- package/dist/providers/core/runtime/base-provider.js.map +1 -1
- package/dist/providers/core/runtime/gemini-cli-http-provider.d.ts +6 -0
- package/dist/providers/core/runtime/gemini-cli-http-provider.js +409 -100
- package/dist/providers/core/runtime/gemini-cli-http-provider.js.map +1 -1
- package/dist/providers/core/runtime/http-request-executor.d.ts +3 -0
- package/dist/providers/core/runtime/http-request-executor.js +110 -38
- package/dist/providers/core/runtime/http-request-executor.js.map +1 -1
- package/dist/providers/core/runtime/http-transport-provider.d.ts +3 -0
- package/dist/providers/core/runtime/http-transport-provider.js +89 -39
- package/dist/providers/core/runtime/http-transport-provider.js.map +1 -1
- package/dist/providers/core/runtime/rate-limit-manager.d.ts +1 -12
- package/dist/providers/core/runtime/rate-limit-manager.js +4 -77
- package/dist/providers/core/runtime/rate-limit-manager.js.map +1 -1
- package/dist/providers/core/utils/http-client.js +20 -43
- package/dist/providers/core/utils/http-client.js.map +1 -1
- package/dist/runtime/wasm-runtime/wasm-config.d.ts +73 -0
- package/dist/runtime/wasm-runtime/wasm-config.js +124 -0
- package/dist/runtime/wasm-runtime/wasm-config.js.map +1 -0
- package/dist/runtime/wasm-runtime/wasm-loader.d.ts +40 -0
- package/dist/runtime/wasm-runtime/wasm-loader.js +62 -0
- package/dist/runtime/wasm-runtime/wasm-loader.js.map +1 -0
- package/dist/server/handlers/handler-utils.js +5 -1
- package/dist/server/handlers/handler-utils.js.map +1 -1
- package/dist/server/handlers/responses-handler.js +1 -1
- package/dist/server/handlers/responses-handler.js.map +1 -1
- package/dist/server/runtime/http-server/index.js +121 -30
- package/dist/server/runtime/http-server/index.js.map +1 -1
- package/dist/server/runtime/http-server/request-executor.js +50 -6
- package/dist/server/runtime/http-server/request-executor.js.map +1 -1
- package/dist/server/runtime/http-server/routes.js +4 -1
- package/dist/server/runtime/http-server/routes.js.map +1 -1
- package/dist/utils/strip-internal-keys.d.ts +12 -0
- package/dist/utils/strip-internal-keys.js +28 -0
- package/dist/utils/strip-internal-keys.js.map +1 -0
- package/docs/CHAT_PROCESS_PROTOCOL_AND_PIPELINE.md +221 -0
- package/docs/antigravity-gemini-format-cleanup.md +143 -0
- package/docs/antigravity-routing-contract.md +31 -0
- package/docs/chat-semantic-expansion-plan.md +8 -6
- package/docs/glm-chat-completions.md +1 -1
- package/docs/llms-wasm-migration.md +331 -0
- package/docs/llms-wasm-module-boundaries.md +588 -0
- package/docs/llms-wasm-replay-baseline.md +171 -0
- package/docs/plans/llms-wasm-migration-plan.md +401 -0
- package/docs/servertool-framework.md +65 -0
- package/docs/v2-architecture/README.md +6 -8
- package/docs/verified-configs/README.md +60 -0
- package/docs/verified-configs/v0.45.0/README.md +244 -0
- package/docs/verified-configs/v0.45.0/lmstudio-5521-gpt-oss-20b-mlx.json +135 -0
- package/docs/verified-configs/v0.45.0/merged-config.5521.json +1205 -0
- package/docs/verified-configs/v0.45.0/merged-config.qwen-5522.json +1559 -0
- package/docs/verified-configs/v0.45.0/qwen-5522-qwen3-coder-plus-final.json +221 -0
- package/docs/verified-configs/v0.45.0/qwen-5522-qwen3-coder-plus-fixed.json +242 -0
- package/docs/verified-configs/v0.45.0/qwen-5522-qwen3-coder-plus.json +242 -0
- package/package.json +17 -11
- package/scripts/antigravity-token-bridge.mjs +283 -0
- package/scripts/build-core.mjs +3 -1
- package/scripts/ci/repo-sanity.mjs +138 -0
- package/scripts/mock-provider/run-regressions.mjs +157 -1
- package/scripts/run-bg.sh +0 -14
- package/scripts/tests/ci-jest.mjs +119 -0
- package/scripts/tools-dev/responses-debug-client/README.md +23 -0
- package/scripts/tools-dev/responses-debug-client/payloads/poem.json +13 -0
- package/scripts/tools-dev/responses-debug-client/payloads/sample-no-tools.json +98 -0
- package/scripts/tools-dev/responses-debug-client/payloads/text.json +13 -0
- package/scripts/tools-dev/responses-debug-client/payloads/tool.json +27 -0
- package/scripts/tools-dev/responses-debug-client/run.mjs +65 -0
- package/scripts/tools-dev/responses-debug-client/src/index.ts +281 -0
- package/scripts/tools-dev/run-llmswitch-chat.mjs +53 -0
- package/scripts/tools-dev/server-tools-dev/run-web-fetch.mjs +65 -0
- package/scripts/vendor-core.mjs +13 -3
- package/scripts/test-fc-responses.mjs +0 -66
- package/scripts/test-guidance.mjs +0 -100
- package/scripts/test-iflow-web-search.mjs +0 -141
- package/scripts/test-iflow.mjs +0 -379
- package/scripts/test-tool-exec.mjs +0 -26
|
@@ -0,0 +1,171 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: "llms-wasm 基线回放集(Replay Baseline)"
|
|
3
|
+
tags:
|
|
4
|
+
- wasm
|
|
5
|
+
- migration
|
|
6
|
+
- baseline
|
|
7
|
+
status: planning
|
|
8
|
+
created: 2026-01-26
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# llms-wasm 基线回放集(Replay Baseline)
|
|
12
|
+
|
|
13
|
+
> [!important] 目标
|
|
14
|
+
> 构建可重复、可回溯、可脱敏的基线回放集,用于 TS/WASM 双跑对比与差异修复闭环。
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## 1. 采样策略(Sampling Strategy)
|
|
19
|
+
|
|
20
|
+
> [!tip] 分层覆盖
|
|
21
|
+
> 采样应覆盖 **模型 / 工具 / 路由 / SSE** 的典型场景,并保留长尾样本用于发现边界差异。
|
|
22
|
+
|
|
23
|
+
### 1.1 样本分层
|
|
24
|
+
|
|
25
|
+
- **模型维度**:高频模型 + 长尾模型(至少覆盖每个 provider 的主力模型)
|
|
26
|
+
- **工具维度**:tool_calls + function_call + 无工具场景
|
|
27
|
+
- **路由维度**:主路 / failover / alias 切换 / 冷却恢复
|
|
28
|
+
- **SSE 维度**:流式 chunk(含拆包差异)+ 非流式 response
|
|
29
|
+
|
|
30
|
+
### 1.2 采样比例(建议)
|
|
31
|
+
|
|
32
|
+
- 全量采样:0.1% - 1%(按租户/路由分层)
|
|
33
|
+
- 长尾采样:按错误率/差异率动态加权
|
|
34
|
+
- 基线回放集:固定 1000 - 5000 条稳定样本
|
|
35
|
+
|
|
36
|
+
### 1.3 采样触发条件
|
|
37
|
+
|
|
38
|
+
- diff 率升高时:自动提高该模块采样比例
|
|
39
|
+
- 新版本发布时:额外采样并标记 `build_version`
|
|
40
|
+
- 关键模块切换阶段(shadow → canary → default):强制采样
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
## 2. 存储格式(Storage Format)
|
|
45
|
+
|
|
46
|
+
> [!warning] 必须可脱敏
|
|
47
|
+
> 所有字段必须经过 PII/密钥脱敏;存储结构必须保持可重放。
|
|
48
|
+
|
|
49
|
+
### 2.1 基线记录结构(JSON)
|
|
50
|
+
|
|
51
|
+
```json
|
|
52
|
+
{
|
|
53
|
+
"request_id": "req_123",
|
|
54
|
+
"tenant": "tenant_a",
|
|
55
|
+
"route": "route_main",
|
|
56
|
+
"timestamp": "2026-01-26T10:00:00Z",
|
|
57
|
+
"request": {
|
|
58
|
+
"endpoint": "/v1/chat/completions",
|
|
59
|
+
"payload": {"...": "..."}
|
|
60
|
+
},
|
|
61
|
+
"response": {
|
|
62
|
+
"status": 200,
|
|
63
|
+
"payload": {"...": "..."}
|
|
64
|
+
},
|
|
65
|
+
"metadata": {
|
|
66
|
+
"provider": "anthropic",
|
|
67
|
+
"model": "claude-3-5",
|
|
68
|
+
"compatibility_profile": "chat-gemini",
|
|
69
|
+
"runtime_main": "ts",
|
|
70
|
+
"runtime_shadow": "wasm"
|
|
71
|
+
}
|
|
72
|
+
}
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
### 2.2 索引字段
|
|
76
|
+
|
|
77
|
+
- `request_id`
|
|
78
|
+
- `tenant`
|
|
79
|
+
- `route`
|
|
80
|
+
- `provider`
|
|
81
|
+
- `model`
|
|
82
|
+
- `compatibility_profile`
|
|
83
|
+
- `timestamp`
|
|
84
|
+
|
|
85
|
+
---
|
|
86
|
+
|
|
87
|
+
## 3. 脱敏规则(Data Sanitization)
|
|
88
|
+
|
|
89
|
+
> [!important] 不可逆脱敏
|
|
90
|
+
> PII 与 secrets 必须不可逆处理;但结构必须保持一致以保证重放。
|
|
91
|
+
|
|
92
|
+
### 3.1 PII 处理
|
|
93
|
+
|
|
94
|
+
- 字段值 → hash + mask
|
|
95
|
+
- 保留结构与字段名
|
|
96
|
+
|
|
97
|
+
示例:
|
|
98
|
+
|
|
99
|
+
```json
|
|
100
|
+
{
|
|
101
|
+
"user_email": "sha256:***",
|
|
102
|
+
"user_name": "sha256:***"
|
|
103
|
+
}
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
### 3.2 Secrets 处理
|
|
107
|
+
|
|
108
|
+
- API keys / tokens:完全移除
|
|
109
|
+
- 如果必需存在:替换为固定占位符 `"__redacted__"`
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
## 4. 基线版本快照字段
|
|
114
|
+
|
|
115
|
+
> [!note] 版本快照必须固定
|
|
116
|
+
> 回放时必须明确 TS/WASM/ruleset 版本,以保证对比可复现。
|
|
117
|
+
|
|
118
|
+
### 4.1 必含字段
|
|
119
|
+
|
|
120
|
+
- `ts_version`(@jsonstudio/llms 版本)
|
|
121
|
+
- `wasm_version`(llms-wasm 版本)
|
|
122
|
+
- `ruleset_version`(diff ruleset 版本)
|
|
123
|
+
- `compat_profile_version`(compat profile 版本)
|
|
124
|
+
- `sse_protocol_version`(SSE 协议版本)
|
|
125
|
+
|
|
126
|
+
示例:
|
|
127
|
+
|
|
128
|
+
```json
|
|
129
|
+
{
|
|
130
|
+
"ts_version": "0.6.1172",
|
|
131
|
+
"wasm_version": "0.1.0",
|
|
132
|
+
"ruleset_version": "diff_ruleset_v1",
|
|
133
|
+
"compat_profile_version": "compat_profiles_v1",
|
|
134
|
+
"sse_protocol_version": "sse_protocol_v1"
|
|
135
|
+
}
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
---
|
|
139
|
+
|
|
140
|
+
## 5. 回放执行(Replay Runner)
|
|
141
|
+
|
|
142
|
+
> [!tip] 回放方式
|
|
143
|
+
> 优先离线回放;如需线上验证,必须隔离生产主路。
|
|
144
|
+
|
|
145
|
+
### 5.1 离线回放
|
|
146
|
+
|
|
147
|
+
- 输入固定基线集
|
|
148
|
+
- 输出 diff 结果
|
|
149
|
+
- 自动归因到模块(compat / logic / nondeterministic)
|
|
150
|
+
|
|
151
|
+
### 5.2 线上回放(可选)
|
|
152
|
+
|
|
153
|
+
- 影子模式执行
|
|
154
|
+
- 只记录 diff 摘要
|
|
155
|
+
- 不影响主路
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## 6. 保留策略
|
|
160
|
+
|
|
161
|
+
- **全量回放集**:7-14 天
|
|
162
|
+
- **采样摘要**:30-90 天
|
|
163
|
+
- **历史归档**:按版本归档
|
|
164
|
+
|
|
165
|
+
---
|
|
166
|
+
|
|
167
|
+
## 相关文档
|
|
168
|
+
|
|
169
|
+
- [[docs/llms-wasm-migration.md]] - 迁移计划概要
|
|
170
|
+
- [[docs/plans/llms-wasm-migration-plan.md]] - 可执行清单
|
|
171
|
+
- [[docs/llms-wasm-module-boundaries.md]] - 模块边界清单
|
|
@@ -0,0 +1,401 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: "llms-wasm 逐步替换迁移计划"
|
|
3
|
+
tags:
|
|
4
|
+
- migration
|
|
5
|
+
- wasm
|
|
6
|
+
- architecture
|
|
7
|
+
status: planning
|
|
8
|
+
priority: high
|
|
9
|
+
created: 2026-01-26
|
|
10
|
+
owners:
|
|
11
|
+
- name: RouteCodex Team
|
|
12
|
+
role: Architecture
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
# llms-wasm 逐步替换迁移计划
|
|
16
|
+
|
|
17
|
+
> [!important] 核心目标
|
|
18
|
+
> 让 llms-wasm 和 llms TS 同时加载、同一请求可双跑;默认流量仍走 TS,WASM 跑影子并产出 diff;diff 可回溯、可修正;按模块逐步替换并具备清晰的验收与回滚开关。
|
|
19
|
+
|
|
20
|
+
## 成功指标(建议起始阈值)
|
|
21
|
+
|
|
22
|
+
- diff rate ≤ 0.5%(按模块可更严格)
|
|
23
|
+
- shadow error rate ≤ 0.1%(影子侧)
|
|
24
|
+
- P95 latency delta ≤ +10ms(主路 vs 基线)
|
|
25
|
+
|
|
26
|
+
> [!note] 统计口径
|
|
27
|
+
> 指标必须按 `tenant`、`route`、`module`、`runtime(ts|wasm)` 分维度统计,并支持按 ruleset 版本回溯。
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
## 阶段 0:边界与基线
|
|
32
|
+
|
|
33
|
+
### 0.1 模块边界清单(Contract + 归属)
|
|
34
|
+
|
|
35
|
+
> [!tip] 产出物
|
|
36
|
+
> 输出一份“模块边界清单”,用于:替换顺序、归因、验收与回滚。
|
|
37
|
+
|
|
38
|
+
| 模块 | 输入 Contract | 输出 Contract | Owner/修复路径 | 仓库归属 |
|
|
39
|
+
|---|---|---|---|---|
|
|
40
|
+
| tokenizer/encoding | `text` | `tokens` | wasm core | llmswitch-core |
|
|
41
|
+
| tool canonicalization | `ToolCallRaw[]` | `ToolCallCanonical[]` | wasm core | llmswitch-core |
|
|
42
|
+
| compat | `UpstreamResponse` | `CanonicalResponse` | compat adapter | llmswitch-core |
|
|
43
|
+
| streaming (SSE) | `SSEChunk[]` | `CanonicalSSEEvents[]` | wasm core | llmswitch-core |
|
|
44
|
+
| routing | `RequestContext` | `ProviderTarget` | wasm core | llmswitch-core |
|
|
45
|
+
|
|
46
|
+
> [!note] 依赖顺序
|
|
47
|
+
> `tokenizer → tools → routing → compat → streaming`(可按实际代码调整,但必须写入清单)。
|
|
48
|
+
|
|
49
|
+
**任务清单**:
|
|
50
|
+
|
|
51
|
+
- [ ] 产出模块边界清单文档(`docs/llms-wasm-module-boundaries.md`)
|
|
52
|
+
- [ ] 定义每个模块的输入/输出 Contract(TypeScript interface)
|
|
53
|
+
- [ ] 明确依赖顺序与替换优先级
|
|
54
|
+
- [ ] 确认 Owner/修复路径(wasm core vs compat adapter)
|
|
55
|
+
|
|
56
|
+
> [!important] 本轮补充要求(阶段 0 必须落地)
|
|
57
|
+
> - **统一 tokenizer**:整理成单一实现与单一入口(single source of truth)。
|
|
58
|
+
> - **统一 SSE event 协议**:定义唯一 canonical SSE event schema + diff 协议(event + token 级)。
|
|
59
|
+
> - **统一 compat profile**:在 llmswitch-core 统一定义、版本化、并成��唯一触发来源。
|
|
60
|
+
|
|
61
|
+
### 0.2 基线回放集(Replay Baseline)
|
|
62
|
+
|
|
63
|
+
- 固定请求集:覆盖典型模型/工具/路由/SSE 场景
|
|
64
|
+
- 固定版本快照:记录 TS 与 WASM 的版本号 + core ruleset 版本
|
|
65
|
+
- 回放方式:离线 replay + 线上 sampled replay(可选)
|
|
66
|
+
|
|
67
|
+
> [!important] 回放基线必须可重复
|
|
68
|
+
> - 输入需要脱敏但可复现(结构不变)
|
|
69
|
+
> - 影子侧必须能独立重放同一请求(同版本、同 ruleset)
|
|
70
|
+
|
|
71
|
+
**任务清单**:
|
|
72
|
+
|
|
73
|
+
- [ ] 设计回放集采样策略(覆盖典型场景)
|
|
74
|
+
- [ ] 实现回放集存储格式(JSON + 脱敏)
|
|
75
|
+
- [ ] 实现回放 runner(支持离线 replay)
|
|
76
|
+
- [ ] 建立 baseline 版本管理(TS/WASM/ruleset 版本号)
|
|
77
|
+
|
|
78
|
+
---
|
|
79
|
+
|
|
80
|
+
## 阶段 1:双加载 + 开关矩阵
|
|
81
|
+
|
|
82
|
+
### 1.1 双加载初始化
|
|
83
|
+
|
|
84
|
+
- TS runtime:当前 `@jsonstudio/llms`(TS)
|
|
85
|
+
- WASM runtime:`llms-wasm`
|
|
86
|
+
- 目标:同进程内并存、互不影响、初始化失败可降级(不 silent fallback;需上报错误)
|
|
87
|
+
|
|
88
|
+
> [!warning] Fail fast + 可观测
|
|
89
|
+
> WASM 初始化失败必须走 `providerErrorCenter` + `errorHandlingCenter` 上报;主路仍按模式走 TS。
|
|
90
|
+
|
|
91
|
+
**任务清单**:
|
|
92
|
+
|
|
93
|
+
- [ ] Host 实现双加载初始化(`src/server/runtime/http-server/dual-runtime.ts`)
|
|
94
|
+
- [ ] WASM 初始化失败上报(通过 `providerErrorCenter`)
|
|
95
|
+
- [ ] 验证双加载互不影响(隔离测试)
|
|
96
|
+
|
|
97
|
+
### 1.2 运行模式
|
|
98
|
+
|
|
99
|
+
- `ts_primary`:主路 TS(默认)
|
|
100
|
+
- `shadow`:主路 TS,WASM 影子
|
|
101
|
+
- `wasm_primary`:主路 WASM,TS 影子
|
|
102
|
+
- `split`:按比例分流(用于逐步切流)
|
|
103
|
+
|
|
104
|
+
**任务清单**:
|
|
105
|
+
|
|
106
|
+
- [ ] 定义 RuntimeMode 类型与枚举
|
|
107
|
+
- [ ] 实现模式切换逻辑(Host 侧)
|
|
108
|
+
|
|
109
|
+
### 1.3 开关优先级与作用域
|
|
110
|
+
|
|
111
|
+
> [!important] 优先级(强制)
|
|
112
|
+
> 全局 > 租户 > 路由 > 请求
|
|
113
|
+
|
|
114
|
+
| 作用域 | 示例 | 说明 |
|
|
115
|
+
|---|---|---|
|
|
116
|
+
| 全局(进程级) | env `ROUTECODEX_WASM_MODE=shadow` | 默认行为 |
|
|
117
|
+
| 租户级 | config `tenants[*].wasmMode` | 覆盖全局 |
|
|
118
|
+
| 路由级 | virtual router `routes[*].wasmMode` | 覆盖租户 |
|
|
119
|
+
| 请求级 | header `X-WASM-Mode: wasm_primary` | 单次请求 override |
|
|
120
|
+
|
|
121
|
+
> [!note] 开关读取位置
|
|
122
|
+
> Host 只做“读取与决策分发”;具体逻辑执行(含 canonicalization、diff 协议)在 llmswitch-core。
|
|
123
|
+
|
|
124
|
+
**任务清单**:
|
|
125
|
+
|
|
126
|
+
- [ ] 实现开关读取逻辑(Host 侧)
|
|
127
|
+
- [ ] 实现优先级解析(全局 > 租户 > 路由 > 请求)
|
|
128
|
+
- [ ] 更新配置 schema(支持 wasmMode 字段)
|
|
129
|
+
- [ ] 文档化开关优先级矩阵
|
|
130
|
+
|
|
131
|
+
---
|
|
132
|
+
|
|
133
|
+
## 阶段 2:影子 + Diff 机制(含 SSE 协议)
|
|
134
|
+
|
|
135
|
+
### 2.1 影子请求分发
|
|
136
|
+
|
|
137
|
+
- 主路请求正常执行
|
|
138
|
+
- 影子请求异步执行(不阻塞主路响应)
|
|
139
|
+
- 影子失败不重试(默认);如需重试必须可配置且有预算(error budget)
|
|
140
|
+
|
|
141
|
+
**任务清单**:
|
|
142
|
+
|
|
143
|
+
- [ ] Host 实现影子请求分发(异步、非阻塞)
|
|
144
|
+
- [ ] 影子失败记录(error 摘要,不影响主路)
|
|
145
|
+
- [ ] 实现影子重试配置(可选,带 error budget)
|
|
146
|
+
|
|
147
|
+
### 2.2 Diff 协议与 Canonicalization(在 llmswitch-core)
|
|
148
|
+
|
|
149
|
+
> [!warning] Canonicalization 归属
|
|
150
|
+
> 所有 canonicalization、diff ruleset、比较逻辑都必须在 llmswitch-core;Host 不做字段修补。
|
|
151
|
+
|
|
152
|
+
#### 通用 Diff 规则(降噪)
|
|
153
|
+
|
|
154
|
+
- 排除/标准化:`timestamp`、随机 id、trace id 等
|
|
155
|
+
- 容忍策略:浮点容忍、集合无序、JSON 格式化
|
|
156
|
+
- 非确定性:白名单 + ruleset 版本化(见附录 A)
|
|
157
|
+
|
|
158
|
+
#### SSE 对比协议(event + token)
|
|
159
|
+
|
|
160
|
+
> [!tip] 核心原则
|
|
161
|
+
> 允许 chunk 拆包差异,但要求最终 token 序列一致;对比基于 event-level schema。
|
|
162
|
+
|
|
163
|
+
Canonical event schema:
|
|
164
|
+
|
|
165
|
+
- `event_type`
|
|
166
|
+
- `ordinal_index`
|
|
167
|
+
- `normalized_payload`
|
|
168
|
+
- `token_digest`(可选,用于 token 级校验)
|
|
169
|
+
|
|
170
|
+
判定逻辑:
|
|
171
|
+
|
|
172
|
+
- event 序列可容忍“拆包导致的分段差异”
|
|
173
|
+
- 最终 token_digest 序列一致则通过
|
|
174
|
+
- 失败时输出最小 diff 摘要(added/removed/mismatch)
|
|
175
|
+
|
|
176
|
+
**任务清单**:
|
|
177
|
+
|
|
178
|
+
- [ ] 实现通用 Diff 规则(降噪、容忍策略)
|
|
179
|
+
- [ ] 实现 SSE canonicalization schema
|
|
180
|
+
- [ ] 实现 SSE diff 判定逻辑(event + token 级)
|
|
181
|
+
- [ ] 实现 diff ruleset 版本化(`diff_ruleset_v1`)
|
|
182
|
+
|
|
183
|
+
### 2.3 Diff 记录(绑定 ruleset)
|
|
184
|
+
|
|
185
|
+
每条 diff 必须包含:
|
|
186
|
+
|
|
187
|
+
- `requestId` / `tenant` / `route`
|
|
188
|
+
- `runtime_main` / `runtime_shadow`
|
|
189
|
+
- `ruleset_version`
|
|
190
|
+
- `diff_summary`(可索引)
|
|
191
|
+
- `payload_refs`(可选:用于回放;存储时需脱敏)
|
|
192
|
+
|
|
193
|
+
**任务清单**:
|
|
194
|
+
|
|
195
|
+
- [ ] 设计 diff 记录 schema
|
|
196
|
+
- [ ] 实现 diff 存储(索引 + 保留策略)
|
|
197
|
+
- [ ] 实现脱敏规则(PII hash + mask,secrets 完全移除)
|
|
198
|
+
- [ ] 实现 diff 查询接口(按 requestId/ruleset)
|
|
199
|
+
|
|
200
|
+
---
|
|
201
|
+
|
|
202
|
+
## 阶段 3:责任归属与修复路径
|
|
203
|
+
|
|
204
|
+
### 3.1 责任归属表(Owner + Fix Location)
|
|
205
|
+
|
|
206
|
+
> [!important] 修复入口
|
|
207
|
+
> 所有 diff 必须能被归因到“修复路径”:`compat` vs `wasm core`(Host 永远不是修复入口)。
|
|
208
|
+
|
|
209
|
+
| 模块 | Owner | 修复路径 | 入口函数/文件(示例) |
|
|
210
|
+
|---|---|---|---|
|
|
211
|
+
| tokenizer | @team/core | wasm core | `llmswitch-core/src/tokenizer/*` |
|
|
212
|
+
| tools | @team/tools | wasm core | `llmswitch-core/src/tools/*` |
|
|
213
|
+
| compat | @team/compat | compat adapter | `llmswitch-core/src/conversion/compat/*` |
|
|
214
|
+
| streaming | @team/streaming | wasm core | `llmswitch-core/src/streaming/*` |
|
|
215
|
+
| routing | @team/routing | wasm core | `llmswitch-core/src/routing/*` |
|
|
216
|
+
|
|
217
|
+
**任务清单**:
|
|
218
|
+
|
|
219
|
+
- [ ] 产出责任归属表(`docs/llms-wasm-ownership-table.md`)
|
|
220
|
+
- [ ] 实现自动归因逻辑(diff → module + owner + fix location)
|
|
221
|
+
|
|
222
|
+
### 3.2 修复闭环
|
|
223
|
+
|
|
224
|
+
1. 影子 diff 收集 → 2. 分类(compat/logic/nondeterministic) → 3. 修复 → 4. 基线回放验证 → 5. diff 下降 → 6. 推进模块替换阶段
|
|
225
|
+
|
|
226
|
+
**任务清单**:
|
|
227
|
+
|
|
228
|
+
- [ ] 实现 diff 分类器(compat_issue / logic_bug / nondeterministic / data_quality)
|
|
229
|
+
- [ ] 实现修复建议生成(自动推断 fix location)
|
|
230
|
+
- [ ] 集成基线回放验证(修复后自动回归)
|
|
231
|
+
- [ ] 实现 diff 下降趋势监控
|
|
232
|
+
|
|
233
|
+
---
|
|
234
|
+
|
|
235
|
+
## 阶段 4:模块级替换(shadow → canary → default)
|
|
236
|
+
|
|
237
|
+
### 4.1 建议替换顺序(低风险 → 高风险)
|
|
238
|
+
|
|
239
|
+
1. tokenizer/encoding
|
|
240
|
+
2. tool canonicalization
|
|
241
|
+
3. compat layer
|
|
242
|
+
4. streaming response formatting (SSE)
|
|
243
|
+
5. routing decision logic
|
|
244
|
+
|
|
245
|
+
**任务清单**:
|
|
246
|
+
|
|
247
|
+
- [ ] 确认替换顺序(风险评级)
|
|
248
|
+
- [ ] 定义每个模块的验收阈值(diff rate / error rate / latency delta)
|
|
249
|
+
- [ ] 定义观察期时长(shadow/canary/default)
|
|
250
|
+
|
|
251
|
+
### 4.2 每个模块的替换 Gate
|
|
252
|
+
|
|
253
|
+
> [!note] 状态机
|
|
254
|
+
> `shadow → canary → default → deprecated → removed`
|
|
255
|
+
|
|
256
|
+
- shadow:TS 主路,WASM 影子;跑满观察期(建议 1-2 周)
|
|
257
|
+
- canary:WASM 主路小流量(5-10%),TS 影子;对稳定性最敏感
|
|
258
|
+
- default:WASM 主路大流量(50-100%);TS 影子可逐步关闭
|
|
259
|
+
- deprecated:TS 保留 2-3 个 release 作为 fallback
|
|
260
|
+
|
|
261
|
+
#### 验收门槛(示例,按模块调整)
|
|
262
|
+
|
|
263
|
+
- diff rate:按模块设阈值(tokenizer 可到 0.01%,streaming 可到 0.5%)
|
|
264
|
+
- error rate:≤ 0.1%(canary 阶段更严格)
|
|
265
|
+
- latency delta:P95 ≤ +10ms(或按模块调整)
|
|
266
|
+
|
|
267
|
+
**任务清单**:
|
|
268
|
+
|
|
269
|
+
- [ ] 实现模块状态机(shadow/canary/default/deprecated)
|
|
270
|
+
- [ ] 实现自动 gate 切换(满足阈值后自动推进)
|
|
271
|
+
- [ ] 实现回滚触发器(diff 激增 / 错误率超阈值 / 性能降级)
|
|
272
|
+
- [ ] 文档化每个模块的验收阈值
|
|
273
|
+
|
|
274
|
+
---
|
|
275
|
+
|
|
276
|
+
## 阶段 5:安全与可观测性
|
|
277
|
+
|
|
278
|
+
### 5.1 必需指标
|
|
279
|
+
|
|
280
|
+
- `diff_rate`(按 module/tenant/route/ruleset)
|
|
281
|
+
- `shadow_error_rate`(按 runtime)
|
|
282
|
+
- `latency_delta_p50/p95/p99`
|
|
283
|
+
- `wasm_init_time` / `wasm_memory`
|
|
284
|
+
|
|
285
|
+
**任务清单**:
|
|
286
|
+
|
|
287
|
+
- [ ] 实现指标上报(通过 `providerErrorCenter` + `errorHandlingCenter`)
|
|
288
|
+
- [ ] 实现指标分维度聚合(tenant/route/module/runtime/ruleset)
|
|
289
|
+
- [ ] 实现指标查询接口(Prometheus/自定义)
|
|
290
|
+
|
|
291
|
+
### 5.2 告警与 Error Budget
|
|
292
|
+
|
|
293
|
+
- diff 突增(超过阈值 2x)
|
|
294
|
+
- shadow error rate 超阈值
|
|
295
|
+
- 主路延迟显著上升
|
|
296
|
+
- WASM 初始化失败/崩溃
|
|
297
|
+
|
|
298
|
+
> [!warning] 自动回滚建议
|
|
299
|
+
> 若具备自动回滚能力:以“错误率 + 延迟 + diff 突增”作为触发器,回滚到 `ts_primary` 或回滚单模块。
|
|
300
|
+
|
|
301
|
+
**任务清单**:
|
|
302
|
+
|
|
303
|
+
- [ ] 实现告警规则配置(diff/error/latency/threshold)
|
|
304
|
+
- [ ] 实现自动回滚触发器(单模块 / 全局)
|
|
305
|
+
- [ ] 实现回滚开关(环境变量 + config)
|
|
306
|
+
- [ ] 文档化告警与回滚流程
|
|
307
|
+
|
|
308
|
+
---
|
|
309
|
+
|
|
310
|
+
## 阶段 6:正式切换与清理
|
|
311
|
+
|
|
312
|
+
### 6.1 版本策略
|
|
313
|
+
|
|
314
|
+
- TS 逻辑保留 2-3 个 release(建议)
|
|
315
|
+
- 在 release note 标记迁移状态与开关说明
|
|
316
|
+
- 在移除双路 diff 逻辑前:保留 ruleset 版本与历史 diff 归档
|
|
317
|
+
|
|
318
|
+
**任务清单**:
|
|
319
|
+
|
|
320
|
+
- [ ] 定义 TS 保留策略(版本数量 / 时长)
|
|
321
|
+
- [ ] 更新 release note 模板(迁移状态)
|
|
322
|
+
- [ ] 实现 ruleset 版本归档
|
|
323
|
+
- [ ] 实现历史 diff 归档
|
|
324
|
+
|
|
325
|
+
### 6.2 清理原则
|
|
326
|
+
|
|
327
|
+
> [!important] No deletions without approval
|
|
328
|
+
> 如需删除现有文件或大规模移除 TS 路径,必须先提案并获得确认。
|
|
329
|
+
|
|
330
|
+
清理候选(仅列举,不直接执行):
|
|
331
|
+
|
|
332
|
+
- 移除 TS 专用实现(在确认无流量后)
|
|
333
|
+
- 移除双路 diff 代码(若不再需要审计)
|
|
334
|
+
- 更新 docs/ 与 CLI 运行手册
|
|
335
|
+
|
|
336
|
+
**任务清单**:
|
|
337
|
+
|
|
338
|
+
- [ ] 制定清理提案(需 approval)
|
|
339
|
+
- [ ] 实现清理脚本(可选,自动化)
|
|
340
|
+
- [ ] 更新文档(README / docs/)
|
|
341
|
+
- [ ] 更新 CLI 运行手册
|
|
342
|
+
|
|
343
|
+
---
|
|
344
|
+
|
|
345
|
+
## 附录 A:Ruleset 版本化规范
|
|
346
|
+
|
|
347
|
+
### A.1 命名与存放
|
|
348
|
+
|
|
349
|
+
- 命名:`diff_ruleset_vN`
|
|
350
|
+
- 存放:`llmswitch-core/src/diff/rulesets/`
|
|
351
|
+
|
|
352
|
+
### A.2 必含字段
|
|
353
|
+
|
|
354
|
+
- `version`(v1/v2/...)
|
|
355
|
+
- `created_at`(日期)
|
|
356
|
+
- `field_whitelist`(忽略字段)
|
|
357
|
+
- `tolerance_policy`(容忍策略)
|
|
358
|
+
- `nondeterministic_rules`(非确定性规则)
|
|
359
|
+
- `sse_protocol`(SSE 协议版本)
|
|
360
|
+
|
|
361
|
+
### A.3 升级流程
|
|
362
|
+
|
|
363
|
+
1. 新增 v(N+1) ruleset(不改旧版本)
|
|
364
|
+
2. 切换 `current_ruleset` 指针
|
|
365
|
+
3. 所有 diff 记录写入 `ruleset_version`
|
|
366
|
+
|
|
367
|
+
**任务清单**:
|
|
368
|
+
|
|
369
|
+
- [ ] 创建 ruleset 目录结构
|
|
370
|
+
- [ ] 实现 `diff_ruleset_v1`(初始版本)
|
|
371
|
+
- [ ] 实现 ruleset 版本管理(current 指针)
|
|
372
|
+
- [ ] 文档化 ruleset 升级流程
|
|
373
|
+
|
|
374
|
+
---
|
|
375
|
+
|
|
376
|
+
## 附录 B:Diff 存储与脱敏
|
|
377
|
+
|
|
378
|
+
### B.1 索引字段
|
|
379
|
+
|
|
380
|
+
- `requestId`
|
|
381
|
+
- `tenant`
|
|
382
|
+
- `route`
|
|
383
|
+
- `ruleset_version`
|
|
384
|
+
- `timestamp`
|
|
385
|
+
|
|
386
|
+
### B.2 保留策略
|
|
387
|
+
|
|
388
|
+
- 全量:7-14 天
|
|
389
|
+
- 采样摘要:30-90 天
|
|
390
|
+
|
|
391
|
+
### B.3 脱敏规则
|
|
392
|
+
|
|
393
|
+
- PII:hash + mask(结构保留)
|
|
394
|
+
- secrets:完全移除(不入库)
|
|
395
|
+
- stack:stack hash(用于聚类),必要时保留截断 stack(受访问控制)
|
|
396
|
+
|
|
397
|
+
**任务清单**:
|
|
398
|
+
|
|
399
|
+
- [ ] 设计 diff 存储索引(支持高效查询)
|
|
400
|
+
- [ ] 实现保留策略(全量 / 采样 / TTL)
|
|
401
|
+
- [ ] 实现脱敏规则(PII / secrets / stack)
|
|
@@ -0,0 +1,65 @@
|
|
|
1
|
+
# ServerTool 框架设计草案
|
|
2
|
+
|
|
3
|
+
统一的 Server-Side Tool(ServerTool)框架,目标是用一套标准流程承载所有「由服务端执行的工具」:web_search、vision followup(图像模型 → 文本模型)以及未来的自定义工具。
|
|
4
|
+
|
|
5
|
+
## 核心流程(统一 5 步)
|
|
6
|
+
|
|
7
|
+
1. **注册 & 初始化**
|
|
8
|
+
- 提供 `ServerToolRegistry`,在 llmswitch-core 初始化时注册各个工具 handler。
|
|
9
|
+
- 通过配置(`virtualrouter.webSearch` 等)决定哪些工具启用、使用哪些后端引擎(providerKey / model / 参数)。
|
|
10
|
+
|
|
11
|
+
2. **命中条件 & 工具注入**
|
|
12
|
+
- 请求阶段:由 Hub 的工具治理层根据分类结果和配置,完成:
|
|
13
|
+
- 工具 **命名空间/名称规范化**(如统一 `web_search` 函数名);
|
|
14
|
+
- 工具 **强制注入**(force)或 **条件注入**(selective)到 tools 列表;
|
|
15
|
+
- 响应阶段:从模型返回中统一抽取 `tool_call`(OpenAI style / Responses style),按 name 在注册表里匹配 ServerTool handler。
|
|
16
|
+
|
|
17
|
+
3. **工具调用拦截 & 执行**
|
|
18
|
+
- 在 `runServerSideToolEngine()` 中:
|
|
19
|
+
- 收集所有命中的 `tool_call`(支持多个工具、多个调用)。
|
|
20
|
+
- 对每个 `tool_call` 调用对应的 ServerTool handler,由 handler 负责:
|
|
21
|
+
- 解析 arguments;
|
|
22
|
+
- 根据配置选择后端引擎(如 Gemini CLI、GLM);
|
|
23
|
+
- 通过 `providerInvoker` 调用后端 provider(HTTP + compat 已由 llmswitch-core/Host 处理)。
|
|
24
|
+
|
|
25
|
+
4. **执行结果归一化 & 虚拟 Tool 响应**
|
|
26
|
+
- 每个 handler 返回统一结构:
|
|
27
|
+
- 结构化结果 payload(如 web_search 的 `summary + hits[] + engine`)。
|
|
28
|
+
- 与原始 `tool_call_id` 绑定的「虚拟工具消息」:
|
|
29
|
+
```jsonc
|
|
30
|
+
{
|
|
31
|
+
"role": "tool",
|
|
32
|
+
"tool_call_id": "<原 tool_call.id>",
|
|
33
|
+
"name": "<tool_name>",
|
|
34
|
+
"content": "<JSON string or structured tokens>"
|
|
35
|
+
}
|
|
36
|
+
```
|
|
37
|
+
- ServerTool 框架负责把这些虚拟工具消息插入到 ChatEnvelope.messages 中,形成标准的「工具调用 + 工具结果」形态,供后续推理使用。
|
|
38
|
+
|
|
39
|
+
5. **二次请求(对客户端/主模型透明)**
|
|
40
|
+
- 框架将更新后的 ChatEnvelope 交还给 Hub Pipeline:
|
|
41
|
+
- 对 Responses 协议:作为「带 tool 结果的 responses 响应」,再经过 tool governance / finalize → 客户端看到的是已经融合工具结果的回答。
|
|
42
|
+
- 对 Chat 协议:按 OpenAI 工具规范让主模型继续一轮对话(视为第二跳),调用对话/路由逻辑,但这一层对客户端和 provider runtime 透明。
|
|
43
|
+
- 整个过程中,客户端仍然只看到一次 `/v1/responses` / `/v1/chat/completions` 请求,ServerTool 的所有调用和重放都封装在 llmswitch-core 内部。
|
|
44
|
+
|
|
45
|
+
## 模块划分
|
|
46
|
+
|
|
47
|
+
- `llmswitch-core/src/servertool/registry`(计划)
|
|
48
|
+
- 负责 ServerTool handler 的注册与查找。
|
|
49
|
+
- 提供按 tool name(如 `web_search`)查找 handler 的接口。
|
|
50
|
+
|
|
51
|
+
- `llmswitch-core/src/servertool/engine`(计划)
|
|
52
|
+
- 替换/封装现有 `runServerSideToolEngine()` 的实现:
|
|
53
|
+
- 统一抽取工具调用;
|
|
54
|
+
- 调用 handler;
|
|
55
|
+
- 注入虚拟 tool 消息;
|
|
56
|
+
- 触发第二跳或返回更新后的 ChatEnvelope。
|
|
57
|
+
|
|
58
|
+
- `llmswitch-core/src/servertool/handlers/web_search`(计划)
|
|
59
|
+
- web_search 的具体 handler:
|
|
60
|
+
- 解析 `query/engine/recency/count`;
|
|
61
|
+
- 根据 `virtualrouter.webSearch.engines` 选择后端引擎(Gemini CLI / GLM 等);
|
|
62
|
+
- 调后端 provider,解析返回结果,构造统一的搜索结果结构;
|
|
63
|
+
- 返回绑定原始 tool_call_id 的虚拟 tool 消息。
|
|
64
|
+
|
|
65
|
+
后续 vision followup 等也会迁移为独立 handler,挂到同一框架上,尽量做到「ServerTool 框架 + 多个 handler」的统一模式。
|
|
@@ -455,17 +455,15 @@ provider → compatibility → llmswitch (final) → response
|
|
|
455
455
|
#### llmswitch-core Exclusive Responsibilities
|
|
456
456
|
Only `llmswitch-core` modules may perform:
|
|
457
457
|
|
|
458
|
-
1. **Tool
|
|
458
|
+
1. **Tool Calls Canonicalization**: Normalize tool_calls structure
|
|
459
459
|
- Implementation: `sharedmodule/llmswitch-core/src/conversion/shared/tool-canonicalizer.ts`
|
|
460
|
-
2. **
|
|
460
|
+
2. **Argument Stringification**: Convert tool arguments to proper string format
|
|
461
461
|
- Implementation: `sharedmodule/llmswitch-core/src/conversion/shared/tool-canonicalizer.ts`
|
|
462
|
-
3. **
|
|
463
|
-
- Implementation: `sharedmodule/llmswitch-core/src/conversion/shared/tool-canonicalizer.ts`
|
|
464
|
-
4. **Result Envelope Stripping**: Remove tool result wrapper envelopes
|
|
462
|
+
3. **Result Envelope Stripping**: Remove tool result wrapper envelopes
|
|
465
463
|
- Implementation: `sharedmodule/llmswitch-core/src/conversion/responses/responses-openai-bridge.ts`
|
|
466
|
-
|
|
467
|
-
- Implementation: `sharedmodule/llmswitch-core/src/conversion/shared/
|
|
468
|
-
|
|
464
|
+
4. **Schema Augmentation**: Normalize/augment tool schemas (and inject tool guidance when enabled)
|
|
465
|
+
- Implementation: `sharedmodule/llmswitch-core/src/conversion/shared/tool-governor.ts` + `sharedmodule/llmswitch-core/src/guidance/index.ts`
|
|
466
|
+
5. **finish_reason=tool_calls Patching**: Set correct finish reason for tool calls
|
|
469
467
|
- Implementation: `sharedmodule/llmswitch-core/src/conversion/responses/responses-openai-bridge.ts`
|
|
470
468
|
|
|
471
469
|
#### V2 Guardrails
|