@jsonstudio/rcc 0.89.1205 → 0.89.1457

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (391) hide show
  1. package/README.md +53 -1412
  2. package/configsamples/config.json +426 -0
  3. package/configsamples/config.reference.json +58 -0
  4. package/configsamples/provider/crs/config.v1.json +46 -0
  5. package/configsamples/provider/glm/config.v1.json +81 -0
  6. package/configsamples/provider/glm-anthropic/config.v1.json +45 -0
  7. package/configsamples/provider/iflow/config.v1.json +74 -0
  8. package/configsamples/provider/kimi/config.v1.json +41 -0
  9. package/configsamples/provider/lmstudio/config.v1.json +101 -0
  10. package/configsamples/provider/mimo/config.v1.json +35 -0
  11. package/configsamples/provider/modelscope/config.v1.json +96 -0
  12. package/configsamples/provider/qwen/config.v1.json +38 -0
  13. package/configsamples/provider/tab/config.v1.json +50 -0
  14. package/configsamples/provider/tabglm/config.v1.json +49 -0
  15. package/dist/build-info.js +2 -2
  16. package/dist/cli/commands/code.js +12 -6
  17. package/dist/cli/commands/code.js.map +1 -1
  18. package/dist/cli/commands/config.d.ts +2 -1
  19. package/dist/cli/commands/config.js +77 -103
  20. package/dist/cli/commands/config.js.map +1 -1
  21. package/dist/cli/commands/examples.js +6 -6
  22. package/dist/cli/commands/examples.js.map +1 -1
  23. package/dist/cli/commands/init.d.ts +28 -0
  24. package/dist/cli/commands/init.js +94 -0
  25. package/dist/cli/commands/init.js.map +1 -0
  26. package/dist/cli/commands/port.js +10 -2
  27. package/dist/cli/commands/port.js.map +1 -1
  28. package/dist/cli/commands/restart.js +5 -2
  29. package/dist/cli/commands/restart.js.map +1 -1
  30. package/dist/cli/commands/start.js +25 -22
  31. package/dist/cli/commands/start.js.map +1 -1
  32. package/dist/cli/commands/status.js +1 -0
  33. package/dist/cli/commands/status.js.map +1 -1
  34. package/dist/cli/commands/stop.js +1 -0
  35. package/dist/cli/commands/stop.js.map +1 -1
  36. package/dist/cli/config/bundled-docs.d.ts +20 -0
  37. package/dist/cli/config/bundled-docs.js +91 -0
  38. package/dist/cli/config/bundled-docs.js.map +1 -0
  39. package/dist/cli/config/init-config.d.ts +37 -0
  40. package/dist/cli/config/init-config.js +212 -0
  41. package/dist/cli/config/init-config.js.map +1 -0
  42. package/dist/cli/config/init-provider-catalog.d.ts +8 -0
  43. package/dist/cli/config/init-provider-catalog.js +187 -0
  44. package/dist/cli/config/init-provider-catalog.js.map +1 -0
  45. package/dist/cli/register/init-command.d.ts +3 -0
  46. package/dist/cli/register/init-command.js +5 -0
  47. package/dist/cli/register/init-command.js.map +1 -0
  48. package/dist/cli.js +28 -3
  49. package/dist/cli.js.map +1 -1
  50. package/dist/client/gemini/gemini-protocol-client.js +2 -1
  51. package/dist/client/gemini/gemini-protocol-client.js.map +1 -1
  52. package/dist/client/gemini-cli/gemini-cli-protocol-client.js +40 -16
  53. package/dist/client/gemini-cli/gemini-cli-protocol-client.js.map +1 -1
  54. package/dist/client/openai/chat-protocol-client.js +2 -1
  55. package/dist/client/openai/chat-protocol-client.js.map +1 -1
  56. package/dist/client/responses/responses-protocol-client.js +2 -1
  57. package/dist/client/responses/responses-protocol-client.js.map +1 -1
  58. package/dist/config/risk-control-config.d.ts +94 -0
  59. package/dist/config/risk-control-config.js +196 -0
  60. package/dist/config/risk-control-config.js.map +1 -0
  61. package/dist/constants/index.d.ts +6 -0
  62. package/dist/constants/index.js +13 -0
  63. package/dist/constants/index.js.map +1 -1
  64. package/dist/docs/daemon-admin-ui.html +2113 -190
  65. package/dist/error-handling/quiet-error-handling-center.js +46 -8
  66. package/dist/error-handling/quiet-error-handling-center.js.map +1 -1
  67. package/dist/index.js +0 -1
  68. package/dist/index.js.map +1 -1
  69. package/dist/manager/modules/health/index.d.ts +1 -1
  70. package/dist/manager/modules/quota/antigravity-quota-manager.d.ts +70 -0
  71. package/dist/manager/modules/quota/antigravity-quota-manager.js +442 -0
  72. package/dist/manager/modules/quota/antigravity-quota-manager.js.map +1 -0
  73. package/dist/manager/modules/quota/index.d.ts +3 -127
  74. package/dist/manager/modules/quota/index.js +2 -1093
  75. package/dist/manager/modules/quota/index.js.map +1 -1
  76. package/dist/manager/modules/quota/provider-key-normalization.d.ts +3 -0
  77. package/dist/manager/modules/quota/provider-key-normalization.js +155 -0
  78. package/dist/manager/modules/quota/provider-key-normalization.js.map +1 -0
  79. package/dist/manager/modules/quota/provider-quota-daemon.cooldown.d.ts +9 -0
  80. package/dist/manager/modules/quota/provider-quota-daemon.cooldown.js +115 -0
  81. package/dist/manager/modules/quota/provider-quota-daemon.cooldown.js.map +1 -0
  82. package/dist/manager/modules/quota/provider-quota-daemon.d.ts +77 -0
  83. package/dist/manager/modules/quota/provider-quota-daemon.events.d.ts +12 -0
  84. package/dist/manager/modules/quota/provider-quota-daemon.events.js +239 -0
  85. package/dist/manager/modules/quota/provider-quota-daemon.events.js.map +1 -0
  86. package/dist/manager/modules/quota/provider-quota-daemon.js +404 -0
  87. package/dist/manager/modules/quota/provider-quota-daemon.js.map +1 -0
  88. package/dist/manager/modules/quota/provider-quota-daemon.model-backoff.d.ts +11 -0
  89. package/dist/manager/modules/quota/provider-quota-daemon.model-backoff.js +192 -0
  90. package/dist/manager/modules/quota/provider-quota-daemon.model-backoff.js.map +1 -0
  91. package/dist/manager/modules/quota/provider-quota-daemon.snapshot.d.ts +8 -0
  92. package/dist/manager/modules/quota/provider-quota-daemon.snapshot.js +96 -0
  93. package/dist/manager/modules/quota/provider-quota-daemon.snapshot.js.map +1 -0
  94. package/dist/manager/modules/quota/provider-quota-daemon.view.d.ts +19 -0
  95. package/dist/manager/modules/quota/provider-quota-daemon.view.js +37 -0
  96. package/dist/manager/modules/quota/provider-quota-daemon.view.js.map +1 -0
  97. package/dist/manager/modules/routing/index.d.ts +1 -0
  98. package/dist/manager/modules/routing/index.js +11 -25
  99. package/dist/manager/modules/routing/index.js.map +1 -1
  100. package/dist/manager/quota/provider-quota-center.d.ts +2 -0
  101. package/dist/manager/quota/provider-quota-center.js +80 -82
  102. package/dist/manager/quota/provider-quota-center.js.map +1 -1
  103. package/dist/modules/llmswitch/bridge.d.ts +16 -18
  104. package/dist/modules/llmswitch/bridge.js +293 -94
  105. package/dist/modules/llmswitch/bridge.js.map +1 -1
  106. package/dist/modules/llmswitch/core-loader.d.ts +4 -2
  107. package/dist/modules/llmswitch/core-loader.js +32 -20
  108. package/dist/modules/llmswitch/core-loader.js.map +1 -1
  109. package/dist/modules/pipeline/utils/colored-logger.js +3 -2
  110. package/dist/modules/pipeline/utils/colored-logger.js.map +1 -1
  111. package/dist/modules/pipeline/utils/debug-logger.js +1 -1
  112. package/dist/modules/pipeline/utils/debug-logger.js.map +1 -1
  113. package/dist/providers/auth/antigravity-userinfo-helper.d.ts +2 -1
  114. package/dist/providers/auth/antigravity-userinfo-helper.js +25 -4
  115. package/dist/providers/auth/antigravity-userinfo-helper.js.map +1 -1
  116. package/dist/providers/auth/iflow-cookie-auth.js +0 -2
  117. package/dist/providers/auth/iflow-cookie-auth.js.map +1 -1
  118. package/dist/providers/auth/oauth-lifecycle.js +2 -23
  119. package/dist/providers/auth/oauth-lifecycle.js.map +1 -1
  120. package/dist/providers/auth/tokenfile-auth.d.ts +2 -0
  121. package/dist/providers/auth/tokenfile-auth.js +33 -1
  122. package/dist/providers/auth/tokenfile-auth.js.map +1 -1
  123. package/dist/providers/core/config/camoufox-launcher.d.ts +5 -0
  124. package/dist/providers/core/config/camoufox-launcher.js +40 -4
  125. package/dist/providers/core/config/camoufox-launcher.js.map +1 -1
  126. package/dist/providers/core/config/service-profiles.js +7 -18
  127. package/dist/providers/core/config/service-profiles.js.map +1 -1
  128. package/dist/providers/core/runtime/antigravity-quota-client.js +6 -3
  129. package/dist/providers/core/runtime/antigravity-quota-client.js.map +1 -1
  130. package/dist/providers/core/runtime/base-provider.d.ts +2 -7
  131. package/dist/providers/core/runtime/base-provider.js +84 -165
  132. package/dist/providers/core/runtime/base-provider.js.map +1 -1
  133. package/dist/providers/core/runtime/gemini-cli-http-provider.d.ts +7 -0
  134. package/dist/providers/core/runtime/gemini-cli-http-provider.js +368 -97
  135. package/dist/providers/core/runtime/gemini-cli-http-provider.js.map +1 -1
  136. package/dist/providers/core/runtime/http-request-executor.d.ts +3 -0
  137. package/dist/providers/core/runtime/http-request-executor.js +110 -38
  138. package/dist/providers/core/runtime/http-request-executor.js.map +1 -1
  139. package/dist/providers/core/runtime/http-transport-provider.d.ts +17 -0
  140. package/dist/providers/core/runtime/http-transport-provider.js +165 -16
  141. package/dist/providers/core/runtime/http-transport-provider.js.map +1 -1
  142. package/dist/providers/core/runtime/provider-error-classifier.js +10 -0
  143. package/dist/providers/core/runtime/provider-error-classifier.js.map +1 -1
  144. package/dist/providers/core/runtime/provider-factory.js +7 -5
  145. package/dist/providers/core/runtime/provider-factory.js.map +1 -1
  146. package/dist/providers/core/runtime/provider-runtime-metadata.d.ts +6 -0
  147. package/dist/providers/core/runtime/provider-runtime-metadata.js.map +1 -1
  148. package/dist/providers/core/runtime/rate-limit-manager.d.ts +1 -12
  149. package/dist/providers/core/runtime/rate-limit-manager.js +4 -77
  150. package/dist/providers/core/runtime/rate-limit-manager.js.map +1 -1
  151. package/dist/providers/core/runtime/responses-provider.d.ts +1 -7
  152. package/dist/providers/core/runtime/responses-provider.js +12 -93
  153. package/dist/providers/core/runtime/responses-provider.js.map +1 -1
  154. package/dist/providers/core/strategies/oauth-auth-code-flow.js +12 -8
  155. package/dist/providers/core/strategies/oauth-auth-code-flow.js.map +1 -1
  156. package/dist/providers/core/utils/http-client.js +36 -46
  157. package/dist/providers/core/utils/http-client.js.map +1 -1
  158. package/dist/providers/core/utils/provider-error-logger.d.ts +1 -1
  159. package/dist/providers/core/utils/provider-error-reporter.d.ts +3 -1
  160. package/dist/providers/core/utils/provider-error-reporter.js +3 -0
  161. package/dist/providers/core/utils/provider-error-reporter.js.map +1 -1
  162. package/dist/providers/core/utils/snapshot-writer.js +1 -4
  163. package/dist/providers/core/utils/snapshot-writer.js.map +1 -1
  164. package/dist/providers/mock/mock-provider-runtime.js +57 -27
  165. package/dist/providers/mock/mock-provider-runtime.js.map +1 -1
  166. package/dist/scripts/camoufox/launch-auth.mjs +193 -58
  167. package/dist/server/handlers/handler-utils.js +8 -3
  168. package/dist/server/handlers/handler-utils.js.map +1 -1
  169. package/dist/server/handlers/responses-handler.js +1 -1
  170. package/dist/server/handlers/responses-handler.js.map +1 -1
  171. package/dist/server/runtime/http-server/daemon-admin/auth-handler.d.ts +2 -0
  172. package/dist/server/runtime/http-server/daemon-admin/auth-handler.js +103 -0
  173. package/dist/server/runtime/http-server/daemon-admin/auth-handler.js.map +1 -0
  174. package/dist/server/runtime/http-server/daemon-admin/auth-session.d.ts +5 -0
  175. package/dist/server/runtime/http-server/daemon-admin/auth-session.js +77 -0
  176. package/dist/server/runtime/http-server/daemon-admin/auth-session.js.map +1 -0
  177. package/dist/server/runtime/http-server/daemon-admin/auth-store.d.ts +18 -0
  178. package/dist/server/runtime/http-server/daemon-admin/auth-store.js +89 -0
  179. package/dist/server/runtime/http-server/daemon-admin/auth-store.js.map +1 -0
  180. package/dist/server/runtime/http-server/daemon-admin/credentials-handler.js +1 -2
  181. package/dist/server/runtime/http-server/daemon-admin/credentials-handler.js.map +1 -1
  182. package/dist/server/runtime/http-server/daemon-admin/providers-handler.js +226 -24
  183. package/dist/server/runtime/http-server/daemon-admin/providers-handler.js.map +1 -1
  184. package/dist/server/runtime/http-server/daemon-admin/quota-handler.js +47 -8
  185. package/dist/server/runtime/http-server/daemon-admin/quota-handler.js.map +1 -1
  186. package/dist/server/runtime/http-server/daemon-admin/restart-handler.js +1 -1
  187. package/dist/server/runtime/http-server/daemon-admin/restart-handler.js.map +1 -1
  188. package/dist/server/runtime/http-server/daemon-admin/stats-handler.js +1 -1
  189. package/dist/server/runtime/http-server/daemon-admin/stats-handler.js.map +1 -1
  190. package/dist/server/runtime/http-server/daemon-admin/status-handler.js +68 -4
  191. package/dist/server/runtime/http-server/daemon-admin/status-handler.js.map +1 -1
  192. package/dist/server/runtime/http-server/daemon-admin-routes.d.ts +3 -4
  193. package/dist/server/runtime/http-server/daemon-admin-routes.js +9 -14
  194. package/dist/server/runtime/http-server/daemon-admin-routes.js.map +1 -1
  195. package/dist/server/runtime/http-server/executor-metadata.js +1 -1
  196. package/dist/server/runtime/http-server/executor-metadata.js.map +1 -1
  197. package/dist/server/runtime/http-server/executor-response.js +0 -16
  198. package/dist/server/runtime/http-server/executor-response.js.map +1 -1
  199. package/dist/server/runtime/http-server/hub-shadow-compare.js +110 -34
  200. package/dist/server/runtime/http-server/hub-shadow-compare.js.map +1 -1
  201. package/dist/server/runtime/http-server/index.d.ts +5 -3
  202. package/dist/server/runtime/http-server/index.js +281 -136
  203. package/dist/server/runtime/http-server/index.js.map +1 -1
  204. package/dist/server/runtime/http-server/middleware.js +19 -1
  205. package/dist/server/runtime/http-server/middleware.js.map +1 -1
  206. package/dist/server/runtime/http-server/request-executor.js +59 -24
  207. package/dist/server/runtime/http-server/request-executor.js.map +1 -1
  208. package/dist/server/runtime/http-server/routes.js +12 -3
  209. package/dist/server/runtime/http-server/routes.js.map +1 -1
  210. package/dist/server/runtime/http-server/session-dir.d.ts +2 -0
  211. package/dist/server/runtime/http-server/session-dir.js +59 -0
  212. package/dist/server/runtime/http-server/session-dir.js.map +1 -0
  213. package/dist/server/runtime/http-server/types.d.ts +0 -4
  214. package/dist/server/utils/utf8-chunk-buffer.js +6 -3
  215. package/dist/server/utils/utf8-chunk-buffer.js.map +1 -1
  216. package/dist/server/utils/warmup-storm-tracker.js +1 -1
  217. package/dist/server/utils/warmup-storm-tracker.js.map +1 -1
  218. package/dist/server-factory.d.ts +6 -28
  219. package/dist/server-factory.js +8 -93
  220. package/dist/server-factory.js.map +1 -1
  221. package/dist/token-daemon/index.js +2 -2
  222. package/dist/token-daemon/index.js.map +1 -1
  223. package/dist/token-daemon/provider-registry.js +0 -1
  224. package/dist/token-daemon/provider-registry.js.map +1 -1
  225. package/dist/token-daemon/server-utils.js +8 -9
  226. package/dist/token-daemon/server-utils.js.map +1 -1
  227. package/dist/token-daemon/token-utils.js +1 -1
  228. package/dist/token-daemon/token-utils.js.map +1 -1
  229. package/dist/tools/semantic-replay.js +2 -2
  230. package/dist/tools/semantic-replay.js.map +1 -1
  231. package/dist/tools/stats-request-events.d.ts +1 -1
  232. package/dist/tools/stats-usage.js +6 -3
  233. package/dist/tools/stats-usage.js.map +1 -1
  234. package/dist/utils/llms-engine-shadow.d.ts +19 -0
  235. package/dist/utils/llms-engine-shadow.js +209 -0
  236. package/dist/utils/llms-engine-shadow.js.map +1 -0
  237. package/dist/utils/runtime-versions.js +2 -1
  238. package/dist/utils/runtime-versions.js.map +1 -1
  239. package/dist/utils/strip-internal-keys.d.ts +12 -0
  240. package/dist/utils/strip-internal-keys.js +28 -0
  241. package/dist/utils/strip-internal-keys.js.map +1 -0
  242. package/docs/ARCHITECTURE.md +402 -0
  243. package/docs/CHAT_PROCESS_PROTOCOL_AND_PIPELINE.md +221 -0
  244. package/docs/CODEX_AND_CLAUDE_CODE.md +69 -0
  245. package/docs/CONFIG_ARCHITECTURE.md +517 -0
  246. package/docs/ERROR_HANDLING_AUDIT.md +0 -0
  247. package/docs/GCLI2API_PARITY_GAPS.md +98 -0
  248. package/docs/INSTALLATION_AND_QUICKSTART.md +74 -0
  249. package/docs/INSTRUCTION_MARKUP.md +89 -0
  250. package/docs/MODULE_ENHANCEMENT_SYSTEM.md +666 -0
  251. package/docs/PORTS.md +36 -0
  252. package/docs/PROVIDERS_BUILTIN.md +111 -0
  253. package/docs/PROVIDER_TYPES.md +55 -0
  254. package/docs/SERVERTOOL_CLOCK_DESIGN.md +233 -0
  255. package/docs/USAGE_HANDLING_ANALYSIS.md +335 -0
  256. package/docs/USER_CONFIG_PARSER_CHANGES.md +175 -0
  257. package/docs/V3_INBOUND_OUTBOUND_DESIGN.md +86 -0
  258. package/docs/VIRTUAL_ROUTER_PRIORITY_AND_HEALTH.md +125 -0
  259. package/docs/anthropic-request-golden-samples.md +50 -0
  260. package/docs/antigravity-gemini-format-cleanup.md +102 -0
  261. package/docs/antigravity-routing-contract.md +31 -0
  262. package/docs/ccr-alignment-enhancetool.md +105 -0
  263. package/docs/chat-glm-500-analysis.md +79 -0
  264. package/docs/chat-request-golden-samples.md +42 -0
  265. package/docs/chat-semantic-expansion-plan.md +84 -0
  266. package/docs/cli-command-inventory.md +76 -0
  267. package/docs/codex-samples-replay.md +50 -0
  268. package/docs/daemon-admin-api-design.md +350 -0
  269. package/docs/daemon-admin-module-structure.md +169 -0
  270. package/docs/daemon-admin-ui.html +3394 -0
  271. package/docs/debug-system-design.md +734 -0
  272. package/docs/debugging/gemini-sse-root-cause.md +52 -0
  273. package/docs/debugging/sse_encoding_failure_analysis.md +53 -0
  274. package/docs/dry-run/README.md +721 -0
  275. package/docs/error-handling-v2.md +92 -0
  276. package/docs/exec-command-guard-policy.example.v1.json +42 -0
  277. package/docs/fixes/gemini-protocol-mapping.md +57 -0
  278. package/docs/fixes/oauth-portal-timing-fix.md +202 -0
  279. package/docs/fixes/web-search-hop3-fix.md +265 -0
  280. package/docs/glm-api-reference.md +390 -0
  281. package/docs/glm-chat-completions.md +1779 -0
  282. package/docs/glm-history-inline-images.md +44 -0
  283. package/docs/golden-ci-library.md +66 -0
  284. package/docs/lmstudio-dry-run-summary.md +203 -0
  285. package/docs/lmstudio-tool-calling.md +214 -0
  286. package/docs/mapping-tables/anthropic-to-openai.json +290 -0
  287. package/docs/mapping-tables/iflow-to-openai.json +215 -0
  288. package/docs/mapping-tables/openai-passthrough.json +190 -0
  289. package/docs/mapping-tables/openai-to-iflow.json +227 -0
  290. package/docs/monitoring/Design.md +61 -0
  291. package/docs/multi-token-auth-guide.md +66 -0
  292. package/docs/oauth-authentication-guide.md +168 -0
  293. package/docs/oauth-iflow-implementation.md +153 -0
  294. package/docs/pipeline-routing-report.md +209 -0
  295. package/docs/plans/manager-daemon/PLAN.md +86 -0
  296. package/docs/plans/provider-config-v2-plan.md +176 -0
  297. package/docs/plans/provider-runtime-manager-plan.md +209 -0
  298. package/docs/plans/transparent-429-failover.md +89 -0
  299. package/docs/plans/unified-hub-framework-v1.md +245 -0
  300. package/docs/provider-config-v2-ui-design.md +181 -0
  301. package/docs/provider-quota-design.md +129 -0
  302. package/docs/providers/gemini-provider.md +62 -0
  303. package/docs/providers/lmstudio-v2-migration-report.md +102 -0
  304. package/docs/providers/provider-composite-design.md +142 -0
  305. package/docs/providers/provider-composite-testing.md +98 -0
  306. package/docs/providers/provider-type-only-migration.md +111 -0
  307. package/docs/rccx-wasm-migration.md +74 -0
  308. package/docs/refactoring/architecture-comparison-diagram.md +140 -0
  309. package/docs/refactoring/compatibility-v2-architecture-design.md +738 -0
  310. package/docs/refactoring/workflow-compatibility-refactoring-design.md +361 -0
  311. package/docs/reports/routing-classification-report.json +24 -0
  312. package/docs/reports/routing-classification-report.md +18 -0
  313. package/docs/reports/thinking-keywords-report.json +19 -0
  314. package/docs/responses/README.md +156 -0
  315. package/docs/responses-generic-provider.md +86 -0
  316. package/docs/responses-passthrough-provider-design.md +202 -0
  317. package/docs/routing-awrr-health-weighted-round-robin.md +179 -0
  318. package/docs/routing-instructions.md +393 -0
  319. package/docs/servertool-framework.md +65 -0
  320. package/docs/stop-message-auto.md +225 -0
  321. package/docs/streaming-flow.html +30 -0
  322. package/docs/streaming-flow.md +182 -0
  323. package/docs/token-daemon-preview.html +490 -0
  324. package/docs/token-refresh-daemon-plan.md +269 -0
  325. package/docs/transformation-tables/Gemini-FinishReason/345/256/214/346/225/264/350/275/254/346/215/242/350/241/250.json +233 -0
  326. package/docs/transformation-tables/README.md +225 -0
  327. package/docs/transformation-tables/claude-code-router-anthropic-to-gemini.json +283 -0
  328. package/docs/transformation-tables/claude-code-router-anthropic-to-openai.json +208 -0
  329. package/docs/transformation-tables/claude-code-router-openai-to-anthropic.json +261 -0
  330. package/docs/transformation-tables/claude-code-router-openai-to-gemini.json +208 -0
  331. package/docs/transformation-tables/claude-code-router-openai-to-lmstudio.json +182 -0
  332. package/docs/transformation-tables/claude-code-router-openai-to-ollama.json +250 -0
  333. package/docs/transformation-tables/claude-code-router-openai-to-textgenwebui.json +295 -0
  334. package/docs/transformation-tables/claude-code-router-provider-conversions.json +193 -0
  335. package/docs/transformation-tables//345/256/214/346/225/264/347/232/204/345/267/245/345/205/267/346/211/247/350/241/214/346/265/201/347/250/213/350/275/254/346/215/242/350/241/250.json +299 -0
  336. package/docs/transformation-tables//345/257/271/350/257/235/345/216/206/345/217/262/347/273/264/346/212/244/345/210/206/346/236/220.md +134 -0
  337. package/docs/transformation-tables//345/267/245/345/205/267/350/260/203/347/224/250/346/250/241/345/274/217/345/210/206/346/236/220.md +158 -0
  338. package/docs/transformation-tables//347/212/266/346/200/201/347/256/241/347/220/206/351/234/200/346/261/202/345/210/206/346/236/220.md +175 -0
  339. package/docs/transformation-tables//351/235/231/346/200/201/350/241/250vs/345/212/250/346/200/201/345/210/206/346/236/220.md +189 -0
  340. package/docs/transformation-tables//351/235/231/346/200/201/350/241/250/345/207/206/347/241/256/346/200/247/350/257/204/344/274/260.md +179 -0
  341. package/docs/transformation-tables//351/235/236/346/265/201/345/274/217/345/234/272/346/231/257/345/210/206/346/236/220.md +189 -0
  342. package/docs/v2-architecture/IMPLEMENTATION-ROADMAP.md +367 -0
  343. package/docs/v2-architecture/OPTIMIZED-DESIGN.md +827 -0
  344. package/docs/v2-architecture/PRERUN-CONNECTION-DESIGN.md +716 -0
  345. package/docs/v2-architecture/README.md +549 -0
  346. package/docs/verification/modelscope-verify.md +59 -0
  347. package/docs/verified-configs/README.md +60 -0
  348. package/docs/verified-configs/v0.45.0/README.md +244 -0
  349. package/docs/verified-configs/v0.45.0/lmstudio-5521-gpt-oss-20b-mlx.json +135 -0
  350. package/docs/verified-configs/v0.45.0/merged-config.5521.json +1205 -0
  351. package/docs/verified-configs/v0.45.0/merged-config.qwen-5522.json +1559 -0
  352. package/docs/verified-configs/v0.45.0/qwen-5522-qwen3-coder-plus-final.json +221 -0
  353. package/docs/verified-configs/v0.45.0/qwen-5522-qwen3-coder-plus-fixed.json +242 -0
  354. package/docs/verified-configs/v0.45.0/qwen-5522-qwen3-coder-plus.json +242 -0
  355. package/docs/web-search-service-design.md +322 -0
  356. package/package.json +26 -15
  357. package/scripts/build-core.mjs +3 -1
  358. package/scripts/camoufox/launch-auth.mjs +193 -58
  359. package/scripts/ci/repo-sanity.mjs +138 -0
  360. package/scripts/mock-provider/run-regressions.mjs +157 -1
  361. package/scripts/monitor-diff.mjs +126 -0
  362. package/scripts/pack-mode.mjs +19 -1
  363. package/scripts/pack-rcc.mjs +63 -0
  364. package/scripts/run-bg.sh +0 -14
  365. package/scripts/tests/ci-jest.mjs +119 -0
  366. package/scripts/tools-dev/responses-debug-client/README.md +23 -0
  367. package/scripts/tools-dev/responses-debug-client/payloads/poem.json +13 -0
  368. package/scripts/tools-dev/responses-debug-client/payloads/sample-no-tools.json +98 -0
  369. package/scripts/tools-dev/responses-debug-client/payloads/text.json +13 -0
  370. package/scripts/tools-dev/responses-debug-client/payloads/tool.json +27 -0
  371. package/scripts/tools-dev/responses-debug-client/run.mjs +65 -0
  372. package/scripts/tools-dev/responses-debug-client/src/index.ts +281 -0
  373. package/scripts/tools-dev/run-llmswitch-chat.mjs +53 -0
  374. package/scripts/tools-dev/server-tools-dev/run-web-fetch.mjs +65 -0
  375. package/scripts/unified-hub-shadow-compare.mjs +33 -13
  376. package/scripts/vendor-core.mjs +13 -3
  377. package/scripts/verify-e2e-toolcall.mjs +115 -26
  378. package/dist/modules/llmswitch/pipeline-registry.d.ts +0 -57
  379. package/dist/modules/llmswitch/pipeline-registry.js +0 -229
  380. package/dist/modules/llmswitch/pipeline-registry.js.map +0 -1
  381. package/dist/server/RouteCodexServer.d.ts +0 -13
  382. package/dist/server/RouteCodexServer.js +0 -25
  383. package/dist/server/RouteCodexServer.js.map +0 -1
  384. package/dist/v2/conversion/hub/snapshot-recorder.d.ts +0 -12
  385. package/dist/v2/conversion/hub/snapshot-recorder.js +0 -22
  386. package/dist/v2/conversion/hub/snapshot-recorder.js.map +0 -1
  387. package/scripts/test-fc-responses.mjs +0 -66
  388. package/scripts/test-guidance.mjs +0 -100
  389. package/scripts/test-iflow-web-search.mjs +0 -141
  390. package/scripts/test-iflow.mjs +0 -379
  391. package/scripts/test-tool-exec.mjs +0 -26
@@ -0,0 +1,202 @@
1
+ # Responses 直通 Provider 与 LLM Switch 直通方案(设计与执行文档)
2
+
3
+ 本文档给出“OpenAI Responses 真实 SSE 透传”的设计与实施计划:在不修改客户端的前提下,让服务器直接连接上游 Responses 接口并原样透传 SSE,且具备完善的可观测性与最小改动范围,符合 RouteCodex V2 架构原则。
4
+
5
+ ## 1. 目标与范围
6
+
7
+ - 目标
8
+ - 在 `/v1/responses` 下提供“真实 SSE 透传”能力:输入 Responses 请求,输出同规范 Responses 事件流。
9
+ - 使用配置驱动:`~/.routecodex/config.json` 中新增一个 `type: "responses-standard"` 的 provider;模型设为 `gpt-5.1`;将其设置为默认路由(`routing.default`)。
10
+ - 与黑盒客户端无耦合:不需要客户端改代码即可稳定收到事件与字节。
11
+ - 强可观测:保留请求/响应快照;增加服务端原始 SSE 字节 tee 日志。
12
+
13
+ - 非目标(本阶段不做)
14
+ - 不在直通流程中修改上游事件内容;不做协议“转换”与“修剪”。
15
+ - 不在直通流程中做工具参数重写、合成填充等逻辑。
16
+
17
+ ## 2. 架构与职责边界
18
+
19
+ - Provider 层(V2):统一 HTTP 通信,连接上游 Responses 接口;不参与工具处理;仅做超时/重试与 headers 构建。
20
+ - LLM Switch 层:提供“Responses 直通”模块,在该模式下不做 request/response 转换(保留 single-path 与最小耦合)。
21
+ - Compatibility 层:最小化(保持空操作),不做工具/文本处理。
22
+ - HTTP Server:SSE 首部、立即 flush、tee 日志、pipe 给客户端;`/v1/responses/:id/submit_tool_outputs` 续轮直通。
23
+
24
+ ## 3. 模块设计
25
+
26
+ ### 3.1 新 Provider:`responses-standard`(真实 SSE 透传)
27
+
28
+ - 文件:`src/providers/core/runtime/responses-standard-provider.ts`
29
+ - 类型:`responses-standard`
30
+ - 关键点:
31
+ - 从 provider 配置读取 `baseUrl`、`auth.apiKey`、`overrides.timeout`,构造请求。
32
+ - 请求端点:`POST <baseUrl>/responses`(与 OpenAI Responses 文档一致)。
33
+ - 当请求 `metadata.entryEndpoint === '/v1/responses'` 且本 provider 生效:
34
+ - 设置 Header:`Accept: text/event-stream`,发起真实 SSE;
35
+ - Provider 消费上游 SSE,并转换成标准 Responses JSON(再标记 `x-upstream-mode: sse`)返回 Pipeline,不再暴露 `__sse_stream`。
36
+ - 旁路快照:
37
+ - `provider-request.json`:请求体;
38
+ - `provider-response.json`:状态与 headers(SSE 情况记录 meta);
39
+ - `provider-error.json`:上游错误(含 HTTP 状态/文案)。
40
+ - 续轮工具:
41
+ - `POST <baseUrl>/responses/:id/submit_tool_outputs`,同样走 `Accept: text/event-stream`,返回 Node Readable。
42
+
43
+ ### 3.2 LLM Switch:Responses 直通模块(基于输入/输出形状的默认逻辑)
44
+
45
+ - 文件:`src/modules/pipeline/modules/llmswitch/llmswitch-responses-passthrough.ts`
46
+ - 类型:`llmswitch-responses-passthrough`
47
+ - 行为(设计):
48
+ - 同一个 llmswitch-core 模块既支持 **桥接** 又支持 **直通**,通过“输入/输出形状 + provider 类型”决定:
49
+ - 若入口为 `/v1/responses` 且 provider 类型为 `responses-standard`,并且请求 payload 已经是标准 Responses 形状(`model + instructions + input[] + tools[] + stream` 等),则视为 **Responses canonical**,只做 schema 校验 & 工具过滤 & 快照,不做 Chat/Anthropic 转换(直通模式)。
50
+ - 若入口为 `/v1/responses`,但 payload 是 Chat/Anthropic 形状(例如 `messages[]`),则仍可按配置启用桥接逻辑(Chat→Responses→上游)。
51
+ - `processIncoming`:根据 payload 形状与 endpoint/type 决定“是否需要桥接”;
52
+ - `processOutgoing`:若 provider 返回的 JSON 已是 `object: "response"` 等标准 Responses 输出,则直接透传;否则才走 Responses bridge。
53
+
54
+ ### 3.3 路由器(仅选择路由池,不做“是否直通”决策)
55
+
56
+ - 文件:`src/modules/pipeline/modules/llmswitch-v2-adapters.ts`
57
+ - 规则(设计):
58
+ - virtual router 只决定 `routeName`(即进入哪个 route pool),不决定“是否直通”;
59
+ - PipelineManager 在对应 route pool 内做轮询(与其它流水线平行,没有特殊分支);
60
+ - 是否走 Responses 直通,由 llmswitch-core 在模块内部按照“入口 endpoint + provider 类型 + 请求/响应形状”统一决策,避免多处重复判断。
61
+
62
+ ### 3.4 HTTP Server(SSE 透传与日志)
63
+
64
+ - 文件:`src/server/http-server.ts`
65
+ - `/v1/responses`:
66
+ - 早写响应头并 `flushHeaders()`:
67
+ - `Content-Type: text/event-stream; charset=utf-8`
68
+ - `Cache-Control: no-cache, no-transform`
69
+ - `Connection: keep-alive`
70
+ - `X-Accel-Buffering: no`
71
+ - 若上游返回 SSE:Provider 负责先转换成 JSON,再由 llmswitch-core 决定是否重建 `__sse_responses` 给 HTTP Server。
72
+ - 不做本地合成(直通模式)。
73
+ - `/v1/responses/:id/submit_tool_outputs`:
74
+ - 同上,透传上游返回的 SSE,tee+pipe。
75
+
76
+ ## 4. 配置与选择
77
+
78
+ - Provider 配置(示例,已按照你的要求生成到 `~/.routecodex/config.responses.json`):
79
+ - `type`: `"responses-standard"`
80
+ - `baseUrl`: `"https://www.fakercode.top/v1"`
81
+ - `auth.type`: `"apikey"`
82
+ - `auth.apiKey`: 从 `~/.zshrc` 的 `FC_API_KEY` 读取
83
+ - `overrides.timeout`: `60000`
84
+ - 直通开关:本方案对 `responses` 类型默认走真实上游 SSE(不再依赖 env)。
85
+ - 路由:
86
+ - `routing.default = ["fc.gpt-5"]`
87
+ - `routing["/v1/responses"] = ["fc.gpt-5"]`
88
+
89
+ ## 5. 任务拆解与文件清单
90
+
91
+ 1) Provider:responses
92
+ - 新增:`src/providers/core/runtime/responses-provider.ts`
93
+ - 注册:`src/modules/pipeline/core/pipeline-manager.ts` → `this.registry.registerModule('responses', this.createResponsesProviderModule)`
94
+ - ServiceProfile(可选):`src/providers/core/config/service-profiles.ts` → `responses` 默认 `defaultEndpoint: '/responses'`
95
+
96
+ 2) LLM Switch 直通模块
97
+ - 新增:`src/modules/pipeline/modules/llmswitch/llmswitch-responses-passthrough.ts`
98
+ - 路由器:`src/modules/pipeline/modules/llmswitch-v2-adapters.ts` 里选择直通。
99
+
100
+ 3) HTTP 层
101
+ - `src/server/http-server.ts`:
102
+ - `/v1/responses` 收到 `__sse_responses` 时 tee+pipe(当前分支已具备 tee 基础,补 flushHeaders/X-Accel-Buffering)。
103
+ - `/v1/responses/:id/submit_tool_outputs` 直通处理(沿用现有结构,接 provider 方法)。
104
+ - submit 路径新增对 `llmswitch-core` 的 `resumeResponsesConversation()` 调用,server 侧仅负责读取 `response_id` / `tool_outputs` 并交给核心缓存生成完整 payload,成功后再进 Hub Pipeline,失败返回 400(表示响应已过期或丢失)。
105
+
106
+ 4) 配置示例(你已要求,已生成在用户目录):
107
+ - `~/.routecodex/config.responses.json`(不改仓库内默认 config)。
108
+
109
+ ## 6. 测试与验收
110
+
111
+ - 黑盒客户端测试:
112
+ - 直接打 `POST /v1/responses`,观察是否稳定收到:
113
+ - `response.created` → `response.in_progress` → `response.output_text.delta`* → `response.output_text.done` → `response.completed` → `response.done` → `[DONE]`。
114
+ - 工具回路:收到 `required_action` 后,黑盒 `submit_tool_outputs`,再验证下一轮直到 `done`。
115
+
116
+ - 服务端可观测:
117
+ - 原始 SSE 字节:`~/.routecodex/logs/sse/<reqId>_server.sse.log`
118
+ - 快照:`~/.routecodex/codex-samples/openai-responses/` 下的 `provider-request.json` / `provider-response.json` / `provider-error.json` 与 `*_sse_pre/post.json`
119
+
120
+ - 代理排查建议:
121
+ - Nginx:`proxy_buffering off;` `proxy_http_version 1.1;` `proxy_set_header Connection '';` `chunked_transfer_encoding on;`;
122
+ - Cloudfront/反代:关闭转换(`no-transform`)。
123
+
124
+ ## 7. 失败与回退
125
+
126
+ - 如上游不可用或频繁 429,直通仍会原样透传(便于排查真实问题)。
127
+ - 若上线后需要回退:将 `routing.default` 指回原 pipeline,或将 provider.type 换回 `openai` 并覆盖 endpoint `/chat/completions`。
128
+
129
+ ## 8. 里程碑与工期
130
+
131
+ 1) Day 0:落地 Provider 与直通模块骨架、注册、HTTP 细节(flushHeaders/X-Accel-Buffering),本地联调。
132
+ 2) Day 1:黑盒连通性测试、工具回路验证、快照与 tee 日志对齐。
133
+ 3) Day 2:文档/READMEs 更新、可观测性清单复核。
134
+
135
+ ## 9. 合规性与约束(对齐 AGENTS.md)
136
+
137
+ - 统一工具处理:直通模式不改写工具事件;工具治理入口仍在 llmswitch-core(但此模式仅旁路)。
138
+ - 最小兼容:Compatibility 层不做转换、不兜底。
139
+ - Fail Fast:上游错误透传(不隐藏);必要时仅在 HTTP 层合成 SSE error 帧作为最后兜底(可开关)。
140
+ - 模块化:新增文件均 <500 行;职责单一。
141
+ - 配置驱动:所有开关通过 provider 配置与路由选择生效,不写死。
142
+
143
+ ## 10. SSE 环回校验
144
+
145
+ - 命令:`npm run verify:sse-loop`
146
+ - 统一触发 Responses、Chat(LMStudio)和 Anthropic(GLM-Anthropic)的官方 SDK → RouteCodex 对比。
147
+ - 需要提前在 `~/.routecodex/provider/<providerId>/` 配置相应上游;RouteCodex 本地实例需已启动(默认 `http://127.0.0.1:5555/v1`)。
148
+ - 环境变量:
149
+ - `RCC_LOOP_RESP_PROVIDER` / `RCC_LOOP_RESP_MODEL`
150
+ - `RCC_LOOP_CHAT_MODEL`
151
+ - `RCC_LOOP_ANTHROPIC_PROVIDER` / `RCC_LOOP_ANTHROPIC_MODEL`
152
+ - `RCC_LOOP_ROUTECODEX_BASE` / `RCC_LOOP_ROUTECODEX_KEY`
153
+ - 运行参数 `--skip-responses|--skip-chat|--skip-anthropic` 可跳过部分检查。
154
+ - 校验方式:脚本使用官方 SDK 与 LMStudio/GLM-Anthropic 建立 SSE 流,再以完全相同 payload 命中 RouteCodex,逐事件比对(忽略 `id/created_at/timestamp` 等波动字段)。一旦发现差异会打印首个不同事件,确保“转换后 = 透传”的红线被持续监控。
155
+
156
+ ---
157
+
158
+ 附:配置示例(已生成在用户目录)
159
+
160
+ - `~/.routecodex/config.responses.json`(节选)
161
+
162
+ ```
163
+ {
164
+ "providers": {
165
+ "fc": {
166
+ "id": "fc",
167
+ "enabled": true,
168
+ "type": "responses-standard",
169
+ "baseUrl": "https://www.fakercode.top/v1",
170
+ "auth": { "type": "apikey", "apiKey": "<从 ~/.zshrc 读取 FC_API_KEY>" },
171
+ "overrides": { "timeout": 60000 }
172
+ }
173
+ },
174
+ "routing": {
175
+ "default": [ "fc.gpt-5.1" ],
176
+ "/v1/responses": [ "fc.gpt-5.1" ]
177
+ },
178
+ "pipelines": [
179
+ {
180
+ "id": "fc.gpt-5.1",
181
+ "provider": { "type": "responses-standard" },
182
+ "modules": {
183
+ "provider": {
184
+ "type": "responses-standard",
185
+ "config": {
186
+ "baseUrl": "https://www.fakercode.top/v1",
187
+ "timeout": 60000,
188
+ "auth": { "type": "apikey", "apiKey": "<FC_API_KEY>" },
189
+ "model": "gpt-5.1"
190
+ }
191
+ },
192
+ "llmSwitch": { "type": "llmswitch-conversion-router", "config": {} },
193
+ "compatibility": { "type": "compatibility", "config": {} },
194
+ "workflow": { "type": "streaming-control", "config": {} }
195
+ },
196
+ "settings": { "debugEnabled": true }
197
+ }
198
+ ]
199
+ }
200
+ ```
201
+
202
+ 以上为实施蓝图。审批后我按此执行,并在实现中严格对照本文件逐项落地与验证。
@@ -0,0 +1,179 @@
1
+ # AWRR: Health-Weighted Round Robin (Design)
2
+
3
+ Status: **Design-only** (implementation pending approval)
4
+ Last updated: **2026-01-22**
5
+
6
+ ## Background / Problem
7
+
8
+ In a multi-key pool, pure priority or naive round-robin can lead to:
9
+
10
+ - **Over-hitting a single “best” key** (hot-spot), while other healthy keys are underused.
11
+ - **Starvation** of a degraded key (never selected), making recovery detection slow and creating “dead” keys.
12
+ - After upstream errors (notably transient 429 capacity), requests may **bubble to the HTTP client too early** instead of:
13
+ - quickly trying a healthier candidate, or
14
+ - continuing to route to available tiers/routes (default/backup) when possible.
15
+
16
+ We want a selection strategy that:
17
+
18
+ 1) starts fair (equal share),
19
+ 2) reduces selection probability for recently failing keys (but never to zero),
20
+ 3) gradually restores probability as time passes without errors,
21
+ 4) remains deterministic and testable (no randomness),
22
+ 5) does **not** cross-contaminate between aliases (no “model-level” shared cooldown across keys/aliases).
23
+
24
+ ## Goals
25
+
26
+ - **Fair use of healthy keys**: when multiple keys are healthy, they should all be selected over time.
27
+ - **Penalty, not ban**: an unhealthy key stays selectable, but less frequently.
28
+ - **Floor guarantee**: selection probability for a key must not drop below **50% of its initial share** within the same pool/tier.
29
+ - **Time-based recovery**: without new errors, a key’s share should slowly return to baseline.
30
+ - **Retry recovery preference**: for a retry attempt (e.g. request metadata carries `excludedProviderKeys`), routing should “snap back”
31
+ to the **currently healthiest** candidate first.
32
+ - **Alias isolation**: health/penalty is per `providerKey` (includes alias), never shared between aliases.
33
+
34
+ ## Non-goals
35
+
36
+ - This design does not change provider transport behavior (retries/backoff are still provider-layer concerns).
37
+ - This design does not “repair” tool calls or rewrite payload semantics.
38
+ - This design does not introduce cross-model or cross-alias global capacity tracking.
39
+
40
+ ## Architecture Placement (Rule: llmswitch-core owns routing)
41
+
42
+ This design splits responsibilities as:
43
+
44
+ - Host (RouteCodex) provides a `quotaView(providerKey) -> ProviderQuotaViewEntry`.
45
+ - It is the **source of truth** for “recent errors” metadata per providerKey.
46
+ - `sharedmodule/llmswitch-core` computes weights and selects a providerKey.
47
+ - Routing + selection policy lives here.
48
+
49
+ ## Proposed API/Data Model
50
+
51
+ ### 1) Extend `ProviderQuotaViewEntry`
52
+
53
+ Add optional fields to support time-decayed penalty:
54
+
55
+ - `selectionPenalty?: number` (existing; derived from recent error activity)
56
+ - `lastErrorAtMs?: number | null` (new; per-providerKey)
57
+ - `consecutiveErrorCount?: number` (new; per-providerKey, resets to 0 on success)
58
+
59
+ Hard exclusion continues to use:
60
+
61
+ - `inPool`, `cooldownUntil`, `blacklistUntil` (if blocked, do not select)
62
+
63
+ ### 2) New derived values (llmswitch-core)
64
+
65
+ For each candidate providerKey, llmswitch-core computes:
66
+
67
+ - `multiplier m ∈ [minMultiplier, 1]`
68
+ - `weight = baseWeight * m`
69
+
70
+ Where:
71
+
72
+ - `minMultiplier = 0.5` (the “50% of initial share” floor)
73
+
74
+ ## Weight Formula
75
+
76
+ We use time-decayed effective error intensity to allow gradual recovery.
77
+
78
+ ### Parameters
79
+
80
+ - `baseWeight = 100` (resolution; does not change ratios)
81
+ - `halfLifeMs = 10 * 60 * 1000` (10 minutes)
82
+ - `beta = 0.1` (penalty slope; tuned so repeated errors quickly reduce share but respect floor)
83
+ - `minMultiplier = 0.5`
84
+
85
+ ### Computation
86
+
87
+ Given `nowMs`, `lastErrorAtMs`, `consecutiveErrorCount`:
88
+
89
+ 1) If `lastErrorAtMs` is missing, treat as no recent error.
90
+
91
+ 2) Time decay:
92
+
93
+ ```
94
+ decay = exp(-ln(2) * (nowMs - lastErrorAtMs) / halfLifeMs)
95
+ effectiveErrors = consecutiveErrorCount * decay
96
+ ```
97
+
98
+ 3) Multiplier:
99
+
100
+ ```
101
+ m = clamp(minMultiplier, 1.0, 1.0 - beta * effectiveErrors)
102
+ ```
103
+
104
+ This ensures:
105
+
106
+ - No error → `effectiveErrors=0` → `m=1` (equal baseline share).
107
+ - Recent repeated errors → `m` drops quickly but never below `minMultiplier`.
108
+ - As time passes without errors → `decay→0` → `m` recovers toward 1.0.
109
+
110
+ ## Selection Algorithm
111
+
112
+ ### Baseline: Smooth Weighted Round Robin (SWRR)
113
+
114
+ Use a deterministic SWRR implementation in the load balancer:
115
+
116
+ - No randomness; stable and testable.
117
+ - Any candidate with `weight >= 1` will be selected eventually.
118
+
119
+ We compute `weights` per request from quotaView, then select via SWRR within:
120
+
121
+ - the current route tier bucket (priorityTier grouping), and
122
+ - the current pool’s candidate ordering (after filtering).
123
+
124
+ ### Retry path: “recover-to-best”
125
+
126
+ If a request is a retry attempt (detected via routing metadata, e.g. `excludedProviderKeys` is non-empty):
127
+
128
+ - Bypass SWRR; pick the candidate with the highest `m` (healthiest).
129
+ - Tie-break deterministically (stable order or round-robin pointer).
130
+
131
+ Note: this “retry” is **router-level re-routing** after a providerKey failed (to avoid picking the same key again in the same
132
+ request chain). It is not the same thing as provider HTTP retries to the same upstream endpoint.
133
+
134
+ Rationale: after an error, we want the next attempt to “snap back” to the best-known key, reducing the chance of repeated
135
+ failures; once stable again, SWRR resumes fair rotation.
136
+
137
+ ## Behavioral Guarantees
138
+
139
+ - **No starvation**: as long as a key is not hard-blocked and `weight >= 1`, it will be selected eventually.
140
+ - **Floor**: `m >= 0.5` ensures a key’s chance cannot be crushed to near-zero by penalty alone.
141
+ - **Recovery**: time decay ensures that without new errors, `m` increases toward 1.0.
142
+ - **Isolation**: weights are computed strictly per `providerKey` and never shared across alias/model.
143
+
144
+ ## Configuration / Tuning
145
+
146
+ We can start with fixed defaults (above), then optionally expose:
147
+
148
+ - `virtualrouter.loadBalancing.healthWeighted`:
149
+ - `halfLifeMs`
150
+ - `beta`
151
+ - `minMultiplier` (default 0.5, must be in `(0,1]`)
152
+ - `baseWeight` (default 100)
153
+
154
+ If config is not provided, use defaults.
155
+
156
+ ## Tests (Coverage Requirements)
157
+
158
+ Add deterministic tests that cover:
159
+
160
+ 1) **Fair baseline**: equal weights should rotate through all candidates (no single key always hit).
161
+ 2) **Penalty reduces share**: a higher `consecutiveErrorCount` produces fewer hits over a fixed window.
162
+ 3) **Floor enforced**: under extreme penalty, the degraded key still gets hits (non-zero) and does not starve.
163
+ 4) **Time recovery**: with a mocked clock, as `nowMs` advances without errors, the key’s computed `m` increases.
164
+ 5) **Retry recover-to-best**: when retry metadata is present, selection should be the healthiest candidate.
165
+ 6) **Alias isolation**: two providerKeys with the same underlying “model name” must not share penalty/cooldown.
166
+
167
+ ## Rollout Plan
168
+
169
+ 1) Implement fields in host `quotaView` (per-providerKey only).
170
+ 2) Implement SWRR + dynamic weights in llmswitch-core load balancer.
171
+ 3) Switch selection to pass computed per-request weights into load balancer.
172
+ 4) Add tests and run llmswitch-core matrix build.
173
+ 5) Host build: `npm run build:dev` + `npm run install:global`.
174
+
175
+ ## Open Questions (Need Approval)
176
+
177
+ 1) Do we expose `halfLifeMs/beta/minMultiplier` as user-configurable now, or hardcode first?
178
+ 2) “Initial share” definition: this design interprets it as equal share within the same pool bucket at `m=1.0`.
179
+ 3) Retry detection: is `metadata.excludedProviderKeys` the canonical signal, or do we also mark retries explicitly?