@tencent-ai/agent-sdk 0.3.169-dev.e3066bb.202606081524 → 0.3.171
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/cli/CHANGELOG.md +50 -18
- package/cli/builtin/workflows/deep-research.workflow.js +429 -0
- package/cli/dist/codebuddy-headless.js +207 -134
- package/cli/dist/web-ui/assets/{index-CiYuiLTV.js → index-i12Tc2lJ.js} +20 -20
- package/cli/dist/web-ui/docs/cn/cli/permission-modes.md +260 -0
- package/cli/dist/web-ui/docs/cn/cli/permissions.md +380 -0
- package/cli/dist/web-ui/docs/cn/cli/release-notes/README.md +3 -0
- package/cli/dist/web-ui/docs/cn/cli/release-notes/v2.103.2.md +21 -0
- package/cli/dist/web-ui/docs/cn/cli/release-notes/v2.103.3.md +15 -0
- package/cli/dist/web-ui/docs/cn/cli/release-notes/v2.103.4.md +13 -0
- package/cli/dist/web-ui/docs/cn/cli/workflows.md +281 -0
- package/cli/dist/web-ui/docs/en/cli/permission-modes.md +260 -0
- package/cli/dist/web-ui/docs/en/cli/permissions.md +380 -0
- package/cli/dist/web-ui/docs/en/cli/release-notes/README.md +3 -0
- package/cli/dist/web-ui/docs/en/cli/release-notes/v2.103.2.md +21 -0
- package/cli/dist/web-ui/docs/en/cli/release-notes/v2.103.3.md +15 -0
- package/cli/dist/web-ui/docs/en/cli/release-notes/v2.103.4.md +13 -0
- package/cli/dist/web-ui/docs/en/cli/workflows.md +281 -0
- package/cli/dist/web-ui/docs/search-index-en.json +1 -1
- package/cli/dist/web-ui/docs/search-index-zh.json +1 -1
- package/cli/dist/web-ui/docs/sidebar-en.json +1 -1
- package/cli/dist/web-ui/docs/sidebar-zh.json +1 -1
- package/cli/dist/web-ui/index.html +1 -1
- package/cli/dist/web-ui/sw.js +1 -1
- package/cli/package.json +3 -3
- package/cli/product.cloudhosted.json +5 -3
- package/cli/product.internal.json +27 -3
- package/cli/product.ioa.json +31 -3
- package/cli/product.json +46 -3
- package/cli/product.selfhosted.json +5 -3
- package/lib/index.d.ts +1 -1
- package/lib/index.d.ts.map +1 -1
- package/lib/types.d.ts +9 -1
- package/lib/types.d.ts.map +1 -1
- package/lib/utils/type-guards.d.ts.map +1 -1
- package/lib/utils/type-guards.js +1 -0
- package/lib/utils/type-guards.js.map +1 -1
- package/package.json +2 -3
package/cli/CHANGELOG.md
CHANGED
|
@@ -7,6 +7,38 @@ CodeBuddy Code 的所有重要更新都会记录在这里。
|
|
|
7
7
|
|
|
8
8
|
## [未发布]
|
|
9
9
|
|
|
10
|
+
## [2.105.0] - 2026-06-09
|
|
11
|
+
|
|
12
|
+
### 🎉 新功能
|
|
13
|
+
|
|
14
|
+
- **Dynamic Workflow(动态编排)**:新增 `Workflow` 工具,支持在单条会话里调度数十个子代理协同跑大型多步任务(深度研究、跨文件重构、批量分析)。脚本走确定性沙箱执行,禁用 `Date.now` / `Math.random` / `new Date()` 等不确定性 API。工具默认走延迟加载(`deferLoading`),仅在用户显式触发或语义检索时激活,不占用默认上下文。
|
|
15
|
+
- **Deep Research(深度研究)**:内置深度研究 workflow + `/deep-research` 斜杠命令;用户可一句话发起多角度并行研究,进度由后台任务通知统一回流。
|
|
16
|
+
- **UltraCode 长程思考**:在用户输入或会话元信息出现 `ultracode` 关键词时自动注入提醒上下文,引导模型进入更深思考;可通过 `/effort` 命令切换 effort 等级。
|
|
17
|
+
- **`/workflows` 工作台**:TUI 新增 Workflows 面板,查看活跃 / 历史 run、保存自定义脚本、一键 resume。
|
|
18
|
+
- **`/save-workflow` 命令**:把当前内联脚本保存为命名 workflow(用户级或项目级),后续可通过 `name` 引用。
|
|
19
|
+
- **后台任务统一通道**:Workflow 与 Bash `run_in_background`、PowerShell 后台任务复用同一 `<task-notification>` 通道,模型用 `TaskOutput` 工具拉取完整结果。
|
|
20
|
+
- **Workflow 暂停 / 恢复**:Workflow 运行支持暂停(pause)与重启(restart),可在长任务中安全中断与续跑。
|
|
21
|
+
- **Workflow 详情面板增强**:Workflows 面板支持查看运行中工作流的实时状态、Agent 子任务列表与暂停 / 恢复操作。
|
|
22
|
+
- **会话标题事件**:stream-json 输出新增 AI 生成标题事件,便于上层客户端同步展示会话标题。
|
|
23
|
+
- **禁用内置模型开关**:新增 `CODEBUDDY_DISABLE_BUILTIN_MODELS` 环境变量,置为 `1` / `true` 时剥离打包内置的静态模型列表,`/model` 列表只保留云端接口模型、`models.json` 自定义模型与 `ACC_PRODUCT_CONFIG_V3` 注入的模型。
|
|
24
|
+
|
|
25
|
+
### 🔧 改进
|
|
26
|
+
|
|
27
|
+
- **Tool Search 增强**:扩展工具发现能力,配合延迟加载工具(含 Workflow)按需激活。
|
|
28
|
+
- **TaskOutput 工具增强**:支持后台任务输出文件管理与流式拉取。
|
|
29
|
+
- **WebFetch 工具优化**:调整重定向与缓存语义。
|
|
30
|
+
- **Tasks 面板**:新增任务详情面板与列表面板,组织展示更清晰。
|
|
31
|
+
- **Workflow 总开关**:环境变量 `CODEBUDDY_DISABLE_WORKFLOWS=1` 或 settings.json `disableWorkflows: true` 任一为真即关闭 Workflow 全套能力(工具、关键词拦截、effort 切换)。
|
|
32
|
+
- **内置 workflow 资源外置**:内置 workflow 脚本统一迁移到 `builtin/workflows/` 目录,新增内置 workflow 仅需丢入文件即可被自动发现,无需改代码。
|
|
33
|
+
- **Workflow 持久化扩展**:journal / durability 增补字段以支持暂停状态与 Agent 派生信息的恢复。
|
|
34
|
+
- **权限保存位置文案**:统一权限规则保存位置的描述文案为 `Saved in <path>`,修正此前 `Saved in at` / `Checked in at` 风格不一致的问题。
|
|
35
|
+
|
|
36
|
+
### 🐛 修复
|
|
37
|
+
|
|
38
|
+
- **/add-dir 目录联想**:修复输入 `/add-dir ` 加空格(尚未输入路径)时误触发目录选择器、回车被当作选中首个目录并直接发送的问题;现在参数为空时不再列举目录,行为与预期一致。
|
|
39
|
+
- **专享版模型列表为空 / 不对**:修复 custom-token 登录时,若 JWT `iss` 为 Keycloak realm 形式(末段不含 `sso-` 前缀)导致 `enterpriseId` 解析失败、`/model` 列表退化为内置默认模型的问题。现在 `enterpriseId` 优先读取配置显式提供的 `authentication.attributes.enterpriseId`,再回退到 JWT 解析。
|
|
40
|
+
- **CLI loading 异常消失**:修复使用子代理调研类对话时主会话 loading 状态被子代理拖到 idle 导致 loading 突然消失、但对话仍在继续输出的问题;主会话的运行状态现在只由自身状态机推进,子代理仅通过 phase 通道驱动 loading 显示。
|
|
41
|
+
|
|
10
42
|
## [2.103.4] - 2026-06-08
|
|
11
43
|
|
|
12
44
|
### 🔧 改进
|
|
@@ -255,7 +287,7 @@ CodeBuddy Code 的所有重要更新都会记录在这里。
|
|
|
255
287
|
|
|
256
288
|
### 🎉 新功能
|
|
257
289
|
|
|
258
|
-
- **OpenTelemetry 自定义上报(traces)**:支持通过 OTel 标准环境变量将内部 traces 上报到自有 Collector。启用开关 `CODEBUDDY_CODE_ENABLE_TELEMETRY=1
|
|
290
|
+
- **OpenTelemetry 自定义上报(traces)**:支持通过 OTel 标准环境变量将内部 traces 上报到自有 Collector。启用开关 `CODEBUDDY_CODE_ENABLE_TELEMETRY=1`,支持 `OTEL_EXPORTER_OTLP_ENDPOINT` / `OTEL_EXPORTER_OTLP_HEADERS` / `OTEL_SERVICE_NAME` 等标准变量。详见 `docs/monitoring.md`。 #41061
|
|
259
291
|
- **沙箱生态资产目录白名单**:沙箱安全策略默认允许 WorkBuddy 生态资产目录和跨平台技能目录写入,减少插件、技能与连接器的访问受限问题。 #42876
|
|
260
292
|
|
|
261
293
|
### 🔧 功能改进
|
|
@@ -345,23 +377,23 @@ CodeBuddy Code 的所有重要更新都会记录在这里。
|
|
|
345
377
|
|
|
346
378
|
### 🎉 新功能
|
|
347
379
|
|
|
348
|
-
-
|
|
349
|
-
- **Skill / Subagent Frontmatter Hooks
|
|
350
|
-
- **Hook
|
|
380
|
+
- **快捷键系统**:参考同类工具架构实现完整的快捷键系统,支持上下文条件、弦序列(如 Ctrl+X Ctrl+K)、用户自定义覆盖。配置文件存储在 `~/.codebuddy/keybindings.json`,支持 16 个上下文与 60+ 动作。新增 `/keybindings` 命令、REST API(`/api/v1/keybindings`)以及 Web UI 可视化配置(按上下文分组、搜索、录制、冲突检测、中英文国际化)。
|
|
381
|
+
- **Skill / Subagent Frontmatter Hooks**:对齐同类工具行为,支持在 `.codebuddy/agents/*.md` 与 `.codebuddy/skills/*/SKILL.md` 的 YAML frontmatter 中配置 `hooks:` 字段,在子代理 / fork skill 生命周期内自动注册和清理。支持 `command` / `prompt` / `agent` / `http` 四种 hook 类型、`once: true` 自动移除、event matcher 语义。**默认安全闸门**:非 product 内置来源的 frontmatter hooks 默认禁用(含本地与插件市场),需在 `settings.json` 中开启 `allowUntrustedFrontmatterHooks: true` 才生效。
|
|
382
|
+
- **Hook 事件与参数对齐(A/B 批次)**:新增 6 个 hook 事件(PostToolUseFailure、SubagentStart、StopFailure、PostCompact、ConfigChange、InstructionsLoaded),工具事件 payload 增加 `tool_use_id` / `permission_mode` 等字段,Stop / SubagentStop 事件新增 `last_assistant_message`,SubagentStop 增加 `agent_id` / `agent_type` / `agent_transcript_path`,Notification 增加 `title`,SessionEnd 原因扩充。新增 hook 异步协议(`{"async": true, "asyncTimeout": N}` 立即放行);PostToolUse 支持 MCP 工具的 `updatedMCPToolOutput` 覆盖;SessionStart 支持 `initialUserMessage` 注入首轮上下文。
|
|
351
383
|
- **Hook 事件扩展(C1 批次)**:新增 8 个 hook 事件 — PermissionRequest / PermissionDenied(权限审批前后可拦截)、TaskCreated / TaskCompleted(携带 task_id / teammate_name 等)、TeammateIdle(团队成员从忙碌转为空闲触发)、Setup(进程启动 / 维护阶段触发,stdout 可注入首轮上下文)、Elicitation / ElicitationResult(MCP elicitation 前后预留)。
|
|
352
|
-
- **Hook 配置扩展(C2 批次)**:新增 `http` / `agent` 两种 hook type;新增 `if` permission rule 语法(如 `Bash(git *)` / `Write(*.ts)`)按工具名和参数匹配;新增 `shell` / `once` / `asyncRewake` / `allowedEnvVars` / `statusMessage` 字段;插件 hook 注入新环境变量 `
|
|
353
|
-
- **Hook 文件监视(C3 批次)**:新增 `FileChanged`(SessionStart / Setup hook 通过 `hookSpecificOutput.watchPaths` 声明监视路径),`CwdChanged`(Bash / PowerShell 工具执行后若 cwd 变化自动触发)。Windows Git Bash 下 hook
|
|
354
|
-
- **Skills / 斜杠命令 / Subagent 变量占位符**:在 `.md`
|
|
384
|
+
- **Hook 配置扩展(C2 批次)**:新增 `http` / `agent` 两种 hook type;新增 `if` permission rule 语法(如 `Bash(git *)` / `Write(*.ts)`)按工具名和参数匹配;新增 `shell` / `once` / `asyncRewake` / `allowedEnvVars` / `statusMessage` 字段;插件 hook 注入新环境变量 `CODEBUDDY_PLUGIN_DATA` 与展开的 `*_PLUGIN_OPTION_*`。
|
|
385
|
+
- **Hook 文件监视(C3 批次)**:新增 `FileChanged`(SessionStart / Setup hook 通过 `hookSpecificOutput.watchPaths` 声明监视路径),`CwdChanged`(Bash / PowerShell 工具执行后若 cwd 变化自动触发)。Windows Git Bash 下 hook 子进程会自动规范化项目路径环境变量。
|
|
386
|
+
- **Skills / 斜杠命令 / Subagent 变量占位符**:在 `.md` 文件中支持变量占位符替换。支持 `${CODEBUDDY_PLUGIN_ROOT}` / `${CODEBUDDY_SKILL_DIR}` / `${CODEBUDDY_SESSION_ID}` 以及任意大写环境变量(含 `${MY_ENV_VAR:-默认值}` 默认值语法),未设置的占位符原样保留。
|
|
355
387
|
- **第三方插件市场自动更新**:新增 `autoUpdateThirdPartyMarketplaces` 产品配置和 `CODEBUDDY_AUTO_UPDATE_THIRD_PARTY_MARKETPLACES` 环境变量,支持全局启用第三方插件市场自动更新(优先级:settings env > process.env > 产品配置 > 默认关闭)。市场配置项新增 `autoUpdate` 可选字段。
|
|
356
388
|
- **Remote Gateway `/api/v1/runs` 任务执行超时可配置**:默认超时从 10 分钟提升到 30 分钟;新增 `gateway.runTimeoutMs` settings 配置;新增 HTTP header `X-Codebuddy-Run-Timeout` 单次覆盖;设为 0 或负数关闭超时保护(覆盖优先级:header > settings > 默认)。
|
|
357
389
|
- **Web UI PWA 自动更新**:前端每 15 分钟自动询问后端是否有新 Service Worker 版本,对话空闲时自动 reload,进行中弹 toast 保留"立即刷新"按钮,订阅运行状态在 Agent 空闲后自动刷新,确保不打扰正在进行的对话。
|
|
358
390
|
- **UE 项目自动排除大仓噪声目录**:在 cwd 检测到 `*.uproject` 时自动在 Grep / Glob ripgrep 调用里叠加 `!Intermediate/` `!DerivedDataCache/` `!Saved/` `!Binaries/` `!Build/` `!.vs/` 排除,避免 Unreal Engine 项目编译产物污染搜索结果;可通过 `settings.disableUEAutoExclude: true` 关闭。Grep content 搜索默认增加 `--max-columns 500` 避免单行 MB 级内容打爆 stdout。
|
|
359
391
|
- **ACP 文件系统方法**:在 ACP StreamManager 中实现 `fs/*` 系列方法(list / read / write / exists / makeDir / remove / rename / getInfo / watchDir / unwatch),支持 Desktop 通过 ACP JSON-RPC 协议访问工作目录文件系统,使"全部文件"标签页在 Desktop 中正常工作。
|
|
360
|
-
- **Subagent frontmatter `skills:`
|
|
392
|
+
- **Subagent frontmatter `skills:` 预加载**:子代理启动时把 frontmatter `skills:` 中列出的 skill 完整内容作为 `isMeta:true` user message 预注入子代理 history,支持 3 档 skill 名称 fallback。
|
|
361
393
|
|
|
362
394
|
### 🔧 功能改进
|
|
363
395
|
|
|
364
|
-
- **PowerShell 安全多层防御**:CLM
|
|
396
|
+
- **PowerShell 安全多层防御**:CLM 类型白名单(约 90 类型)、Git 安全防护(裸仓库攻击 / NTFS 8.3 短名 / archive 提取器检测)、危险 Cmdlet 分类(7 集合 + 120+ 别名)、破坏性命令 UI 警告(16 种模式)、Unicode 破折号支持(en/em/horizontal-bar)。
|
|
365
397
|
- **ACP 连接稳定性**:GET SSE 连接增加 30 秒心跳防止反向代理(cloudflared / nginx)回收;连接关闭时等待未完成的历史写入完成防止 JSONL 日志被截断。`idle_timeout` 场景增强诊断日志,便于会话长时间无更新时定位卡点。
|
|
366
398
|
- **Bash 工具 timeout 提示动态化**:工具描述中"最大 timeout"数值原先写死 600000ms,即便用户调大 `BASH_MAX_TIMEOUT_MS` 模型看到的仍是旧值。现在描述动态注入 `bashMaxTimeoutMs` / `bashDefaultTimeoutMs`,与运行时实际 clamp 值保持一致;并引导模型长任务优先省略 `timeout` 参数。
|
|
367
399
|
- **TUI 输入框 Ctrl+F / Ctrl+B 光标导航**:之前 Ctrl+F 被当普通字符插入、Ctrl+B 被静默吞掉,现支持 bash readline 风格的左右移动一字符,与已有 Ctrl+A / Ctrl+E / Ctrl+K / Ctrl+U 一致。多字节字符和 emoji 移动正确。
|
|
@@ -405,7 +437,7 @@ CodeBuddy Code 的所有重要更新都会记录在这里。
|
|
|
405
437
|
- **Compact 自动渲染 XML 标签**:修复自动压缩触发时摘要在聊天界面被渲染为裸 `<conversation_history_summary>` / `<summary>` 标签的问题,与用户主动 `/compact` 走同一套 `compact_group` 渲染。聊天界面统一显示 `/compact`,内部命令 `/_compact` 在用户消息气泡中归一化;隐藏 `/_` 前缀的内部命令。
|
|
406
438
|
- **Compact 失败保留用户消息**:长会话发送前自动压缩失败后会保存用户消息、结束本次请求并展示明确失败提示,避免静默卡住。
|
|
407
439
|
- **Compact 历史 `Unknown content` 错误**:修复从历史会话继续对话时可能报 `Unknown content` 导致中断的问题。
|
|
408
|
-
- **PTL 上下文超长恢复**:新增按 API
|
|
440
|
+
- **PTL 上下文超长恢复**:新增按 API 轮次精确截断历史的兜底机制,覆盖多类主流模型错误文案。组合命令安全检查(`&&` / `;` / `|`)改为子命令独立评估并取最高等级。
|
|
409
441
|
- **空流自动重试**:识别上游 OpenAI 兼容网关只发占位 / 心跳帧后断开的"空流"场景,复用流超时重试管道(max=1)自动恢复;重试额度按 turn 重置。
|
|
410
442
|
- **请求体超限恢复**:优化大上下文 / 大图片导致请求体超限时的自动恢复能力;`ModelProvider` axios 显式设 `maxBodyLength: Infinity, maxContentLength: Infinity`,避免打包内 `follow-redirects` 默认 10MB 上限本地拦截。
|
|
411
443
|
- **模型错误兜底状态码**:`ModelErrorAnalyzer.analyze` 与 `ModelProvider` 外层 catch 未识别归因时兜底响应码由 `500 Internal Client Error` 调整为 `400 Client Error`。
|
|
@@ -455,7 +487,7 @@ CodeBuddy Code 的所有重要更新都会记录在这里。
|
|
|
455
487
|
- **Worktree 相对 cwd**:修复 worktree 复用时未正确进入相对目录的问题。
|
|
456
488
|
- **Web UI 编辑器保存清空文件**:`writeFile` 前端未显式设 Content-Type 被 body-parser.text() 中间件消费导致后端读到空 Buffer 写入磁盘;现前端显式设 `application/octet-stream`,后端对"空请求体 + 已存在非空文件 + 无 `allowEmpty=1`"返回 409 兜底,CodeEditor 改用 `defaultValue` + useEffect 守卫。
|
|
457
489
|
- **Gateway CORS 静态资源**:开启 CORS 白名单时同源 IP 访问静态资源(首页 / `/assets/*` / Service Worker)被误拦的问题,浏览器 module script / modulepreload 静态资源不再参与 API CORS 白名单检查。
|
|
458
|
-
- **anydev 远程 IDE `/ide` 命令**:新增 `CODEBUDDY_IDE_PORT` / `CODEBUDDY_IDE_HOST` / `CODEBUDDY_IDE_SKIP_VALID_CHECK`
|
|
490
|
+
- **anydev 远程 IDE `/ide` 命令**:新增 `CODEBUDDY_IDE_PORT` / `CODEBUDDY_IDE_HOST` / `CODEBUDDY_IDE_SKIP_VALID_CHECK` 环境变量,适配容器内无法访问宿主侧锁文件、IDE 在远程主机、容器内外路径不一致等场景。
|
|
459
491
|
- **Read 工具图片返回格式**:修复 Read 工具读取图片时返回数据结构不正确,现正确返回 image content block 格式。
|
|
460
492
|
- **`@openai/agents-openai` patch**:修复 assistant 消息中 tool_calls 合并问题。
|
|
461
493
|
- **历史登出闪烁**:账号退出时清除缓存消息,防止重新登录后短暂显示旧的 pending question。
|
|
@@ -577,7 +609,7 @@ CodeBuddy Code 的所有重要更新都会记录在这里。
|
|
|
577
609
|
|
|
578
610
|
### 🐛 问题修复
|
|
579
611
|
|
|
580
|
-
- **自定义模型兼容性**:修复自定义模型(如 `custom-local:xxx
|
|
612
|
+
- **自定义模型兼容性**:修复自定义模型(如 `custom-local:xxx`)请求中残留不兼容字段(`verbosity`、`reasoning_summary`、`reasoning_effort`)导致第三方 OpenAI 兼容接口返回 400 的问题
|
|
581
613
|
|
|
582
614
|
## [2.94.3] - 2026-04-29
|
|
583
615
|
|
|
@@ -1085,7 +1117,7 @@ CodeBuddy Code 的所有重要更新都会记录在这里。
|
|
|
1085
1117
|
- **Windows PowerShell 工具**:新增原生 PowerShell 命令执行工具,支持 PowerShell 7+ 和 Windows PowerShell 5.1,自动检测版本并适配语法指导
|
|
1086
1118
|
- **无 Git Bash 降级支持**:Windows 上未安装 Git Bash 时,自动禁用 Bash 工具,PowerShell 工具成为唯一 shell 工具,不再崩溃退出
|
|
1087
1119
|
- **GLM-5.1 模型**:新增 GLM-5.1 及 GLM-5.1-ioa 模型配置
|
|
1088
|
-
- **Memory
|
|
1120
|
+
- **Memory 系统增强**:Typed Memory 默认启用,使用 4 种记忆类型(user/feedback/project/reference)结构化管理记忆
|
|
1089
1121
|
- **记忆相关性选择**:根据用户查询自动选择最多 5 个相关记忆注入上下文
|
|
1090
1122
|
- **后台记忆提取**:对话结束后可自动从对话中提取记忆(需手动启用)
|
|
1091
1123
|
- **记忆新鲜度管理**:超过 1 天的记忆附带陈旧警告,提醒验证后再使用
|
|
@@ -2159,7 +2191,7 @@ CodeBuddy Code 的所有重要更新都会记录在这里。
|
|
|
2159
2191
|
|
|
2160
2192
|
- **Bash 内置子代理**:新增 Bash 专用子代理,专注于命令执行任务(git 操作、构建工具、测试执行等),使用轻量级模型快速响应
|
|
2161
2193
|
- **命令注入检测**:新增 Bash 命令注入检测机制,拦截反引号替换、`$()` 命令替换、换行符注入等潜在危险命令
|
|
2162
|
-
-
|
|
2194
|
+
- **命令前缀提取**:新增多词命令前缀提取功能,支持环境变量前缀、子命令识别,对齐通用 policy_spec 规范
|
|
2163
2195
|
|
|
2164
2196
|
### 🔧 功能改进
|
|
2165
2197
|
|
|
@@ -2955,7 +2987,7 @@ CodeBuddy Code 的所有重要更新都会记录在这里。
|
|
|
2955
2987
|
|
|
2956
2988
|
### 🐛 问题修复
|
|
2957
2989
|
|
|
2958
|
-
-
|
|
2990
|
+
- **上下文窗口溢出**:修复模型在高负载场景下 token 超限导致请求失败的问题
|
|
2959
2991
|
- **Windows 配置路径**:修复 Windows 系统下全局配置路径错误(`%appdata%` 环境变量未正确展开)导致无法加载用户配置的问题
|
|
2960
2992
|
|
|
2961
2993
|
## [2.36.2] - 2026-01-17
|
|
@@ -3016,7 +3048,7 @@ CodeBuddy Code 的所有重要更新都会记录在这里。
|
|
|
3016
3048
|
|
|
3017
3049
|
### ✨ 新功能
|
|
3018
3050
|
|
|
3019
|
-
-
|
|
3051
|
+
- **第三方 API Key 支持**:新增第三方模型 API Key 参数和对应环境变量支持,可直接使用第三方模型 API(需配合自定义 base URL 使用)
|
|
3020
3052
|
|
|
3021
3053
|
### 🔧 功能改进
|
|
3022
3054
|
|
|
@@ -3052,7 +3084,7 @@ CodeBuddy Code 的所有重要更新都会记录在这里。
|
|
|
3052
3084
|
|
|
3053
3085
|
### 🔧 功能改进
|
|
3054
3086
|
|
|
3055
|
-
-
|
|
3087
|
+
- **模型显示**:优化模型选择器中模型的显示效果
|
|
3056
3088
|
- **会话恢复**:改进历史会话的恢复显示逻辑
|
|
3057
3089
|
- **输入体验**:修复中文输入法在某些场景下的交互问题
|
|
3058
3090
|
|
|
@@ -3566,5 +3598,5 @@ CodeBuddy Code 的所有重要更新都会记录在这里。
|
|
|
3566
3598
|
### ✨ 首次发布
|
|
3567
3599
|
|
|
3568
3600
|
- 基础对话功能
|
|
3569
|
-
-
|
|
3601
|
+
- 模型支持
|
|
3570
3602
|
- 命令行参数解析
|
|
@@ -0,0 +1,429 @@
|
|
|
1
|
+
export const meta = {
|
|
2
|
+
name: 'deep-research',
|
|
3
|
+
description: 'Deep research harness — fan-out web searches, fetch sources, adversarially verify claims, synthesize a cited report.',
|
|
4
|
+
whenToUse: 'When the user wants a deep, multi-source, fact-checked research report on any topic. BEFORE invoking, check if the question is specific enough to research directly — if underspecified (e.g., "what car to buy" without budget/use-case/region), ask 2-3 clarifying questions to narrow scope. Then pass the refined question as args, weaving the answers in.',
|
|
5
|
+
phases: [{"title":"Scope","detail":"Decompose question (from args) into 5 search angles"},{"title":"Search","detail":"5 parallel WebSearch agents, one per angle"},{"title":"Fetch","detail":"URL-dedup, fetch top 15 sources, extract falsifiable claims"},{"title":"Verify","detail":"3-vote adversarial verification per claim (need 2/3 refutes to kill)"},{"title":"Synthesize","detail":"Merge semantic dupes, rank by confidence, cite sources"}],
|
|
6
|
+
}
|
|
7
|
+
|
|
8
|
+
// deep-research: Scope → pipeline(Search → URL-dedup → Fetch+Extract) → 3-vote Verify → Synthesize
|
|
9
|
+
// Model compatibility: no `schema` param in agent() — deepseek-chat doesn't support
|
|
10
|
+
// reasoning_effort + structured output simultaneously. Instead, models output JSON text
|
|
11
|
+
// and we parse/validate manually.
|
|
12
|
+
|
|
13
|
+
// ─── JSON extraction + validation helpers (replaces agent schema option) ───
|
|
14
|
+
function extractJSON(text) {
|
|
15
|
+
if (typeof text !== 'string') return null
|
|
16
|
+
// Try ```json ... ``` block first, then ``` ... ```, then raw { }
|
|
17
|
+
const patterns = [
|
|
18
|
+
/```json\s*([\s\S]*?)```/,
|
|
19
|
+
/```\s*([\s\S]*?)```/,
|
|
20
|
+
/(\{[\s\S]*\})/,
|
|
21
|
+
]
|
|
22
|
+
for (const pat of patterns) {
|
|
23
|
+
const m = text.match(pat)
|
|
24
|
+
if (m) {
|
|
25
|
+
try { return JSON.parse(m[1].trim()) } catch {}
|
|
26
|
+
}
|
|
27
|
+
}
|
|
28
|
+
return null
|
|
29
|
+
}
|
|
30
|
+
|
|
31
|
+
function validateJSON(data, schema) {
|
|
32
|
+
if (!data || typeof data !== 'object') return false
|
|
33
|
+
for (const key of (schema.required || [])) {
|
|
34
|
+
if (!(key in data)) return false
|
|
35
|
+
const val = data[key]
|
|
36
|
+
const prop = schema.properties?.[key]
|
|
37
|
+
if (prop) {
|
|
38
|
+
if (prop.type === 'array') {
|
|
39
|
+
if (!Array.isArray(val)) return false
|
|
40
|
+
if (prop.minItems && val.length < prop.minItems) return false
|
|
41
|
+
if (prop.maxItems && val.length > prop.maxItems) return false
|
|
42
|
+
if (prop.items) {
|
|
43
|
+
for (const item of val) {
|
|
44
|
+
if (prop.items.required) {
|
|
45
|
+
for (const r of prop.items.required) {
|
|
46
|
+
if (!(r in item)) return false
|
|
47
|
+
}
|
|
48
|
+
}
|
|
49
|
+
if (prop.items.properties && prop.items.properties.relevance) {
|
|
50
|
+
if (!['high','medium','low'].includes(item.relevance)) return false
|
|
51
|
+
}
|
|
52
|
+
if (prop.items.properties && prop.items.properties.importance) {
|
|
53
|
+
if (!['central','supporting','tangential'].includes(item.importance)) return false
|
|
54
|
+
}
|
|
55
|
+
}
|
|
56
|
+
}
|
|
57
|
+
} else if (prop.type === 'object') {
|
|
58
|
+
if (!val || typeof val !== 'object') return false
|
|
59
|
+
} else if (prop.type === 'string') {
|
|
60
|
+
if (typeof val !== 'string') return false
|
|
61
|
+
} else if (prop.type === 'boolean') {
|
|
62
|
+
if (typeof val !== 'boolean') return false
|
|
63
|
+
}
|
|
64
|
+
if (prop.enum && !prop.enum.includes(val)) return false
|
|
65
|
+
}
|
|
66
|
+
}
|
|
67
|
+
return true
|
|
68
|
+
}
|
|
69
|
+
|
|
70
|
+
function safeParse(text, schema, fallback = null) {
|
|
71
|
+
const parsed = extractJSON(text)
|
|
72
|
+
if (parsed && validateJSON(parsed, schema)) return parsed
|
|
73
|
+
return fallback
|
|
74
|
+
}
|
|
75
|
+
|
|
76
|
+
const VOTES_PER_CLAIM = 3
|
|
77
|
+
const REFUTATIONS_REQUIRED = 2
|
|
78
|
+
const MAX_FETCH = 15
|
|
79
|
+
const MAX_VERIFY_CLAIMS = 25
|
|
80
|
+
|
|
81
|
+
// ─── Schemas (for validation, not agent() calls) ───
|
|
82
|
+
const SCOPE_SCHEMA = {
|
|
83
|
+
type: "object", required: ["question", "angles", "summary"],
|
|
84
|
+
properties: {
|
|
85
|
+
question: { type: "string" },
|
|
86
|
+
summary: { type: "string" },
|
|
87
|
+
angles: { type: "array", minItems: 3, maxItems: 6, items: {
|
|
88
|
+
type: "object", required: ["label", "query"],
|
|
89
|
+
properties: { label: { type: "string" }, query: { type: "string" }, rationale: { type: "string" } },
|
|
90
|
+
}},
|
|
91
|
+
},
|
|
92
|
+
}
|
|
93
|
+
const SEARCH_SCHEMA = {
|
|
94
|
+
type: "object", required: ["results"],
|
|
95
|
+
properties: {
|
|
96
|
+
results: { type: "array", maxItems: 6, items: {
|
|
97
|
+
type: "object", required: ["url", "title", "relevance"],
|
|
98
|
+
properties: {
|
|
99
|
+
url: { type: "string" }, title: { type: "string" }, snippet: { type: "string" },
|
|
100
|
+
relevance: { enum: ["high", "medium", "low"] },
|
|
101
|
+
},
|
|
102
|
+
}},
|
|
103
|
+
},
|
|
104
|
+
}
|
|
105
|
+
const EXTRACT_SCHEMA = {
|
|
106
|
+
type: "object", required: ["claims", "sourceQuality"],
|
|
107
|
+
properties: {
|
|
108
|
+
sourceQuality: { enum: ["primary", "secondary", "blog", "forum", "unreliable"] },
|
|
109
|
+
publishDate: { type: "string" },
|
|
110
|
+
claims: { type: "array", maxItems: 5, items: {
|
|
111
|
+
type: "object", required: ["claim", "quote", "importance"],
|
|
112
|
+
properties: { claim: { type: "string" }, quote: { type: "string" }, importance: { enum: ["central", "supporting", "tangential"] } },
|
|
113
|
+
}},
|
|
114
|
+
},
|
|
115
|
+
}
|
|
116
|
+
const VERDICT_SCHEMA = {
|
|
117
|
+
type: "object", required: ["refuted", "evidence", "confidence"],
|
|
118
|
+
properties: {
|
|
119
|
+
refuted: { type: "boolean" }, evidence: { type: "string" },
|
|
120
|
+
confidence: { enum: ["high", "medium", "low"] }, counterSource: { type: "string" },
|
|
121
|
+
},
|
|
122
|
+
}
|
|
123
|
+
const REPORT_SCHEMA = {
|
|
124
|
+
type: "object", required: ["summary", "findings", "caveats"],
|
|
125
|
+
properties: {
|
|
126
|
+
summary: { type: "string" },
|
|
127
|
+
findings: { type: "array", items: {
|
|
128
|
+
type: "object", required: ["claim", "confidence", "sources", "evidence"],
|
|
129
|
+
properties: {
|
|
130
|
+
claim: { type: "string" }, confidence: { enum: ["high", "medium", "low"] },
|
|
131
|
+
sources: { type: "array", items: { type: "string" } }, evidence: { type: "string" }, vote: { type: "string" },
|
|
132
|
+
},
|
|
133
|
+
}},
|
|
134
|
+
caveats: { type: "string" },
|
|
135
|
+
openQuestions: { type: "array", items: { type: "string" } },
|
|
136
|
+
},
|
|
137
|
+
}
|
|
138
|
+
|
|
139
|
+
// ─── Phase 0: Scope — decompose question into search angles ───
|
|
140
|
+
phase("Scope")
|
|
141
|
+
const QUESTION = (typeof args === "string" && args.trim()) || ""
|
|
142
|
+
if (!QUESTION) {
|
|
143
|
+
return { error: "No research question provided. Pass it as args: Workflow({name: 'deep-research', args: '<question>'})." }
|
|
144
|
+
}
|
|
145
|
+
const scope = await agent(
|
|
146
|
+
"Decompose this research question into complementary search angles.\n\n" +
|
|
147
|
+
"## Question\n" + QUESTION + "\n\n" +
|
|
148
|
+
"## Task\n" +
|
|
149
|
+
"Generate 5 distinct web search queries that together cover the question from different angles. Pick angles that suit the question's domain. Examples:\n" +
|
|
150
|
+
"- broad/primary · academic/technical · recent news · contrarian/skeptical · practitioner/implementation\n" +
|
|
151
|
+
"- For medical: anatomy · common causes · serious differentials · authoritative refs · red flags\n" +
|
|
152
|
+
"- For tech: state-of-art · benchmarks · limitations · industry adoption · cost/tradeoffs\n\n" +
|
|
153
|
+
"Make queries specific enough to surface high-signal results. Avoid redundancy.\n" +
|
|
154
|
+
"Return ONLY valid JSON (no markdown, no explanation) matching this schema:\n" +
|
|
155
|
+
'{"question": "string", "summary": "string", "angles": [{"label": "string", "query": "string", "rationale": "string"}]}\n' +
|
|
156
|
+
"with 3-6 angles. The JSON object only — nothing before or after.",
|
|
157
|
+
{ label: "scope", model: "lite" }
|
|
158
|
+
)
|
|
159
|
+
const scopeParsed = safeParse(scope, SCOPE_SCHEMA)
|
|
160
|
+
if (!scopeParsed) {
|
|
161
|
+
return { error: "Scope agent returned no result — cannot decompose the research question." }
|
|
162
|
+
}
|
|
163
|
+
log("Q: " + QUESTION.slice(0, 80) + (QUESTION.length > 80 ? "…" : ""))
|
|
164
|
+
log("Decomposed into " + scopeParsed.angles.length + " angles: " + scopeParsed.angles.map(a => a.label).join(", "))
|
|
165
|
+
|
|
166
|
+
// ─── Dedup state — accumulates across searchers as they complete ───
|
|
167
|
+
const normURL = u => {
|
|
168
|
+
try {
|
|
169
|
+
const p = new URL(u)
|
|
170
|
+
return (p.hostname.replace(/^www\./, "") + p.pathname.replace(/\/$/, "")).toLowerCase()
|
|
171
|
+
} catch { return u.toLowerCase() }
|
|
172
|
+
}
|
|
173
|
+
const seen = new Map()
|
|
174
|
+
const dupes = []
|
|
175
|
+
const budgetDropped = []
|
|
176
|
+
const relRank = { high: 0, medium: 1, low: 2 }
|
|
177
|
+
let fetchSlots = MAX_FETCH
|
|
178
|
+
|
|
179
|
+
// ─── Prompts ───
|
|
180
|
+
const SEARCH_PROMPT = (angle) =>
|
|
181
|
+
"## Web Searcher: " + angle.label + "\n\n" +
|
|
182
|
+
"Research question: \"" + QUESTION + "\"\n\n" +
|
|
183
|
+
"Your angle: **" + angle.label + "** — " + (angle.rationale || "") + "\n" +
|
|
184
|
+
"Search query: `" + angle.query + "`\n\n" +
|
|
185
|
+
"## Task\nUse WebSearch with the query above (or a refined version). Return the top 4-6 most relevant results.\n" +
|
|
186
|
+
"Rank by relevance to the ORIGINAL question, not just the search query. Skip obvious SEO spam/content farms.\n" +
|
|
187
|
+
"Include a short snippet capturing why each result is relevant.\n\n" +
|
|
188
|
+
"Return ONLY valid JSON (no markdown, no explanation) matching:\n" +
|
|
189
|
+
'{"results": [{"url": "string", "title": "string", "snippet": "string", "relevance": "high|medium|low"}]}\n' +
|
|
190
|
+
"with 4-6 results. The JSON object only — nothing before or after."
|
|
191
|
+
|
|
192
|
+
const FETCH_PROMPT = (source, angle) =>
|
|
193
|
+
"## Source Extractor\n\n" +
|
|
194
|
+
"Research question: \"" + QUESTION + "\"\n\n" +
|
|
195
|
+
"Fetch and extract key claims from this source:\n" +
|
|
196
|
+
"**URL:** " + source.url + "\n**Title:** " + source.title + "\n**Found via:** " + angle + " search\n\n" +
|
|
197
|
+
"## Task\n1. Use WebFetch to retrieve the page content.\n" +
|
|
198
|
+
"2. Assess source quality: primary research/institution? secondary reporting? blog/opinion? forum? unreliable?\n" +
|
|
199
|
+
"3. Extract 2-5 FALSIFIABLE claims that bear on the research question. Each claim must:\n" +
|
|
200
|
+
" - be a concrete, checkable statement (not vague generalities)\n" +
|
|
201
|
+
" - include a direct quote from the source as support\n" +
|
|
202
|
+
" - be rated central/supporting/tangential to the research question\n" +
|
|
203
|
+
"4. Note publish date if available.\n\n" +
|
|
204
|
+
"If the fetch fails or the page is irrelevant/paywalled, return claims: [] and sourceQuality: \"unreliable\".\n\n" +
|
|
205
|
+
"Return ONLY valid JSON (no markdown, no explanation) matching:\n" +
|
|
206
|
+
'{"sourceQuality": "primary|secondary|blog|forum|unreliable", "publishDate": "string", "claims": [{"claim": "string", "quote": "string", "importance": "central|supporting|tangential"}]}\n' +
|
|
207
|
+
"with 2-5 claims per source. The JSON object only — nothing before or after."
|
|
208
|
+
|
|
209
|
+
const VERIFY_PROMPT = (claim, v) =>
|
|
210
|
+
"## Adversarial Claim Verifier (voter " + (v + 1) + "/" + VOTES_PER_CLAIM + ")\n\n" +
|
|
211
|
+
"Be SKEPTICAL. Try to REFUTE this claim. ≥" + REFUTATIONS_REQUIRED + "/" + VOTES_PER_CLAIM + " refutations kill it.\n\n" +
|
|
212
|
+
"## Research question\n" + QUESTION + "\n\n" +
|
|
213
|
+
"## Claim under review\n\"" + claim.claim + "\"\n\n" +
|
|
214
|
+
"**Source:** " + claim.sourceUrl + " (" + claim.sourceQuality + ")\n" +
|
|
215
|
+
"**Supporting quote:** \"" + claim.quote + "\"\n\n" +
|
|
216
|
+
"## Checklist\n" +
|
|
217
|
+
"1. Is the claim actually supported by the quote, or is it an overreach/misread?\n" +
|
|
218
|
+
"2. WebSearch for contradicting evidence — does any credible source dispute or heavily qualify this?\n" +
|
|
219
|
+
"3. Is the source quality sufficient for the claim's strength? (extraordinary claims need primary sources)\n" +
|
|
220
|
+
"4. Is the claim outdated? (check dates — old claims about fast-moving fields are suspect)\n" +
|
|
221
|
+
"5. Is this a marketing claim / press release / cherry-picked benchmark / forum speculation?\n\n" +
|
|
222
|
+
"**refuted=true** if: unsupported by quote / contradicted / low-quality source for strong claim / outdated / marketing fluff.\n" +
|
|
223
|
+
"**refuted=false** ONLY if: claim is well-supported, current, and source quality matches claim strength.\n" +
|
|
224
|
+
"Default to refuted=true if uncertain.\n\n" +
|
|
225
|
+
"Return ONLY valid JSON (no markdown, no explanation) matching:\n" +
|
|
226
|
+
'{"refuted": true|false, "evidence": "string", "confidence": "high|medium|low", "counterSource": "string"}\n' +
|
|
227
|
+
"The JSON object only — nothing before or after. Evidence MUST be specific."
|
|
228
|
+
|
|
229
|
+
// ─── Pipeline: search → dedup → fetch+extract (no barrier) ───
|
|
230
|
+
const searchResults = await pipeline(
|
|
231
|
+
scopeParsed.angles,
|
|
232
|
+
|
|
233
|
+
angle => agent(SEARCH_PROMPT(angle), {
|
|
234
|
+
label: "search:" + angle.label, phase: "Search", model: "lite"
|
|
235
|
+
}).then(r => {
|
|
236
|
+
const parsed = safeParse(r, SEARCH_SCHEMA)
|
|
237
|
+
if (!parsed) return null
|
|
238
|
+
log(angle.label + ": " + parsed.results.length + " results")
|
|
239
|
+
return { angle: angle.label, results: parsed.results }
|
|
240
|
+
}),
|
|
241
|
+
|
|
242
|
+
searchResult => {
|
|
243
|
+
const sorted = [...searchResult.results].sort((a, b) => relRank[a.relevance] - relRank[b.relevance])
|
|
244
|
+
const novel = sorted.filter(r => {
|
|
245
|
+
const key = normURL(r.url)
|
|
246
|
+
if (seen.has(key)) {
|
|
247
|
+
dupes.push({ ...r, angle: searchResult.angle, dupOf: seen.get(key) })
|
|
248
|
+
return false
|
|
249
|
+
}
|
|
250
|
+
if (fetchSlots <= 0 && relRank[r.relevance] >= 1) {
|
|
251
|
+
budgetDropped.push({ ...r, angle: searchResult.angle })
|
|
252
|
+
return false
|
|
253
|
+
}
|
|
254
|
+
seen.set(key, { angle: searchResult.angle, title: r.title })
|
|
255
|
+
fetchSlots--
|
|
256
|
+
return true
|
|
257
|
+
})
|
|
258
|
+
if (novel.length < searchResult.results.length) {
|
|
259
|
+
log(searchResult.angle + ": " + novel.length + " novel (" + (searchResult.results.length - novel.length) + " filtered)")
|
|
260
|
+
}
|
|
261
|
+
return parallel(
|
|
262
|
+
novel.map(source => () => {
|
|
263
|
+
// host 仅用作 agent label,解析失败也不阻断流程。
|
|
264
|
+
// search agent 返回的 url 经常不规范:缺 scheme 的裸域名 / markdown 链接
|
|
265
|
+
// 片段 / 整段标题塞进 url 字段。三层兜底,实在拿不到才回落 "unknown"。
|
|
266
|
+
let host = "unknown"
|
|
267
|
+
const raw = String(source.url || "").trim()
|
|
268
|
+
try {
|
|
269
|
+
host = new URL(raw).hostname.replace(/^www\./, "")
|
|
270
|
+
} catch {
|
|
271
|
+
// 1) 缺 scheme 的裸域名 → 补 https:// 再试
|
|
272
|
+
try {
|
|
273
|
+
host = new URL("https://" + raw.replace(/^\/+/, "")).hostname.replace(/^www\./, "")
|
|
274
|
+
} catch {
|
|
275
|
+
// 2) markdown 链接片段 / 杂乱字符串 → 正则抓第一个 host-like token。
|
|
276
|
+
// `\b` 边界让 ".com)" / ".com," 这类后跟标点的也能命中。
|
|
277
|
+
const m = raw.match(/([a-z0-9-]+(?:\.[a-z0-9-]+)*\.[a-z]{2,})\b/i)
|
|
278
|
+
if (m) host = m[1].replace(/^www\./, "")
|
|
279
|
+
}
|
|
280
|
+
}
|
|
281
|
+
return agent(FETCH_PROMPT(source, searchResult.angle), {
|
|
282
|
+
label: "fetch:" + host,
|
|
283
|
+
phase: "Fetch", model: "lite",
|
|
284
|
+
}).then(ext => {
|
|
285
|
+
const parsed = safeParse(ext, EXTRACT_SCHEMA)
|
|
286
|
+
if (!parsed) return null
|
|
287
|
+
return {
|
|
288
|
+
url: source.url, title: source.title, angle: searchResult.angle,
|
|
289
|
+
sourceQuality: parsed.sourceQuality, publishDate: parsed.publishDate,
|
|
290
|
+
claims: parsed.claims.map(c => ({ ...c, sourceUrl: source.url, sourceQuality: parsed.sourceQuality })),
|
|
291
|
+
}
|
|
292
|
+
}).catch(e => {
|
|
293
|
+
log("fetch failed: " + source.url + " — " + (e.message || e))
|
|
294
|
+
return { url: source.url, title: source.title, angle: searchResult.angle, sourceQuality: "unreliable", claims: [] }
|
|
295
|
+
})
|
|
296
|
+
})
|
|
297
|
+
)
|
|
298
|
+
}
|
|
299
|
+
)
|
|
300
|
+
|
|
301
|
+
const allSources = searchResults.flat().filter(Boolean)
|
|
302
|
+
const allClaims = allSources.flatMap(s => s.claims)
|
|
303
|
+
const impRank = { central: 0, supporting: 1, tangential: 2 }
|
|
304
|
+
const qualRank = { primary: 0, secondary: 1, blog: 2, forum: 3, unreliable: 4 }
|
|
305
|
+
|
|
306
|
+
const rankedClaims = [...allClaims]
|
|
307
|
+
.sort((a, b) => (impRank[a.importance] - impRank[b.importance]) || (qualRank[a.sourceQuality] - qualRank[b.sourceQuality]))
|
|
308
|
+
.slice(0, MAX_VERIFY_CLAIMS)
|
|
309
|
+
|
|
310
|
+
log("Fetched " + allSources.length + " sources → " + allClaims.length + " claims → verifying top " + rankedClaims.length)
|
|
311
|
+
|
|
312
|
+
if (rankedClaims.length === 0) {
|
|
313
|
+
return {
|
|
314
|
+
question: QUESTION,
|
|
315
|
+
summary: "No claims extracted. " + allSources.length + " sources fetched, all empty/failed. " + dupes.length + " URL dupes, " + budgetDropped.length + " budget-dropped.",
|
|
316
|
+
findings: [], refuted: [], sources: allSources.map(s => ({ url: s.url, quality: s.sourceQuality })),
|
|
317
|
+
stats: { angles: scopeParsed.angles.length, sources: allSources.length, claims: 0, dupes: dupes.length },
|
|
318
|
+
}
|
|
319
|
+
}
|
|
320
|
+
|
|
321
|
+
// ─── Verify: 3-vote adversarial ───
|
|
322
|
+
// Barrier here is intentional — claim pool must be fully assembled before ranking/verification.
|
|
323
|
+
phase("Verify")
|
|
324
|
+
const voted = (await parallel(
|
|
325
|
+
rankedClaims.map(claim => () =>
|
|
326
|
+
parallel(
|
|
327
|
+
Array.from({ length: VOTES_PER_CLAIM }, (_, v) => () =>
|
|
328
|
+
agent(VERIFY_PROMPT(claim, v), {
|
|
329
|
+
label: "v" + v + ":" + claim.claim.slice(0, 40),
|
|
330
|
+
phase: "Verify", model: "lite",
|
|
331
|
+
}).then(text => safeParse(text, VERDICT_SCHEMA))
|
|
332
|
+
)
|
|
333
|
+
).then(verdicts => {
|
|
334
|
+
// A vote can be null (user-skip or agent error) — treat as abstain.
|
|
335
|
+
const valid = verdicts.filter(Boolean)
|
|
336
|
+
const refuted = valid.filter(v => v.refuted).length
|
|
337
|
+
// Survive only if the claim was actually adjudicated: a quorum of
|
|
338
|
+
// valid votes AND fewer than REFUTATIONS_REQUIRED refuting. Too many
|
|
339
|
+
// abstentions = unverified, which must NOT pass into the report
|
|
340
|
+
// (otherwise all-abstain → refuted=0 → false survive).
|
|
341
|
+
const abstained = VOTES_PER_CLAIM - valid.length
|
|
342
|
+
const survives = valid.length >= REFUTATIONS_REQUIRED && refuted < REFUTATIONS_REQUIRED
|
|
343
|
+
log("\"" + claim.claim.slice(0, 50) + "…\": " + (valid.length - refuted) + "-" + refuted + (abstained > 0 ? " (" + abstained + " abstain)" : "") + " " + (survives ? "✓" : "✗"))
|
|
344
|
+
return { ...claim, verdicts: valid, refutedVotes: refuted, survives }
|
|
345
|
+
})
|
|
346
|
+
)
|
|
347
|
+
)).filter(Boolean)
|
|
348
|
+
|
|
349
|
+
const confirmed = voted.filter(c => c.survives)
|
|
350
|
+
const killed = voted.filter(c => !c.survives)
|
|
351
|
+
log("Verify done: " + voted.length + " claims → " + confirmed.length + " confirmed, " + killed.length + " killed")
|
|
352
|
+
|
|
353
|
+
if (confirmed.length === 0) {
|
|
354
|
+
return {
|
|
355
|
+
question: QUESTION,
|
|
356
|
+
summary: "All " + voted.length + " claims refuted by adversarial verification. Research inconclusive — sources may be low-quality or claims overstated.",
|
|
357
|
+
findings: [],
|
|
358
|
+
refuted: killed.map(c => ({ claim: c.claim, vote: (c.verdicts.length - c.refutedVotes) + "-" + c.refutedVotes, source: c.sourceUrl })),
|
|
359
|
+
sources: allSources.map(s => ({ url: s.url, quality: s.sourceQuality, claimCount: s.claims.length })),
|
|
360
|
+
stats: { angles: scopeParsed.angles.length, sources: allSources.length, claims: allClaims.length, verified: voted.length, confirmed: 0, killed: killed.length },
|
|
361
|
+
}
|
|
362
|
+
}
|
|
363
|
+
|
|
364
|
+
// ─── Synthesize ───
|
|
365
|
+
phase("Synthesize")
|
|
366
|
+
const confRank = { high: 0, medium: 1, low: 2 }
|
|
367
|
+
const block = confirmed.map((c, i) => {
|
|
368
|
+
const best = c.verdicts.filter(v => !v.refuted).sort((a, b) => confRank[a.confidence] - confRank[b.confidence])[0]
|
|
369
|
+
return "### [" + i + "] " + c.claim + "\n" +
|
|
370
|
+
"Vote: " + (c.verdicts.length - c.refutedVotes) + "-" + c.refutedVotes + " · Source: " + c.sourceUrl + " (" + c.sourceQuality + ")\n" +
|
|
371
|
+
"Quote: \"" + c.quote + "\"\nVerifier evidence (" + best.confidence + "): " + best.evidence + "\n"
|
|
372
|
+
}).join("\n")
|
|
373
|
+
|
|
374
|
+
const killedBlock = killed.length > 0
|
|
375
|
+
? "\n## Refuted claims (for transparency)\n" +
|
|
376
|
+
killed.map(c => "- \"" + c.claim + "\" (" + c.sourceUrl + ", vote " + (c.verdicts.length - c.refutedVotes) + "-" + c.refutedVotes + ")").join("\n")
|
|
377
|
+
: ""
|
|
378
|
+
|
|
379
|
+
const reportText = await agent(
|
|
380
|
+
"## Synthesis: research report\n\n" +
|
|
381
|
+
"**Question:** " + QUESTION + "\n\n" +
|
|
382
|
+
confirmed.length + " claims survived " + VOTES_PER_CLAIM + "-vote adversarial verification. Merge semantic duplicates and synthesize.\n\n" +
|
|
383
|
+
"## Confirmed claims\n" + block + "\n" + killedBlock + "\n\n" +
|
|
384
|
+
"## Instructions\n" +
|
|
385
|
+
"1. Identify claims that say the same thing — merge them, combine their sources.\n" +
|
|
386
|
+
"2. Group related claims into coherent findings. Each finding should directly address the research question.\n" +
|
|
387
|
+
"3. Assign confidence per finding: high (multiple primary sources, unanimous votes), medium (secondary sources or split votes), low (single source or blog-quality).\n" +
|
|
388
|
+
"4. Write a 3-5 sentence executive summary answering the research question.\n" +
|
|
389
|
+
"5. Note caveats: what's uncertain, what sources were weak, what time-sensitivity applies.\n" +
|
|
390
|
+
"6. List 2-4 open questions that emerged but weren't answered.\n\n" +
|
|
391
|
+
"Return ONLY valid JSON (no markdown, no explanation) matching:\n" +
|
|
392
|
+
'{"summary": "string", "findings": [{"claim": "string", "confidence": "high|medium|low", "sources": ["string"], "evidence": "string", "vote": "string"}], "caveats": "string", "openQuestions": ["string"]}\n' +
|
|
393
|
+
"The JSON object only — nothing before or after.",
|
|
394
|
+
{ label: "synthesize", model: "lite" }
|
|
395
|
+
)
|
|
396
|
+
const report = safeParse(reportText, REPORT_SCHEMA)
|
|
397
|
+
|
|
398
|
+
if (!report) {
|
|
399
|
+
// Synthesis skipped/errored — salvage the verified claims raw rather
|
|
400
|
+
// than throwing on report.findings and discarding the whole run.
|
|
401
|
+
return {
|
|
402
|
+
question: QUESTION,
|
|
403
|
+
summary: "Synthesis step was skipped or failed — returning " + confirmed.length + " verified claims unmerged.",
|
|
404
|
+
findings: [],
|
|
405
|
+
confirmed: confirmed.map(c => ({ claim: c.claim, source: c.sourceUrl, quote: c.quote, vote: (c.verdicts.length - c.refutedVotes) + "-" + c.refutedVotes })),
|
|
406
|
+
refuted: killed.map(c => ({ claim: c.claim, vote: (c.verdicts.length - c.refutedVotes) + "-" + c.refutedVotes, source: c.sourceUrl })),
|
|
407
|
+
sources: allSources.map(s => ({ url: s.url, quality: s.sourceQuality, claimCount: s.claims.length })),
|
|
408
|
+
stats: { angles: scopeParsed.angles.length, sources: allSources.length, claims: allClaims.length, verified: voted.length, confirmed: confirmed.length, killed: killed.length, afterSynthesis: 0 },
|
|
409
|
+
}
|
|
410
|
+
}
|
|
411
|
+
|
|
412
|
+
return {
|
|
413
|
+
question: QUESTION,
|
|
414
|
+
...report,
|
|
415
|
+
refuted: killed.map(c => ({ claim: c.claim, vote: (c.verdicts.length - c.refutedVotes) + "-" + c.refutedVotes, source: c.sourceUrl })),
|
|
416
|
+
sources: allSources.map(s => ({ url: s.url, quality: s.sourceQuality, angle: s.angle, claimCount: s.claims.length })),
|
|
417
|
+
stats: {
|
|
418
|
+
angles: scopeParsed.angles.length,
|
|
419
|
+
sourcesFetched: allSources.length,
|
|
420
|
+
claimsExtracted: allClaims.length,
|
|
421
|
+
claimsVerified: voted.length,
|
|
422
|
+
confirmed: confirmed.length,
|
|
423
|
+
killed: killed.length,
|
|
424
|
+
afterSynthesis: report.findings.length,
|
|
425
|
+
urlDupes: dupes.length,
|
|
426
|
+
budgetDropped: budgetDropped.length,
|
|
427
|
+
agentCalls: 1 + scopeParsed.angles.length + allSources.length + (voted.length * VOTES_PER_CLAIM) + 1,
|
|
428
|
+
},
|
|
429
|
+
}
|