@dogfood-lab/study-swarm 0.6.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.zh.md CHANGED
@@ -13,79 +13,120 @@
13
13
  <img src="https://img.shields.io/badge/cited%20research-verified-1f6feb" alt="Cited research, verified">
14
14
  </p>
15
15
 
16
- **将设计决策建立在引用的研究基础上——然后在将任何内容确立为标准之前,使用*不同的*模型系列来验证这些引用。**
16
+ 首先,将设计决策建立在引用的研究基础上——然后,在使用这些研究成果之前,请使用*不同的*模型系列来验证引用的准确性。
17
17
 
18
- `study-swarm` 是一种协议,而不是一种工具。当您使用大型语言模型 (LLM) 做出重大的设计决策时——例如,创建一个新的产品层、选择一种架构,或者决定“我们是否应该在这里信任该模型”——如果仅凭经验进行设计,会导致设计陈旧;如果仅凭记忆引用论文,会导致设计依赖于不存在或与您认为的内容不符的来源。`study-swarm` 替代了这两种方法:同时启动多个研究代理,要求提供具体的引用结果,并在每个引用被用于指导设计之前,通过一个**来自不同模型系列的外部验证器**进行验证。
18
+ `study-swarm` 是一种协议,而不是一种工具。当您使用大型语言模型(LLM)做出重大设计决策时——例如,创建一个新的产品层、选择一种架构,或者决定“我们是否应该信任该模型”——如果只是凭经验进行即兴创作,那么最终的设计方案就会显得陈旧;如果只是凭记忆引用论文,那么设计方案就会依赖于不存在的来源或与您认为的内容不符的来源。`study-swarm` 可以取代这两种做法:它会同时启动多个研究代理,要求提供具体的引文结果,并且在将任何引文用于指导设计之前,都会通过**来自不同模型系列的外部验证器**进行验证。
19
19
 
20
- 它也适用于自身。该协议规定,对于它所帮助设计的系统,应使用经过验证器保护的机制——因此,它也将其应用于自身。**没有模型会自己批改作业,包括运行该协议的模型。**
20
+ 它采取了自我调节的方式。该协议规定,对于其参与设计的系统,应使用经过验证者保护的信封——因此,它也将其应用于自身。**没有任何模型会自己批改作业,包括运行该协议的模型。**
21
21
 
22
- ## 该协议包含五个步骤
22
+ ## 五步流程
23
23
 
24
- 1. **确定** 3-5 个关键的设计问题,如果存在经验证据,这些证据可能会改变答案。
25
- 2. **同时启动**每个问题的研究代理。每个代理必须返回论文标题 + 作者 + 年份 + URL + 一句话的结论——强调具体性而非广泛性(“6-8 个来源可靠的结论胜过 20 个模糊的描述”)。
26
- 3. **将结论综合**到“研究依据”部分:`N. **<结论>.** <作者> <年份> (<arXiv/DOI>). <设计意义>.`
27
- 4. **进行外部验证**——一个*不同的模型系列*,不带推理能力,以两个阶段检查每个引用:一个**检索预言机**确认论文是否存在(永远不是模型的记忆),然后一个**可靠性**过滤器确认结论与来源是否匹配。如果发现捏造或错误引用,则**停止**;如果验证器或检索预言机不可用,则**停止并升级**(永远不要将无法找到的情况视为“引用没问题”)。
28
- 5. **将每个架构选择与一个结论联系起来**,通过编号进行关联。没有设计意义的引用是噪音。
24
+ 1. **确定** 35 个关键的结构设计问题,这些问题的答案可以通过实证证据来改变。
25
+ 2. **指派** 一名研究人员负责每个问题,并让他们并行工作。每位研究人员必须提供论文标题、作者、发表年份、网址以及一个简短的结论(强调具体性而非广泛性,“6 8 个有充分依据的结论胜过 20 个含糊不清的描述”)。
26
+ 3. **综合** 这些结论,形成一个“*研究基础*”部分:`N. **<结论>.** <作者> <年份> (<arXiv/DOI>) <设计启示>。`
27
+ 4. **进行外部验证**——使用一种*不同的模型系列*,去除推理能力后,分两个阶段检查所有引用文献:首先,一个**检索预言机**确认论文是否存在(绝不能依赖模型的记忆),然后,一个**真实性评估工具**确认结论是否与来源一致。如果发现捏造或错误归因的引用,则立即**停止**;如果验证者或检索预言机不可用,则**停止并升级处理**(切勿将无法找到的情况解读为“引用没有问题”)。
28
+ 5. **将**每个结构设计选择与相应的结论联系起来,通过编号进行关联。如果没有明确的设计启示,那么这些引用就是噪音。
29
29
 
30
- 完整的可执行细节——停止表、来源标准、集成规则——位于 **[PROTOCOL.md](PROTOCOL.md)** 中。
30
+ 完整的可执行细节——包括停止表、源标准和集成规则——都可以在**[PROTOCOL.md]**文件中找到。
31
31
 
32
- ## 为什么使用*不同的*模型系列,并且不带推理能力?
32
+ ## 为什么会是另外一个家庭?而且,请不要再进行任何推测
33
33
 
34
- 因为失败模式是已知的,而不是假设的:
34
+ 因为这里记录的是实际发生的故障模式,而不是假设的故障模式:
35
35
 
36
- - **大型语言模型无法可靠地验证其自身的输出。** Huang 等人,2023 ([arXiv:2310.01798](https://arxiv.org/abs/2310.01798));Kambhampati 等人,2024 ([arXiv:2402.01817](https://arxiv.org/abs/2402.01817),LLM-Modulo);Stechly 等人,2024 ([arXiv:2402.08115](https://arxiv.org/abs/2402.08115))——外部验证器可以带来收益;自我批评的内容是无效的。
37
- - **同一系列的评估者会偏向于自我。** Panickssery、Bowman 和 Feng,2024 ([arXiv:2404.13076](https://arxiv.org/abs/2404.13076))——自我识别与自我偏好呈*线性*相关,因此部分隐藏并不能提供帮助。Verga 等人,2024 ([arXiv:2404.18796](https://arxiv.org/abs/2404.18796),PoLL))——来自不同系列的评估小组的偏见更小,成本约为原来的 1/7。
38
- - **大型语言模型最容易在引用方面撒谎。** Walters 和 Wilder,2023 ([doi:10.1038/s41598-023-41032-5](https://doi.org/10.1038/s41598-023-41032-5))——55% 的 GPT-3.5 / 18% GPT-4 引用是捏造的。Onweller 等人,2026 ([arXiv:2605.06635](https://arxiv.org/abs/2605.06635))——链接可以解决超过 94% 的问题,但只有 39-77% 的引用内容实际上支持该主张。因此,必须通过**检索**来检查是否存在,而不是通过**回忆**。
39
- - **隐藏生成器的推理过程。** Khalifa 等人,2026 ([arXiv:2601.14691](https://arxiv.org/abs/2601.14691),“欺骗评估者”)——仅通过操纵思维链,就可以使评估者的假阳性率提高高达 90%,而操作保持不变。Turpin 等人,2023 ([arXiv:2305.04388](https://arxiv.org/abs/2305.04388))——思维链是一种事后合理化。验证器看到的是裸露的引用主张,而不是“我为什么包含这个”。
40
- - **多样性胜过数量。** Rajan,2025 ([arXiv:2511.16708](https://arxiv.org/abs/2511.16708))——四个验证器,成对相关性 ρ ∈ [0.05, 0.25],通过亚模覆盖胜过任何一个智能评估者。Kim 等人,2025 ([arXiv:2506.07962](https://arxiv.org/abs/2506.07962))——大型语言模型的错误是*相关的*,因此关键变量是视角多样性,而不是原始数量。
36
+ - **大型语言模型无法可靠地验证自身输出。** Huang 等人,2023 ([arXiv:2310.01798](https://arxiv.org/abs/2310.01798));Kambhampati 等人,2024 ([arXiv:2402.01817](https://arxiv.org/abs/2402.01817),LLM-Modulo);Stechly 等人,2024 ([arXiv:2402.08115](https://arxiv.org/abs/2402.08115))——外部验证者承担了主要的改进作用;自我评价的内容是静态的。
37
+ - **同一系列的评估者更倾向于选择自身的结果。** Panickssery、Bowman 和 Feng,2024 ([arXiv:2404.13076](https://arxiv.org/abs/2404.13076))——自我识别与自我偏好呈*线性*相关,因此部分隐藏信息并不能起到帮助作用。Verga 等人,2024 ([arXiv:2404.18796](https://arxiv.org/abs/2404.18796),PoLL)——由不同系列的评估者组成的团队的偏见更小,且成本约为原来的 1/7。
38
+ - **大型语言模型最容易在引用时造假。** Walters 和 Wilder,2023 ([doi:10.1038/s41598-023-41032-5](https://doi.org/10.1038/s41598-023-41032-5))——GPT-3.5 55% 的引用,GPT-4 中 18% 的引用是捏造的。Onweller 等人,2026 ([arXiv:2605.06635](https://arxiv.org/abs/2605.06635))——链接在超过 94% 的情况下可以找到,但只有 39-77% 的引用内容实际上支持了论点。因此,必须通过**检索而非回忆**来验证其存在性。
39
+ - **隐藏生成器的推理过程。** Khalifa 等人,2026 ([arXiv:2601.14691](https://arxiv.org/abs/2601.14691),“欺骗评估者”)——仅通过操纵思维链,就可以使评估者的假阳性率提高高达 90%,而操作条件保持不变。Turpin 等人,2023 ([arXiv:2305.04388](https://arxiv.org/abs/2305.04388))——思维链是一种事后合理化。验证者只能看到原始的引用声明,而无法了解“我为什么包含这个”。
40
+ - **多样性胜过数量。** Rajan,2025 ([arXiv:2511.16708](https://arxiv.org/abs/2511.16708))——四个评估者之间的成对相关性 ρ ∈ [0.05, 0.25],通过次模覆盖,其效果优于任何单个评估者。Kim 等人,2025 ([arXiv:2506.07962](https://arxiv.org/abs/2506.07962))——大型语言模型的错误是*相关的*,因此,关键变量是视角的多样性,而不是单纯的数量。
41
41
 
42
- ## 它真的有效吗?(证明)
42
+ ## 它真的有效吗?(请提供证据)
43
43
 
44
- 作为一项测试,该协议被应用于其自身的引用。两个不相关的非 Claude 系列——**Mistral** (`mistral-small:24b`) **IBM Granite** (`granite4.1:30b`)——检查了一组引用,并且不带推理能力,并设置了两个盲目陷阱:
44
+ 为了进行测试,我们将该协议应用于其自身的引用文献。我们选择了两个与 Claude 模型无关的模型系列——**Mistral**(`mistral-small:24b`)和 **IBM Granite**(`granite4.1:30b`),并对它们进行了测试。测试内容包括:检查一组引用文献,去除推理过程中的干扰因素,并在其中设置了两个隐藏的陷阱。
45
45
 
46
- | 设置的陷阱 | Mistral | IBM Granite | 真实情况 |
46
+ | 设下陷阱。 | 米斯特拉尔 | IBM 花岗岩(IBM Granite | 真实数据;基准数据。 |
47
47
  |---|---|---|---|
48
- | 思维链提示归因于“Nakamura & Olsen” | 未发现 | **发现**(错误归因 → 实际上是 Wei 等人,2022 年) | 错误归因 |
49
- | 一个捏造的“98% 的错误已消除,不需要预言机”论文 | **caught** (fabricated) | **caught** (fabricated) | 捏造 |
48
+ | “中村和奥尔森”提出的基于思维链的提示方法。 | 错过;想念。 | **已更正**(原错误归因,现改为:魏等人,2022年,arXiv:2201.11903) | 错误归因;错误地认为……是……所为。 |
49
+ | 一篇捏造的论文,声称“已消除 98% 的错误,无需使用 Oracle”。 | **caught** (fabricated) | **caught** (fabricated) | 捏造的;伪造的。 |
50
50
 
51
- 两个模型单独都没有发现这两个陷阱——但它们的**组合发现了 2/2 个**。如果只有一个评估者,它会忽略错误归因。此外,检索预言机发现了我们自己设计文档中的两个*真实*的错误归因(引用了错误的作者),而任何参数化的大型语言模型都无法标记——并且它正确地确认了 2026 年的真实论文,而这两个大型语言模型都错误地将其标记为捏造,仅仅是因为这些论文的发布时间晚于它们的训练时间。最后一点是,步骤 4 的存在性检查**必须**是一个检索预言机,而不是大型语言模型。
51
+ 两个家庭都未能单独成功设置这两个陷阱——但他们合作后,两个陷阱都成功触发了。如果由一位法官来判断,可能会出现误判。此外,检索系统在我们的设计文档中发现了两个真实的错误归因(即引用时将论文归于错误的作者),而任何参数化的大型语言模型都无法识别这些错误——并且它正确地确认了真正的 2026 年发表的论文,这两篇论文之前被两个大型语言模型错误地标记为虚构,仅仅是因为这些论文的发表时间晚于它们接受训练的时间。最后这一点是第四步存在检查必须采用检索系统而非大型语言模型的根本原因。
52
52
 
53
- 这一个测试就是缩影:**不相关的视角 + 用于验证存在性的检索预言机,胜过任何一个智能评估者。**
53
+ 那一次实验结果,可以被视为一个微缩版的论点:**通过使用不相关的镜头,并结合一种用于验证存在的检索机制,其效果将优于任何单一的智能判断系统。**
54
54
 
55
- ## 它的工作方式
55
+ ### ……而且,我们还要重新设计 1.1 版本
56
56
 
57
- 您可以手动运行该协议——任何不同的模型系列,加上您自己解析 arXiv/DOI,就可以满足步骤 4 的要求。两个辅助工具可以将其简化为一个命令:
57
+ v1.1版本的改进采用相同的方式进行——通过在“study-swarm”上运行“study-swarm”。第一个版本提出的四个问题(如何*实现*扎实性检查的自动化,是否在生成时进行扎实性验证,如何*组合*不同的视角,以及是否对经过校准的不确定性进行弃权)被分配给并行的研究代理,所有**27条结果引用**都通过第4步进行筛选,然后才用于指导设计。检索预言机确认**27/27条引用存在**——包括六篇2025-2026年的论文,如果使用参数模型,这些论文会被错误地标记为捏造的——并且更正了五处归因错误,而这是模型无法做到的,其中一处是研究代理自己发现的一处真实的作者署名错误。在不进行推理的情况下运行,扎实性视角甚至可以重现其自身记录的失败模式:自信地将一篇真实论文错误地标记为虚假论文,并且它们的*分歧*触发了升级——这与级联机制完全一致。经过验证的工作流程以[`examples/study-swarm-v1_1.dispatch.md`](examples/study-swarm-v1_1.dispatch.md)的形式提供;它所确定的改进(分解/三元扎实性、生成时扎实性、由预言机控制的级联机制以及经过校准的弃权)都包含在[PROTOCOL.md](PROTOCOL.md)中。
58
58
 
59
- - **[prism-verify](https://github.com/mcp-tool-shop-org/prism-verify)** — 运行时验证器:不同类型的路由、去除推理过程、多角度仲裁、确定性的检索存在性验证(arXiv → Crossref)以及带签名的收据。
60
- - **[role-os](https://github.com/mcp-tool-shop-org/role-os)** — 提供 `roleos verify-citations <dispatch>` 命令,该命令提取某个任务的引用,并通过 prism 进行验证。
59
+ ## 其工作原理
61
60
 
62
- ## 命令行界面
61
+ 您可以手动运行该协议——任何不同类型的模型,再加上您自己解析arXiv/DOI,都可以满足第4步的要求。两个辅助工具使其只需一个命令即可完成:
62
+
63
+ - **[prism-verify](https://github.com/mcp-tool-shop-org/prism-verify)**——运行时验证器:不同类型的模型路由、无推理、多视角仲裁、确定性的检索存在性阈值(arXiv → Crossref)以及带签名的收据。
64
+ - **[role-os](https://github.com/mcp-tool-shop-org/role-os)**——提供`roleos verify-citations <dispatch>`,该工具提取工作流程中的引用并将其通过prism进行筛选。
65
+
66
+ 传递过程就是工作流程的格式:将研究结果写成`N. **finding.** Authors year (arXiv|DOI). implication.`的形式——**每条研究结果都包含一个可解析的标识符**——这正是`roleos verify-citations`提取和筛选的内容。如果工作流程符合“lint”规范,则可以顺利传递;如果引用格式不正确,运行器会将其标记为未解析。`study-swarm lint`会在本地检查这一点,因此第3步和第4步对引用的定义是一致的。
67
+
68
+ ## 命令行界面(CLI)
63
69
 
64
70
  ```bash
65
71
  npm i -g @dogfood-lab/study-swarm # or run ad-hoc: npx @dogfood-lab/study-swarm <command>
66
72
  ```
67
73
 
68
- | 命令 | 功能 |
74
+ | 命令 | 其作用 |
69
75
  |---|---|
70
- | `study-swarm protocol` | 打印完整的协议——五个步骤、终止表、来源标准。 |
71
- | `study-swarm new <slug>` | 生成一个 `<slug>.dispatch.md` 文件,其中包含五个步骤的框架,以便进行填充。 |
72
- | `study-swarm lint <file>` | 检查某个任务的*研究依据*是否符合来源标准——每个发现都需要有作者、年份和可解析的标识符(arXiv / DOI / URL);“研究表明……”这种含糊的说法将被拒绝。如果存在违规,则返回 `1`,从而阻止 CI |
76
+ | `study-swarm protocol` | 打印完整的协议——五个步骤、停止表以及来源标准。 |
77
+ | `study-swarm new <slug>` | 创建一个`<slug>.dispatch.md`文件,其中包含五步流程的框架,以便进行填充。 |
78
+ | `study-swarm lint [--json] <path…>` | 根据来源标准检查工作流程的*研究扎实性*——每条研究结果都需要作者、年份和一个可解析的标识符(arXiv / DOI / URL);“研究表明……”这种含糊其辞的方式将被拒绝。如果存在违规行为,则退出代码为`1`,以便在CI中进行筛选。`<path>`可以是文件、目录(递归地检查所有`.dispatch.md`文件),或者`-`表示标准输入;`--json`会输出机器可读的报告。 |
79
+
80
+ `lint`是确定性的——不调用任何模型——因此可以在CI中安全使用。它在本地强制执行**第3步的来源标准**;基于模型的**第4步**验证仍然依赖于[`roleos verify-citations`](https://github.com/mcp-tool-shop-org/role-os) → prism。
81
+
82
+ 一个典型的循环:
73
83
 
74
- `lint` 命令是确定性的——不调用任何模型——因此可以在 CI 中安全地使用。它在本地强制执行**第三步的来源标准**;基于模型的**第四步**验证仍然依赖于 [`roleos verify-citations`](https://github.com/mcp-tool-shop-org/role-os) → prism。
84
+ ```bash
85
+ study-swarm new my-decision # creates my-decision.dispatch.md
86
+ # …fill in the questions, run the research dispatch, write the findings…
87
+ study-swarm lint my-decision.dispatch.md # enforce the sourcing standard (Step 3)
88
+ roleos verify-citations my-decision.dispatch.md # model-based Step 4 (different family, via prism)
89
+ ```
90
+
91
+ 两个完整的、符合“lint”规范的工作流程示例以供参考:[`examples/study-swarm-self.dispatch.md`](examples/study-swarm-self.dispatch.md)(协议的核心决策,简洁)和[`examples/study-swarm-v1_1.dispatch.md`](examples/study-swarm-v1_1.dispatch.md)(完整的v1.1设计流程——27条引用,每一条都经过外部验证)。
92
+
93
+ ### 在CI中进行筛选
94
+
95
+ `lint`接受文件、目录(递归地检查所有`.dispatch.md`文件)或`-`表示标准输入,并且`--json`会输出机器可读的报告。将其添加到您的代码库中,以便对每个PR中的每个工作流程的来源进行筛选(一个复制粘贴示例也位于[`examples/study-swarm-ci.yml`](examples/study-swarm-ci.yml)中):
96
+
97
+ ```yaml
98
+ # .github/workflows/dispatches.yml
99
+ name: study-swarm lint
100
+ on:
101
+ pull_request:
102
+ paths: ['**/*.dispatch.md', '.github/workflows/dispatches.yml']
103
+ workflow_dispatch:
104
+ concurrency:
105
+ group: ${{ github.workflow }}-${{ github.ref }}
106
+ cancel-in-progress: true
107
+ jobs:
108
+ lint:
109
+ runs-on: ubuntu-latest
110
+ steps:
111
+ - uses: actions/checkout@v4
112
+ - uses: actions/setup-node@v4
113
+ with: { node-version: '20' }
114
+ - run: npx @dogfood-lab/study-swarm@latest lint dispatches/
115
+ ```
75
116
 
76
- ## 为什么它有效,一句话概括
117
+ ## 用一句话概括其工作原理
77
118
 
78
- **及时性**——该领域发展迅速;要求提供具体的、带有年份的研究,可以防止设计落后 18 个月。**实用性**——证据表明哪些*失败*,而不仅仅是哪些有效(解释可能会增加对*错误*人工智能的过度依赖——Bansal 等人,2021 年)。**安全性**——由验证器保护的范围是证据支持的架构,并且协议对其自身的输出进行强制执行。来源不是学术表演;它是证据链。
119
+ **及时性**——该领域发展迅速;要求提供具体的带有年份的研究,可以防止设计落后18个月。**功能性**——证据表明哪些*方法失败*,而不仅仅是哪些有效(解释可能会增加对*错误*人工智能的过度依赖——Bansal等人,2021年,[arXiv:2006.14779](https://arxiv.org/abs/2006.14779))。**安全性**——受验证器保护的范围是证据支持的架构,并且协议对其自身的输出进行强制执行。来源不是学术上的形式主义;它是证据链。
79
120
 
80
121
  ## 安全性
81
122
 
82
- `study-swarm` 是一个文档仓库——包含 Markdown 文件和徽标。它不包含任何可执行代码,也不从该仓库安装任何内容。它不涉及任何数据,不需要任何权限,也不收集任何遥测数据;源代码中没有秘密或凭据。该方法*描述*了一种使用网络检索和基于模型的验证的工作流程,但此仓库不实现或运行该工作流程。请参阅 [SECURITY.md](SECURITY.md)。
123
+ `study-swarm`提供了一个**轻量级、零依赖的CLI**(`study-swarm`),以及该方法论。它**不进行任何网络或模型调用,也不收集任何遥测数据**;源代码中没有秘密或凭据。在运行时,它只会读取您传递给`lint`的文件,并在当前目录中写入一个`<slug>.dispatch.md`文件(拒绝覆盖,并且绝不会超出工作目录)。该方法论描述的基于模型的验证(第4步)由辅助工具执行,而不是由此软件包执行。请参阅[SECURITY.md](SECURITY.md)。
83
124
 
84
125
  ## 状态
85
126
 
86
- 一个可行的协议,由其自身的机制进行外部验证——不同的模型家族检查其引用(参见上面的证明)。此仓库是公共参考;[PROTOCOL.md](PROTOCOL.md) 是可执行的形式。它是 [dogfood-lab](https://github.com/dogfood-lab) 系列的一部分——用于在人工智能时代构建的方法和示例。
127
+ 一个可用的协议,其自身的机制对其进行了外部验证——不同的模型系列检查其引用(参见上面的证明)。**v1.1**改进了验证器,弥补了第一个版本中存在的不足:分解/三元扎实性、生成时扎实性、由预言机控制的级联机制以及经过校准的弃权——每项都基于经过验证的v1.1工作流程。此仓库是公共参考;[PROTOCOL.md](PROTOCOL.md)是可执行的形式。它是[dogfood-lab](https://github.com/dogfood-lab)系列的一部分——用于在人工智能时代构建方法和示例。
87
128
 
88
- 采用 MIT 许可。
129
+ 采用MIT许可证。
89
130
 
90
131
  ---
91
132
 
package/SECURITY.md CHANGED
@@ -1,15 +1,15 @@
1
1
  # Security Policy
2
2
 
3
- `study-swarm` is a **documentation repository** it contains the study-swarm methodology (Markdown) and a logo asset. It ships no executable code, no compiled artifacts, and installs nothing from this repository. (The npm name `@dogfood-lab/study-swarm` is a reserved placeholder; this repo is the methodology source, not the package.)
3
+ `study-swarm` is the study-swarm methodology (Markdown) plus a **thin, zero-dependency command-line tool**, published as the npm package `@dogfood-lab/study-swarm`. The CLI ships in the package (`bin/study-swarm.mjs`), so installing it exposes a `study-swarm` executable. It has **no runtime dependencies** and makes **no network or model calls** the model-based verification the methodology describes (Step 4) is run by separate tools, not by this package.
4
4
 
5
5
  ## Threat model
6
6
 
7
- - **What it touches:** nothing at runtime. There is no program to run; reading the docs executes no code.
8
- - **What it does NOT touch:** your filesystem, network, credentials, or environment.
9
- - **Telemetry:** none. **Secrets/credentials:** none in source.
10
- - **Permissions required:** none.
7
+ - **What it runs:** a small Node CLI (Node >= 18). `protocol`, `version`, and `help` only print text. `lint <file>` **reads** the file you name. `new <slug>` **writes** exactly one file — `<slug>.dispatch.md` — in the current working directory, and refuses to overwrite an existing file. The slug is sanitized to a single filename (path separators are replaced with `-`, pure-dots slugs rejected), so `new` cannot write outside the current directory.
8
+ - **What it does NOT do:** no network access, no model calls, no telemetry, no filesystem access beyond the two cases above, no use of credentials or environment beyond what Node needs to run.
9
+ - **Secrets/credentials:** none in source or output.
10
+ - **Permissions required:** filesystem read for `lint`; one-file write (in the working directory) for `new`. Nothing else.
11
11
 
12
- The methodology *describes* a workflow that uses web retrieval and model-based verification, but this repository does not implement or execute that workflow.
12
+ The methodology *describes* a workflow that uses web retrieval and model-based verification; those are performed by the sibling tools ([prism-verify](https://github.com/mcp-tool-shop-org/prism-verify), [role-os](https://github.com/mcp-tool-shop-org/role-os)), not by this package.
13
13
 
14
14
  ## Supported versions
15
15
 
@@ -1,13 +1,15 @@
1
1
  #!/usr/bin/env node
2
2
  // study-swarm — thin CLI for the research-grounded design protocol.
3
3
  // Zero runtime dependencies. Commands: protocol | new | lint | help | version.
4
- import { readFileSync, writeFileSync, existsSync } from 'node:fs';
4
+ import { readFileSync, writeFileSync, existsSync, statSync, readdirSync } from 'node:fs';
5
5
  import { fileURLToPath } from 'node:url';
6
- import { dirname, resolve } from 'node:path';
6
+ import { dirname, resolve, join } from 'node:path';
7
+ import { createHash } from 'node:crypto';
7
8
 
8
9
  const __dirname = dirname(fileURLToPath(import.meta.url));
9
10
  const PKG = JSON.parse(readFileSync(resolve(__dirname, '../package.json'), 'utf8'));
10
11
  const VERSION = PKG.version;
12
+ const PROTOCOL_PATH = resolve(__dirname, '../PROTOCOL.md');
11
13
 
12
14
  const HELP = `study-swarm v${VERSION} — ground design decisions in cited research, then verify.
13
15
 
@@ -15,17 +17,24 @@ USAGE
15
17
  study-swarm <command> [args]
16
18
 
17
19
  COMMANDS
18
- protocol Print the locked protocol (the five steps + halt rules).
19
- new <slug> Scaffold a dispatch file <slug>.dispatch.md to fill in.
20
- lint <file> Check a dispatch's citations against the sourcing standard.
21
- help Show this help.
22
- version Print the version.
20
+ protocol Print the locked protocol (the five steps + halt rules).
21
+ new <slug> Scaffold a dispatch file <slug>.dispatch.md to fill in.
22
+ lint [--json] <path...> Check dispatches' citations against the sourcing standard.
23
+ A <path> may be a file, a directory (linted recursively for
24
+ *.dispatch.md), or "-" to read one dispatch from stdin.
25
+ help Show this help.
26
+ version Print the version.
23
27
 
24
28
  EXIT CODES
25
29
  0 ok / lint clean
26
30
  1 lint found sourcing violations
27
31
  2 usage or runtime error
28
32
 
33
+ NOTE
34
+ lint checks citation FORM (Step 3: author + year + a resolvable arXiv/DOI/URL,
35
+ no "studies show…" gestures) — it does not judge whether a source is legitimate
36
+ or actually supports the claim. That is Step 4, below.
37
+
29
38
  Run a dispatch's model-based verification with: roleos verify-citations <file>
30
39
  Docs: https://dogfood-lab.github.io/study-swarm/
31
40
  `;
@@ -35,19 +44,29 @@ function fail(code, msg) {
35
44
  process.exit(code);
36
45
  }
37
46
 
47
+ // Short hash of the vendored PROTOCOL.md, so a scaffolded dispatch records the exact
48
+ // methodology version it was authored against (the package vendors PROTOCOL.md for this).
49
+ function protocolHash() {
50
+ try { return createHash('sha256').update(readFileSync(PROTOCOL_PATH)).digest('hex').slice(0, 16); }
51
+ catch { return 'unknown'; }
52
+ }
53
+
38
54
  function cmdProtocol() {
39
- const p = resolve(__dirname, '../PROTOCOL.md');
40
- if (!existsSync(p)) fail(2, 'PROTOCOL.md not found in package');
41
- process.stdout.write(readFileSync(p, 'utf8'));
55
+ if (!existsSync(PROTOCOL_PATH)) fail(2, 'PROTOCOL.md not found in package');
56
+ try { process.stdout.write(readFileSync(PROTOCOL_PATH, 'utf8')); }
57
+ catch (err) { fail(2, `cannot read PROTOCOL.md in package: ${err && err.code ? err.code : err.message}`); }
42
58
  }
43
59
 
44
- const template = (slug) => `# Study-swarm dispatch: ${slug}
60
+ const template = (slug, stamp) => `<!-- ${stamp} -->
61
+ # Study-swarm dispatch: ${slug}
45
62
 
46
63
  > Fill in each section. Verify citations (Step 4) BEFORE connecting findings to the design (Step 5).
47
64
  > Lint the sourcing with: study-swarm lint ${slug}.dispatch.md
48
65
 
49
66
  ## Step 1 — Load-bearing questions
50
- <!-- 3-5 questions where empirical evidence would change the answer. Fewer is fine if the decision is substantial. -->
67
+ <!-- 3-5 questions where empirical evidence would change the answer. Fewer is fine if the decision is substantial.
68
+ A question is load-bearing if you can picture two designs hinging on the answer and the honest current
69
+ answer is "I think…", not "evidence says…". Don't manufacture questions to hit a count. -->
51
70
  1.
52
71
  2.
53
72
  3.
@@ -57,7 +76,8 @@ const template = (slug) => `# Study-swarm dispatch: ${slug}
57
76
 
58
77
  ## Step 3 — Research grounding
59
78
  <!-- One entry per finding (this is what 'lint' checks):
60
- N. **<finding>.** <Authors> <year> (<arXiv:NNNN.NNNNN | DOI>). <design implication>. -->
79
+ N. **<finding>.** <Authors> <year> (<arXiv:NNNN.NNNNN | DOI>). <design implication>.
80
+ e.g.: 1. **Contrastive explanations with a predicted human foil improve independent decisions.** Buçinca et al. 2024 (arXiv:2410.04253). Implication: every recommendation carries a "you might think X; I chose Y because…" frame. -->
61
81
  1. **<finding>.** <Authors> <year> (arXiv:____.____). <implication>.
62
82
 
63
83
  ## Step 4 — External verification
@@ -73,60 +93,168 @@ const template = (slug) => `# Study-swarm dispatch: ${slug}
73
93
 
74
94
  function cmdNew(slug) {
75
95
  if (!slug) fail(2, 'usage: study-swarm new <slug>');
76
- const safe = String(slug).replace(/\.dispatch\.md$/i, '').replace(/[^\w.\-/]/g, '-');
96
+ // Reduce the slug to a single safe filename: strip any trailing .dispatch.md (even if
97
+ // repeated), then collapse anything that isn't a word char, dot, or hyphen to '-'. Path
98
+ // separators ('/' and '\') are NOT permitted — `new` writes ONE file in the current
99
+ // directory and must never traverse out of it. A pure-dots slug ('.', '..') is rejected.
100
+ const stem = String(slug).replace(/(\.dispatch\.md)+$/i, '');
101
+ const safe = stem.replace(/[^\w.\-]/g, '-');
102
+ if (!safe || /^\.+$/.test(safe)) {
103
+ fail(2, `invalid slug "${slug}" — use letters, digits, '.', or '-' (the file stays in the current directory)`);
104
+ }
77
105
  const out = `${safe}.dispatch.md`;
78
106
  if (existsSync(out)) fail(2, `refusing to overwrite existing ${out}`);
79
- writeFileSync(out, template(safe), 'utf8');
80
- process.stdout.write(`Created ${out}\nFill it in, then: study-swarm lint ${out}\n`);
107
+ // Provenance stamp: pins the methodology version a dispatch was authored against.
108
+ const stamp = `study-swarm v${VERSION} · protocol-sha256:${protocolHash()} · created:${new Date().toISOString().slice(0, 10)}`;
109
+ writeFileSync(out, template(safe, stamp), 'utf8');
110
+ const note = safe === stem ? '' : ` (slug sanitized to "${safe}")`;
111
+ process.stdout.write(`Created ${out}${note}\nFill it in, then: study-swarm lint ${out}\n`);
81
112
  }
82
113
 
83
- function cmdLint(file) {
84
- if (!file) fail(2, 'usage: study-swarm lint <file>');
85
- if (!existsSync(file)) fail(2, `file not found: ${file}`);
86
- const lines = readFileSync(file, 'utf8').split(/\r?\n/);
114
+ // --- lint core ------------------------------------------------------------
87
115
 
88
- const start = lines.findIndex((l) => /^#{1,6}\s.*research grounding/i.test(l));
89
- if (start === -1) fail(1, 'no "Research grounding" section found — every dispatch needs one (Step 3).');
116
+ const YEAR = /\b(19|20)\d{2}\b/;
117
+ const ID = /(arxiv:\s*\d{4}\.\d{4,5}|10\.\d{4,9}\/\S+|https?:\/\/\S+)/i;
118
+ const PLACEHOLDER = /arXiv:_{2,}|<finding>|<authors>|<year>|<implication>/i;
119
+ const BANNED = /\b(studies show|research suggests|it'?s well[- ]established|well[- ]established that)\b/i;
120
+ // An author cite: a capitalized name (Unicode-aware, so "Buçinca" counts), optionally
121
+ // followed by "et al.", "&", "and", or further surnames, immediately before the year.
122
+ // Accepts "Huang et al. 2023", "Walters & Wilder 2023", "Panickssery, Bowman & Feng 2024";
123
+ // flags an author-less finding like "**Foo.** 2024 (arXiv:…)".
124
+ const AUTHOR = /\p{Lu}[\p{L}.'’-]+(?:\s*,?\s*(?:&|and|et al\.?|\p{Lu}[\p{L}.'’-]+))*\s+\(?(?:19|20)\d{2}/u;
125
+
126
+ // Check one dispatch's text. Returns a structured result; never exits.
127
+ function lintText(label, raw) {
128
+ const lines = raw.split(/\r?\n/);
129
+ const problems = []; // { finding, line, rule, message }
130
+ const add = (rule, message, line = null, finding = null) => problems.push({ finding, line, rule, message });
131
+
132
+ // Find the "Research grounding" heading whose TEXT ends with that phrase (last wins), so a
133
+ // title that merely mentions "research grounding" above the real section can't shadow it.
134
+ let start = -1;
135
+ for (let i = 0; i < lines.length; i++) {
136
+ const h = lines[i].match(/^#{1,6}\s+(.*?)\s*$/);
137
+ if (h && /research grounding$/i.test(h[1])) start = i;
138
+ }
139
+ if (start === -1) {
140
+ add('no-section', 'no "Research grounding" section found — every dispatch needs one (Step 3).');
141
+ return { file: label, ok: false, findingCount: 0, problems, findings: [] };
142
+ }
90
143
  let end = lines.length;
91
144
  for (let i = start + 1; i < lines.length; i++) {
92
145
  if (/^#{1,6}\s/.test(lines[i])) { end = i; break; }
93
146
  }
94
147
  const section = lines.slice(start + 1, end);
95
148
 
96
- const YEAR = /\b(19|20)\d{2}\b/;
97
- const ID = /(arxiv:\s*\d{4}\.\d{4,5}|10\.\d{4,9}\/\S+|https?:\/\/\S+)/i;
98
- const PLACEHOLDER = /arXiv:_{2,}|<finding>|<authors>|<year>|<implication>/i;
99
- const BANNED = /\b(studies show|research suggests|it'?s well[- ]established|well[- ]established that)\b/i;
100
-
101
- // Each numbered list item (with its continuation lines) is one finding.
102
- const findings = [];
149
+ // Split into findings (numbered items + continuation lines), ignoring fenced code blocks
150
+ // so a "1." inside a ``` example isn't mistaken for a finding. Track each finding's line.
151
+ const findings = []; // { text, line }
103
152
  let cur = null;
104
- for (const l of section) {
105
- if (/^\s*\d+\.\s/.test(l)) { if (cur !== null) findings.push(cur); cur = l; }
106
- else if (cur !== null && l.trim()) cur += ' ' + l.trim();
107
- }
108
- if (cur !== null) findings.push(cur);
153
+ let inFence = false;
154
+ section.forEach((l, idx) => {
155
+ if (/^\s*(```|~~~)/.test(l)) { inFence = !inFence; return; }
156
+ if (inFence) return;
157
+ if (/^\s*\d+\.\s/.test(l)) { if (cur) findings.push(cur); cur = { text: l, line: start + 2 + idx }; }
158
+ else if (cur && l.trim()) cur.text += ' ' + l.trim();
159
+ });
160
+ if (cur) findings.push(cur);
161
+
162
+ if (findings.length === 0) add('no-findings', 'Research grounding has no numbered findings.');
109
163
 
110
- const problems = [];
111
- if (findings.length === 0) problems.push('Research grounding has no numbered findings.');
164
+ const parsed = [];
112
165
  findings.forEach((f, i) => {
113
166
  const n = i + 1;
114
- if (PLACEHOLDER.test(f)) problems.push(`finding ${n}: still has template placeholders — fill it in.`);
115
- if (!YEAR.test(f)) problems.push(`finding ${n}: missing a year.`);
116
- if (!ID.test(f)) problems.push(`finding ${n}: missing an identifier (arXiv:NNNN.NNNNN, DOI, or URL).`);
167
+ if (PLACEHOLDER.test(f.text)) add('placeholder', `finding ${n}: still has template placeholders — fill it in.`, f.line, n);
168
+ // Strip identifiers before the year check so an arXiv id's YYMM prefix
169
+ // (e.g. 2402 in arXiv:2402.01817) can't masquerade as a publication year.
170
+ const fNoIds = f.text.replace(/arxiv:\s*\d{4}\.\d{4,5}/gi, '').replace(/10\.\d{4,9}\/\S+/g, '');
171
+ if (!YEAR.test(fNoIds)) add('missing-year', `finding ${n}: missing a year (spell it out, e.g. "2024" — an arXiv id alone is not a year).`, f.line, n);
172
+ if (!AUTHOR.test(f.text)) add('missing-author', `finding ${n}: missing an author before the year (e.g. "Huang et al. 2023").`, f.line, n);
173
+ const idm = f.text.match(ID);
174
+ if (!idm) add('missing-id', `finding ${n}: missing an identifier (arXiv:NNNN.NNNNN, DOI, or URL).`, f.line, n);
175
+ const ym = fNoIds.match(YEAR);
176
+ const ident = idm ? idm[0].replace(/\s+/g, '').replace(/[).,;]+$/, '') : null;
177
+ parsed.push({ finding: n, year: ym ? ym[0] : null, identifier: ident });
117
178
  });
118
- lines.forEach((l, i) => {
119
- if (BANNED.test(l) && !ID.test(l)) {
120
- problems.push(`line ${i + 1}: name the study (author + year + identifier), don't gesture: "${l.trim().slice(0, 56)}"`);
179
+
180
+ // Banned gesture anywhere in the section (outside fences): a finding STATES its result,
181
+ // it never "studies show…" a co-located citation doesn't redeem it.
182
+ let fence = false;
183
+ section.forEach((l, idx) => {
184
+ if (/^\s*(```|~~~)/.test(l)) { fence = !fence; return; }
185
+ if (!fence && BANNED.test(l)) {
186
+ add('banned-gesture', `line ${start + 2 + idx}: name the study (author + year + identifier), don't gesture: "${l.trim().slice(0, 56)}"`, start + 2 + idx);
121
187
  }
122
188
  });
123
189
 
124
- if (problems.length) {
125
- process.stderr.write(`x ${file}: ${problems.length} sourcing issue(s)\n`);
126
- for (const p of problems) process.stderr.write(` - ${p}\n`);
127
- process.exit(1);
190
+ return { file: label, ok: problems.length === 0, findingCount: findings.length, problems, findings: parsed };
191
+ }
192
+
193
+ // Recursively collect *.dispatch.md files under a directory (skips node_modules/.git).
194
+ function walkDispatches(dir) {
195
+ const out = [];
196
+ for (const entry of readdirSync(dir, { withFileTypes: true })) {
197
+ if (entry.name === 'node_modules' || entry.name === '.git') continue;
198
+ const full = join(dir, entry.name);
199
+ if (entry.isDirectory()) out.push(...walkDispatches(full));
200
+ else if (/\.dispatch\.md$/i.test(entry.name)) out.push(full);
201
+ }
202
+ return out.sort();
203
+ }
204
+
205
+ function readTarget(p) {
206
+ try { return { label: p, raw: readFileSync(p, 'utf8') }; }
207
+ catch (err) { fail(2, `cannot read ${p}: ${err && err.code ? err.code : err.message}`); }
208
+ }
209
+
210
+ function cmdLint(args) {
211
+ const json = args.includes('--json');
212
+ const paths = args.filter((a) => a !== '--json');
213
+ if (paths.length === 0) fail(2, 'usage: study-swarm lint [--json] <file|dir|-> [more...]');
214
+
215
+ const targets = [];
216
+ for (const p of paths) {
217
+ if (p === '-') {
218
+ let raw;
219
+ try { raw = readFileSync(0, 'utf8'); }
220
+ catch (err) { fail(2, `cannot read stdin: ${err && err.code ? err.code : err.message}`); }
221
+ targets.push({ label: '<stdin>', raw });
222
+ continue;
223
+ }
224
+ if (!existsSync(p)) fail(2, `path not found: ${p}`);
225
+ if (statSync(p).isDirectory()) {
226
+ const files = walkDispatches(p);
227
+ if (files.length === 0) fail(2, `no .dispatch.md files found under ${p}`);
228
+ for (const f of files) targets.push(readTarget(f));
229
+ } else {
230
+ targets.push(readTarget(p));
231
+ }
232
+ }
233
+
234
+ const results = targets.map((t) => lintText(t.label, t.raw));
235
+ const anyFail = results.some((r) => !r.ok);
236
+
237
+ if (json) {
238
+ const payload = results.length === 1 ? results[0] : { ok: !anyFail, files: results };
239
+ process.stdout.write(JSON.stringify(payload) + '\n');
240
+ process.exit(anyFail ? 1 : 0);
241
+ }
242
+
243
+ for (const r of results) {
244
+ if (r.ok) {
245
+ process.stdout.write(`ok ${r.file}: ${r.findingCount} finding(s), all sourced.\n`);
246
+ } else {
247
+ process.stderr.write(`x ${r.file}: ${r.problems.length} sourcing issue(s)\n`);
248
+ for (const pr of r.problems) process.stderr.write(` - ${pr.message}\n`);
249
+ }
250
+ }
251
+ if (!anyFail) {
252
+ process.stdout.write(
253
+ `\nStep 3 (sourcing FORM) is satisfied — this does NOT confirm the citations exist or support the claim.\n` +
254
+ `Run Step 4 (existence + groundedness, a different model family): roleos verify-citations <file>\n`,
255
+ );
128
256
  }
129
- process.stdout.write(`ok ${file}: ${findings.length} finding(s), all sourced.\n`);
257
+ process.exit(anyFail ? 1 : 0);
130
258
  }
131
259
 
132
260
  function main(argv) {
@@ -134,7 +262,7 @@ function main(argv) {
134
262
  switch (cmd) {
135
263
  case 'protocol': return cmdProtocol();
136
264
  case 'new': return cmdNew(rest[0]);
137
- case 'lint': return cmdLint(rest[0]);
265
+ case 'lint': return cmdLint(rest);
138
266
  case 'version': case '--version': case '-v':
139
267
  return void process.stdout.write(VERSION + '\n');
140
268
  case 'help': case '--help': case '-h': case undefined:
@@ -0,0 +1,28 @@
1
+ # Copy this into YOUR repo at .github/workflows/dispatches.yml to gate the sourcing
2
+ # of every study-swarm dispatch on each pull request. It is a SAMPLE — it is not an
3
+ # active workflow in the study-swarm repo itself.
4
+ name: study-swarm lint
5
+
6
+ on:
7
+ pull_request:
8
+ paths:
9
+ - '**/*.dispatch.md'
10
+ - '.github/workflows/dispatches.yml'
11
+ workflow_dispatch:
12
+
13
+ concurrency:
14
+ group: ${{ github.workflow }}-${{ github.ref }}
15
+ cancel-in-progress: true
16
+
17
+ jobs:
18
+ lint:
19
+ runs-on: ubuntu-latest
20
+ timeout-minutes: 5
21
+ steps:
22
+ - uses: actions/checkout@v4
23
+ - uses: actions/setup-node@v4
24
+ with:
25
+ node-version: '20'
26
+ # Lint every dispatch under dispatches/ (a file, a dir, or '-' for stdin all work).
27
+ # Exit 1 on any sourcing violation fails the check. Add --json for machine-readable output.
28
+ - run: npx @dogfood-lab/study-swarm@latest lint dispatches/
@@ -0,0 +1,46 @@
1
+ <!-- study-swarm vX.Y.Z · protocol-sha256:<vendored> · a worked, lint-clean reference dispatch -->
2
+ # Study-swarm dispatch: study-swarm-self
3
+
4
+ > A complete, **lint-clean** example dispatch — study-swarm applied to its own
5
+ > central design decision. Run `study-swarm lint examples/study-swarm-self.dispatch.md`
6
+ > (it passes), then read it as a model for what a filled-in dispatch looks like end to end.
7
+
8
+ ## Step 1 — Load-bearing questions
9
+
10
+ <!-- Each is load-bearing: two real designs hinge on the answer, and the honest prior is "I think", not "evidence says". -->
11
+
12
+ 1. When an LLM makes a substantial design call, can the *same* model reliably verify its own citations, or does the verifier have to be a separate model?
13
+ 2. Is confirming a cited paper *exists* enough, or must "the source supports this claim" be checked as a separate axis?
14
+ 3. Does adding *more* verifiers improve coverage, or does the diversity of the verifiers matter more than their count?
15
+
16
+ ## Step 2 — Research dispatch
17
+
18
+ <!-- One research agent per question, in parallel; each returned paper titles + authors + years + URLs + a one-sentence finding, web-retrieval required (no recall-only citations). -->
19
+
20
+ Three parallel agents, scoped to empirical evidence (not opinion), word-capped, "specificity over breadth — 6–8 well-sourced findings beat 20 vague gestures." Their citations (below) were then resolved against arXiv/Crossref before any informed the design.
21
+
22
+ ## Step 3 — Research grounding
23
+
24
+ 1. **LLMs struggle to self-correct without external feedback, and can degrade after self-correction.** Huang et al. 2023 (arXiv:2310.01798). Implication: the verifier cannot be the generator itself — an external check is required (answers Q1).
25
+ 2. **Autoregressive LLMs cannot self-verify; pair the generator with an external model-based verifier.** Kambhampati et al. 2024 (arXiv:2402.01817). Implication: the architecture is generator + separate verifier, not self-critique (answers Q1).
26
+ 3. **An LLM judge's self-recognition correlates *linearly* with its self-preference bias.** Panickssery, Bowman & Feng 2024 (arXiv:2404.13076). Implication: the verifier must be a *different model family*, since partial blinding of a same-family judge does not remove the bias (answers Q1).
27
+ 4. **18–55% of LLM-generated citations are fabricated, and many real ones carry bibliographic errors.** Walters & Wilder 2023 (doi:10.1038/s41598-023-41032-5). Implication: existence must be established by *retrieval* (resolve the arXiv/DOI), never by the model's recall (answers Q2).
28
+ 5. **Cited links resolve >94% of the time, yet only 39–77% of the content actually supports the claim.** Onweller et al. 2026 (arXiv:2605.06635). Implication: groundedness is a distinct axis from existence — "the link resolves" is not "the paper says this" (answers Q2).
29
+ 6. **Decorrelated verifiers (pairwise ρ ∈ [0.05, 0.25]) beat any single one via submodular coverage.** Rajan 2025 (arXiv:2511.16708). Implication: spend the budget on *lens diversity* (a retrieval oracle + ≥2 different families), not on more copies of one judge (answers Q3).
30
+
31
+ ## Step 4 — External verification
32
+
33
+ <!-- This dispatch's own citations were gated this way before Step 5 was written. -->
34
+
35
+ - [x] every citation resolved by retrieval (arXiv/DOI), not model memory — arXiv API + OpenAlex + Crossref
36
+ - [x] every finding matches what its source actually claims (groundedness) — checked against each abstract
37
+ - [x] >= 3 decorrelated lenses (retrieval oracle + >= 2 different model families) — oracle + Mistral + IBM Granite, reasoning-stripped
38
+
39
+ Result: all six citations VERIFIED (existence + attribution + groundedness). Two blind traps seeded into a sibling set — a misattribution and a fabricated paper — were caught by the *union* of the two families, not either alone.
40
+
41
+ ## Step 5 — Architecture
42
+
43
+ - The verifier is a **different model family** from the synthesizer, run reasoning-stripped. (findings 1, 2, 3)
44
+ - Verification is **two-stage per citation**: a retrieval oracle confirms existence, then a groundedness lens confirms the source supports the claim. (findings 4, 5)
45
+ - The verifier is an **ensemble of decorrelated lenses** (retrieval oracle + ≥2 different families), because diversity — not count — drives coverage. (finding 6)
46
+ - On a non-clean verdict the finding **halts** (fabricated → dropped; misattributed → corrected once; unavailable → escalate), never silently proceeds. (findings 1, 4)