@dogfood-lab/study-swarm 1.0.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.zh.md CHANGED
@@ -13,69 +13,75 @@
13
13
  <img src="https://img.shields.io/badge/cited%20research-verified-1f6feb" alt="Cited research, verified">
14
14
  </p>
15
15
 
16
- **将设计决策建立在引用的研究基础上——然后,在使用*不同的*模型系列验证引用之前,确保其成为正式内容。**
16
+ 首先,将设计决策建立在引用的研究基础上——然后,在使用这些研究成果之前,请使用*不同的*模型系列来验证引用的准确性。
17
17
 
18
- `study-swarm`是一种协议,而不是一种工具。当您使用LLM做出重大的设计决策时(例如,新的产品层、架构选择或“我们是否应该信任该模型”),从第一性原理出发进行即兴创作会导致过时的设计,而凭记忆引用论文会导致设计依赖于不存在的来源或与您认为的内容不符。`study-swarm`取代了这两种方法:派遣并行研究代理,要求提供具体的引用结果,并在每个引用通过**不同模型系列的外部验证器**后才将其用于指导设计。
18
+ `study-swarm` 是一种协议,而不是一种工具。当您使用大型语言模型(LLM)做出重大设计决策时——例如,创建一个新的产品层、选择一种架构,或者决定“我们是否应该信任该模型”——如果只是凭经验进行即兴创作,那么最终的设计方案就会显得陈旧;如果只是凭记忆引用论文,那么设计方案就会依赖于不存在的来源或与您认为的内容不符的来源。`study-swarm` 可以取代这两种做法:它会同时启动多个研究代理,要求提供具体的引文结果,并且在将任何引文用于指导设计之前,都会通过**来自不同模型系列的外部验证器**进行验证。
19
19
 
20
- 它也适用于自身。该协议规定,对于其帮助设计的系统,应使用经过验证器的保护机制——因此,它也在自身上运行这种机制。**没有模型会自己批改作业,包括运行该协议的模型。**
20
+ 它采取了自我调节的方式。该协议规定,对于其参与设计的系统,应使用经过验证者保护的信封——因此,它也将其应用于自身。**没有任何模型会自己批改作业,包括运行该协议的模型。**
21
21
 
22
- ## 该协议包含五个步骤:
22
+ ## 五步流程
23
23
 
24
- 1. **确定** 3-5个关键设计问题,如果存在经验证据,答案可能会发生变化。
25
- 2. **派遣**每个问题的研究代理,并行进行。每个代理必须返回论文标题+作者+年份+URL+一个句子摘要——强调具体性而非广泛性(“6-8条有充分依据的发现胜过20个模糊的要点”)。
26
- 3. **综合**这些发现,形成一份*研究基础*部分:`N. **<发现>.** <作者> <年份> (<arXiv/DOI>)。 <设计意义>.`
27
- 4. **进行外部验证**——一个*不同的模型系列*(不带推理能力),以两个阶段检查每个引用:一个**检索预言机**确认论文是否存在(永远不是模型的记忆),然后一个**真实性**过滤器确认该发现与来源是否匹配。如果出现捏造/错误归因,则**停止**;如果验证器或检索预言机不可用,则**停止并升级**(切勿将无法访问视为“引用没问题”)。
28
- 5. **将**每个架构选择与编号的发现联系起来。没有设计意义的引用就是噪音。
24
+ 1. **确定** 35 个关键的结构设计问题,这些问题的答案可以通过实证证据来改变。
25
+ 2. **指派** 一名研究人员负责每个问题,并让他们并行工作。每位研究人员必须提供论文标题、作者、发表年份、网址以及一个简短的结论(强调具体性而非广泛性,“68 个有充分依据的结论胜过 20 个含糊不清的描述”)。
26
+ 3. **综合** 这些结论,形成一个“*研究基础*”部分:`N. **<结论>.** <作者> <年份> (<arXiv/DOI>)。 <设计启示>。`
27
+ 4. **进行外部验证**——使用一种*不同的模型系列*,去除推理能力后,分两个阶段检查所有引用文献:首先,一个**检索预言机**确认论文是否存在(绝不能依赖模型的记忆),然后,一个**真实性评估工具**确认结论是否与来源一致。如果发现捏造或错误归因的引用,则立即**停止**;如果验证者或检索预言机不可用,则**停止并升级处理**(切勿将无法找到的情况解读为“引用没有问题”)。
28
+ 5. **将**每个结构设计选择与相应的结论联系起来,通过编号进行关联。如果没有明确的设计启示,那么这些引用就是噪音。
29
29
 
30
- 完整的可执行细节——停止表、来源标准、集成规则——位于**[PROTOCOL.md](PROTOCOL.md)**中。
30
+ 完整的可执行细节——包括停止表、源标准和集成规则——都可以在**[PROTOCOL.md]**文件中找到。
31
31
 
32
- ## 为什么需要*不同的*模型系列,并且不带推理能力?
32
+ ## 为什么会是另外一个家庭?而且,请不要再进行任何推测
33
33
 
34
- 因为其失败模式是已知的,而不是假设的:
34
+ 因为这里记录的是实际发生的故障模式,而不是假设的故障模式:
35
35
 
36
- - **LLM无法可靠地验证自身的输出。** Huang等人,2023年([arXiv:2310.01798](https://arxiv.org/abs/2310.01798));Kambhampati等人,2024年([arXiv:2402.01817](https://arxiv.org/abs/2402.01817),LLM-Modulo);Stechly等人,2024年([arXiv:2402.08115](https://arxiv.org/abs/2402.08115))——外部验证器具有优势;自我批评的内容是无效的。
37
- - **同一系列的评判者会偏袒自己。** Panickssery、Bowman和Feng,2024年([arXiv:2404.13076](https://arxiv.org/abs/2404.13076))——自我识别与自我偏好呈线性相关,因此部分屏蔽没有帮助。Verga等人,2024年([arXiv:2404.18796](https://arxiv.org/abs/2404.18796),PoLL)——跨越不同系列的评审团的偏见更小,成本约为原来的 1/7。
38
- - **引用是LLM说谎的地方。** Walters和Wilder,2023年([doi:10.1038/s41598-023-41032-5](https://doi.org/10.1038/s41598-023-41032-5))——GPT-3.555%和GPT-418%的引用是捏造的。Onweller等人,2026年([arXiv:2605.06635](https://arxiv.org/abs/2605.06635))——链接可以解决超过94%的问题,但只有39-77%的引用内容实际上支持该声明。因此,必须通过**检索而不是回忆**来检查是否存在。
39
- - **隐藏生成器的推理过程。** Khalifa等人,2026年([arXiv:2601.14691](https://arxiv.org/abs/2601.14691),“欺骗评判者”)——仅通过操纵思维链,可以将评判者的假阳性率提高高达90%,同时保持操作不变。Turpin等人,2023年([arXiv:2305.04388](https://arxiv.org/abs/2305.04388))——思维链是一种事后合理化。验证器只会看到裸露的引用声明,而不会看到“我为什么包含这个”。
40
- - **多样性胜过数量。** Rajan,2025年([arXiv:2511.16708](https://arxiv.org/abs/2511.16708))——四个验证器在成对相关系数ρ∈[0.05, 0.25]时,通过亚模覆盖优于任何单个验证器。Kim等人,2025年([arXiv:2506.07962](https://arxiv.org/abs/2506.07962))——LLM的错误是*相关的*,因此关键变量是视角多样性,而不是原始数量。
36
+ - **大型语言模型无法可靠地验证自身输出。** Huang 等人,2023 ([arXiv:2310.01798](https://arxiv.org/abs/2310.01798));Kambhampati 等人,2024 ([arXiv:2402.01817](https://arxiv.org/abs/2402.01817),LLM-Modulo);Stechly 等人,2024 ([arXiv:2402.08115](https://arxiv.org/abs/2402.08115))——外部验证者承担了主要的改进作用;自我评价的内容是静态的。
37
+ - **同一系列的评估者更倾向于选择自身的结果。** Panickssery、Bowman Feng,2024 ([arXiv:2404.13076](https://arxiv.org/abs/2404.13076))——自我识别与自我偏好呈*线性*相关,因此部分隐藏信息并不能起到帮助作用。Verga 等人,2024 ([arXiv:2404.18796](https://arxiv.org/abs/2404.18796),PoLL)——由不同系列的评估者组成的团队的偏见更小,且成本约为原来的 1/7。
38
+ - **大型语言模型最容易在引用时造假。** Walters Wilder,2023 ([doi:10.1038/s41598-023-41032-5](https://doi.org/10.1038/s41598-023-41032-5))——GPT-3.555% 的引用,GPT-418% 的引用是捏造的。Onweller 等人,2026 ([arXiv:2605.06635](https://arxiv.org/abs/2605.06635))——链接在超过 94% 的情况下可以找到,但只有 39-77% 的引用内容实际上支持了论点。因此,必须通过**检索而非回忆**来验证其存在性。
39
+ - **隐藏生成器的推理过程。** Khalifa 等人,2026 ([arXiv:2601.14691](https://arxiv.org/abs/2601.14691),“欺骗评估者”)——仅通过操纵思维链,就可以使评估者的假阳性率提高高达 90%,而操作条件保持不变。Turpin 等人,2023 ([arXiv:2305.04388](https://arxiv.org/abs/2305.04388))——思维链是一种事后合理化。验证者只能看到原始的引用声明,而无法了解“我为什么包含这个”。
40
+ - **多样性胜过数量。** Rajan,2025 ([arXiv:2511.16708](https://arxiv.org/abs/2511.16708))——四个评估者之间的成对相关性 ρ ∈ [0.05, 0.25],通过次模覆盖,其效果优于任何单个评估者。Kim 等人,2025 ([arXiv:2506.07962](https://arxiv.org/abs/2506.07962))——大型语言模型的错误是*相关的*,因此,关键变量是视角的多样性,而不是单纯的数量。
41
41
 
42
- ## 它真的有效吗?(证明)
42
+ ## 它真的有效吗?(请提供证据)
43
43
 
44
- 作为测试,该协议针对其自身的引用进行了运行。两个不相关的非Claude系列——**Mistral** (`mistral-small:24b`)和**IBM Granite** (`granite4.1:30b`)——以无推理的方式检查了一组引用,并设置了两个盲目陷阱:
44
+ 为了进行测试,我们将该协议应用于其自身的引用文献。我们选择了两个与 Claude 模型无关的模型系列——**Mistral**(`mistral-small:24b`)和 **IBM Granite**(`granite4.1:30b`),并对它们进行了测试。测试内容包括:检查一组引用文献,去除推理过程中的干扰因素,并在其中设置了两个隐藏的陷阱。
45
45
 
46
- | 设定的陷阱 | Mistral | IBM Granite | 真实情况 |
46
+ | 设下陷阱。 | 米斯特拉尔 | IBM 花岗岩(IBM Granite | 真实数据;基准数据。 |
47
47
  |---|---|---|---|
48
- | 思维链提示归因于“Nakamura & Olsen” | 未发现 | **发现**(错误归因→实际上是Wei等人,2022年,arXiv:2201.11903) | 错误归因 |
49
- | 一篇捏造的“98%的错误已消除,无需预言机”论文 | **caught** (fabricated) | **caught** (fabricated) | 捏造 |
48
+ | “中村和奥尔森”提出的基于思维链的提示方法。 | 错过;想念。 | **已更正**(原错误归因,现改为:魏等人,2022年,arXiv:2201.11903) | 错误归因;错误地认为……是……所为。 |
49
+ | 一篇捏造的论文,声称“已消除 98% 的错误,无需使用 Oracle”。 | **caught** (fabricated) | **caught** (fabricated) | 捏造的;伪造的。 |
50
50
 
51
- 两个系列都没有单独发现这两个陷阱——但它们的**组合发现了2/2个。** 单个评判者会忽略错误归因。此外,检索预言机还发现了我们自己设计文档中的两个*真实*的错误归因(引用了错误的作者),而任何参数LLM都无法标记出来——并且它正确地确认了真正的2026年的论文,这两个LLM因为该论文发布时间晚于它们的训练数据而将其错误地标记为捏造。后一点是第4步存在性检查**必须**使用检索预言机的原因,而不是LLM。
51
+ 两个家庭都未能单独成功设置这两个陷阱——但他们合作后,两个陷阱都成功触发了。如果由一位法官来判断,可能会出现误判。此外,检索系统在我们的设计文档中发现了两个真实的错误归因(即引用时将论文归于错误的作者),而任何参数化的大型语言模型都无法识别这些错误——并且它正确地确认了真正的 2026 年发表的论文,这两篇论文之前被两个大型语言模型错误地标记为虚构,仅仅是因为这些论文的发表时间晚于它们接受训练的时间。最后这一点是第四步存在检查必须采用检索系统而非大型语言模型的根本原因。
52
52
 
53
- 这次运行就是缩影:**不相关的视角+用于验证存在的检索预言机胜过任何一个聪明的评判者。**
53
+ 那一次实验结果,可以被视为一个微缩版的论点:**通过使用不相关的镜头,并结合一种用于验证存在的检索机制,其效果将优于任何单一的智能判断系统。**
54
54
 
55
- ## 它的工作原理
55
+ ### ……而且,我们还要重新设计 1.1 版本
56
56
 
57
- 你可以手动运行该协议——任何不同的模型加上自行解析 arXiv/DOI,即可满足步骤 4。两个辅助工具使其成为一个命令:
57
+ v1.1版本的改进采用相同的方式进行——通过在“study-swarm”上运行“study-swarm”。第一个版本提出的四个问题(如何*实现*扎实性检查的自动化,是否在生成时进行扎实性验证,如何*组合*不同的视角,以及是否对经过校准的不确定性进行弃权)被分配给并行的研究代理,所有**27条结果引用**都通过第4步进行筛选,然后才用于指导设计。检索预言机确认**27/27条引用存在**——包括六篇2025-2026年的论文,如果使用参数模型,这些论文会被错误地标记为捏造的——并且更正了五处归因错误,而这是模型无法做到的,其中一处是研究代理自己发现的一处真实的作者署名错误。在不进行推理的情况下运行,扎实性视角甚至可以重现其自身记录的失败模式:自信地将一篇真实论文错误地标记为虚假论文,并且它们的*分歧*触发了升级——这与级联机制完全一致。经过验证的工作流程以[`examples/study-swarm-v1_1.dispatch.md`](examples/study-swarm-v1_1.dispatch.md)的形式提供;它所确定的改进(分解/三元扎实性、生成时扎实性、由预言机控制的级联机制以及经过校准的弃权)都包含在[PROTOCOL.md](PROTOCOL.md)中。
58
58
 
59
- - **[prism-verify](https://github.com/mcp-tool-shop-org/prism-verify)** ——运行时验证器:不同模型的路由、去除推理过程的多镜头仲裁、确定性的检索存在性基准(arXiv → Crossref)以及带签名的收据。
60
- - **[role-os](https://github.com/mcp-tool-shop-org/role-os)** ——提供 `roleos verify-citations <dispatch>`,该工具提取一个“dispatch”中的引用并将其传递给 prism 进行验证。
59
+ ## 其工作原理
61
60
 
62
- 数据传输是“dispatch”格式本身:以 `N. **finding.** Authors year (arXiv|DOI). implication.` 形式编写的发现——每个发现都**包含一个可解析的标识符**——这正是 `roleos verify-citations` 工具所处理和验证的内容。如果“dispatch”符合 `lint` 的要求,则可以顺利进行;如果引用格式不正确,该工具会将其标记为未解析。`study-swarm lint` 会在本地检查此约定,因此步骤 3 和步骤 4 对引用的定义保持一致。
61
+ 您可以手动运行该协议——任何不同类型的模型,再加上您自己解析arXiv/DOI,都可以满足第4步的要求。两个辅助工具使其只需一个命令即可完成:
63
62
 
64
- ## 命令行界面 (CLI)
63
+ - **[prism-verify](https://github.com/mcp-tool-shop-org/prism-verify)**——运行时验证器:不同类型的模型路由、无推理、多视角仲裁、确定性的检索存在性阈值(arXiv → Crossref)以及带签名的收据。
64
+ - **[role-os](https://github.com/mcp-tool-shop-org/role-os)**——提供`roleos verify-citations <dispatch>`,该工具提取工作流程中的引用并将其通过prism进行筛选。
65
+
66
+ 传递过程就是工作流程的格式:将研究结果写成`N. **finding.** Authors year (arXiv|DOI). implication.`的形式——**每条研究结果都包含一个可解析的标识符**——这正是`roleos verify-citations`提取和筛选的内容。如果工作流程符合“lint”规范,则可以顺利传递;如果引用格式不正确,运行器会将其标记为未解析。`study-swarm lint`会在本地检查这一点,因此第3步和第4步对引用的定义是一致的。
67
+
68
+ ## 命令行界面(CLI)
65
69
 
66
70
  ```bash
67
71
  npm i -g @dogfood-lab/study-swarm # or run ad-hoc: npx @dogfood-lab/study-swarm <command>
68
72
  ```
69
73
 
70
- | 命令 | 作用 |
74
+ | 命令 | 其作用 |
71
75
  |---|---|
72
- | `study-swarm protocol` | 打印完整的协议——五个步骤、停止表和来源标准。 |
73
- | `study-swarm new <slug>` | 创建一个 `<slug>.dispatch.md` 文件,其中包含五步流程的框架,以便进行填充。 |
74
- | `study-swarm lint [--json] <path…>` | 检查“dispatch”的*研究依据*是否符合来源标准——每个发现都需要作者、年份和一个可解析的标识符(arXiv / DOI / URL);“研究表明……”这种泛泛而谈的方式将被拒绝。如果存在违规行为,则返回 `1`,从而阻止 CI 流程。`<path>` 可以是文件、目录(递归地对所有 `*.dispatch.md` 文件进行 lint 检查),或者 `-` 表示标准输入;`--json` 会输出机器可读的报告。 |
76
+ | `study-swarm protocol` | 打印完整的协议——五个步骤、停止表以及来源标准。 |
77
+ | `study-swarm new <slug>` | 创建一个`<slug>.dispatch.md`文件,其中包含五步流程的框架,以便进行填充。 |
78
+ | `study-swarm lint [--json] <path…>` | 根据来源标准检查工作流程的*研究扎实性*——每条研究结果都需要作者、年份和一个可解析的标识符(arXiv / DOI / URL);“研究表明……”这种含糊其辞的方式将被拒绝。如果存在违规行为,则退出代码为`1`,以便在CI中进行筛选。`<path>`可以是文件、目录(递归地检查所有`.dispatch.md`文件),或者`-`表示标准输入;`--json`会输出机器可读的报告。 |
79
+ | `study-swarm lock <dispatch> --from <orchestration.json>` | 将一个调度固定下来以便重放——编写 `<dispatch>.lock.json`,其中包含基于内容的哈希值,按照步骤 2 中的代理进行操作,包括**已解析的模型 ID** + **字节级精确提示的 SHA-256 值** + **工具模式的 SHA-256 值**,以及步骤 4 中的**验证者凭证**,并将它们组合成一个 `lock_sha256`。 |
80
+ | `study-swarm lock --verify <dispatch> [--from …]` | 重新计算这些哈希值并确认它们与锁匹配;如果出现任何偏差,则退出并返回 1,因此它就像软件包的 lock 文件一样,可以控制 CI 流程。如果不使用 `--from` 参数,则会检查锁自身的完整性。 |
75
81
 
76
- `lint` 是确定性的——不调用任何模型——因此可以在 CI 中安全使用。它在本地强制执行**步骤 3 的来源标准**;基于模型的**步骤 4** 验证仍然依赖于 [`roleos verify-citations`](https://github.com/mcp-tool-shop-org/role-os) → prism。
82
+ `lint`是确定性的——不调用任何模型——因此可以在CI中安全使用。它在本地强制执行**第3步的来源标准**;基于模型的**第4步**验证仍然依赖于[`roleos verify-citations`](https://github.com/mcp-tool-shop-org/role-os) → prism。
77
83
 
78
- 典型的流程:
84
+ 一个典型的循环:
79
85
 
80
86
  ```bash
81
87
  study-swarm new my-decision # creates my-decision.dispatch.md
@@ -84,11 +90,11 @@ study-swarm lint my-decision.dispatch.md # enforce the sourcing standard
84
90
  roleos verify-citations my-decision.dispatch.md # model-based Step 4 (different family, via prism)
85
91
  ```
86
92
 
87
- 一个完整的、符合 `lint` 要求的“dispatch”——将 study-swarm 应用于其自身的设计——包含在 [`examples/study-swarm-self.dispatch.md`](examples/study-swarm-self.dispatch.md) 中,作为参考示例。
93
+ 三个完整的、经过清理的调度文件作为参考发布:[`examples/study-swarm-self.dispatch.md`](examples/study-swarm-self.dispatch.md)(协议的核心决策,简洁),[`examples/study-swarm-v1_1.dispatch.md`](examples/study-swarm-v1_1.dispatch.md)(完整的 v1.1 设计版本——27 处引用,每一处都经过外部验证),以及 [`examples/study-swarm-lock.dispatch.md`](examples/study-swarm-lock.dispatch.md)(v1.2 的锁设计——39 处引用,通过运行器进行控制,并且是第一个发布其自身锁的调度文件)。
88
94
 
89
- ### 在 CI 中进行验证
95
+ ### 在CI中进行筛选
90
96
 
91
- `lint` 接受文件、目录(递归地对所有 `*.dispatch.md` 文件进行 lint 检查),或者 `-` 表示标准输入,并且 `--json` 会输出机器可读的报告。将其添加到你的仓库中,以便在每次 PR 中验证每个“dispatch”的来源(一个复制粘贴示例也包含在 [`examples/study-swarm-ci.yml`](examples/study-swarm-ci.yml) 中):
97
+ `lint`接受文件、目录(递归地检查所有`.dispatch.md`文件)或`-`表示标准输入,并且`--json`会输出机器可读的报告。将其添加到您的代码库中,以便对每个PR中的每个工作流程的来源进行筛选(一个复制粘贴示例也位于[`examples/study-swarm-ci.yml`](examples/study-swarm-ci.yml)中):
92
98
 
93
99
  ```yaml
94
100
  # .github/workflows/dispatches.yml
@@ -110,19 +116,25 @@ jobs:
110
116
  - run: npx @dogfood-lab/study-swarm@latest lint dispatches/
111
117
  ```
112
118
 
113
- ## 为什么它有效,一句话概括:
119
+ ### 将一个调度固定下来以便重放 (`dispatch.lock.json`)
120
+
121
+ 只有当你能够说明*是什么产生了它*时,才能对经过验证的调度进行审计。`study-swarm lock` 编写一个配套的锁文件,该文件基于内容进行哈希处理,按照研究代理进行操作,包括**已解析的模型 ID(绝不使用浮动别名)**、**字节级精确提示的 SHA-256 值**以及**工具模式的 SHA-256 值**,以及外部**验证者凭证**——所有这些都组合成一个 `lock_sha256`。`study-swarm lock --verify` 重新计算这些哈希值,并且如果出现任何偏差,则会失败并停止,因此,如果提示、模型或工具发生更改,系统都会检测到——这是 [PIN_PER_STEP](https://github.com/dogfood-lab/study-swarm) 可重复性标准的可执行版本。该框架会输出记录;CLI 保持零依赖和无网络状态,仅进行规范化(RFC 8785)、哈希处理和验证。
122
+
123
+ **它固定输入,而不是输出。** 固定模型 + 提示 + 温度并不能使 LLM 的输出完全相同——批处理不变性、浮点数非结合律、混合专家路由以及无声提供者漂移都超出了离线工具的控制范围。因此,该锁为您提供**可重放的输入和可检测偏差的输出**,而不是“确定性重放”。该设计基于 [`examples/study-swarm-lock.dispatch.md`](examples/study-swarm-lock.dispatch.md) 中的每一处引用,并且是第一个发布其自身锁([`examples/study-swarm-lock.lock.json`](examples/study-swarm-lock.lock.json))的调度文件。
124
+
125
+ ## 用一句话概括其工作原理
114
126
 
115
- **当前**——该领域发展迅速;要求提供具体的、带有年份的研究,可以防止设计落后 18 个月。**功能性**——证据表明哪些*失败*了,而不仅仅是哪些有效(解释可能会增加对*错误* AI 的过度依赖——Bansal 等人,2021 年,[arXiv:2006.14779](https://arxiv.org/abs/2006.14779))。**安全性**——由验证器保护的范围是证据支持的架构,并且该协议对其自身输出进行强制执行。来源不是学术游戏;它是证据链。
127
+ **及时性**——该领域发展迅速;要求提供具体的带有年份的研究,可以防止设计落后18个月。**功能性**——证据表明哪些*方法失败*,而不仅仅是哪些有效(解释可能会增加对*错误*人工智能的过度依赖——Bansal等人,2021年,[arXiv:2006.14779](https://arxiv.org/abs/2006.14779))。**安全性**——受验证器保护的范围是证据支持的架构,并且协议对其自身的输出进行强制执行。来源不是学术上的形式主义;它是证据链。
116
128
 
117
129
  ## 安全性
118
130
 
119
- `study-swarm` 提供一个**轻量级、零依赖 CLI** (`study-swarm`) 以及该方法论。它**不进行任何网络或模型调用,也不收集任何遥测数据**;源代码中没有秘密或凭据。在运行时,它只会读取你传递给 `lint` 的文件,并在当前目录中写入一个 `<slug>.dispatch.md` 文件(拒绝覆盖,并且绝不会超出工作目录)。该方法论描述的基于模型的验证(步骤 4)由辅助工具执行,而不是由此软件包执行。请参阅 [SECURITY.md](SECURITY.md)。
131
+ `study-swarm`提供了一个**轻量级、零依赖的CLI**(`study-swarm`),以及该方法论。它**不进行任何网络或模型调用,也不收集任何遥测数据**;源代码中没有秘密或凭据。在运行时,它只会读取您传递给`lint`的文件,并在当前目录中写入一个`<slug>.dispatch.md`文件(拒绝覆盖,并且绝不会超出工作目录)。该方法论描述的基于模型的验证(第4步)由辅助工具执行,而不是由此软件包执行。请参阅[SECURITY.md](SECURITY.md)。
120
132
 
121
133
  ## 状态
122
134
 
123
- 一个可工作的协议,通过其自身的机制进行了外部验证——不同的模型会检查其引用(参见上面的证明)。该仓库是公共参考;[PROTOCOL.md](PROTOCOL.md) 是可执行的形式。它是 [dogfood-lab](https://github.com/dogfood-lab) 系列的一部分——用于构建 AI 时代的方法和示例。
135
+ 一个可工作的协议,由其自身的机制进行外部验证——不同的模型系列会检查其引用(参见上面的证明)。**v1.1** 改进了验证器,而第一个版本是静默的:分解/三元接地、生成时接地、用于组合透镜的 oracle 门控级联以及校准后的弃权——所有这些都基于经过验证的 v1.1 调度。**v1.2** 使调度可以进行字节级别的重放:`study-swarm lock` 会按照步骤固定已解析的模型、提示和工具模式,并添加验证者凭证,而 `lock --verify` 则会在出现偏差时失败并停止。此仓库是公共参考;[PROTOCOL.md](PROTOCOL.md) 是可执行的形式。它是 [dogfood-lab](https://github.com/dogfood-lab) 系列的一部分——用于在人工智能时代构建的各种方法和示例。
124
136
 
125
- 采用 MIT 许可。
137
+ 采用MIT许可证。
126
138
 
127
139
  ---
128
140
 
@@ -22,6 +22,13 @@ COMMANDS
22
22
  lint [--json] <path...> Check dispatches' citations against the sourcing standard.
23
23
  A <path> may be a file, a directory (linted recursively for
24
24
  *.dispatch.md), or "-" to read one dispatch from stdin.
25
+ lock <dispatch> --from <orchestration.json>
26
+ Emit <dispatch>.lock.json — pin (per Step-2 agent) the resolved
27
+ model + SHA-256 of the byte-exact prompt + SHA-256 of the tool
28
+ schema, plus the verifier receipt, rolled into one lock_sha256.
29
+ lock --verify <dispatch> [--from <orchestration.json>]
30
+ Re-derive the deterministic hashes and assert they match the lock;
31
+ drift exits 1 (gates CI). Without --from, checks lock self-integrity.
25
32
  help Show this help.
26
33
  version Print the version.
27
34
 
@@ -47,7 +54,7 @@ function fail(code, msg) {
47
54
  // Short hash of the vendored PROTOCOL.md, so a scaffolded dispatch records the exact
48
55
  // methodology version it was authored against (the package vendors PROTOCOL.md for this).
49
56
  function protocolHash() {
50
- try { return createHash('sha256').update(readFileSync(PROTOCOL_PATH)).digest('hex').slice(0, 16); }
57
+ try { return createHash('sha256').update(normText(readFileSync(PROTOCOL_PATH, 'utf8'))).digest('hex').slice(0, 16); }
51
58
  catch { return 'unknown'; }
52
59
  }
53
60
 
@@ -257,12 +264,187 @@ function cmdLint(args) {
257
264
  process.exit(anyFail ? 1 : 0);
258
265
  }
259
266
 
267
+ // --- lock core (dispatch.lock.json — the PIN_PER_STEP feature) ------------------
268
+ // Design + research grounding: examples/study-swarm-lock.dispatch.md (choices L1-L11).
269
+ // The CLI is a PURE FUNCTION of provided bytes: the orchestration harness emits the record
270
+ // (resolved models + byte-exact prompts + tool schemas + verifier receipt); the CLI only
271
+ // canonicalizes + hashes + validates it. No network, no model calls (L2).
272
+
273
+ const LOCK_SCHEMA = 'dispatch.lock/v1';
274
+
275
+ // Self-describing digest "sha256-<base64>" — the W3C Subresource Integrity form: algorithm-
276
+ // prefixed (so it's algorithm-agile) and used fail-closed on mismatch (L9; lock dispatch finding 38).
277
+ function sriBytes(buf) { return 'sha256-' + createHash('sha256').update(buf).digest('base64'); }
278
+ // Normalize TEXT before hashing so the same content hashes identically across platforms — strip a
279
+ // BOM, fold CRLF/CR -> LF, NFC-normalize. Without this, a CRLF working tree (Windows) and an LF
280
+ // checkout (git/CI) produce different hashes — the exact cross-platform drift our Q2 findings warn
281
+ // about (RFC 8259 BOM, UAX #15 NFC, and CRLF/LF). Applied to every text input that gets hashed.
282
+ function normText(s) { s = String(s); if (s.charCodeAt(0) === 0xFEFF) s = s.slice(1); return s.replace(/\r\n?/g, '\n').normalize('NFC'); }
283
+ function sriText(str) { return sriBytes(Buffer.from(normText(str), 'utf8')); }
284
+
285
+ // RFC 8785 (JCS) canonical JSON, for the structured JSON the CLI assembles ITSELF (the tool
286
+ // surface and the lock body): NFC-normalize strings, sort object keys by UTF-16 code unit (JS
287
+ // default string sort), no inter-token whitespace, ECMAScript-shortest numbers, UTF-8 (L4).
288
+ // The PROMPT is NOT JCS-restructured — it is the literal string the model conditioned on, so it is
289
+ // hashed as text (L3; JWS/DSSE hash-known-bytes rule, findings 12/23) under the SAME newline/BOM/NFC
290
+ // normalization (normText) so the same prompt hashes identically across platforms (findings 10/11).
291
+ function jcs(value) {
292
+ const ser = (v) => {
293
+ if (v === null) return 'null';
294
+ const t = typeof v;
295
+ if (t === 'boolean') return v ? 'true' : 'false';
296
+ if (t === 'number') {
297
+ if (!Number.isFinite(v)) throw new Error('JCS: non-finite number not allowed');
298
+ return JSON.stringify(v); // ECMAScript Number->String shortest round-trip
299
+ }
300
+ if (t === 'string') return JSON.stringify(normText(v));
301
+ if (Array.isArray(v)) return '[' + v.map(ser).join(',') + ']';
302
+ if (t === 'object') {
303
+ return '{' + Object.keys(v).sort()
304
+ .map((k) => JSON.stringify(normText(k)) + ':' + ser(v[k])).join(',') + '}';
305
+ }
306
+ throw new Error(`JCS: unsupported type ${t}`);
307
+ };
308
+ return ser(value);
309
+ }
310
+ function jcsDigest(value) { return sriBytes(Buffer.from(jcs(value), 'utf8')); }
311
+
312
+ // The lock sits beside its dispatch: <dir>/<stem>.lock.json (stem strips a trailing .dispatch.md).
313
+ function lockPathFor(dispatch) {
314
+ const base = dispatch.split(/[\\/]/).pop().replace(/(\.dispatch)?\.md$/i, '');
315
+ return join(dirname(dispatch), `${base}.lock.json`);
316
+ }
317
+
318
+ // Build the lock object from the dispatch bytes + the harness-emitted orchestration record.
319
+ function buildLockObject(dispatchPath, orchestration) {
320
+ const dispatchText = readFileSync(dispatchPath, 'utf8');
321
+ const protocolText = readFileSync(PROTOCOL_PATH, 'utf8');
322
+ if (!orchestration || !Array.isArray(orchestration.steps) || orchestration.steps.length === 0) {
323
+ fail(2, 'orchestration record has no non-empty "steps" array');
324
+ }
325
+ const steps = orchestration.steps.map((s, i) => {
326
+ const need = (k) => {
327
+ if (s == null || s[k] === undefined || s[k] === null) fail(2, `orchestration step ${i + 1} is missing "${k}"`);
328
+ return s[k];
329
+ };
330
+ const rec = {
331
+ question_id: String(need('question_id')),
332
+ resolved_model: String(need('resolved_model')), // L6 — the resolved id, never an alias
333
+ prompt_sha256: sriText(String(need('prompt'))), // L3 — text-normalized (LF/NFC/BOM), not JCS-restructured
334
+ tool_schema_sha256: jcsDigest(need('tool_schema')), // L5 — canonicalized tool surface
335
+ };
336
+ if (s.schema_dialect) rec.schema_dialect = String(s.schema_dialect); // L5 — dialect is contract
337
+ if (s.params && typeof s.params === 'object') rec.params = s.params;
338
+ // L7 — output hash for DRIFT DETECTION only (not determinism). The harness may ship the raw
339
+ // output (the CLI hashes it) OR a pre-computed output_sha256 (large outputs needn't be shipped).
340
+ if (typeof s.output_sha256 === 'string') rec.output_sha256 = s.output_sha256;
341
+ else if (s.output !== undefined) rec.output_sha256 = typeof s.output === 'string' ? sriText(s.output) : jcsDigest(s.output);
342
+ return rec;
343
+ });
344
+ const lock = {
345
+ schema: LOCK_SCHEMA,
346
+ study_swarm_version: VERSION,
347
+ protocol_sha256: sriText(protocolText), // pins the methodology version (text-normalized)
348
+ dispatch_sha256: sriText(dispatchText), // pins the dispatch text (text-normalized)
349
+ steps,
350
+ };
351
+ if (orchestration.verification && typeof orchestration.verification === 'object') {
352
+ lock.verification = orchestration.verification; // L10 — the external-verifier receipt
353
+ }
354
+ // L1/L9 — rollup over the whole body (this object, before lock_sha256 is added) as ONE flat
355
+ // canonical object: distinct keys give domain separation, the steps array's explicit length
356
+ // commits to exactly N steps (no odd-leaf duplication).
357
+ lock.lock_sha256 = jcsDigest(lock);
358
+ return lock;
359
+ }
360
+
361
+ // Verify a lock: self-integrity always; source-drift too when an orchestration record is supplied.
362
+ // Strict-match, fail-closed (L8): returns a list of problems (empty = clean).
363
+ function verifyLockObject(dispatchPath, lockPath, orchestration) {
364
+ let stored;
365
+ try { stored = JSON.parse(readFileSync(lockPath, 'utf8')); }
366
+ catch (err) { fail(2, `cannot read lock ${lockPath}: ${err && err.code ? err.code : err.message}`); }
367
+ const problems = [];
368
+ // 1) Self-integrity — recompute lock_sha256 over the stored body (detects a hand-edited lock).
369
+ if (!stored || typeof stored !== 'object' || typeof stored.lock_sha256 !== 'string') {
370
+ problems.push('lock has no lock_sha256 string');
371
+ } else {
372
+ const { lock_sha256, ...body } = stored;
373
+ const recomputed = jcsDigest(body);
374
+ if (lock_sha256 !== recomputed) {
375
+ problems.push(`lock_sha256 mismatch (the lock body was edited): stored ${lock_sha256} != recomputed ${recomputed}`);
376
+ }
377
+ }
378
+ // 2) Source drift — re-derive the deterministic hashes from the live inputs and strict-compare.
379
+ if (orchestration) {
380
+ const fresh = buildLockObject(dispatchPath, orchestration);
381
+ const cmp = (label, a, b) => { if (a !== b) problems.push(`${label} drift: lock ${b} != re-derived ${a}`); };
382
+ cmp('lock_sha256', fresh.lock_sha256, stored.lock_sha256); // the authoritative rollup guard
383
+ cmp('protocol_sha256', fresh.protocol_sha256, stored.protocol_sha256);
384
+ cmp('dispatch_sha256', fresh.dispatch_sha256, stored.dispatch_sha256);
385
+ const a = fresh.steps || [], b = Array.isArray(stored.steps) ? stored.steps : [];
386
+ if (a.length !== b.length) problems.push(`step count drift: re-derived ${a.length} != lock ${b.length}`);
387
+ for (let i = 0; i < Math.min(a.length, b.length); i++) {
388
+ for (const k of ['question_id', 'resolved_model', 'prompt_sha256', 'tool_schema_sha256']) {
389
+ cmp(`steps[${i}].${k}`, a[i][k], b[i][k]);
390
+ }
391
+ if (a[i].output_sha256 || b[i].output_sha256) cmp(`steps[${i}].output_sha256`, a[i].output_sha256, b[i].output_sha256);
392
+ }
393
+ }
394
+ return problems;
395
+ }
396
+
397
+ function cmdLock(args) {
398
+ const verify = args.includes('--verify');
399
+ const rest = args.filter((a) => a !== '--verify');
400
+ let orchPath = null;
401
+ const fromIdx = rest.indexOf('--from');
402
+ if (fromIdx !== -1) {
403
+ orchPath = rest[fromIdx + 1];
404
+ if (!orchPath) fail(2, 'usage: --from <orchestration.json>');
405
+ rest.splice(fromIdx, 2);
406
+ }
407
+ const dispatch = rest[0];
408
+ if (!dispatch) {
409
+ fail(2, 'usage: study-swarm lock <dispatch> --from <orchestration.json> | study-swarm lock --verify <dispatch> [--from <orchestration.json>]');
410
+ }
411
+ if (!existsSync(dispatch)) fail(2, `dispatch not found: ${dispatch}`);
412
+ let orchestration = null;
413
+ if (orchPath) {
414
+ if (!existsSync(orchPath)) fail(2, `orchestration record not found: ${orchPath}`);
415
+ try { orchestration = JSON.parse(readFileSync(orchPath, 'utf8')); }
416
+ catch (err) { fail(2, `orchestration record is not valid JSON: ${err.message}`); }
417
+ }
418
+ const lockPath = lockPathFor(dispatch);
419
+
420
+ if (verify) {
421
+ if (!existsSync(lockPath)) fail(2, `no lock at ${lockPath} — create it with: study-swarm lock ${dispatch} --from <orchestration.json>`);
422
+ const problems = verifyLockObject(dispatch, lockPath, orchestration);
423
+ if (problems.length === 0) {
424
+ const scope = orchestration ? 'lock integrity verified + no source drift' : 'lock self-integrity verified (pass --from to also check source drift)';
425
+ process.stdout.write(`ok ${lockPath}: ${scope}.\n`);
426
+ process.exit(0);
427
+ }
428
+ process.stderr.write(`x ${lockPath}: ${problems.length} drift/integrity issue(s)\n`);
429
+ for (const p of problems) process.stderr.write(` - ${p}\n`);
430
+ process.exit(1);
431
+ }
432
+
433
+ if (!orchestration) {
434
+ fail(2, 'study-swarm lock <dispatch> requires --from <orchestration.json> — the harness-emitted record of resolved models + byte-exact prompts + tool schemas + the verifier receipt');
435
+ }
436
+ const lock = buildLockObject(dispatch, orchestration);
437
+ writeFileSync(lockPath, JSON.stringify(lock, null, 2) + '\n', 'utf8');
438
+ process.stdout.write(`Created ${lockPath}\nlock_sha256: ${lock.lock_sha256}\nVerify with: study-swarm lock --verify ${dispatch} --from ${orchPath}\n`);
439
+ }
440
+
260
441
  function main(argv) {
261
442
  const [cmd, ...rest] = argv;
262
443
  switch (cmd) {
263
444
  case 'protocol': return cmdProtocol();
264
445
  case 'new': return cmdNew(rest[0]);
265
446
  case 'lint': return cmdLint(rest);
447
+ case 'lock': return cmdLock(rest);
266
448
  case 'version': case '--version': case '-v':
267
449
  return void process.stdout.write(VERSION + '\n');
268
450
  case 'help': case '--help': case '-h': case undefined:
@@ -0,0 +1,137 @@
1
+ <!-- study-swarm v1.2.0 · protocol-sha256:4cfba7e8f7ef0915 · created:2026-06-29 -->
2
+ # Study-swarm dispatch: dispatch.lock.json (the PIN_PER_STEP feature)
3
+
4
+ > **Design dispatch.** This grounds the design of `dispatch.lock.json` — a per-dispatch lockfile that
5
+ > makes a study-swarm research dispatch **byte-replayable** by pinning, per step, the resolved model id,
6
+ > the SHA-256 of the byte-exact agent prompt, the SHA-256 of the tool JSONSchemas the agent had, and the
7
+ > external-verifier receipt — plus a top-level `lock_sha256` rollup. It implements the **PIN_PER_STEP**
8
+ > workflow standard (heritage: Snakemake 2012, Pegasus 2001). Five load-bearing questions were sent to
9
+ > parallel retrieval-grounded research agents; every finding below was fetched this session, and the whole
10
+ > set is gated through Step 4 (`roleos verify-citations` → prism, a different model family) **before** it
11
+ > informs the architecture. The synthesizer is Claude/Opus; the groundedness lens is Mistral; the existence
12
+ > oracle is deterministic retrieval — none of them Claude. Run `study-swarm lint study-swarm-lock.dispatch.md`
13
+ > (it passes).
14
+
15
+ ## Step 1 — Load-bearing questions
16
+
17
+ Each passes the load-bearing test (two real designs hinge on the answer; an adjacent field has measured it; the current spec is silent or hand-wavy):
18
+
19
+ - **Q1 — Replay-manifest structure.** How do reproducible-workflow and package/build systems structure a replay manifest, and how do they detect & surface DRIFT between the lock and a re-run? (What is pinned, how it is content-addressed, how a mismatch fails.)
20
+ - **Q2 — Canonicalization & rollup hashing.** How do you canonicalize structured data so a hash is STABLE across platforms (Windows/macOS/Linux) and re-serializations, and how should per-step hashes roll up to one dispatch hash without introducing a collision?
21
+ - **Q3 — Step-level provenance/attestation.** How do supply-chain frameworks capture STEP-LEVEL provenance, and which parts map to pinning "model + prompt + tool-schema + verifier receipt"? What is the honest verifiable-not-unforgeable claim for an offline CLI with an ephemeral key?
22
+ - **Q4 — LLM replay determinism reality.** Can pinning model + prompt + temperature (+ seed) yield reproducible OUTPUTS, or only reproducible INPUTS? (This decides whether the lock may claim "deterministic replay" or must claim "replayable inputs + drift-detectable outputs.")
23
+ - **Q5 — Tool-schema drift.** How do agent frameworks pin/version the tool/function schemas an agent had, so a replay with the same prompt but a CHANGED tool surface is detected? (The half PIN explicitly flags as missing.)
24
+
25
+ ## Step 2 — Research dispatch
26
+
27
+ Five parallel research agents (one per question), each retrieval-required — a source an agent could not fetch did not enter the dispatch — followed by a per-question **coverage-recovery sweep** that did a second retrieval pass AND a retrieval-based **existence audit** of the first sweep's citations. The recovery pass earned its keep: it caught a real **misattribution** (PEP 658's author is Tzu-ping Chung, not the byline the first sweep guessed), a **mis-sourced** claim (MoE-routing nondeterminism was attributed to a blog that does not state it; the correct source is the Hugging Face MoE post), and two **over-attributed URLs** (an SLSA reproducibility caveat that is a paraphrase, and a Sigstore ephemeral-key claim documented on the Fulcio pages, not the Rekor logging page). One first-sweep finding was `retrieved:false` (a Pact claim) and was **dropped**; an `oasdiff` finding whose maintainer attribution could not be verified was demoted to a design note rather than a numbered citation. All corrections are folded into Step 3 below; the audit is recorded in Step 4.
28
+
29
+ ## Step 3 — Research grounding
30
+
31
+ <!-- Every finding: author + year + a resolvable identifier (arXiv / DOI / RFC-or-spec URL), one-sentence finding, design implication. All gated by Step 4 before Step 5. Claims are phrased to what the retrieved source supports; precise figures that live only in a paper's body were softened to the abstract-grounded claim. -->
32
+
33
+ 1. **(Q1) Snakemake re-runs a rule only on output-absence or an input modification-time newer than the output, and deletes incomplete outputs on resume.** Köster & Rahmann 2012 (DOI:10.1093/bioinformatics/bts480). Implication: mtime-based re-run detection is the weak baseline — `dispatch.lock.json` pins CONTENT (SHA-256 of prompt/schema), never timestamps, so drift survives file-touch noise.
34
+ 2. **(Q1) Nextflow decides a task is cached by a single hash folding its inputs, command script, container image, environment, and parameters, re-executing on any change.** Di Tommaso et al. 2017 (DOI:10.1038/nbt.3820). Implication: each lock step's record folds resolved-model-id + `prompt_sha256` + `tool_schema_sha256` together, and `lock --verify` treats any field's drift as a replay miss.
35
+ 3. **(Q1) Nextflow's default file hashing keys on path + last-modified + size rather than content, producing spurious cache misses it must paper over with a "lenient" mode.** Seqera 2025 (https://docs.seqera.io/nextflow/cache-and-resume). Implication: a cautionary tale — hash byte-exact CONTENT via `node:crypto`, never path+mtime+size, so the lock does not inherit metadata-hash false positives.
36
+ 4. **(Q1) npm's package-lock.json pins each package's exact version, resolved URL, and an integrity hash, describing the tree so installs reproduce it.** npm, Inc. 2024 (https://docs.npmjs.com/cli/v10/configuring-npm/package-lock-json). Implication: models the per-step fields one-to-one — resolved id (≈version), source, content hash (≈integrity) — per research-agent step.
37
+ 5. **(Q1) npm ci treats the lockfile as the source of truth and errors out rather than rewriting it on any package.json/lock mismatch.** npm, Inc. 2024 (https://docs.npmjs.com/cli/v10/commands/npm-ci). Implication: the exact pattern for `lock --verify` — re-derive the hashes and FAIL CLOSED (non-zero exit) on mismatch; never silently auto-heal.
38
+ 6. **(Q1) pip's --require-hashes mode demands a SHA-256-or-stronger hash for ALL requirements and rejects md5/sha1/sha224 as too weak.** Python Packaging Authority 2024 (https://pip.pypa.io/en/stable/topics/secure-installs/). Implication: SHA-256 is the floor and completeness is all-or-nothing — every step is hashed, and one missing/mismatched hash fails the whole verify.
39
+ 7. **(Q1) PEP 658 lets a repository advertise a distribution's metadata hash as hashname=hashvalue so clients verify fetched metadata before trusting it.** Chung 2021 (https://peps.python.org/pep-0658/). Implication: precedent for hashing the capability METADATA, not just a payload — `tool_schema_sha256` hashes the tool schemas (what the agent COULD call), independent of any output.
40
+ 8. **(Q1) A build is reproducible only if, given the same source, environment, and instructions, any party can recreate bit-identical artifacts.** Reproducible Builds 2024 (https://reproducible-builds.org/docs/definition/). Implication: frames the honest ceiling — an LLM step has an uncontrolled input, so the lock pins the declarable inputs bit-exact and only drift-detects outputs.
41
+ 9. **(Q2) RFC 8785 (JCS) fixes a canonical JSON form — recursive UTF-16 code-unit key sorting, no inter-token whitespace, ECMAScript number serialization, UTF-8 output — so equal objects serialize to equal bytes.** Rundgren et al. 2020 (https://www.rfc-editor.org/rfc/rfc8785). Implication: run every CLI-assembled JSON object (the tool surface, the lock body) through JCS before SHA-256 so the harness's key order or spacing never causes false drift.
42
+ 10. **(Q2) UAX #15 guarantees canonically-equivalent Unicode strings (e.g. the NFC vs NFD encodings of the same text) share one identical normalized form.** Whistler 2025 (https://unicode.org/reports/tr15/). Implication: because JCS deliberately does NOT normalize Unicode, the CLI must NFC-normalize strings before JCS+SHA-256, or text authored on macOS (often NFD) silently drifts from the same text on Windows/Linux.
43
+ 11. **(Q2) RFC 8259 requires interchange JSON be UTF-8 and forbids emitting a byte-order mark, while declaring member order and whitespace insignificant to JSON semantics.** Bray 2017 (https://datatracker.ietf.org/doc/html/rfc8259). Implication: the Windows-authored CLI strips any BOM, never writes one, and relies on JCS rather than trusting the harness's key order/whitespace.
44
+ 12. **(Q2) RFC 7515 (JWS) signs over the exact base64url octets rather than re-canonicalized JSON, so whitespace, member order, and re-serialization can never alter the signature.** Jones et al. 2015 (https://www.rfc-editor.org/rfc/rfc7515). Implication: for inputs whose exact bytes are already known — the byte-exact prompt and the verifier receipt blob — hash the captured bytes directly; reserve JCS for JSON the CLI assembles itself.
45
+ 13. **(Q2) Merkle's hash tree authenticates many leaves under a single root hash by hashing pairwise up a tree.** Merkle 1987 (DOI:10.1007/3-540-48184-2_32). Implication: `lock_sha256` is a rollup committing to every per-step record, so any single-step drift changes the dispatch's content-address.
46
+ 14. **(Q2) RFC 6962 hashes Merkle leaves as SHA-256(0x00 then data) and interior nodes as SHA-256(0x01 then children), stating the domain separation is required for second-preimage resistance.** Laurie et al. 2013 (https://www.rfc-editor.org/rfc/rfc6962.html). Implication: distinct per-field keys inside the canonical lock object provide the same domain separation, so a single step's hash cannot be passed off as the whole-lock hash.
47
+ 15. **(Q2) Bitcoin's CVE-2012-2459 showed that hashing leaves and interior nodes identically, plus duplicating an odd last leaf, lets two distinct trees collide to one root.** Bitcoin Optech 2012 (https://bitcoinops.org/en/topics/merkle-tree-vulnerabilities/). Implication: do NOT hand-roll an odd-leaf-duplicating Merkle tree — hash the ordered steps as one canonical JSON array whose explicit length commits to exactly N steps.
48
+ 16. **(Q2) RFC 8032 specifies Ed25519 signatures as deterministic — the per-signature nonce is hash-derived, not random — so signing the same bytes under the same key yields byte-identical signatures.** Josefsson & Liusvaara 2017 (https://www.rfc-editor.org/rfc/rfc8032.html). Implication: the prism Ed25519 receipt is a stable, re-derivable pin — the lock stores the receipt id plus a hash of the signed blob, and `--verify` flags drift if those bytes change, without the CLI verifying the signature itself.
49
+ 17. **(Q3) in-toto link metadata records, per step, materials (input hashes), products (output hashes), the command, byproducts, and environment, with MATCH rules chaining one step's products to the next step's materials.** Torres-Arias et al. 2019 (https://www.usenix.org/conference/usenixsecurity19/presentation/torres-arias). Implication: a lock step mirrors the link shape — materials = prompt + tool-schema hashes, product = output hash, command = resolved model + params, actor = verifier receipt.
50
+ 18. **(Q3) The in-toto attestation Statement binds a predicate to artifacts purely by digest, matching subjects regardless of name or content type.** in-toto Attestation Framework 2023 (https://github.com/in-toto/attestation/blob/main/spec/v1/statement.md). Implication: pin every input by its digest, never by filename or alias — the digest is the anchor, the name is cosmetic.
51
+ 19. **(Q3) SLSA provenance splits buildDefinition (external/internal parameters, resolvedDependencies with digests) from runDetails (builder.id, metadata, byproducts) and documents inputs for rebuild rather than guaranteeing reproducibility.** OpenSSF SLSA 2023 (https://slsa.dev/spec/v1.0/provenance). Implication: licenses the lock's honest ceiling and tells it to record the RESOLVED model id as the builder.id-equivalent — aliases float and break replay.
52
+ 20. **(Q3) SLSA verification is a strict comparison against pre-declared expectations that rejects unrecognized external parameters rather than ignoring them.** OpenSSF SLSA 2023 (https://slsa.dev/spec/v1.0/verifying-artifacts). Implication: `lock --verify` is strict-match — every pinned field must equal the re-derived value AND unknown/extra fields fail, not pass silently.
53
+ 21. **(Q3) SCITT issues a Receipt — an offline-verifiable inclusion proof — for a registered Signed Statement, so a third party can verify registration without re-contacting the service.** Birkholz et al. 2025 (https://datatracker.ietf.org/doc/html/draft-ietf-scitt-architecture-22). Implication: store the verifier run_id + receipt chain hash so `lock --verify` confirms the verification happened, offline, without re-running the verifier.
54
+ 22. **(Q3) Sigstore's Rekor is an append-only transparency log whose entries are immutable, making tampering detectable rather than impossible, and it pairs ephemeral keys with the log so a discarded key still yields a verifiable record.** Newman et al. 2022 (DOI:10.1145/3548606.3560596). Implication: the lock makes drift/tampering DETECTABLE, not outputs unforgeable — the correct, honest claim for an offline CLI whose signing key is ephemeral.
55
+ 23. **(Q3) DSSE signs over a length-prefixed pre-authentication encoding of (type, body) and forbids verifiers from re-parsing the envelope to extract the payload.** Secure Systems Lab 2021 (https://github.com/secure-systems-lab/dsse/blob/master/protocol.md). Implication: hash the byte-exact prompt/tool-schema with their length and type bound in, and have `--verify` hash the STORED bytes rather than re-canonicalizing parsed JSON — re-parsing reintroduces the canonicalization attack surface DSSE was built to remove.
56
+ 24. **(Q3) Git is content-addressable: an object's key is the hash of a typed header (the type plus byte-length plus a NUL) concatenated with its exact content.** Chacon & Straub 2014 (https://git-scm.com/book/en/v2/Git-Internals-Git-Objects). Implication: compute `lock_sha256` over a typed, length-framed canonical encoding so a one-byte change flips the address — the proven content-address recipe.
57
+ 25. **(Q3) W3C PROV models a step as an Activity that used inputs, wasGeneratedBy outputs, and wasAssociatedWith an agent, chained by wasInformedBy / wasDerivedFrom.** Lebo et al. 2013 (https://www.w3.org/TR/prov-o/). Implication: confirms the minimal per-step quadruple — input hashes, output hashes, actor (model + verifier), and a rollup link — is complete lineage; nothing more is required.
58
+ 26. **(Q4) The dominant cause of LLM inference nondeterminism is lack of batch invariance — kernel reduction order shifts with batch size (server load) — so temperature-0 prompts still vary; standard vLLM gave 80 unique completions from 1000 identical samples.** He & Thinking Machines Lab 2025 (https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/). Implication: batch size is outside the lock's control, so it records output hashes for drift and never promises identical outputs.
59
+ 27. **(Q4) Under greedy decoding, BF16 LLM outputs diverge across hardware from floating-point non-associativity, with up to ~9% accuracy variation and large response-length differences.** Yuan et al. 2025 (arXiv:2506.09501). Implication: fixed decoding params do not yield reproducible outputs across infrastructure, so an output-hash mismatch is expected drift to flag, not a failure of the lock.
60
+ 28. **(Q4) Five LLMs configured at temperature 0 rarely produced identical raw output strings across reruns, with accuracy varying up to ~15%.** Atil et al. 2024 (arXiv:2408.04667). Implication: the strongest direct evidence that "deterministic settings" do not give byte-identical outputs — pin INPUTS byte-exact, only drift-detect OUTPUTS.
61
+ 29. **(Q4) The same commercial model silently changed behavior over months — GPT-4's prime-vs-composite accuracy fell from 84% to 51% between March and June 2023.** Chen, Zaharia & Zou 2023 (arXiv:2307.09009). Implication: an alias (or even a named model) can be re-tuned server-side, so the lock pins the RESOLVED model id and drift-detects to surface provider drift at re-run.
62
+ 30. **(Q4) OpenAI's API frames reproducibility as inputs-pinned plus a drift sentinel: identical seed and params are only "mostly" deterministic, and clients must watch the returned system_fingerprint for backend changes.** OpenAI 2025 (https://developers.openai.com/api/docs/guides/advanced-usage). Implication: a frontier provider already ships the lock's exact pattern — pin inputs, treat a fingerprint/hash mismatch on re-run as drift to flag, not a lock error.
63
+ 31. **(Q4) MoE serving adds routing-level nondeterminism: expert capacity and token-dropping mean a sequence's processing depends on which other sequences share its batch.** Sanseviero et al. 2023 (https://huggingface.co/blog/moe). Implication: for MoE-backed frontier models the lock can never promise output replay, reinforcing "replayable inputs + drift-detectable outputs."
64
+ 32. **(Q4) Achieving truly deterministic LLM output needs a runtime verify-and-rollback loop that replays nondeterministic tokens under consistent reduction schedules.** Gond et al. 2026 (arXiv:2601.17768). Implication: output determinism requires machinery far outside a zero-dependency offline CLI, so the lock defers any determinism claim to the harness and stays at the input-pin + output-drift layer.
65
+ 33. **(Q5) An MCP tool is defined by name + description + inputSchema (+ optional outputSchema/annotations) returned by tools/list, but the protocol ships no per-tool version or content hash.** MCP Spec 2025 (https://modelcontextprotocol.io/specification/2025-06-18/server/tools). Implication: `tool_schema_sha256` supplies the missing content-level detector — hash the canonicalized tools/list array the agent actually saw.
66
+ 34. **(Q5) MCP signals tool changes only via a runtime listChanged flag and versions just the protocol (a session date-string), not the individual tool schemas.** MCP Spec 2025 (https://modelcontextprotocol.io/specification/2025-06-18/basic/lifecycle). Implication: record the protocolVersion but never rely on it for drift — two runs can share it with a totally different tool set, so the content hash is the real guard.
67
+ 35. **(Q5) Anthropic injects tool definitions as JSON Schema verbatim into the constructed system prompt the model conditions on.** Anthropic 2025 (https://platform.claude.com/docs/en/agents-and-tools/tool-use/implement-tool-use). Implication: because the schema bytes condition the model, `tool_schema_sha256` must hash the exact schema — a changed description or an added enum value is real conditioning drift, not cosmetic.
68
+ 36. **(Q5) OpenAI strict-mode function calling requires additionalProperties false and every field marked required, rejecting a tool definition that violates the constraints.** OpenAI 2025 (https://developers.openai.com/api/docs/guides/function-calling). Implication: the CLI validates each captured tool schema is well-formed before hashing, so the lock pins a schema the run could actually have used.
69
+ 37. **(Q5) The same JSON Schema body can be evaluated differently under a different dialect, and MCP's unspecified dialect caused real cross-SDK validation failures resolved by defaulting to draft 2020-12.** JSON Schema Org 2025 (https://json-schema.org/understanding-json-schema/reference/schema). Implication: capture the effective `$schema` dialect inside the hashed bytes so a draft-07 to 2020-12 change flips the hash even when the properties are unchanged.
70
+ 38. **(Q5) W3C Subresource Integrity pins an algorithm-prefixed digest (e.g. the algorithm name then a base64 hash) and requires the agent to refuse content whose computed hash does not exactly match, erroring rather than warning.** Braun 2026 (https://www.w3.org/TR/sri-2/). Implication: adopt SRI's self-describing, algorithm-agile digest form and its fail-closed semantics for `lock --verify`.
71
+ 39. **(Q5) OpenAPI separates the spec-dialect version (openapi) from the document's own version (info.version) as distinct fields.** OpenAPI Initiative 2025 (https://spec.openapis.org/oas/v3.2.0.html). Implication: separate the schema-dialect version from any tool-surface version string the harness supplies, but treat the content hash — not either label — as the authoritative drift guard.
72
+
73
+ ## Step 4 — External verification
74
+
75
+ **Run against this dispatch's 39 citations through the LIVE runner before Step 5 was locked.** Synthesizer = Claude/Opus; verifier = the deterministic arXiv→Crossref retrieval oracle + a groundedness lens on **`mistral-small:24b`** (ModelFamily `local`, reasoning-stripped — `--caller-family anthropic` excludes the synthesizer's family by construction). Command: `roleos verify-citations examples/study-swarm-lock.dispatch.md --provider ollama` → `prism verify --type citations` (prism v1.6.0). **No verifier was Claude — the protocol did not grade its own homework.**
76
+
77
+ - [x] existence established by **retrieval, not memory** — the structured oracle for the academic subset, plus an independent in-session retrieval existence-audit for the rest. **0 fabricated.**
78
+ - [x] groundedness checked by a **different family** (`mistral-small:24b`, reasoning-stripped), where the oracle returned an abstract — abstract-less and oracle-timeout citations **escalated "retrieve manually," never auto-passed**.
79
+ - [x] **≥ 3 decorrelated lenses** (mechanism diversity, per the ensemble rule): the deterministic arXiv/Crossref oracle + the Mistral groundedness lens + the independent coverage-recovery retrieval existence-audit.
80
+
81
+ **Verdict: `escalate` (advisory, non-blocking; prism exit 30). 0 fabricated, 0 refused.** The gate *discriminated* by source type rather than rubber-stamping:
82
+
83
+ - **8 citations parsed** by the runner's arXiv/DOI extractor (the academic subset). **4 Crossref DOIs existence-RESOLVED:** Snakemake (`10.1093/bioinformatics/bts480`), Nextflow (`10.1038/nbt.3820`), Merkle (`10.1007/3-540-48184-2_32`), Sigstore (`10.1145/3548606.3560596`). Groundedness rendered a **real verdict on Snakemake** — `not_addressed` ("the claim is not in the title+abstract"), correct: the mtime-re-run behavior lives in the paper body, exactly the partial-support escalation the protocol exists to surface. The other three resolved but Crossref served no abstract, so groundedness could not run → escalate.
84
+ - **4 arXiv papers** (Yuan `2506.09501`, Atil `2408.04667`, Chen `2307.09009`, Gond `2601.17768`) → **arXiv oracle ReadTimeout**: this rig's IP was rate-limited by arXiv after the dispatch's heavy same-session traffic (confirmed — a direct combined `export.arxiv.org` request also returned empty during the cooldown). The runner escalated **"RETRIEVE MANUALLY," never fabrication** — the oracle-unavailable halt rule firing live. All four were existence-confirmed earlier this session by the coverage-recovery existence-audit.
85
+ - **31 spec / RFC / standard / vendor-doc citations** carry direct URLs — outside the arXiv/Crossref structured oracle's resolver — so the extractor reported them **`unparsed` (not arXiv/DOI), which is NOT fabrication**. Every one was existence-verified by **direct retrieval this session**: the research + coverage-recovery agents' existence-audit resolved all of them (and caught the corrections folded into Step 3), and a live spot-check returned **HTTP 200** for RFC 8785, the in-toto USENIX page, W3C SRI, and the MCP spec.
86
+ - **A first run flagged the in-toto citation `fabricated`** — a **Crossref-coverage gap**, not a real fabrication: the ACM/USENIX legacy `10.5555` DOI namespace is not indexed by Crossref. The paper is real (USENIX Security 2019). Corrected to its primary USENIX URL (finding 17); the false flag cleared. This is the "the oracle can't resolve *this identifier type* ≠ fabricated" lesson, observed live.
87
+
88
+ **Receipt (captured + cryptographically verified).** prism Ed25519 receipt **`prism-01kwbajx31dj9gcf5xn3cn5ydg`**, chained into the `roleos-citation-receipt/v1` (`chain_sha256 499b63905064a5e25fd1801c5530504c94742f2183c4d3c8eb545a20cfbb112e`; each retrieval pin carries a `source_sha256` → drift-detectable). `prism replay <id>` → `prism verify-receipt --public-key <pub.pem>` → **`signature_valid: true`, exit 0** — verified with the **public key alone, no shared secret** (the cross-tool path a consumer uses to confirm a verdict it did not produce). Honest ceiling: the signing key is ephemeral and scratchpad-local, so the receipt buys third-party *verifiability*, not anti-forgery (prism's own disclosed limit). **This dispatch is the first to pin its own verifier receipt into its `dispatch.lock.json` (L10) — the feature closing the loop on the exact run that gated it.**
89
+
90
+ ## Step 5 — Architecture (`dispatch.lock.json`)
91
+
92
+ Each choice traces to findings by number. The shape:
93
+
94
+ ```json
95
+ {
96
+ "schema": "dispatch.lock/v1",
97
+ "study_swarm_version": "1.2.0",
98
+ "protocol_sha256": "<full sha256 of the vendored PROTOCOL.md>",
99
+ "dispatch_sha256": "<sha256 of the dispatch .md bytes>",
100
+ "steps": [
101
+ {
102
+ "question_id": "Q1-replay-manifest",
103
+ "resolved_model": "claude-opus-4-8",
104
+ "prompt_sha256": "sha256-<base64>",
105
+ "tool_schema_sha256": "sha256-<base64>",
106
+ "schema_dialect": "https://json-schema.org/draft/2020-12/schema",
107
+ "params": { "effort": "high" },
108
+ "output_sha256": "sha256-<base64>"
109
+ }
110
+ ],
111
+ "verification": {
112
+ "runner": "roleos verify-citations",
113
+ "tool": "prism verify --type citations",
114
+ "verifier_model": "mistral-small:24b",
115
+ "verifier_family": "local",
116
+ "receipt_id": "prism-...",
117
+ "receipt_chain_sha256": "sha256-<base64>"
118
+ },
119
+ "lock_sha256": "sha256-<base64>"
120
+ }
121
+ ```
122
+
123
+ - **L1 — One lock per dispatch; the lock IS the dispatch's content-address.** `lock_sha256` binds every per-step record and the verifier block together, so a replay cannot stitch a step from one dispatch onto another. (findings 4, 13, 17, 18)
124
+ - **L2 — The harness emits the record; the CLI canonicalizes + hashes + validates it.** This producer/verifier split is universal across the provenance literature, and it is what keeps the CLI zero-dependency, network-free, and deterministic — it never calls a model. (findings 17, 19, 21, 22)
125
+ - **L3 — Hash the prompt as normalized text, not JCS-restructured JSON.** The prompt is the literal string the model conditioned on, so it is hashed directly rather than canonicalized as JSON (the JWS/DSSE hash-known-bytes rule) — under one necessary text normalization (BOM strip + CRLF→LF + NFC), without which the same prompt hashes differently across platforms. (findings 12, 23; 10, 11)
126
+ - **L4 — Normalize every text input before hashing (BOM strip + CRLF→LF + NFC), and JCS-canonicalize the structured JSON (tool surface, lock body).** This is the only way the same dispatch hashes identically on Windows, macOS, and Linux — the prompt, the dispatch text, `PROTOCOL.md`, and every JSON string value all pass through it. (findings 9, 10, 11) *(CI caught a real CRLF drift here when an early build hashed raw `PROTOCOL.md` bytes — the fix is exactly this normalization, and a line-ending-invariance test now guards it.)*
127
+ - **L5 — Capture the tool surface as the canonicalized array `{name, description, inputSchema, outputSchema}` plus the effective JSON Schema dialect.** Neither MCP nor the provider APIs ship a per-tool version or hash, so the lock's content hash is the missing drift detector, and the dialect is part of the contract. (findings 7, 33, 34, 35, 37)
128
+ - **L6 — Pin the RESOLVED model id, never an alias.** A named model can be re-tuned server-side; the concrete platform identity is what makes a step replayable. (findings 19, 29)
129
+ - **L7 — `output_sha256` records outputs for DRIFT DETECTION, not determinism.** Pinning model + prompt + temperature does not yield bit-identical outputs (batch-invariance, FP non-associativity, MoE routing, provider drift), so the honest claim is **"replayable inputs + drift-detectable outputs,"** never "deterministic replay." (findings 8, 19, 26, 27, 28, 30, 31, 32)
130
+ - **L8 — `lock --verify` is fail-closed strict-match.** Re-derive every deterministic hash and assert equality; any mismatch — and any unrecognized field — exits non-zero, never auto-heals. (findings 5, 6, 20, 38)
131
+ - **L9 — Self-describing digests + domain separation in the rollup, hashed as one flat canonical array.** Digests carry their algorithm prefix; the canonical JSON object's distinct keys and the array's explicit length supply domain separation and sidestep the odd-leaf-duplication collision class. (findings 14, 15, 24, 38)
132
+ - **L10 — Pin the verifier receipt offline-verifiably.** Store `receipt_id` + `receipt_chain_sha256`; `lock --verify` confirms the verification happened without re-contacting the verifier. The Ed25519 receipt is a stable pin; the ephemeral key buys verifiability, not anti-forgery — stated, not oversold. (findings 16, 21, 22)
133
+ - **L11 — Pin content, not mtimes.** The lock is robust to file-touch noise by construction. (findings 1, 3)
134
+
135
+ **Optional actionable drift output (design note, not a numbered citation):** beyond a flipped SHA, `lock --verify` may classify *what* changed in the tool surface (added tool = additive, removed/renamed parameter = breaking), mirroring the breaking/non-breaking split that the [oasdiff](https://github.com/oasdiff/oasdiff) OpenAPI differ surfaces in CI. This is a usability layer over the authoritative hash check, not a substitute for it.
136
+
137
+ **Net:** the lock turns a study-swarm dispatch into a content-addressed, byte-replayable manifest — resolved model + prompt + tool-schema + verifier receipt pinned per step, rolled into one `lock_sha256`, drift-checked fail-closed — while telling the truth about its ceiling: it makes inputs replayable and outputs drift-detectable, not LLM outputs deterministic.