npm - @archsight/aios - Versions diffs - 1.3.1 → 1.4.0 - Mend

@archsight/aios 1.3.1 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (80) hide show

package/.claude-plugin/plugin.json +1 -1
package/CHANGELOG.md +44 -0
package/README.md +38 -0
package/RELEASE_NOTES.md +49 -0
package/bin/archsight-aios.mjs +253 -8
package/delivery/README.md +1 -0
package/delivery/v1.4.0-release-readiness.md +64 -0
package/docs/quickstart.md +8 -0
package/docs/v1.4.0-writing-boundary.md +42 -0
package/docs/v1.4.0-writing-workflow-quickstart.md +83 -0
package/gemini-extension.json +1 -1
package/package.json +5 -1
package/prompts/README.md +3 -0
package/prompts/evaluation-policy.md +68 -0
package/prompts/evaluations/engineering-document-writing-fixtures.json +129 -0
package/prompts/evaluations/engineering-document-writing-scorecard.json +108 -0
package/prompts/evaluations/skill-runtime/README.md +15 -0
package/prompts/evaluations/skill-runtime/raw/codex-scheme-writing-sample.md +111 -0
package/prompts/evaluations/skill-runtime/raw/codex-tender-writing-sample.md +108 -0
package/prompts/evaluations/skill-runtime/raw/workbuddy-scheme-writing-sample.md +228 -0
package/prompts/evaluations/skill-runtime/raw/workbuddy-tender-writing-sample.md +143 -0
package/prompts/evaluations/skill-runtime/v1.4.0-writing-host-scorecard-review.md +29 -0
package/prompts/evaluations/skill-runtime/v1.4.0-writing-host-validation.json +112 -0
package/prompts/evaluations/skill-runtime/v1.4.0-writing-host-validation.md +43 -0
package/prompts/prompt-registry.md +2 -0
package/runtime/archsight-aios.manifest.json +70 -8
package/runtime/skill-routing.md +4 -1
package/scripts/build-prompt-run-pack.mjs +27 -9
package/scripts/validate-prompt-fixtures.mjs +31 -10
package/scripts/validate-prompt-scorecard.mjs +22 -6
package/scripts/validate-skill-runtime-evidence.mjs +125 -0
package/skills/README.md +5 -1
package/skills/aios/SKILL.md +15 -3
package/skills/aios/agents/openai.yaml +1 -1
package/skills/aios-commercial-contract/SKILL.md +10 -0
package/skills/aios-commercial-contract/agents/openai.yaml +1 -1
package/skills/aios-commercial-tender/SKILL.md +10 -0
package/skills/aios-commercial-tender/agents/openai.yaml +1 -1
package/skills/aios-commercial-variation/SKILL.md +10 -0
package/skills/aios-commercial-variation/agents/openai.yaml +1 -1
package/skills/aios-construction-daily/SKILL.md +10 -0
package/skills/aios-construction-daily/agents/openai.yaml +1 -1
package/skills/aios-construction-meeting/SKILL.md +10 -0
package/skills/aios-construction-meeting/agents/openai.yaml +1 -1
package/skills/aios-construction-scheme/SKILL.md +10 -0
package/skills/aios-construction-scheme/agents/openai.yaml +1 -1
package/skills/aios-scheme-write/SKILL.md +132 -0
package/skills/aios-scheme-write/agents/openai.yaml +4 -0
package/skills/aios-scheme-write/prompts/basic-prompt.md +83 -0
package/skills/aios-tender-write/SKILL.md +130 -0
package/skills/aios-tender-write/agents/openai.yaml +4 -0
package/skills/aios-tender-write/prompts/basic-prompt.md +82 -0
package/skills/archsight-aios/SKILL.md +13 -0
package/skills/archsight-aios/agents/openai.yaml +1 -1
package/templates/README.md +35 -0
package/templates/document-writing/draft.md +15 -0
package/templates/document-writing/final.md +16 -0
package/templates/document-writing/material-index.md +18 -0
package/templates/document-writing/review-notes.md +18 -0
package/templates/document-writing/source-normalized.md +20 -0
package/templates/document-writing/writing-brief.md +23 -0
package/templates/document-writing-samples/scheme/README.md +16 -0
package/templates/document-writing-samples/scheme/draft.md +25 -0
package/templates/document-writing-samples/scheme/final.md +20 -0
package/templates/document-writing-samples/scheme/material-index.md +23 -0
package/templates/document-writing-samples/scheme/review-notes.md +23 -0
package/templates/document-writing-samples/scheme/source-normalized.md +35 -0
package/templates/document-writing-samples/scheme/writing-brief.md +26 -0
package/templates/document-writing-samples/tender/README.md +16 -0
package/templates/document-writing-samples/tender/draft.md +27 -0
package/templates/document-writing-samples/tender/final.md +20 -0
package/templates/document-writing-samples/tender/material-index.md +23 -0
package/templates/document-writing-samples/tender/review-notes.md +23 -0
package/templates/document-writing-samples/tender/source-normalized.md +34 -0
package/templates/document-writing-samples/tender/writing-brief.md +26 -0
package/templates/project-ai/.ai/agent-routing.md +6 -2
package/templates/project-ai/.ai/skills.md +6 -1
package/vision/README.md +1 -0
package/vision/roadmap.md +8 -0
package/vision/v1.4.0-engineering-document-workflow.md +205 -0

package/docs/v1.4.0-writing-workflow-quickstart.md ADDED Viewed

@@ -0,0 +1,83 @@
+# v1.4.0 写作工作流快速使用
+本页给外部使用者看，优先说明最短可用路径。AIOS 用于生成和改写工程文档初稿，并辅助复核资料链，不替代投标负责人、技术负责人、总工、专家、法务或审批主体。
+## 1. 安装到 WorkBuddy
+```bash
+npx @archsight/aios@latest install --target workbuddy --scope user
+```
+安装后，WorkBuddy 会读取个人 Skills 目录中的 AIOS 技能包。
+## 2. 创建写作工作台
+在你的项目资料目录执行：
+```bash
+npx @archsight/aios@latest writing:init --type tender --name document-writing
+```
+专项施工方案场景：
+```bash
+npx @archsight/aios@latest writing:init --type scheme --name document-writing
+```
+想先看样板：
+```bash
+npx @archsight/aios@latest writing:init --type tender --sample --name tender-sample
+npx @archsight/aios@latest writing:init --type scheme --sample --name scheme-sample
+```
+## 3. 放入资料
+按顺序填写：
+- `source-normalized.md`：脱敏后的招标文件、方案初稿、专家意见、评分办法、图纸目录、计算书目录等。
+- `material-index.md`：历史标书或历史方案素材，并标注可复用、仅参考、不可套用或需人工确认。
+- `writing-brief.md`：本次要写什么章节、输出格式、禁止结论和人工复核岗位。
+## 4. 触发写作 Skill
+标书 / 技术标：
+```text
+$aios-tender-write 请基于 document-writing 工作台生成或改写 draft.md，并保留待补资料和审核门禁。
+```
+专项施工方案：
+```text
+$aios-scheme-write 请基于 document-writing 工作台生成或改写 draft.md，并保留危险源、计算书待补项和审核门禁。
+```
+## 5. 触发审核门禁
+标书 / 技术标生成后：
+```text
+$aios-commercial-tender 请复核 draft.md 的评分点响应、资料缺口、历史素材误套、废标风险和人工复核岗位。
+```
+专项施工方案生成后：
+```text
+$aios-construction-scheme 请复核 draft.md 的危险源、规范依据、计算书缺口、专家意见回查和交底要点。
+```
+## 6. 检查工作台完整性
+```bash
+npx @archsight/aios@latest writing:validate --name document-writing
+```
+该命令只检查文件链和关键边界是否存在，不判断内容专业正确。
+## 不能直接交付的事项
+- AI 生成稿不能直接作为正式标书、正式方案或盖章文件。
+- 缺少资料时必须保留 `待补`、`需核验` 或 `转人工复核`。
+- 不输出中标概率、评标结论、最终投标决策。
+- 不输出方案合格、计算正确、专家论证通过、审批通过。

package/gemini-extension.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "archsight-aios",
-  "version": "1.3.1",
+  "version": "1.4.0",
   "description": "面向建筑行业知识工作从业者与 AI 研发团队的 Skills、Workflow 与多 Agent 工具包 / Building-industry AI agent skills for BIM, IFC, RAG, GraphRAG, project evidence work, code review, and runtime governance.",
   "contextFileName": "GEMINI.md"
 }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@archsight/aios",
-  "version": "1.3.1",
+  "version": "1.4.0",
   "description": "面向建筑行业知识工作从业者与 AI 研发团队的 Skills、Workflow 与多 Agent 工具包 / Building-industry AI agent skills for BIM, IFC, RAG, GraphRAG, project evidence work, code review, and runtime governance.",
   "type": "module",
   "homepage": "https://github.com/ArchSightLabs/archsight-aios#readme",
@@ -51,11 +51,15 @@
     "validate:prompts": "node ./scripts/validate-prompt-fixtures.mjs",
     "validate:prompt-run-pack": "node ./scripts/build-prompt-run-pack.mjs --check",
     "validate:public-advisory-run-pack": "node ./scripts/build-prompt-run-pack.mjs --fixture prompts/evaluations/engineering-business-public-advisory-fixtures.json --check",
+    "validate:document-writing-run-pack": "node ./scripts/build-prompt-run-pack.mjs --fixture prompts/evaluations/engineering-document-writing-fixtures.json --check",
     "validate:prompt-run-results": "node ./scripts/validate-prompt-run-results.mjs --check-template",
     "validate:prompt-outputs": "node ./scripts/validate-prompt-model-outputs.mjs",
     "validate:prompt-scorecard": "node ./scripts/validate-prompt-scorecard.mjs",
+    "validate:document-writing-scorecard": "node ./scripts/validate-prompt-scorecard.mjs",
+    "validate:skill-runtime-evidence": "node ./scripts/validate-skill-runtime-evidence.mjs",
     "build:prompt-run-pack": "node ./scripts/build-prompt-run-pack.mjs --out prompts/evaluations/engineering-business-basic-run-pack.generated.json",
     "build:public-advisory-run-pack": "node ./scripts/build-prompt-run-pack.mjs --fixture prompts/evaluations/engineering-business-public-advisory-fixtures.json --out prompts/evaluations/engineering-business-public-advisory-run-pack.generated.json",
+    "build:document-writing-run-pack": "node ./scripts/build-prompt-run-pack.mjs --fixture prompts/evaluations/engineering-document-writing-fixtures.json --out prompts/evaluations/engineering-document-writing-run-pack.generated.json",
     "analyze:prompt-run-results": "node ./scripts/analyze-prompt-run-results.mjs",
     "test": "node ./tests/cli.test.mjs"
   },

package/prompts/README.md CHANGED Viewed

@@ -16,7 +16,10 @@ Prompt 会腐化，因此不能只保存文本本身，必须保存评估和维
 对比验证记录见 [工程业务管理基础提示词对比验证](evaluations/engineering-business-basic-prompts-2026-06-16.md)。
 advisory 来源信号复核见 [工程业务基础提示词 advisory 复核说明](evaluations/engineering-business-basic-advisory-validation-2026-06-16.md)。
 结构化评分卡见 [工程业务管理基础提示词评分卡](evaluations/engineering-business-basic-scorecard.json)，可用 `npm run validate:prompt-scorecard` 校验。
+写作型评分卡见 [工程文档写作评分卡](evaluations/engineering-document-writing-scorecard.json)，同样由 `npm run validate:prompt-scorecard` 或 `npm run validate:document-writing-scorecard` 校验。
 weak/basic 运行包可用 `npm run build:prompt-run-pack` 生成，生成前可用 `npm run validate:prompt-run-pack` 校验。
+写作型 Skill 运行包使用 `npm run build:document-writing-run-pack` 生成，生成前可用 `npm run validate:document-writing-run-pack` 校验。
+真实宿主触发证据见 [Skill Runtime Evidence](evaluations/skill-runtime/)，可用 `npm run validate:skill-runtime-evidence` 校验归档结构。
 weak/basic 运行结果模板可用 `node ./scripts/validate-prompt-run-results.mjs --init <file>` 生成，模板和真实结果可用 `npm run validate:prompt-run-results` 或 `--file` 校验。
 weak/basic 运行结果报告可用 `npm run analyze:prompt-run-results -- --file <results> --out <report>` 生成。
 输出结构样例见 [工程业务管理基础模型输出样例](evaluations/engineering-business-basic-model-output.example.json)，可用 `npm run validate:prompt-outputs` 校验。

package/prompts/evaluation-policy.md CHANGED Viewed

@@ -30,6 +30,39 @@ npm run validate:prompts
 该检查不替代真实模型输出评估，但能保证 6 类基础场景、抽象来源信号、必备输出结构、禁止结论和敏感信息边界没有被破坏。
+## 工程文档写作提示词回归
+工程文档写作型提示词使用 `prompts/evaluations/engineering-document-writing-fixtures.json` 作为脱敏回归基线，覆盖 `aios-tender-write` 和 `aios-scheme-write` 两类生成 / 改写任务。
+该 fixture 不验证模型真实文采，也不证明可以直接交付客户；它只检查写作型 Skill 是否稳定保留：
+- Markdown 工作母版。
+- 资料来源、写作 brief 和历史素材复用判断。
+- 章节初稿 / 改写稿和待补占位。
+- 禁止编造、禁止越权结论和人工复核边界。
+- 生成后交回 `aios-commercial-tender` 或 `aios-construction-scheme` 的审核门禁。
+若需要批量运行写作 weak/basic 对照输入，使用：
+```bash
+npm run validate:document-writing-run-pack
+npm run build:document-writing-run-pack
+```
+写作运行包包含 2 个 case 的普通提示词和基础提示词两组输入，共 4 条 run item。该步骤只组织脱敏 / 虚构输入和 prompt 文本，不调用模型。
+写作型评分卡保存在 `prompts/evaluations/engineering-document-writing-scorecard.json`，用于固定来源链、历史素材复用、初稿可操作性、边界安全、审核门禁和交接可读性 6 个维度。修改写作 fixture、写作 prompt 或评分维度后，运行：
+```bash
+npm run validate:document-writing-scorecard
+```
+真实宿主触发证据保存在 `prompts/evaluations/skill-runtime/`。该目录允许记录 blocked / uncertain 状态，但不能把 blocked case 写成通过；只有原始输出已归档且能确认真实触发对应 Skill 时，才允许标记 `skillRuntimeConfirmed=yes`。
+```bash
+npm run validate:skill-runtime-evidence
+```
 普通提示词与基础提示词的结构化比较保存在 `prompts/evaluations/engineering-business-basic-scorecard.json`。修改 fixture、基础提示词或评分维度后，运行：
 ```bash
@@ -51,6 +84,41 @@ npm run build:public-advisory-run-pack
 若要评估“普通提示词、便携强提示词、真实 Skill 结果”三类差异，使用 `aios-prompt-compare`。其中 weak/basic 可以沿用 run pack；`skill-runtime` 需要由宿主工具真实触发对应 `$aios-*` Skill 后归档，再按同一 scorecard 做三栏比较。不要把 `SKILL.md` 直接作为普通 prompt 粘贴运行的输出称为真实 Skill 结果。
+## 宿主遵从度受控评测
+当需要比较 WorkBuddy、Codex、Gemini、Antigravity 等宿主的表现时，评测目标应先定义为“宿主 + Skill 加载方式 + 文档解析 + 模型 + 输出长度策略”的整体效果，不要直接推断某个模型长期更强。
+最小受控设计：
+1. 使用同一版 AIOS、同一批脱敏输入文档和同一句短指令，例如“请用 AIOS 技能包分析该文档”。
+2. 每个宿主都确认已安装同一版 `@archsight/aios`，并记录宿主名称、模型名称、运行时间、输入文件、是否真实触发 Skill。
+3. 原始输出全文归档，不只保存摘要；客户、项目、人员、地点、金额和编号先脱敏。
+4. 先按“是否触发正确 Skill、是否输出标准详版报告、是否包含输出自检”判断宿主遵从度。
+5. 再按 scorecard 比较证据链、可操作性、边界安全、资料缺口、人工交接和输出可读性。
+6. 结论只写到当前样本和当前宿主版本，不把一次输出胜负写成模型长期优劣。
+推荐记录字段：
+```text
+caseId：
+aiosVersion：
+host：
+model：
+ranAt：
+inputFile：
+triggerPrompt：
+skillTriggered：
+skillRuntimeConfirmed：是 / 否 / 不确定
+outputFile：
+notes：
+```
+判读口径：
+- 如果输出缺少资料来源、主分析表 / 台账、资料缺口、人工复核或 AI 不应下结论事项，优先判断为宿主遵从度或 Skill 加载问题。
+- 如果结构完整但行业术语、责任边界、工程语境或表格细度明显不足，再进入模型适配和中文工程语境能力讨论。
+- 如果宿主无法确认真实 Skill 触发，只能标为“疑似便携提示词效果”，不能归入 `skill-runtime`。
 weak/basic 成对运行后，用 run results 文件归档 12 条结果：
 ```bash

package/prompts/evaluations/engineering-document-writing-fixtures.json ADDED Viewed

@@ -0,0 +1,129 @@
+{
+  "schema": 1,
+  "name": "engineering-document-writing-fixtures",
+  "version": "0.1",
+  "dataBoundary": "De-identified engineering document writing fixtures. Customer names, people, locations, dates, amounts, project numbers, and raw source documents are synthetic or abstracted.",
+  "sourceBoundary": "Source signals preserve writing-task shape only: source normalization, historical material reuse, draft generation, and review-gate handoff.",
+  "cases": [
+    {
+      "id": "tender-writing-historical-material-reuse",
+      "skillId": "aios-tender-write",
+      "promptPath": "skills/aios-tender-write/prompts/basic-prompt.md",
+      "scenario": "基于招标要求、评分点、用户初稿和历史标书素材生成技术标章节工作母版",
+      "sourceSignals": [
+        "source-shape:technical bid writing task with scoring points and historical material snippets",
+        "source-shape:user draft needs rewriting into a markdown working draft",
+        "boundary-shape:historical project facts cannot be copied into the new project without evidence"
+      ],
+      "advisoryComparison": [
+        "普通提示词容易直接扩写成顺滑正文，忽略来源链和本项目事实缺口。",
+        "写作型基础提示词要求先建立资料来源、写作 brief、素材复用判断和审核门禁。"
+      ],
+      "weakPrompt": "请根据这些招标要求和历史标书，帮我写一版技术标章节。",
+      "inputSummary": [
+        "输入包含技术标要求、评分点摘要、用户初稿和历史标书片段。",
+        "历史标书含有旧项目名称、工期、人员和设备表述，需要先判断是否可复用。",
+        "缺少本项目图纸、组织机构、资源计划和证明材料时，只能输出待补占位。"
+      ],
+      "sampleInput": [
+        "资料类型：技术标章节生成任务。",
+        "招标要求摘要：需说明施工总体部署、进度计划、质量安全文明施工和资源配置；评分点要求章节与措施一一对应。",
+        "用户初稿：希望复用历史项目中的总体部署和安全文明施工段落，但本项目施工组织、设备、人员和工期尚未确认。",
+        "历史素材片段：旧项目写有项目名称、管理人员、设备型号、工期安排和现场布置描述。",
+        "资料缺口：未提供招标原文页码、图纸、项目组织架构、资源计划、证明材料和最终投标格式要求。"
+      ],
+      "requiredPromptTerms": [
+        "Markdown",
+        "复用级别",
+        "评分点到章节映射",
+        "审核门禁交接",
+        "aios-commercial-tender"
+      ],
+      "expectedOutputShape": [
+        "先列资料来源和写作 brief。",
+        "把历史素材按可复用改写、仅参考、不可套用、需人工确认分级。",
+        "输出评分点到章节映射和 Markdown 章节初稿。",
+        "缺少依据时写待补占位，并交给审核 Skill 复核。"
+      ],
+      "expectedStrongSections": [
+        "资料来源清单",
+        "写作 brief",
+        "历史素材匹配表",
+        "评分点到章节映射",
+        "章节初稿 / 改写稿",
+        "审核门禁交接"
+      ],
+      "weakFailureModes": [
+        "直接套用旧项目事实。",
+        "不区分评分点、章节草稿和待补资料。",
+        "把生成内容当成正式投标定稿。"
+      ],
+      "bannedClaims": [
+        "中标概率",
+        "评标结论",
+        "最终投标决策"
+      ]
+    },
+    {
+      "id": "scheme-writing-expert-comments",
+      "skillId": "aios-scheme-write",
+      "promptPath": "skills/aios-scheme-write/prompts/basic-prompt.md",
+      "scenario": "基于方案初稿、历史方案、工程概况和专家意见生成专项施工方案章节工作母版",
+      "sourceSignals": [
+        "source-shape:special construction scheme writing task with expert comments and historical scheme material",
+        "source-shape:scheme draft needs rewriting into a markdown working draft",
+        "boundary-shape:calculation correctness and approval outcomes require professional review"
+      ],
+      "advisoryComparison": [
+        "普通提示词容易把专家意见改成完整正文，并默认计算、附图和审批已经满足。",
+        "写作型基础提示词要求保留危险源、计算书缺口、专家意见回查和审核门禁。"
+      ],
+      "weakPrompt": "请根据方案初稿、专家意见和历史方案，帮我写一版专项施工方案。",
+      "inputSummary": [
+        "输入包含工程概况、专项施工方案初稿、历史方案素材和专家意见。",
+        "历史方案中的旧项目条件、设备参数和验收口径不能自动套入新项目。",
+        "缺少计算书、附图、危大工程审批或专业复核时，只能输出待补占位。"
+      ],
+      "sampleInput": [
+        "资料类型：专项施工方案章节生成任务。",
+        "工程概况摘要：本项目需要补充关键工序、危险源控制、检查验收、应急措施和交底要点。",
+        "专家意见摘要：需回查危险源识别、计算书目录、附图一致性、验收要求和应急措施。",
+        "历史方案片段：旧项目包含设备型号、荷载参数、平面布置和验收流程。",
+        "资料缺口：未提供本项目计算书全文、附图、设备厂家资料、专项审批记录和安全负责人确认意见。"
+      ],
+      "requiredPromptTerms": [
+        "Markdown",
+        "复用级别",
+        "方案章节结构",
+        "危险源和控制措施占位表",
+        "aios-construction-scheme"
+      ],
+      "expectedOutputShape": [
+        "先列资料来源和写作 brief。",
+        "把历史方案按可复用改写、仅参考、不可套用、需人工确认分级。",
+        "输出方案章节结构、Markdown 章节初稿、危险源占位表和待补资料。",
+        "不确认计算正确或审批通过，并交给审核 Skill 复核。"
+      ],
+      "expectedStrongSections": [
+        "资料来源清单",
+        "写作 brief",
+        "历史素材匹配表",
+        "方案章节结构",
+        "方案章节初稿 / 改写稿",
+        "危险源和控制措施占位表",
+        "审核门禁交接"
+      ],
+      "weakFailureModes": [
+        "直接套用旧项目参数和设备配置。",
+        "忽略专家意见、危险源和计算书缺口。",
+        "把方案初稿写成已经通过审批的定稿。"
+      ],
+      "bannedClaims": [
+        "方案合格",
+        "计算正确",
+        "专家论证通过",
+        "审批通过"
+      ]
+    }
+  ]
+}

package/prompts/evaluations/engineering-document-writing-scorecard.json ADDED Viewed

@@ -0,0 +1,108 @@
+{
+  "schema": 1,
+  "name": "engineering-document-writing-scorecard",
+  "version": "0.1",
+  "fixture": "prompts/evaluations/engineering-document-writing-fixtures.json",
+  "dataBoundary": "Structured comparison derived from de-identified engineering document writing fixtures. Do not include customer names, contacts, project names, exact amounts, or raw source documents.",
+  "minimumWeightedDelta": 1.2,
+  "criteria": [
+    {
+      "id": "source_traceability",
+      "weight": 20,
+      "description": "Keeps source list, provenance comments, and missing evidence visible instead of writing unsupported facts."
+    },
+    {
+      "id": "material_reuse",
+      "weight": 18,
+      "description": "Classifies historical material reuse and prevents old project facts, parameters, people, equipment, dates, or claims from being copied."
+    },
+    {
+      "id": "draft_actionability",
+      "weight": 17,
+      "description": "Produces a Markdown working draft and tables that engineering staff can continue editing."
+    },
+    {
+      "id": "boundary_safety",
+      "weight": 18,
+      "description": "Avoids tender decisions, approval outcomes, calculation correctness, compliance conclusions, and other over-claims."
+    },
+    {
+      "id": "review_gate",
+      "weight": 17,
+      "description": "Hands the generated draft to the correct review Skill with concrete issues to verify."
+    },
+    {
+      "id": "handoff_readability",
+      "weight": 10,
+      "description": "Keeps the output readable for project, tender, technical, safety, and reviewer roles."
+    }
+  ],
+  "cases": [
+    {
+      "caseId": "tender-writing-historical-material-reuse",
+      "winner": "basic",
+      "decisionBasis": "The basic writing prompt turns loose source material into a traceable Markdown workbench with scoring-point mapping, material reuse judgment, blocked old facts, and a tender review gate.",
+      "observedWeakFailures": [
+        "直接套用旧项目事实。",
+        "不区分评分点、章节草稿和待补资料。",
+        "把生成内容当成正式投标定稿。"
+      ],
+      "basicPromptGains": [
+        "Keeps source list, writing brief, and historical material matching table before drafting.",
+        "Requires reuse levels and removes old project facts.",
+        "Hands scoring point response and bid-risk review to aios-commercial-tender."
+      ],
+      "weakScores": {
+        "source_traceability": 2,
+        "material_reuse": 1,
+        "draft_actionability": 3,
+        "boundary_safety": 2,
+        "review_gate": 1,
+        "handoff_readability": 3
+      },
+      "basicScores": {
+        "source_traceability": 5,
+        "material_reuse": 5,
+        "draft_actionability": 4,
+        "boundary_safety": 5,
+        "review_gate": 5,
+        "handoff_readability": 4
+      }
+    },
+    {
+      "caseId": "scheme-writing-expert-comments",
+      "winner": "basic",
+      "decisionBasis": "The basic writing prompt keeps calculations, drawings, expert comments, hazard controls, and approvals as evidence-bound placeholders, then routes the draft through the scheme review gate.",
+      "observedWeakFailures": [
+        "直接套用旧项目参数和设备配置。",
+        "忽略专家意见、危险源和计算书缺口。",
+        "把方案初稿写成已经通过审批的定稿。"
+      ],
+      "basicPromptGains": [
+        "Separates reusable wording from project-specific equipment and calculation facts.",
+        "Forces hazard-control placeholders and missing calculation evidence into the draft.",
+        "Hands calculation gaps, expert comments, and safety review to aios-construction-scheme."
+      ],
+      "weakScores": {
+        "source_traceability": 2,
+        "material_reuse": 1,
+        "draft_actionability": 3,
+        "boundary_safety": 1,
+        "review_gate": 1,
+        "handoff_readability": 3
+      },
+      "basicScores": {
+        "source_traceability": 5,
+        "material_reuse": 5,
+        "draft_actionability": 4,
+        "boundary_safety": 5,
+        "review_gate": 5,
+        "handoff_readability": 4
+      }
+    }
+  ],
+  "overallDecision": {
+    "winner": "basic",
+    "notAClaim": "This scorecard validates deterministic fixture design and prompt guardrails; it does not prove real model performance or customer-ready document quality without skill-runtime output review."
+  }
+}

package/prompts/evaluations/skill-runtime/README.md ADDED Viewed

@@ -0,0 +1,15 @@
+# Skill Runtime Evidence
+本目录保存真实宿主触发 Skill 的证据归档。
+`skill-runtime` 只能来自宿主工具实际加载并触发 `$aios-*` Skill 后的原始输出，不接受把 `SKILL.md` 或 `basic-prompt.md` 当普通 prompt 粘贴运行的结果。
+每次归档至少包含：
+- 宿主名称、版本或可用性检查。
+- AIOS 版本或 commit。
+- 输入样板路径。
+- 触发短指令。
+- 是否能确认真实 Skill 触发。
+- 原始输出全文路径。
+- 若无法确认，写明 blocker，不得伪造 raw output。

package/prompts/evaluations/skill-runtime/raw/codex-scheme-writing-sample.md ADDED Viewed

@@ -0,0 +1,111 @@
+# Codex Raw Output: aios-scheme-write
+Host: Codex desktop current thread
+Captured at: 2026-06-19
+Skill loaded from: `C:\Users\white\.codex\skills\aios-scheme-write\SKILL.md`
+Triggered skill: `aios-scheme-write`
+Expected gate skill: `aios-construction-scheme`
+Sample workspace: `templates/document-writing-samples/scheme`
+Trigger prompt:
+```text
+$aios-scheme-write 请基于该工作台生成或改写 draft.md，并保留交给 aios-construction-scheme 的审核门禁。
+```
+## 1. 写作任务判断
+这是专项施工方案章节生成 / 改写任务，输入包含工程概况摘要、专家意见摘要、历史方案素材和素材复用索引，适用 `aios-scheme-write`。本输出只形成 Markdown 工作母版，不输出“方案合格 / 不合格 / 已通过审查”，也不替代总工、专家论证、危大工程审批、结构计算、监理或安全负责人。
+## 2. 资料来源清单
+| 序号 | 资料 | 版本 / 定位 | 当前用途 | 缺口 |
+|---:|---|---|---|---|
+| 1 | 工程概况摘要 | 示例版，方案概况 | 形成关键工序、危险源、验收和交底章节结构 | 缺少图纸编号 |
+| 2 | 专家意见摘要 | 示例版，修改意见 | 回查危险源、计算书、附图、验收和交底要点 | 缺少专家签认页 |
+| 3 | 历史方案素材 | 示例版，设备参数、验收流程 | 判断可复用章节表达 | 旧项目设备参数、图号和验收口径不得直接套用 |
+## 3. 写作 brief
+| 项目 | 内容 |
+|---|---|
+| 目标章节 | 关键工序、危险源控制、检查验收和交底要点 |
+| 输出格式 | Markdown 工作母版 |
+| 写作口径 | 只使用已提供资料，缺证据处保留 `待补` / `需核验` / `转人工复核` |
+| 禁止结论 | 不输出方案合格、计算正确、专家论证通过、审批通过或现场安全满足 |
+| 人工复核 | 技术负责人、总工、安全负责人、质量负责人、资料负责人 |
+## 4. 历史素材匹配表
+| 素材来源 | 原章节 / 原段落 | 本次可用位置 | 复用级别 | 需要删除的旧项目事实 | 需要补充的本项目证据 | 建议复核岗位 |
+|---|---|---|---|---|---|---|
+| 历史方案素材 | 关键工序描述 | 工艺流程 | 可复用改写 | 旧项目部位名称 | 本项目工序划分 | 技术负责人 |
+| 历史方案素材 | 设备参数 | 计算和设备能力 | 不可套用 | 设备型号、荷载参数 | 本项目设备资料和计算书 | 总工 |
+| 历史方案素材 | 验收流程 | 检查验收章节 | 仅参考 | 旧项目验收口径 | 本项目规范依据和验收要求 | 质量负责人 |
+## 5. 方案章节结构
+1. 工程概况与编制边界
+2. 施工准备与关键工序
+3. 危险源识别与控制措施
+4. 检查验收与资料留痕
+5. 应急措施
+6. 班组交底要点
+7. 待补资料和人工复核清单
+## 6. 施工方案 Markdown 初稿 / 改写稿
+### 工程概况与编制边界
+本方案章节基于已提供的工程概况摘要、专家意见摘要和历史方案素材进行工作母版整理。当前资料能够支持章节框架、回查清单和通用管理表达，但不能支持具体图纸编号、设备参数、荷载取值、计算结论或审批状态。
+图纸编号、计算书全文、设备厂家资料、专项审批记录和安全负责人确认意见均应作为 `待补` 项列入后续复核。
+### 关键工序
+关键工序章节建议围绕施工准备、工艺流程、过程控制、检查验收和应急处置展开。历史方案中的工艺流程表达可参考并改写，但必须删除旧项目部位名称、设备型号、参数和图号。
+建议正文写法：
+> 施工前应完成技术交底、作业条件确认、设备材料进场核查和安全防护准备。施工过程中按工序划分设置质量控制点、安全控制点和验收节点。具体工序划分、图纸编号、设备参数和检查频次待本项目资料补齐后确认。
+### 危险源控制
+危险源控制章节应把专家意见中的“危险源识别覆盖关键工序”拆成清单。每项危险源需绑定已有资料依据、控制措施、检查频次、责任岗位和待补资料；缺少本项目资料时不得写成已确认安全措施。
+## 7. 危险源和控制措施占位表
+| 序号 | 危险源 / 回查点 | 当前依据 | 控制措施草稿 | 待补资料 | 建议复核岗位 |
+|---:|---|---|---|---|---|
+| 1 | 关键工序危险源识别 | 专家意见摘要 | 按工序拆分危险源、控制措施和责任岗位 | 本项目工序划分 | 安全负责人 |
+| 2 | 设备能力和参数 | 历史方案素材提示有设备参数 | 不写具体参数，仅列复核项 | 设备厂家资料、计算书全文 | 总工 |
+| 3 | 检查验收 | 历史方案验收流程仅参考 | 建立检查验收清单和资料留痕 | 本项目规范依据、验收要求 | 质量负责人 |
+## 8. 待补资料和人工复核清单
+| 序号 | 待补内容 | 影响章节 | 建议复核岗位 |
+|---:|---|---|---|
+| 1 | 当前图纸编号 | 工程概况、附图引用 | 技术负责人 |
+| 2 | 计算书全文 | 设备能力和安全系数 | 总工 |
+| 3 | 设备厂家资料 | 机械设备参数 | 安全负责人 |
+| 4 | 专家签认页和专项审批记录 | 专家意见回查、审批状态说明 | 项目负责人 / 资料负责人 |
+## 9. 审核门禁交接
+请将本 `draft.md` 交给 `aios-construction-scheme` 复核，重点检查：
+- 危险源、控制措施、交底要点是否有方案原文或本项目条件依据。
+- 规范、地方规程、危大工程判断是否缺少工具、项目资料或专家证据。
+- 计算书、荷载、稳定、变形、设备能力和安全系数是否需要 `aios-structural` 或项目计算书复核。
+- 是否把历史方案中的旧项目地点、参数、工艺、设备、图号或工期写成未经证实的本次事实。
+- 是否存在方案合格、计算正确、专家论证通过、审批通过或现场安全满足等越权结论。
+## 10. 输出自检
+| 检查项 | 结果 |
+|---|---|
+| 是否包含资料来源清单 | 是 |
+| 是否保留素材复用级别 | 是 |
+| 是否保留 `待补` / `需核验` / `转人工复核` | 是 |
+| 是否避免输出合格、计算正确或审批通过结论 | 是 |
+| 是否明确 `aios-construction-scheme` 审核门禁 | 是 |
+| 是否把 AI 草稿标为正式定稿 | 否 |