npm - @fitlab-ai/agent-infra - Versions diffs - 0.7.4 → 0.7.6 - Mend

@fitlab-ai/agent-infra 0.7.4 → 0.7.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (119) hide show

package/templates/.agents/rules/review-handshake.zh-CN.md ADDED Viewed

@@ -0,0 +1,97 @@
+# 双向审查握手协议
+> 三阶段（analysis / plan / code）的执行方与检视方在执行 `review-*` 与 `*-task` 技能时共用本协议。
+> 这是协议的**单一事实源**；各 SKILL 只 `Read` 本文件，不重复抄写词表。
+## 核心原则
+- **检视意见是待验证输入，不是执行命令**。执行方必须逐条核实后再处置，不默认认账、不盲目反驳。
+- **对称证据负担**：无论接受还是反驳，每条处置都要附**相称证据**。"接受"不是零成本默认路径。
+- **达成一致再推进**：存在未关闭分歧、替代修法、无法判断或 review 后新增提交时，不得静默进入下一阶段、归档或合并。
+## 执行方四态处置（`*-task` 技能，Round ≥ 2 响应上一轮审查时）
+对上一轮 `review-*` 的每条 finding，先 Read/Grep 核实其引用的 `file:line` / 命令，再落一个状态：
+| 状态 | 含义 | 必附证据 |
+|------|------|----------|
+| `accepted` | 成立，将按建议修复 | 指向修复点的 `file:line` 或本轮将施加的改动说明 |
+| `adjusted` | 成立，但采用替代修法 | 替代修法说明 + 为何更优；待检视方确认 |
+| `refuted` | 核实后判定不成立 / 幻觉 / 基于错误 `file:line` | 反证（`file:line` 或命令原文）；待检视方确认 |
+| `cannot-judge` | 证据不足，无法判断 | 已尝试的核实路径；交检视方/人工 |
+## 检视方回交义务（`review-*` 技能，对执行方响应复核时）
+执行方给出 `adjusted` / `refuted` / `cannot-judge` 后，检视方必须逐条回应，不得复读原意见或无视：
+- **撤回 finding** → 账本置 `confirmed`（接受反驳）。
+- **接受替代修法** → 账本置 `confirmed`。
+- **补充新证据后坚持** → 账本置回 `open`（带新证据，回到执行方）。
+- **升级人工裁决** → 账本置 `needs-human-decision`。
+## 收敛终止语义（防死循环）
+- 单条 finding 的握手轮次上限 `MAX_HANDSHAKE_ROUNDS`，默认 **3**，可在 `.agents/.airc.json` 的 `review.maxHandshakeRounds` 覆盖。
+- 某条 finding 的 `round` 达到上限仍未进入终态，必须强制置 `needs-human-decision`；gate 会拦截"达限却未升级"的行。
+- `needs-human-decision` 持续阻塞完成，直到人工在 task.md `## 人工裁决` 段记录裁定并把该行翻为 `human-decided`。
+## 同源模型收敛偏差缓解（文档级纪律）
+执行方与检视方常由相近模型承担，天然容易互相同意。检视时遵守：
+1. **先看证据再看结论**：先读 `git diff` / 产物本体并独立形成 findings，**再**读执行方的结论与响应，避免被其结论锚定。
+2. **默认怀疑框架**：把"看起来没问题"视为未验证；每条放行都要有可复现证据支撑（见各 `review-*` 的 `证据原文` 段硬门禁）。
+> 唯一的机械杠杆是**对称证据 gate**（账本非 `open` 行必须有证据）；模型同源性本身不可机械校验，故本节为纪律而非门禁。
+## 机械账本（task.md `## 审查分歧账本`）
+分歧状态的**单一事实源**是 task.md 的固定段 `## 审查分歧账本`，单张可解析表。阶段推进与 `complete-task` 的 gate 读取本段。
+```markdown
+## 审查分歧账本
+<!-- 每条 review finding 一行；状态机/证据规则见 .agents/rules/review-handshake.md。阶段推进与 complete-task gate 读取本段。 -->
+| id | stage | round | severity | status | evidence |
+|----|-------|-------|----------|--------|----------|
+| CD-1 | code | 1 | blocker | open | review-code.md#1 |
+```
+- `id`：阶段前缀 + 序号——analysis→`AN-`、plan→`PL-`、code→`CD-`；执行方自提的人工裁决行使用 `HD-`。
+- `stage` ∈ `{analysis, plan, code}`（外加保留值 `post-review-commit`，仅用于 post-review 豁免行）。
+- `status` 合法枚举：`open` / `accepted` / `adjusted` / `refuted` / `cannot-judge` / `confirmed` / `needs-human-decision` / `closed` / `human-decided`。
+- **终态集合（gate 放行）**：`{confirmed, closed, human-decided}`；其余为阻塞态。
+- **写入责任**：`review-*` 提 finding → upsert `open` 行；`*-task` 响应 → 改四态并填 `evidence`、`round` +1；下一轮 `review-*` → `confirmed` / 置回 `open` / `needs-human-decision`；执行方修复经下一轮 review 验证通过 → `closed`；人工裁决 → `human-decided`。
+- **向后兼容**：task.md 无此段时，gate 视为无未决分歧而放行。
+### 执行方自提人工裁决行
+当执行方在产物 `## 未决问题` 中标记 `[needs-human-decision]` 时，必须在 task.md `## 审查分歧账本` upsert 对应 `HD-` 行：
+```markdown
+| HD-1 | plan | - | decision | needs-human-decision | plan.md#HD-1 |
+```
+- `stage` 填该决策产生的阶段：`analysis` / `plan` / `code`。
+- `round` 填 `-`，因为它不是 review finding 的握手轮次。
+- `severity` 固定填 `decision`。
+- `status` 初始填 `needs-human-decision`，因此会被现有 gate 阻塞。
+- 人工在 task.md `## 人工裁决` 段记录裁定后，把对应 `HD-` 行翻为 `human-decided`，`evidence` 指向该裁定记录。
+## post-review commit 门禁（仅 code 阶段）
+- `review-code` 在最高轮报告中记录 `审查基线提交`（R，`git rev-parse HEAD`）和 `审查差异指纹`（F，完整工作区 diff fingerprint）。
+- `commit` 只读取最高轮 `review-code` 产物；当该产物 Approved、提交前 HEAD 等于 R、且 staged diff fingerprint 等于 F 时，在 task.md 写入 `last_reviewed_commit`（B，新提交 SHA）。
+- `complete-task` 的 `post-review-commit` gate 优先使用 B；B 缺失或非法时回退最高轮 `review-code` 的 R。
+- 若 B / R 之后代码 / 规则路径出现新提交，gate 会拦截，要求重新 `review-code`。
+- **豁免**：在账本追加一行 `| PRC-1 | post-review-commit | - | - | human-decided | <裁定说明> |`，记录人工明确允许该批提交免复审。
+## gate 行为速查
+| 调用方 | `review-ledger` 作用域 | `post-review-commit` |
+|--------|------------------------|----------------------|
+| `plan-task` | 仅 `analysis` 阶段行须终态 | 不挂 |
+| `code-task` | `analysis` + `plan` 阶段行须终态 | 不挂 |
+| `complete-task` | 全部阶段行须终态 | 挂（见上） |
+| `analyze-task` | 不挂（首阶段） | 不挂 |

package/templates/.agents/rules/task-management.en.md CHANGED Viewed

@@ -37,3 +37,28 @@ Map user intent to the corresponding workflow command:
 - `complete-task`: update `status`, `current_step`, `completed_at`, `updated_at`, `agent_infra_version`
 - `block-task`: update `status`, `blocked_at`, `blocked_reason`, `updated_at`, `agent_infra_version`
 - `cancel-task`: update `status`, `cancelled_at`, `cancel_reason`, `updated_at`, `agent_infra_version`
+## Activity Log started / done dual-marker convention (single source of truth)
+> This section is the sole authoritative definition of the started/done dual marker. The skills, the renderer (`lib/task/commands/log.ts`), and the validator (`.agents/scripts/validate-artifact.js`) all defer to it; keep this section in sync when changing any of them.
+**Line grammar is unchanged**: both started and done use the existing entry grammar `- {YYYY-MM-DD HH:mm:ss±HH:MM} — **{action}** by {agent} — {note}`, so the parsing regexes (`log.ts:ENTRY_RE` and `validate-artifact.js:ACTIVITY_LOG_PATTERN`) need no change.
+- **started line** (written when the step begins): the action suffixes the existing base with ` [started]`, note is `started`:
+  `- {time} — **{base} [started]** by {agent} — started`
+- **done line** (written when the step completes, unchanged from today): the action is the base itself:
+  `- {time} — **{base}** by {agent} — {completion summary}`
+- `{base}` is that skill's existing done action text, including `(Round {N})` (e.g. `Plan Task (Round 1)`). started and done must share the same `{base}` to pair.
+**Pairing and rendering** (`ai task log`): a started entry pairs with the next same-`{base}` done entry onto one row (repeated executions of the same base pair FIFO by ascending time). The STARTED column shows the start time, DONE the completion time; started with no done = in progress (DONE shows `(in progress)`); done with no started (legacy logs) = a standalone completed row. All three shapes are valid and never error.
+**Gate** (`checkActivityLog`): when computing the "latest action / freshness" it skips `[started]` lines (ascending-order and format checks still cover every line), so a started marker never satisfies a skill's `expected_action_pattern`.
+**Skills that write started**: every workflow skill that **appends entries to a task's `## Activity Log`** writes started, so the STARTED column stays uniformly complete across the whole `ai task log` table. Two forms, depending on whether task.md already exists:
+- **Standard form (task.md already exists)** — append the started line when that round's real work begins (after prerequisites, before the first artifact action) and the done line on completion:
+  `analyze-task`, `plan-task`, `code-task`, `review-analysis`, `review-plan`, `review-code`, `commit`, `complete-task`, `create-pr`, `watch-pr`, `block-task`, `cancel-task`, `restore-task`, `close-codescan`, `close-dependabot`.
+- **Deferred form (the skill creates task.md, so there is no file to write to at the start)** — capture `started_at` in memory before running, then when writing the Activity Log at the end, **append both lines at once** (started line uses `started_at`, done line uses the completion time):
+  `create-task`, `import-issue`, `import-codescan`, `import-dependabot`.
+**Exceptions**: read-only inspection skills that do not represent real progress (e.g. `check-task`) do not write started. A bare operation with no task.md context (e.g. a `commit` not tied to a task) likewise skips it.

package/templates/.agents/rules/task-management.zh-CN.md CHANGED Viewed

@@ -37,3 +37,32 @@
 - `complete-task`：更新 `status`、`current_step`、`completed_at`、`updated_at`、`agent_infra_version`
 - `block-task`：更新 `status`、`blocked_at`、`blocked_reason`、`updated_at`、`agent_infra_version`
 - `cancel-task`：更新 `status`、`cancelled_at`、`cancel_reason`、`updated_at`、`agent_infra_version`
+## Activity Log started / done 双标记约定（单一事实源）
+> 本节是 started/done 双标记的唯一权威定义。各 SKILL、渲染器（`lib/task/commands/log.ts`）、
+> 校验脚本（`.agents/scripts/validate-artifact.js`）的相关行为都以本节为准；改动任一端时同步本节。
+**行语法不变**：started 与 done 都沿用既有条目语法
+`- {YYYY-MM-DD HH:mm:ss±HH:MM} — **{action}** by {agent} — {note}`，因此解析正则
+（`log.ts:ENTRY_RE` 与 `validate-artifact.js:ACTIVITY_LOG_PATTERN`）无需改动。
+- **started 行**（步骤开始时写）：action 在既有基名末尾加后缀 ` [started]`，note 用 `started`：
+  `- {time} — **{基名} [started]** by {agent} — started`
+- **done 行**（步骤完成时写，与现状一致）：action 即基名本身：
+  `- {time} — **{基名}** by {agent} — {完成说明}`
+- `{基名}` 指该 SKILL 既有 done 条目的 action 文本，含 `(Round {N})`（如 `Plan Task (Round 1)`）。
+  started 与 done 共用同一 `{基名}` 才能配对。
+**配对与渲染**（`ai task log`）：按 `{基名}` 把 started 与其后最近的同名 done 配成一行（同基名多次执行按时间升序 FIFO 配对）。STARTED 列显示 started 时间、DONE 列显示 done 时间；只有 started 无 done = 进行中（DONE 显示 `(in progress)`）；只有 done 无 started（历史日志）= 单态完成行。三种形态都合法、不报错。
+**gate**（`checkActivityLog`）：计算「最新 action / freshness」时跳过 `[started]` 行（升序与格式校验仍覆盖全部行），故 started 标记不会污染各 SKILL 的 `expected_action_pattern`。
+**写 started 的 SKILL**：所有**会向某个任务的 `## 活动日志` 追加条目**的工作流 SKILL 都写 started，保证 `ai task log` 整张表的 STARTED 列一致完整。两种写法按技能是否已有 task.md 区分：
+- **常规写法（task.md 已存在）**——在「该轮实质工作开始时」（前置条件确认后、第一个产出动作前）追加 started 行，完成时写 done 行：
+  `analyze-task`、`plan-task`、`code-task`、`review-analysis`、`review-plan`、`review-code`、`commit`、`complete-task`、`create-pr`、`watch-pr`、`block-task`、`cancel-task`、`restore-task`、`close-codescan`、`close-dependabot`。
+- **延迟补写（本技能创建 task.md，开始时无文件可写）**——开始执行前先在内存记录 `started_at`，最后写活动日志时**一次性补两条**（started 行用 `started_at`、done 行用完成时间）：
+  `create-task`、`import-issue`、`import-codescan`、`import-dependabot`。
+**例外**：`check-task` 等只读巡检类、不代表实质工作推进的技能不写 started。无 task.md 上下文的纯操作（如无关联任务的 `commit`）同样跳过。

package/templates/.agents/scripts/lib/post-review-commit.js ADDED Viewed

@@ -0,0 +1,56 @@
+import fs from "node:fs";
+import path from "node:path";
+import { artifactName, maxRound } from "./review-artifacts.js";
+export const DEFAULT_POST_REVIEW_GLOBS = [
+  ".agents/skills",
+  ".agents/scripts",
+  ".agents/rules",
+  ".agents/workflows",
+  "bin",
+  "lib",
+  "src",
+  "templates"
+];
+export function resolvePostReviewGlobs(config = {}, reviewConfig = {}) {
+  if (Array.isArray(config.post_review_globs)) {
+    return config.post_review_globs;
+  }
+  if (Array.isArray(reviewConfig.post_review_globs)) {
+    return reviewConfig.post_review_globs;
+  }
+  return DEFAULT_POST_REVIEW_GLOBS;
+}
+export function findAuthoritativeReviewCodeArtifact(taskDir) {
+  const entries = fs.existsSync(taskDir) ? fs.readdirSync(taskDir) : [];
+  const round = maxRound(entries, "review-code");
+  if (round === 0) {
+    return { ok: false, round: 0, fileName: null, path: null };
+  }
+  const fileName = artifactName("review-code", round);
+  return {
+    ok: true,
+    round,
+    fileName,
+    path: path.join(taskDir, fileName)
+  };
+}
+export function extractReviewBaseline(content) {
+  const match = String(content).match(/^[-*]?\s*\*\*(?:审查基线提交|Review Baseline Commit)\*\*[:：]\s*(.*?)\s*$/m);
+  return match ? match[1].trim().replace(/`/g, "") : "";
+}
+export function extractReviewDiffFingerprint(content) {
+  const match = String(content).match(/^[-*]?\s*\*\*(?:审查差异指纹|Reviewed Diff Fingerprint)\*\*[:：]\s*(.*?)\s*$/m);
+  return match ? match[1].trim().replace(/`/g, "") : "";
+}
+export function parseReviewVerdict(content) {
+  const match = String(content).match(/^[-*]?\s*\*\*(?:总体结论|Overall Verdict)\*\*[:：]\s*(.*?)\s*$/m);
+  return match ? match[1].trim() : "";
+}

package/templates/.agents/scripts/lib/review-artifacts.js ADDED Viewed

@@ -0,0 +1,117 @@
+// Shared helpers for review-artifact parsing.
+// Imported by both .agents/skills/code-task/scripts/detect-mode.js and
+// .agents/scripts/validate-artifact.js so the round/verdict vocabulary stays
+// in a single source of truth (prevents the cross-file drift this lifecycle
+// is designed to eliminate).
+import fs from "node:fs";
+import path from "node:path";
+export function escapeRegExp(value) {
+  return value.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
+}
+export function maxRound(entries, stem) {
+  let max = 0;
+  for (const entry of entries) {
+    if (entry === `${stem}.md`) {
+      max = Math.max(max, 1);
+      continue;
+    }
+    const match = entry.match(new RegExp(`^${escapeRegExp(stem)}-r(\\d+)\\.md$`));
+    if (match) {
+      max = Math.max(max, Number(match[1]));
+    }
+  }
+  return max;
+}
+export function artifactName(stem, round) {
+  return round === 1 ? `${stem}.md` : `${stem}-r${round}.md`;
+}
+export function normalizeVerdict(raw) {
+  const value = String(raw).trim().toLowerCase();
+  if (value === "通过" || value === "approved") {
+    return "Approved";
+  }
+  if (value === "需要修改" || value === "changes requested") {
+    return "Changes Requested";
+  }
+  if (value === "拒绝" || value === "rejected") {
+    return "Rejected";
+  }
+  return "";
+}
+export function extractSection(content, names) {
+  const lines = content.split(/\r?\n/);
+  const nameSet = new Set(names);
+  const start = lines.findIndex((line) => {
+    const match = line.trim().match(/^##\s+(.+?)\s*$/);
+    return match ? nameSet.has(match[1]) : false;
+  });
+  if (start === -1) {
+    return "";
+  }
+  const sectionLines = [];
+  for (let index = start + 1; index < lines.length; index += 1) {
+    if (/^##\s+/.test(lines[index])) {
+      break;
+    }
+    sectionLines.push(lines[index]);
+  }
+  return sectionLines.join("\n");
+}
+// Parse the canonical verdict out of a review-* artifact.
+// Returns { ok, verdict, message }. Verdict collapses Approved into
+// "Approved-with-issues" when the findings counts are non-zero.
+export function parseVerdict(reviewPath) {
+  if (!fs.existsSync(reviewPath)) {
+    return { ok: false, verdict: null, message: `Review artifact not found: ${path.basename(reviewPath)}` };
+  }
+  const content = fs.readFileSync(reviewPath, "utf8");
+  const summary = extractSection(content, ["审查摘要", "Review Summary"]);
+  const fileName = path.basename(reviewPath);
+  if (!summary) {
+    return { ok: false, verdict: null, message: `cannot locate review summary section in ${fileName}` };
+  }
+  const verdictMatch = summary.match(/^[-*]?\s*\*\*(?:总体结论|Overall Verdict)\*\*[:：]\s*(.+?)\s*$/im);
+  if (!verdictMatch) {
+    return { ok: false, verdict: null, message: `cannot parse verdict in ${fileName}` };
+  }
+  const verdict = normalizeVerdict(verdictMatch[1]);
+  if (!verdict) {
+    return {
+      ok: false,
+      verdict: null,
+      message: `unrecognized verdict '${verdictMatch[1].trim()}' in ${fileName}`
+    };
+  }
+  if (verdict !== "Approved") {
+    return { ok: true, verdict };
+  }
+  const findingsMatch = summary.match(/^[-*]?\s*\*\*(?:发现（AI 可处理）|Findings \(AI-actionable\))\*\*[:：]\s*(.+?)\s*$/im);
+  if (!findingsMatch) {
+    return { ok: false, verdict, message: `cannot parse findings count in ${fileName}` };
+  }
+  const counts = findingsMatch[1].match(/(\d+)\s*(?:阻塞项|blockers?).*?(\d+)\s*(?:主要|majors?).*?(\d+)\s*(?:次要|minors?)/i);
+  if (!counts) {
+    return { ok: false, verdict, message: `cannot parse findings count in ${fileName}` };
+  }
+  const [, blockers, majors, minors] = counts.map(Number);
+  return {
+    ok: true,
+    verdict: blockers === 0 && majors === 0 && minors === 0 ? "Approved" : "Approved-with-issues"
+  };
+}

package/templates/.agents/scripts/review-diff-fingerprint.js ADDED Viewed

@@ -0,0 +1,99 @@
+import crypto from "node:crypto";
+import fs from "node:fs";
+import os from "node:os";
+import path from "node:path";
+import process from "node:process";
+import { execFileSync } from "node:child_process";
+import { resolvePostReviewGlobs } from "./lib/post-review-commit.js";
+const repoRoot = execFileSync("git", ["rev-parse", "--show-toplevel"], {
+  cwd: process.cwd(),
+  encoding: "utf8"
+}).trim();
+function usage() {
+  console.error("Usage: node .agents/scripts/review-diff-fingerprint.js <worktree|staged> <baseline>");
+  process.exit(2);
+}
+function git(args, options = {}) {
+  return execFileSync("git", args, {
+    cwd: repoRoot,
+    encoding: options.encoding || "utf8",
+    env: options.env || process.env
+  });
+}
+function loadReviewConfig() {
+  const configPath = path.join(repoRoot, ".agents", ".airc.json");
+  if (!fs.existsSync(configPath)) {
+    return {};
+  }
+  try {
+    return JSON.parse(fs.readFileSync(configPath, "utf8")).review || {};
+  } catch {
+    return {};
+  }
+}
+function splitNull(output) {
+  return output.split("\0").filter((value) => value !== "");
+}
+function hashDiff(args, env = process.env) {
+  const diff = execFileSync("git", args, {
+    cwd: repoRoot,
+    encoding: "buffer",
+    env
+  });
+  return `sha256:${crypto.createHash("sha256").update(diff).digest("hex")}`;
+}
+function worktreeFingerprint(baseline, globs) {
+  const tempDir = fs.mkdtempSync(path.join(os.tmpdir(), "agent-infra-review-index-"));
+  const tempIndex = path.join(tempDir, "index");
+  const env = { ...process.env, GIT_INDEX_FILE: tempIndex };
+  try {
+    execFileSync("git", ["read-tree", baseline], { cwd: repoRoot, env });
+    const tracked = splitNull(git(["diff", "--name-only", "-z", baseline, "--", ...globs]));
+    const untracked = splitNull(git(["ls-files", "-o", "--exclude-standard", "-z", "--", ...globs]));
+    const paths = [...new Set([...tracked, ...untracked])];
+    for (const filePath of paths) {
+      const absolutePath = path.join(repoRoot, filePath);
+      if (fs.existsSync(absolutePath)) {
+        execFileSync("git", ["update-index", "--add", "--", filePath], { cwd: repoRoot, env });
+      } else {
+        execFileSync("git", ["update-index", "--remove", "--", filePath], { cwd: repoRoot, env });
+      }
+    }
+    return hashDiff(["diff", "--cached", "--binary", baseline, "--", ...globs], env);
+  } finally {
+    fs.rmSync(tempDir, { recursive: true, force: true });
+  }
+}
+function stagedFingerprint(baseline, globs) {
+  return hashDiff(["diff", "--cached", "--binary", baseline, "--", ...globs]);
+}
+function main(argv) {
+  const [mode, baseline] = argv;
+  if (!["worktree", "staged"].includes(mode) || !baseline) {
+    usage();
+  }
+  git(["rev-parse", "--verify", `${baseline}^{commit}`]);
+  const globs = resolvePostReviewGlobs({}, loadReviewConfig());
+  const fingerprint = mode === "worktree"
+    ? worktreeFingerprint(baseline, globs)
+    : stagedFingerprint(baseline, globs);
+  process.stdout.write(`${fingerprint}\n`);
+}
+main(process.argv.slice(2));