@groundnuty/macf 0.2.34 → 0.2.36
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/.build-info.json +2 -2
- package/dist/cli/claude-sh.d.ts.map +1 -1
- package/dist/cli/claude-sh.js +13 -2
- package/dist/cli/claude-sh.js.map +1 -1
- package/dist/cli/commands/certs.js +3 -3
- package/dist/cli/commands/certs.js.map +1 -1
- package/dist/cli/commands/init.d.ts.map +1 -1
- package/dist/cli/commands/init.js +6 -2
- package/dist/cli/commands/init.js.map +1 -1
- package/dist/cli/commands/update.d.ts.map +1 -1
- package/dist/cli/commands/update.js +15 -2
- package/dist/cli/commands/update.js.map +1 -1
- package/dist/cli/plugin-fetcher.d.ts +24 -0
- package/dist/cli/plugin-fetcher.d.ts.map +1 -1
- package/dist/cli/plugin-fetcher.js +61 -1
- package/dist/cli/plugin-fetcher.js.map +1 -1
- package/dist/cli/settings-writer.d.ts +34 -5
- package/dist/cli/settings-writer.d.ts.map +1 -1
- package/dist/cli/settings-writer.js +54 -5
- package/dist/cli/settings-writer.js.map +1 -1
- package/dist/cli/version-resolver.d.ts +2 -5
- package/dist/cli/version-resolver.d.ts.map +1 -1
- package/dist/cli/version-resolver.js +5 -19
- package/dist/cli/version-resolver.js.map +1 -1
- package/dist/reconciler/parse-delivered.d.ts +32 -0
- package/dist/reconciler/parse-delivered.d.ts.map +1 -0
- package/dist/reconciler/parse-delivered.js +18 -0
- package/dist/reconciler/parse-delivered.js.map +1 -0
- package/dist/reconciler/parse-processed.d.ts +57 -0
- package/dist/reconciler/parse-processed.d.ts.map +1 -0
- package/dist/reconciler/parse-processed.js +41 -0
- package/dist/reconciler/parse-processed.js.map +1 -0
- package/dist/reconciler/reconcile.d.ts +99 -0
- package/dist/reconciler/reconcile.d.ts.map +1 -0
- package/dist/reconciler/reconcile.js +75 -0
- package/dist/reconciler/reconcile.js.map +1 -0
- package/dist/reconciler/run.d.ts +3 -0
- package/dist/reconciler/run.d.ts.map +1 -0
- package/dist/reconciler/run.js +184 -0
- package/dist/reconciler/run.js.map +1 -0
- package/package.json +2 -2
- package/plugin/rules/coordination.md +23 -14
- package/plugin/rules/mention-routing-hygiene.md +2 -0
- package/plugin/rules/silent-fallback-hazards.md +49 -10
- package/scripts/check-close-keyword.sh +218 -0
- package/scripts/emit-turn-receipt.sh +81 -0
|
@@ -0,0 +1,184 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
/**
|
|
3
|
+
* Route-receipt reconciler entrypoint (groundnuty/macf#444 Option D, piece 4).
|
|
4
|
+
*
|
|
5
|
+
* Invoked by the scheduled `route-reconciler.yml` workflow. Thin I/O glue over
|
|
6
|
+
* the unit-tested pure pieces (reconcile / parse-delivered / parse-processed):
|
|
7
|
+
*
|
|
8
|
+
* 1. DELIVERED — `gh run list` the `agent-router` runs (success, recent) +
|
|
9
|
+
* `gh run view --log` each → parse the `Routed … to <AGENT>` lines.
|
|
10
|
+
* 2. PROCESSED — Tempo `/api/search` for `turn_processed` spans → parse the
|
|
11
|
+
* `(routed_run_id, agent)` attrs.
|
|
12
|
+
* 3. reconcile() → drops (delivered, no receipt, older than the open-threshold).
|
|
13
|
+
* 4. Emit `tempo_ok` + `drops_count` / `drops_json` to `$GITHUB_OUTPUT` for
|
|
14
|
+
* the workflow's open-on-absence / self-close-on-appearance incident steps.
|
|
15
|
+
*
|
|
16
|
+
* Pattern-A safety (false-drop avoidance): any Tempo problem — unreachable,
|
|
17
|
+
* HTTP error, OR a result hitting the `limit` (truncation) — makes the
|
|
18
|
+
* PROCESSED set UNKNOWABLE, which is NOT the same as "zero drops". Treating it
|
|
19
|
+
* as empty would false-OPEN incidents (a truncated set reads as missing
|
|
20
|
+
* receipts) and treating it as `drops_count=0` would false-CLOSE a real
|
|
21
|
+
* incident on a transient outage. So all three paths emit `tempo_ok=false`;
|
|
22
|
+
* the workflow gates BOTH the open AND the self-close steps on
|
|
23
|
+
* `tempo_ok == 'true'`, making a Tempo problem a TRUE no-op (incidents left
|
|
24
|
+
* exactly as they are). Only a verified-complete query opens (real drops) or
|
|
25
|
+
* closes (verified zero).
|
|
26
|
+
*
|
|
27
|
+
* Config (env, set by the workflow):
|
|
28
|
+
* RECONCILER_REPO owner/repo (default $GITHUB_REPOSITORY)
|
|
29
|
+
* ROUTER_WORKFLOW router workflow file (default agent-router.yml)
|
|
30
|
+
* TEMPO_QUERY_ENDPOINT Tempo query base, e.g. http://<tailnet-host>:13200
|
|
31
|
+
* OPEN_THRESHOLD_MIN drop threshold, must exceed busy-turn latency (default 15)
|
|
32
|
+
* LOOKBACK_MIN how far back to scan runs + Tempo (default 120; keep
|
|
33
|
+
* below the window where the router exceeds the run-list
|
|
34
|
+
* cap, else the DELIVERED set truncates → delivered_ok=false)
|
|
35
|
+
* TEMPO_LIMIT Tempo search result cap (default 1000)
|
|
36
|
+
*/
|
|
37
|
+
import { execFileSync } from 'node:child_process';
|
|
38
|
+
import { appendFileSync } from 'node:fs';
|
|
39
|
+
import { reconcile } from './reconcile.js';
|
|
40
|
+
import { parseDeliveredFromLog } from './parse-delivered.js';
|
|
41
|
+
import { parseProcessedFromTempo, receiptsIfComplete } from './parse-processed.js';
|
|
42
|
+
const MIN = 60_000;
|
|
43
|
+
/** `gh run list` page cap. If the router bursts past this in the lookback
|
|
44
|
+
* window the DELIVERED set truncates silently (Pattern A on the delivered
|
|
45
|
+
* side: a missed delivery → its receipt is never looked for → a real drop is
|
|
46
|
+
* silently NOT flagged). We WARN when the cap is hit so the symmetry is
|
|
47
|
+
* observable, but don't abort: delivered-side truncation under-reports drops
|
|
48
|
+
* (false negative), which is far less harmful than the processed-side
|
|
49
|
+
* truncation that would over-report them (false incident). */
|
|
50
|
+
const RUN_LIST_LIMIT = 100;
|
|
51
|
+
const envStr = (name, def) => {
|
|
52
|
+
const v = process.env[name];
|
|
53
|
+
return v !== undefined && v.length > 0 ? v : def;
|
|
54
|
+
};
|
|
55
|
+
const REPO = envStr('RECONCILER_REPO', process.env['GITHUB_REPOSITORY'] ?? '');
|
|
56
|
+
const ROUTER_WORKFLOW = envStr('ROUTER_WORKFLOW', 'agent-router.yml');
|
|
57
|
+
const TEMPO = envStr('TEMPO_QUERY_ENDPOINT', 'http://127.0.0.1:13200').replace(/\/+$/, '');
|
|
58
|
+
const OPEN_THRESHOLD_MS = Number(envStr('OPEN_THRESHOLD_MIN', '15')) * MIN;
|
|
59
|
+
const LOOKBACK_MS = Number(envStr('LOOKBACK_MIN', '120')) * MIN;
|
|
60
|
+
const TEMPO_LIMIT = Number(envStr('TEMPO_LIMIT', '1000'));
|
|
61
|
+
/** Deployment-boundary cutoff (groundnuty/macf#444): ignore routes delivered
|
|
62
|
+
* before the receipt mechanism went live. Set `RECONCILER_SINCE` to the
|
|
63
|
+
* hook-go-live time (ISO-8601 e.g. `2026-06-06T08:00:00Z`, or epoch ms) at the
|
|
64
|
+
* first `RECONCILER_ENABLED` flip — pre-deployment routes had no marker + hit
|
|
65
|
+
* hookless sessions, so their missing receipt is EXPECTED, not a drop. Without
|
|
66
|
+
* it, run 1 would false-alarm on every pre-deployment route in the lookback.
|
|
67
|
+
* Self-obsoletes once LOOKBACK_MIN slides fully past the cutoff; unset ⇒ no cutoff. */
|
|
68
|
+
const SINCE_MS = (() => {
|
|
69
|
+
const raw = process.env['RECONCILER_SINCE'];
|
|
70
|
+
if (!raw)
|
|
71
|
+
return undefined;
|
|
72
|
+
const ms = /^\d+$/.test(raw) ? Number(raw) : Date.parse(raw);
|
|
73
|
+
if (!Number.isFinite(ms)) {
|
|
74
|
+
console.error(`WARN: RECONCILER_SINCE="${raw}" not parseable (want ISO-8601 or epoch ms) — ignoring cutoff.`);
|
|
75
|
+
return undefined;
|
|
76
|
+
}
|
|
77
|
+
return ms;
|
|
78
|
+
})();
|
|
79
|
+
/** DELIVERED set: parse the router runs' `Routed … to <AGENT>` success lines.
|
|
80
|
+
* `truncated` is true when `gh run list` hit its page cap — the set is then
|
|
81
|
+
* UNKNOWABLE (older routes fell off the page), so the caller must NOT flag
|
|
82
|
+
* drops this run. Symmetric to the Tempo `tempo_ok` guard, on the delivered
|
|
83
|
+
* side (Pattern A): silently bounding coverage would let a real drop hide. */
|
|
84
|
+
function fetchDelivered(nowMs) {
|
|
85
|
+
const listJson = execFileSync('gh', ['run', 'list', '--repo', REPO, '--workflow', ROUTER_WORKFLOW,
|
|
86
|
+
'--status', 'success', '--limit', String(RUN_LIST_LIMIT), '--json', 'databaseId,createdAt'], { encoding: 'utf-8' });
|
|
87
|
+
const runs = JSON.parse(listJson);
|
|
88
|
+
const truncated = runs.length >= RUN_LIST_LIMIT;
|
|
89
|
+
if (truncated) {
|
|
90
|
+
console.error(`WARN: gh run list hit the ${RUN_LIST_LIMIT}-run cap — DELIVERED set UNKNOWABLE (truncated; older routes in the ${LOOKBACK_MS / MIN}-min window fell off the page). Emitting delivered_ok=false (no drops this run); narrow LOOKBACK_MIN or raise the cap.`);
|
|
91
|
+
}
|
|
92
|
+
const out = [];
|
|
93
|
+
for (const run of runs) {
|
|
94
|
+
const createdMs = Date.parse(run.createdAt);
|
|
95
|
+
if (!Number.isFinite(createdMs) || nowMs - createdMs > LOOKBACK_MS)
|
|
96
|
+
continue;
|
|
97
|
+
let log;
|
|
98
|
+
try {
|
|
99
|
+
log = execFileSync('gh', ['run', 'view', String(run.databaseId), '--repo', REPO, '--log'], { encoding: 'utf-8', maxBuffer: 64 * 1024 * 1024 });
|
|
100
|
+
}
|
|
101
|
+
catch {
|
|
102
|
+
continue; // log unavailable (expired/in-progress) — skip, not a drop signal
|
|
103
|
+
}
|
|
104
|
+
out.push(...parseDeliveredFromLog(log, String(run.databaseId), createdMs));
|
|
105
|
+
}
|
|
106
|
+
return { routes: out, truncated };
|
|
107
|
+
}
|
|
108
|
+
/** PROCESSED set: Tempo `turn_processed` spans → receipts. Aborts (no drops)
|
|
109
|
+
* on a Tempo failure to avoid false alarms (Pattern A). */
|
|
110
|
+
async function fetchProcessed(nowMs) {
|
|
111
|
+
const startSec = Math.floor((nowMs - LOOKBACK_MS) / 1000);
|
|
112
|
+
const endSec = Math.floor(nowMs / 1000);
|
|
113
|
+
const q = '{name="turn_processed" && resource.service.namespace="macf"} | select(span.routed_run_id, span.agent)';
|
|
114
|
+
const url = `${TEMPO}/api/search?q=${encodeURIComponent(q)}&start=${startSec}&end=${endSec}&limit=${TEMPO_LIMIT}`;
|
|
115
|
+
let res;
|
|
116
|
+
try {
|
|
117
|
+
res = await fetch(url, { signal: AbortSignal.timeout(15_000) });
|
|
118
|
+
}
|
|
119
|
+
catch (err) {
|
|
120
|
+
// `fetch failed` is opaque — the real cause is in err.cause (undici wraps
|
|
121
|
+
// the network error). Surface its code + message so an unreachable Tempo
|
|
122
|
+
// tells us WHICH failure: ENOTFOUND (MagicDNS not resolving the host),
|
|
123
|
+
// ECONNREFUSED (nothing listening / not `tailscale serve`d), ETIMEDOUT
|
|
124
|
+
// (ACL/firewall dropping the connection), or AbortError (15s timeout).
|
|
125
|
+
const e = err;
|
|
126
|
+
const detail = e.cause?.code ?? e.cause?.message ?? e.name;
|
|
127
|
+
console.error(`WARN: Tempo query unreachable [${url}] — ${e.message}: ${detail} — PROCESSED set unknowable; NOT flagging drops this run (avoids false alarms).`);
|
|
128
|
+
return null;
|
|
129
|
+
}
|
|
130
|
+
if (!res.ok) {
|
|
131
|
+
console.error(`WARN: Tempo query HTTP ${res.status} — PROCESSED set unknowable; treating as Tempo-unknown (true no-op this run).`);
|
|
132
|
+
return null;
|
|
133
|
+
}
|
|
134
|
+
const parsed = parseProcessedFromTempo(await res.json());
|
|
135
|
+
// Truncation is a Tempo problem too: an incomplete PROCESSED set reads as
|
|
136
|
+
// missing receipts → false drops. receiptsIfComplete() returns null when the
|
|
137
|
+
// result hit the limit, so it's handled identically to unreachable/HTTP-error.
|
|
138
|
+
const receipts = receiptsIfComplete(parsed, TEMPO_LIMIT);
|
|
139
|
+
if (receipts === null) {
|
|
140
|
+
console.error(`WARN: Tempo returned ${parsed.traceCount} >= limit ${TEMPO_LIMIT} — PROCESSED set may be truncated (Pattern A: a truncated set reads as missing receipts → false drops). Treating as Tempo-unknown (true no-op this run); narrow LOOKBACK_MIN or raise TEMPO_LIMIT.`);
|
|
141
|
+
}
|
|
142
|
+
return receipts;
|
|
143
|
+
}
|
|
144
|
+
async function main() {
|
|
145
|
+
const nowMs = Date.now();
|
|
146
|
+
const processed = await fetchProcessed(nowMs);
|
|
147
|
+
if (processed === null) {
|
|
148
|
+
// Tempo problem (unreachable / HTTP error / truncated) — the PROCESSED set
|
|
149
|
+
// is UNKNOWABLE, NOT "zero drops". Emit tempo_ok=false; the workflow gates
|
|
150
|
+
// BOTH its open AND its self-close steps on tempo_ok==true, so this is a
|
|
151
|
+
// TRUE no-op: incidents are left exactly as they are (never false-OPEN on a
|
|
152
|
+
// truncated set, never false-CLOSE a real incident on a transient outage).
|
|
153
|
+
console.error('WARN: PROCESSED set unknowable this run — emitting tempo_ok=false (no open, no close).');
|
|
154
|
+
emit({ tempoOk: false, deliveredOk: true, dropsCount: 0, inFlightCount: 0, dropsJson: '[]' });
|
|
155
|
+
return;
|
|
156
|
+
}
|
|
157
|
+
const { routes: delivered, truncated } = fetchDelivered(nowMs);
|
|
158
|
+
if (truncated) {
|
|
159
|
+
// DELIVERED set unknowable (truncated) — like a Tempo problem, NOT "zero
|
|
160
|
+
// drops". Emit delivered_ok=false so the workflow's open AND self-close
|
|
161
|
+
// steps both no-op this run (each gated on tempo_ok && delivered_ok).
|
|
162
|
+
emit({ tempoOk: true, deliveredOk: false, dropsCount: 0, inFlightCount: 0, dropsJson: '[]' });
|
|
163
|
+
return;
|
|
164
|
+
}
|
|
165
|
+
const result = reconcile(delivered, processed, { nowMs, openThresholdMs: OPEN_THRESHOLD_MS, sinceMs: SINCE_MS });
|
|
166
|
+
const dropsJson = JSON.stringify(result.drops);
|
|
167
|
+
console.error(`reconcile: delivered=${result.deliveredCount} processed=${result.processedCount} ` +
|
|
168
|
+
`drops=${result.drops.length} in_flight=${result.inFlight.length}` +
|
|
169
|
+
(SINCE_MS ? ` (since=${new Date(SINCE_MS).toISOString()} — pre-deployment routes excluded)` : ''));
|
|
170
|
+
if (result.drops.length > 0)
|
|
171
|
+
console.error(`DROPS: ${dropsJson}`);
|
|
172
|
+
emit({ tempoOk: true, deliveredOk: true, dropsCount: result.drops.length, inFlightCount: result.inFlight.length, dropsJson });
|
|
173
|
+
}
|
|
174
|
+
function emit(o) {
|
|
175
|
+
const out = process.env['GITHUB_OUTPUT'];
|
|
176
|
+
if (!out)
|
|
177
|
+
return;
|
|
178
|
+
appendFileSync(out, `tempo_ok=${o.tempoOk}\ndelivered_ok=${o.deliveredOk}\ndrops_count=${o.dropsCount}\nin_flight_count=${o.inFlightCount}\ndrops_json=${o.dropsJson}\n`);
|
|
179
|
+
}
|
|
180
|
+
main().catch((err) => {
|
|
181
|
+
console.error(`Fatal: ${err instanceof Error ? err.message : String(err)}`);
|
|
182
|
+
process.exit(1);
|
|
183
|
+
});
|
|
184
|
+
//# sourceMappingURL=run.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"run.js","sourceRoot":"","sources":["../../src/reconciler/run.ts"],"names":[],"mappings":";AACA;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GAkCG;AACH,OAAO,EAAE,YAAY,EAAE,MAAM,oBAAoB,CAAC;AAClD,OAAO,EAAE,cAAc,EAAE,MAAM,SAAS,CAAC;AACzC,OAAO,EAAE,SAAS,EAA8C,MAAM,gBAAgB,CAAC;AACvF,OAAO,EAAE,qBAAqB,EAAE,MAAM,sBAAsB,CAAC;AAC7D,OAAO,EAAE,uBAAuB,EAAE,kBAAkB,EAAE,MAAM,sBAAsB,CAAC;AAEnF,MAAM,GAAG,GAAG,MAAM,CAAC;AACnB;;;;;;+DAM+D;AAC/D,MAAM,cAAc,GAAG,GAAG,CAAC;AAC3B,MAAM,MAAM,GAAG,CAAC,IAAY,EAAE,GAAW,EAAU,EAAE;IACnD,MAAM,CAAC,GAAG,OAAO,CAAC,GAAG,CAAC,IAAI,CAAC,CAAC;IAC5B,OAAO,CAAC,KAAK,SAAS,IAAI,CAAC,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC;AACnD,CAAC,CAAC;AAEF,MAAM,IAAI,GAAG,MAAM,CAAC,iBAAiB,EAAE,OAAO,CAAC,GAAG,CAAC,mBAAmB,CAAC,IAAI,EAAE,CAAC,CAAC;AAC/E,MAAM,eAAe,GAAG,MAAM,CAAC,iBAAiB,EAAE,kBAAkB,CAAC,CAAC;AACtE,MAAM,KAAK,GAAG,MAAM,CAAC,sBAAsB,EAAE,wBAAwB,CAAC,CAAC,OAAO,CAAC,MAAM,EAAE,EAAE,CAAC,CAAC;AAC3F,MAAM,iBAAiB,GAAG,MAAM,CAAC,MAAM,CAAC,oBAAoB,EAAE,IAAI,CAAC,CAAC,GAAG,GAAG,CAAC;AAC3E,MAAM,WAAW,GAAG,MAAM,CAAC,MAAM,CAAC,cAAc,EAAE,KAAK,CAAC,CAAC,GAAG,GAAG,CAAC;AAChE,MAAM,WAAW,GAAG,MAAM,CAAC,MAAM,CAAC,aAAa,EAAE,MAAM,CAAC,CAAC,CAAC;AAC1D;;;;;;wFAMwF;AACxF,MAAM,QAAQ,GAAG,CAAC,GAAuB,EAAE;IACzC,MAAM,GAAG,GAAG,OAAO,CAAC,GAAG,CAAC,kBAAkB,CAAC,CAAC;IAC5C,IAAI,CAAC,GAAG;QAAE,OAAO,SAAS,CAAC;IAC3B,MAAM,EAAE,GAAG,OAAO,CAAC,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC;IAC7D,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,EAAE,CAAC,EAAE,CAAC;QACzB,OAAO,CAAC,KAAK,CAAC,2BAA2B,GAAG,gEAAgE,CAAC,CAAC;QAC9G,OAAO,SAAS,CAAC;IACnB,CAAC;IACD,OAAO,EAAE,CAAC;AACZ,CAAC,CAAC,EAAE,CAAC;AAEL;;;;+EAI+E;AAC/E,SAAS,cAAc,CAAC,KAAa;IACnC,MAAM,QAAQ,GAAG,YAAY,CAC3B,IAAI,EACJ,CAAC,KAAK,EAAE,MAAM,EAAE,QAAQ,EAAE,IAAI,EAAE,YAAY,EAAE,eAAe;QAC5D,UAAU,EAAE,SAAS,EAAE,SAAS,EAAE,MAAM,CAAC,cAAc,CAAC,EAAE,QAAQ,EAAE,sBAAsB,CAAC,EAC5F,EAAE,QAAQ,EAAE,OAAO,EAAE,CACtB,CAAC;IACF,MAAM,IAAI,GAAG,IAAI,CAAC,KAAK,CAAC,QAAQ,CAA6D,CAAC;IAC9F,MAAM,SAAS,GAAG,IAAI,CAAC,MAAM,IAAI,cAAc,CAAC;IAChD,IAAI,SAAS,EAAE,CAAC;QACd,OAAO,CAAC,KAAK,CAAC,6BAA6B,cAAc,uEAAuE,WAAW,GAAG,GAAG,wHAAwH,CAAC,CAAC;IAC7Q,CAAC;IACD,MAAM,GAAG,GAAqB,EAAE,CAAC;IACjC,KAAK,MAAM,GAAG,IAAI,IAAI,EAAE,CAAC;QACvB,MAAM,SAAS,GAAG,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,SAAS,CAAC,CAAC;QAC5C,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,SAAS,CAAC,IAAI,KAAK,GAAG,SAAS,GAAG,WAAW;YAAE,SAAS;QAC7E,IAAI,GAAW,CAAC;QAChB,IAAI,CAAC;YACH,GAAG,GAAG,YAAY,CAAC,IAAI,EAAE,CAAC,KAAK,EAAE,MAAM,EAAE,MAAM,CAAC,GAAG,CAAC,UAAU,CAAC,EAAE,QAAQ,EAAE,IAAI,EAAE,OAAO,CAAC,EACvF,EAAE,QAAQ,EAAE,OAAO,EAAE,SAAS,EAAE,EAAE,GAAG,IAAI,GAAG,IAAI,EAAE,CAAC,CAAC;QACxD,CAAC;QAAC,MAAM,CAAC;YACP,SAAS,CAAC,kEAAkE;QAC9E,CAAC;QACD,GAAG,CAAC,IAAI,CAAC,GAAG,qBAAqB,CAAC,GAAG,EAAE,MAAM,CAAC,GAAG,CAAC,UAAU,CAAC,EAAE,SAAS,CAAC,CAAC,CAAC;IAC7E,CAAC;IACD,OAAO,EAAE,MAAM,EAAE,GAAG,EAAE,SAAS,EAAE,CAAC;AACpC,CAAC;AAED;4DAC4D;AAC5D,KAAK,UAAU,cAAc,CAAC,KAAa;IACzC,MAAM,QAAQ,GAAG,IAAI,CAAC,KAAK,CAAC,CAAC,KAAK,GAAG,WAAW,CAAC,GAAG,IAAI,CAAC,CAAC;IAC1D,MAAM,MAAM,GAAG,IAAI,CAAC,KAAK,CAAC,KAAK,GAAG,IAAI,CAAC,CAAC;IACxC,MAAM,CAAC,GAAG,uGAAuG,CAAC;IAClH,MAAM,GAAG,GAAG,GAAG,KAAK,iBAAiB,kBAAkB,CAAC,CAAC,CAAC,UAAU,QAAQ,QAAQ,MAAM,UAAU,WAAW,EAAE,CAAC;IAClH,IAAI,GAAa,CAAC;IAClB,IAAI,CAAC;QACH,GAAG,GAAG,MAAM,KAAK,CAAC,GAAG,EAAE,EAAE,MAAM,EAAE,WAAW,CAAC,OAAO,CAAC,MAAM,CAAC,EAAE,CAAC,CAAC;IAClE,CAAC;IAAC,OAAO,GAAG,EAAE,CAAC;QACb,0EAA0E;QAC1E,yEAAyE;QACzE,uEAAuE;QACvE,uEAAuE;QACvE,uEAAuE;QACvE,MAAM,CAAC,GAAG,GAA8D,CAAC;QACzE,MAAM,MAAM,GAAG,CAAC,CAAC,KAAK,EAAE,IAAI,IAAI,CAAC,CAAC,KAAK,EAAE,OAAO,IAAI,CAAC,CAAC,IAAI,CAAC;QAC3D,OAAO,CAAC,KAAK,CAAC,kCAAkC,GAAG,OAAO,CAAC,CAAC,OAAO,KAAK,MAAM,iFAAiF,CAAC,CAAC;QACjK,OAAO,IAAI,CAAC;IACd,CAAC;IACD,IAAI,CAAC,GAAG,CAAC,EAAE,EAAE,CAAC;QACZ,OAAO,CAAC,KAAK,CAAC,0BAA0B,GAAG,CAAC,MAAM,+EAA+E,CAAC,CAAC;QACnI,OAAO,IAAI,CAAC;IACd,CAAC;IACD,MAAM,MAAM,GAAG,uBAAuB,CAAC,MAAM,GAAG,CAAC,IAAI,EAAE,CAAC,CAAC;IACzD,0EAA0E;IAC1E,6EAA6E;IAC7E,+EAA+E;IAC/E,MAAM,QAAQ,GAAG,kBAAkB,CAAC,MAAM,EAAE,WAAW,CAAC,CAAC;IACzD,IAAI,QAAQ,KAAK,IAAI,EAAE,CAAC;QACtB,OAAO,CAAC,KAAK,CAAC,wBAAwB,MAAM,CAAC,UAAU,aAAa,WAAW,oMAAoM,CAAC,CAAC;IACvR,CAAC;IACD,OAAO,QAAQ,CAAC;AAClB,CAAC;AAED,KAAK,UAAU,IAAI;IACjB,MAAM,KAAK,GAAG,IAAI,CAAC,GAAG,EAAE,CAAC;IACzB,MAAM,SAAS,GAAG,MAAM,cAAc,CAAC,KAAK,CAAC,CAAC;IAC9C,IAAI,SAAS,KAAK,IAAI,EAAE,CAAC;QACvB,2EAA2E;QAC3E,2EAA2E;QAC3E,yEAAyE;QACzE,4EAA4E;QAC5E,2EAA2E;QAC3E,OAAO,CAAC,KAAK,CAAC,wFAAwF,CAAC,CAAC;QACxG,IAAI,CAAC,EAAE,OAAO,EAAE,KAAK,EAAE,WAAW,EAAE,IAAI,EAAE,UAAU,EAAE,CAAC,EAAE,aAAa,EAAE,CAAC,EAAE,SAAS,EAAE,IAAI,EAAE,CAAC,CAAC;QAC9F,OAAO;IACT,CAAC;IACD,MAAM,EAAE,MAAM,EAAE,SAAS,EAAE,SAAS,EAAE,GAAG,cAAc,CAAC,KAAK,CAAC,CAAC;IAC/D,IAAI,SAAS,EAAE,CAAC;QACd,yEAAyE;QACzE,wEAAwE;QACxE,sEAAsE;QACtE,IAAI,CAAC,EAAE,OAAO,EAAE,IAAI,EAAE,WAAW,EAAE,KAAK,EAAE,UAAU,EAAE,CAAC,EAAE,aAAa,EAAE,CAAC,EAAE,SAAS,EAAE,IAAI,EAAE,CAAC,CAAC;QAC9F,OAAO;IACT,CAAC;IACD,MAAM,MAAM,GAAG,SAAS,CAAC,SAAS,EAAE,SAAS,EAAE,EAAE,KAAK,EAAE,eAAe,EAAE,iBAAiB,EAAE,OAAO,EAAE,QAAQ,EAAE,CAAC,CAAC;IACjH,MAAM,SAAS,GAAG,IAAI,CAAC,SAAS,CAAC,MAAM,CAAC,KAAK,CAAC,CAAC;IAC/C,OAAO,CAAC,KAAK,CACX,wBAAwB,MAAM,CAAC,cAAc,cAAc,MAAM,CAAC,cAAc,GAAG;QACnF,SAAS,MAAM,CAAC,KAAK,CAAC,MAAM,cAAc,MAAM,CAAC,QAAQ,CAAC,MAAM,EAAE;QAClE,CAAC,QAAQ,CAAC,CAAC,CAAC,WAAW,IAAI,IAAI,CAAC,QAAQ,CAAC,CAAC,WAAW,EAAE,oCAAoC,CAAC,CAAC,CAAC,EAAE,CAAC,CAClG,CAAC;IACF,IAAI,MAAM,CAAC,KAAK,CAAC,MAAM,GAAG,CAAC;QAAE,OAAO,CAAC,KAAK,CAAC,UAAU,SAAS,EAAE,CAAC,CAAC;IAClE,IAAI,CAAC,EAAE,OAAO,EAAE,IAAI,EAAE,WAAW,EAAE,IAAI,EAAE,UAAU,EAAE,MAAM,CAAC,KAAK,CAAC,MAAM,EAAE,aAAa,EAAE,MAAM,CAAC,QAAQ,CAAC,MAAM,EAAE,SAAS,EAAE,CAAC,CAAC;AAChI,CAAC;AAED,SAAS,IAAI,CAAC,CAA2G;IACvH,MAAM,GAAG,GAAG,OAAO,CAAC,GAAG,CAAC,eAAe,CAAC,CAAC;IACzC,IAAI,CAAC,GAAG;QAAE,OAAO;IACjB,cAAc,CAAC,GAAG,EAAE,YAAY,CAAC,CAAC,OAAO,kBAAkB,CAAC,CAAC,WAAW,iBAAiB,CAAC,CAAC,UAAU,qBAAqB,CAAC,CAAC,aAAa,gBAAgB,CAAC,CAAC,SAAS,IAAI,CAAC,CAAC;AAC5K,CAAC;AAED,IAAI,EAAE,CAAC,KAAK,CAAC,CAAC,GAAG,EAAE,EAAE;IACnB,OAAO,CAAC,KAAK,CAAC,UAAU,GAAG,YAAY,KAAK,CAAC,CAAC,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,GAAG,CAAC,EAAE,CAAC,CAAC;IAC5E,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC"}
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@groundnuty/macf",
|
|
3
|
-
"version": "0.2.
|
|
3
|
+
"version": "0.2.36",
|
|
4
4
|
"description": "Multi-Agent Coordination Framework CLI — coordinate Claude Code agents via GitHub. Installs as `macf` binary; use `macf init` to set up an agent workspace, `macf update` to refresh rules + version pins.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "dist/index.js",
|
|
@@ -35,7 +35,7 @@
|
|
|
35
35
|
"test:watch": "vitest"
|
|
36
36
|
},
|
|
37
37
|
"dependencies": {
|
|
38
|
-
"@groundnuty/macf-core": "0.2.
|
|
38
|
+
"@groundnuty/macf-core": "0.2.36",
|
|
39
39
|
"commander": "^14.0.3",
|
|
40
40
|
"reflect-metadata": "^0.2.2",
|
|
41
41
|
"zod": "^4.0.0"
|
|
@@ -13,7 +13,7 @@ The rules here are topology-agnostic: they work whether the project uses a scien
|
|
|
13
13
|
1. **The reporter owns the issue closure.** The agent who opened an issue is the only one who closes it. This rule has two failure modes — both costly, both silent. Check for both before posting a merge-handoff comment.
|
|
14
14
|
|
|
15
15
|
**Failure mode A — closing an issue you didn't open.** Two ways this happens:
|
|
16
|
-
- *Auto-close via PR keywords.* GitHub's auto-close keywords in a PR body or commit message close the referenced issue on merge, bypassing the reporter. **Never use any of these 9 variants when the issue was filed by someone else:** `Closes #N`, `Fixes #N`, `Resolves #N`, `Close #N`, `Fix #N`, `Resolve #N`, `Closed #N`, `Fixed #N`, `Resolved #N`. Use **`Refs #N`** instead.
|
|
16
|
+
- *Auto-close via PR keywords.* GitHub's auto-close keywords in a PR body or commit message close the referenced issue on merge, bypassing the reporter. **Never use any of these 9 variants when the issue was filed by someone else:** `Closes #N`, `Fixes #N`, `Resolves #N`, `Close #N`, `Fix #N`, `Resolve #N`, `Closed #N`, `Fixed #N`, `Resolved #N`. Use **`Refs #N`** instead. The parser is negation- and context-blind (tables, checklists, quotes, and code fences do *not* shield the keyword), so this is now backstopped structurally: the `check-close-keyword.sh` PreToolUse hook (groundnuty/macf#431) intercepts `gh pr create` / `gh pr edit` whose body or title carries a close-keyword adjacent to a `#N` (or `owner/repo#N`) filed by another agent, and blocks with `exit 2`. Override with `MACF_SKIP_CLOSE_CHECK=1` for a deliberate cross-fix or your own issue.
|
|
17
17
|
- *Manual close via `gh issue close`.* Don't close someone else's issue even after merging the implementation. Post the handoff comment and stop.
|
|
18
18
|
|
|
19
19
|
**Failure mode B — waiting for yourself to close.** When the issue's reporter is YOU (you filed the issue during an audit, a follow-up split-off, or self-observed bug), there is no one else to close it. Don't post `@<other-agent> ready for you to close when verified` — no one is waiting to do that for you. After your PR merges, close the issue yourself with a verification comment. Silent stall otherwise: the queue fills with in-review issues that never clear.
|
|
@@ -115,6 +115,27 @@ The rules here are topology-agnostic: they work whether the project uses a scien
|
|
|
115
115
|
|
|
116
116
|
4. **Concise comments** — 1-3 sentences unless detail is needed.
|
|
117
117
|
|
|
118
|
+
5. **Inbound review requests are action asks that block the requester — not status FYIs.** Issue Lifecycle rule 2 covers the implementer's OUTBOUND queue ("work through your assigned-label queue without prompting"). This rule covers the symmetric REVIEWER-side INBOUND obligation, which is just as load-bearing and far easier to strand.
|
|
119
|
+
|
|
120
|
+
**(a) Treat "PR ready: #N" / "ready for review" as a request that blocks the peer's next move.** When a peer posts a review-request comment on a thread, that is not an informational status update you can note-and-defer — it is an action ask directed at you. The LGTM/merge-gate makes this structural: per `pr-discipline.md`, a peer's PR cannot merge without a reviewer's **formal approval** (`gh pr review --approve`), so until you review, the peer is blocked. Not slowed — blocked, and silently, because nothing on their side surfaces "my reviewer never saw this." A stranded review request stalls the peer indefinitely. Pick it up, review honestly, and submit a formal approve / request-changes (not a plain comment — see `pr-discipline.md §"How to submit LGTM"`; only the formal state-change fires the reviewer-notification routing).
|
|
121
|
+
|
|
122
|
+
**(b) Run an INBOUND review-sweep before declaring idle — especially after surfacing from a long single-threaded task.** Before you say "caught up" / "nothing pending" / "your call" / go idle, sweep your repos for review requests addressed to you. This matters most right after you surface from a deep, single-threaded arc (a long workflow, a multi-step investigation): while you were heads-down on one thread, a review request can have arrived on another and been easy to miss.
|
|
123
|
+
|
|
124
|
+
# Peer-authored PRs awaiting YOUR review (review not yet given):
|
|
125
|
+
for r in <your-repos>; do
|
|
126
|
+
gh pr list --repo "$r" --state open \
|
|
127
|
+
--json number,author,reviewDecision,title \
|
|
128
|
+
--jq '.[] | select(.reviewDecision == "REVIEW_REQUIRED" or .reviewDecision == null)
|
|
129
|
+
| select(.author.login != "<your-login>")'
|
|
130
|
+
done
|
|
131
|
+
# Plus: open threads where you were @mentioned and haven't replied.
|
|
132
|
+
|
|
133
|
+
(The query intentionally surfaces *any* peer-authored PR lacking a review, not only those that formally requested you by name — on a small fleet that breadth is a feature: it catches a peer blocked on review even when the request didn't route to you. Narrow it with a requested-reviewer filter on a larger fleet.)
|
|
134
|
+
|
|
135
|
+
**Why the sweep is necessary — and why it's mechanism-agnostic.** Routing delivers each notification as a **discrete event at the moment it occurs** — a push to your channel-server, an A2A message — not as a standing inbox you can re-open and re-read later. Once that event has been delivered (or has arrived while you were mid-turn on a different thread), it is not re-presented to you on its own. So a review-request ping that landed while you were deep in another task is gone from the live stream and only recoverable by querying GitHub state directly. This is the **general silent-fallback shape** (see `silent-fallback-hazards.md`): the routing layer reports the ping delivered, but "delivered" does not guarantee "processed by the recipient," and nothing surfaces the gap until the blocked peer escalates. The sweep is the result-invariant check at the reviewer boundary — assert against GitHub state ("are there peer PRs awaiting my review?") rather than trusting that you'd remember every ping that flowed past.
|
|
136
|
+
|
|
137
|
+
**Verified motivation:** three operator-surfaced stalls where a peer's review request sat idle (42 min in one case; ~2.5 h in another) because the reviewer went idle without sweeping — the ping had arrived during a long single-threaded task and was never picked back up. In each case the peer's PR was blocked the entire time on a formal approval that never came.
|
|
138
|
+
|
|
118
139
|
---
|
|
119
140
|
|
|
120
141
|
## When You're Stuck — Escalation
|
|
@@ -160,19 +181,7 @@ The goal is correctness through dialogue, not compliance.
|
|
|
160
181
|
|
|
161
182
|
---
|
|
162
183
|
|
|
163
|
-
##
|
|
164
|
-
|
|
165
|
-
When a hook or script needs to programmatically submit a prompt to a Claude Code TUI running in tmux, **always use the canonical helper**:
|
|
166
|
-
|
|
167
|
-
.claude/scripts/tmux-send-to-claude.sh <session-or-empty> "<prompt text>"
|
|
168
|
-
|
|
169
|
-
Pass `""` for the session to target the current pane.
|
|
170
|
-
|
|
171
|
-
**Never** call `tmux send-keys "<prompt>" Enter` inline. Claude Code's TUI is in multi-line input mode by default, so a single Enter inserts a newline instead of submitting — the prompt sits in the buffer unsubmitted. The helper handles the submit-quirk correctly: clear existing input with `C-u`, send the text with a first Enter, sleep 1 second (load-bearing — without it tmux batches both Enters and Claude processes them atomically as "newline + newline"), then send a second Enter that actually submits.
|
|
172
|
-
|
|
173
|
-
The helper is distributed to every agent workspace by `macf init` and refreshed by `macf update` (same mechanism as this rules file). If you're writing a new hook or automation that needs to prompt Claude, use the helper — do not re-implement the pattern.
|
|
174
|
-
|
|
175
|
-
### Canonical tmux launch pattern
|
|
184
|
+
## Canonical tmux launch pattern
|
|
176
185
|
|
|
177
186
|
**One session per agent, named `<project>@<agent>`.** Post-v0.2.10, `claude.sh` self-wraps in tmux with this naming structurally — bare `./claude.sh` produces the canonical session. Pre-v0.2.10 consumers (and operators wanting manual launch) use the explicit form:
|
|
178
187
|
|
|
@@ -117,6 +117,8 @@ Per `groundnuty/macf#244` + `#272`, this rule is enforced by a Claude Code PreTo
|
|
|
117
117
|
|
|
118
118
|
The hook is the same shape as `check-gh-token.sh` (#140 attribution-trap defense) — bash command-type hook distributed via `macf init` / `macf update` / `macf rules refresh` to every workspace's `.claude/scripts/check-mention-routing.sh` with the entry registered in `.claude/settings.json` `hooks.PreToolUse`. Substrate workspaces, tester agents, CV consumers, and future MACF-consumer projects all get the protection uniformly.
|
|
119
119
|
|
|
120
|
+
Because the hook is registered as a path-invocation (`"command": "$CLAUDE_PROJECT_DIR/.claude/scripts/check-mention-routing.sh"`), Claude Code execs the script fresh on every event, so a change to the **script body** goes live on the very next event as soon as the file is synced on disk — a `macf update` that updates the script is immediately in force for consumers with no session relaunch and no relaunch-coordination needed. Only a change to the hook **registration** in `.claude/settings.json` (or to the launch environment) requires a relaunch to take effect.
|
|
121
|
+
|
|
120
122
|
**Heuristic** (subject to refinement; documented for transparency):
|
|
121
123
|
|
|
122
124
|
- Already wrapped in backticks (`` `@<bot>[bot]` ``) → routing-suppressed; allowed (canonical describing form §5); does NOT count toward Check A
|
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
|
|
5
5
|
> **Workspaces without full `macf init`** (e.g. `groundnuty/macf` itself, or any Claude Code workspace operated by a bot that isn't a MACF-registered agent) can still get this canonical rule via `macf rules refresh --dir <workspace>`. Same copy, no App credentials or registry required.
|
|
6
6
|
|
|
7
|
-
This rule names the CLASS so agents recognize the shape on first encounter rather than re-discovering each instance from scratch.
|
|
7
|
+
This rule names the CLASS so agents recognize the shape on first encounter rather than re-discovering each instance from scratch. Ten active instances are documented below as worked examples spanning different architectural layers (identity, parsing, TUI binding, observability routing, config substitution, multi-agent coordination protocol, metric-instrumentation lifecycle, observability-endpoint routing, release-pipeline-partial-publish, third-party-action retry-exhaustion). (Instance 10 — a legacy substrate-routing receipt-gap — was retired 2026-06-07; its number is kept, not reused.) Nine of ten active instances have structural defenses applied or in flight — the pattern of defense generalizes alongside the pattern of hazard.
|
|
8
8
|
|
|
9
9
|
Instance 9 is annotated as **sister-shape** (failure correctly surfaced + partial side-effect breaks retry idempotency) — listed here for cross-reference convenience but warrants a sibling canonical rule (`partial-side-effect-hazards.md`) if more instances surface. The two classes share "multi-step pipeline where consumer assumes atomicity" but the failure surface differs: silent-fallback hides at the API boundary; partial-side-effect surfaces loudly but persists semi-state.
|
|
10
10
|
|
|
@@ -23,7 +23,7 @@ The trap is that defensive programming targets exit codes, but exit-code success
|
|
|
23
23
|
|
|
24
24
|
---
|
|
25
25
|
|
|
26
|
-
##
|
|
26
|
+
## Known instances
|
|
27
27
|
|
|
28
28
|
### Instance 1 — gh-token attribution traps
|
|
29
29
|
|
|
@@ -43,7 +43,7 @@ This is a distinct sub-case from the missing-helper / mis-pipefail / wrong-prefi
|
|
|
43
43
|
**Surface:** PR / issue body markdown parsing
|
|
44
44
|
**Failure shape:** `Closes #N` / `Fixes #N` / `Resolves #N` (and lowercase / past-tense variants — 9 forms total) trigger GitHub's auto-close on merge **regardless of surrounding context** — including inside negations ("will NOT close #N"), quotes, hypothetical examples, or AC checklists. Markdown structure is not a shield — the parser scans raw body text, so a **markdown table cell or coordination checklist that DOCUMENTS the closure protocol itself** (e.g. a who-closes-when row reading `| Close #N when ACs satisfied | reporter | … |`) is a high-frequency variant — the very act of writing down reporter-owns-closure fires the premature close it describes. Revert commits inherit the keyword via the default `Revert "..."` wrapping and fire auto-close a second time on the revert merge.
|
|
45
45
|
**Recurrence:** Multiple confirmed incidents; revert-message-keyword-inheritance sub-mode confirmed 2026-04-29; closure-protocol-table-cell sub-mode confirmed 2026-05-20 on `macf#406` (Phase-5 tracking issue auto-closed 2 s after PR #410 merged, attributed to the merger with `commit_id: null`, **premature against an 8-item operator checklist**, and **~2 weeks undetected** — a peer's "stays OPEN" comment 10 min later did not catch it).
|
|
46
|
-
**Canonical defense:** `pr-discipline.md` + `coordination.md §Issue Lifecycle 1` — use `Refs #N` exclusively when issue was filed by someone else; never use any of the 9 auto-close keywords with `#N` regardless of intended context. When reverting, override the default revert message to strip the inherited keyword. **When a PR body must DESCRIBE closure mechanics (tables, checklists, protocol notes), lead with the issue number and a non-keyword verb (`#N is closed by the reporter once ACs met` / `reporter handles closure of #N`) — never place a `close`/`fix`/`resolve` verb immediately before `#N`.** Grep the drafted body for the keyword-then-`#N` pattern before posting.
|
|
46
|
+
**Canonical defense:** `pr-discipline.md` + `coordination.md §Issue Lifecycle 1` — use `Refs #N` exclusively when issue was filed by someone else; never use any of the 9 auto-close keywords with `#N` regardless of intended context. When reverting, override the default revert message to strip the inherited keyword. **When a PR body must DESCRIBE closure mechanics (tables, checklists, protocol notes), lead with the issue number and a non-keyword verb (`#N is closed by the reporter once ACs met` / `reporter handles closure of #N`) — never place a `close`/`fix`/`resolve` verb immediately before `#N`.** Grep the drafted body for the keyword-then-`#N` pattern before posting. **Structural backstop (SHIPPED, groundnuty/macf#431):** the `check-close-keyword.sh` PreToolUse hook now intercepts `gh pr create` / `gh pr edit` whose body or title carries a close-keyword adjacent to a `#N` (or cross-repo `owner/repo#N`) filed by another agent — resolves the referenced issue's author via `gh api` and blocks with `exit 2` unless it is self-filed (override: `MACF_SKIP_CLOSE_CHECK=1`). This moves the defense from grep-discipline to the harness, after 4 self-inflicted recurrences (macf#316/#410/#430) proved cognitive discipline insufficient — same Path-2 promotion as the #140 / #244+#272 / #270 hooks.
|
|
47
47
|
|
|
48
48
|
### Instance 3 — Remote Control IPC blocking tmux send-keys
|
|
49
49
|
|
|
@@ -51,10 +51,10 @@ This is a distinct sub-case from the missing-helper / mis-pipefail / wrong-prefi
|
|
|
51
51
|
**Failure shape:** `tmux send-keys` exits 0 + keystrokes are written to pane stdin, but Claude Code's input handler is bound to a different IPC channel (RC's SDK socket); routing-via-tmux silently bypasses the actual input path → recipient never sees the routed prompt.
|
|
52
52
|
**Recurrence:** Cross-agent triangulated; 2+ confirmed firings on real routes hours apart, same shape.
|
|
53
53
|
**Defense status:** Two-tier per fleet class:
|
|
54
|
-
- **Consumer fleet** (CV agents, tester agents, future macf-init'd consumers):
|
|
54
|
+
- **Consumer fleet** (CV agents, tester agents, future macf-init'd consumers): largely mitigated, not fully eliminated. The routed **message** arrives via the channel-server's HTTP/MCP path (not as a keystroke), so the prompt *content* reaches the agent regardless of RC state — that's the structural win over send-keys routing (DR-020 / macf-actions v3+). The **residual**: the channel-server's `wakeViaTmux` nudge still uses send-keys, so under RC-bound input the auto-wake keystroke may not land — the message sits in the MCP channel until the agent next reads it, rather than being lost. So the hazard is reduced to a wake-latency issue, not a content-drop.
|
|
55
55
|
- **Substrate fleet** (workspaces operated as the design surface, not registered MACF consumers): permanent operational reality — substrate workspaces don't run `macf init`. Defensive posture: rule-discipline + Pattern C fragility detector (`tmux display -p '#{session_activity}'` doesn't advance under RC-bound input).
|
|
56
56
|
|
|
57
|
-
The
|
|
57
|
+
The content-drop is retired for the consumer fleet (message arrives via HTTP/MCP); only the wake-nudge residual remains there. The substrate fleet expects full Instance 3 firings to recur on routes indefinitely; rule-discipline catches the failure at observation time, not pre-emptively.
|
|
58
58
|
|
|
59
59
|
### Instance 4 — Loki / ClickHouse-logs pipeline divergence (label-vs-structured-metadata)
|
|
60
60
|
|
|
@@ -163,6 +163,44 @@ Both classes share "multi-step pipeline where consumer assumes atomicity" but th
|
|
|
163
163
|
|
|
164
164
|
**Codification rationale:** 3 instances across 2 trigger mechanisms + defense pattern stable + operator-witnessed across 2 calendar days (2026-05-18 + 2026-05-19) + cross-agent (instance 1 via science-agent's authoring; instances 2/3 via code-agent's release-cut workflow) — meets all four "When to add a new instance" criteria. The sister-class question (separate `partial-side-effect-hazards.md` canonical rule) is acknowledged inline + deferred to the 5-instance threshold per rule-promotion convention.
|
|
165
165
|
|
|
166
|
+
### Instance 10 — retired (legacy substrate-routing receipt-gap)
|
|
167
|
+
|
|
168
|
+
Documented a send-logged ≠ received gap on the legacy Stage-2 substrate routing last-mile, a path the macf **product** does not use: consumers route via the channel-server / A2A path, whose `notify_received` / `mcp_pushed` + OTel receipt spans (`macf.mcp.push`, `macf.tmux_wake.deliver`) already capture receipt. Removed from canonical 2026-06-07 — legacy-routing operational detail belongs in substrate workbench + git history, not the product rules. The number is retained (not reused) to keep Instances 1–9 + 11 stable as identifiers.
|
|
169
|
+
|
|
170
|
+
---
|
|
171
|
+
|
|
172
|
+
### Instance 11 — Third-party retry-wrapping action exits 0 on internal retry-exhaustion (connect-failure masked as step-success)
|
|
173
|
+
|
|
174
|
+
**Surface:** any consumer-CI step that delegates a connect/auth handshake to a third-party GitHub Action (or wrapper tool) that owns its own internal retry loop — tailnet join (`tailscale/github-action`), OTLP collector connect, cloud-provider auth (`aws-actions/configure-aws-credentials`, `google-github-actions/auth`), registry login, etc. The action retries internally and reports a single aggregate exit code for the step.
|
|
175
|
+
|
|
176
|
+
**Failure shape:** the wrapped tool fails on EVERY internal retry (a hard error, not a transient one — e.g. an invalid-auth `400` that no amount of retrying can fix), the action **exhausts its retry budget and still exits `0`**, and the workflow continues into a broken state. The connect never happened, but the step is green. The real failure then surfaces far downstream as a *different* problem — every later fetch/query against the resource the step was supposed to make reachable fails, and the symptom (timeout, connection-refused, "endpoint unreachable") points the investigator at the downstream consumer, not at the masked upstream connect. The exit-0 step is the last place anyone looks because it's the one place that reported success.
|
|
177
|
+
|
|
178
|
+
**Recurrence:** First confirmed instance — groundnuty/macf#461 (2026-06-07). The e2e workflow's `tailscale/github-action` step was configured with `tags: tag:ci`, but the OAuth client + ACL only permit `tag:ci-runner`. `tailscale up` returned `Status 400 "requested tags [tag:ci] are invalid or not permitted"` on all 5 internal retries; the action exited `0`; the runner never joined the tailnet; every subsequent tailnet fetch failed, presenting as a generic **"Tempo query unreachable."** Because the connect step was *green*, it was the last place examined — the diagnosis cycled through transient-retry, Tempo-serve-config, and DNS-vs-IP hypotheses until reading the connect step's **actual output** (not its exit code) surfaced the `400`. (Two *genuinely separate* upstream bugs — a devbox/Nix-install step-ordering fault [#460] and the Tempo query port not being tailnet-exposed [devops-toolkit #88] — were correctly diagnosed and landed first; they were real fixes, not projections of this masked failure. The exit-0 masking is what hid *this* failure for an extra diagnostic cycle even after those two were resolved.)
|
|
179
|
+
|
|
180
|
+
**Distinct from the sister GHA-surface instances** — do NOT fold this into Instance 5 or Instance 8:
|
|
181
|
+
|
|
182
|
+
| Aspect | Instance 5 (secrets-misnamed) | Instance 8 (OTLP endpoint silent-drop) | Instance 11 (this) |
|
|
183
|
+
|---|---|---|---|
|
|
184
|
+
| Where the lie originates | empty-string substitution at `${{ }}` expansion, BEFORE the tool runs | exporter silently retries-then-drops, no exit code surfaced at all | third-party action runs, fails every retry, then **exits `0` reporting success** |
|
|
185
|
+
| What's wrong | a required input is missing/renamed | a long-lived process points at a dead endpoint | a connect handshake hard-fails but its wrapper claims success |
|
|
186
|
+
| Downstream disguise | misleading auth error at the *consuming* step | empty observability surface, no failure signal anywhere | masquerades as a *different* downstream problem (here: "Tempo unreachable") |
|
|
187
|
+
|
|
188
|
+
Instance 5 is "the input was never there"; Instance 8 is "the data went into a void with no signal"; Instance 11 is "the connection step actively *reported success* while having hard-failed." All three live on the GitHub-Actions / CI-plumbing surface but the trust boundary that breaks differs — Instance 11's is specifically *a third party's exit code about its own retry exhaustion*. The tailscale case above is just the worked example: any consumer CI that wraps a connect/auth in a retrying action (tailnet, OTLP, cloud-auth, registry-login) is exposed to the same shape.
|
|
189
|
+
|
|
190
|
+
**Defense status:** SHIPPED (Pattern A result-invariant assert, with a Pattern D precheck flavor) — a **"Verify <resource> is up" step placed immediately after the connect step**, asserting the connection's result-invariant and failing LOUD (red job) when it doesn't hold. Never trust a third-party action's exit code as evidence that its own internal retries succeeded — assert the post-connect state directly.
|
|
191
|
+
|
|
192
|
+
```bash
|
|
193
|
+
# After the tailscale connect step (NOT trusting its exit 0):
|
|
194
|
+
tailscale status --json | jq -e '.BackendState == "Running"' >/dev/null \
|
|
195
|
+
|| { echo "::error::tailnet did not come up — tailscale BackendState != Running."; \
|
|
196
|
+
echo "::error::The connect action may have exhausted retries and exited 0 anyway"; \
|
|
197
|
+
echo "::error::(e.g. tag/ACL mismatch returns Status 400 on every retry)."; \
|
|
198
|
+
tailscale status || true; exit 1; }
|
|
199
|
+
echo "✓ tailnet up (BackendState=Running)"
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
Generalizes to any retry-wrapping action: assert the result-invariant the connect was *supposed to establish* — `BackendState == "Running"` for a tailnet, a successful authenticated probe for cloud-auth, a `200` from the collector's health endpoint for OTLP — in a dedicated step right after the connect, before any downstream consumer runs. This converts a far-downstream misdiagnosis (the 3-hop red-herring chase in #461) into a fail-at-the-boundary red job pointing directly at the broken connect.
|
|
203
|
+
|
|
166
204
|
---
|
|
167
205
|
|
|
168
206
|
## How to recognize the class on first encounter
|
|
@@ -306,12 +344,12 @@ Silent-fallback hazards are **architectural**, not implementation bugs. They eme
|
|
|
306
344
|
|
|
307
345
|
For coordination-system safety analysis: this is a class of hazards multi-agent systems must explicitly defend against. Each new instance teaches the same lesson; the class-name is what makes the lesson transferable across agents.
|
|
308
346
|
|
|
309
|
-
### Defense-pattern emergence (
|
|
347
|
+
### Defense-pattern emergence (9-of-10 active instances have structural defense applied or shipped)
|
|
310
348
|
|
|
311
349
|
| Instance | Surface | Structural defense | Pattern |
|
|
312
350
|
|---|---|---|---|
|
|
313
351
|
| 1 — gh-token attribution traps | `gh` ops + bot tokens | PreToolUse hook + helper-with-fail-loud-prefix-check; expiry sub-case (macf#317) adds in-runner token refresh in macf-channel-server (`token-refresh.ts` + `refresh-aware-client.ts`) — caches token ~50min, force-refreshes on 401 | Pattern B (acquisition) + Pattern A (expiry retry) |
|
|
314
|
-
| 2 — GitHub auto-close negation-blindness | PR/issue body markdown |
|
|
352
|
+
| 2 — GitHub auto-close negation-blindness | PR/issue body markdown | Structural defense SHIPPED — `check-close-keyword.sh` PreToolUse hook (groundnuty/macf#431) blocks `gh pr create`/`edit` carrying a close-keyword adjacent to another agent's `#N` | Pattern B (mitigated) |
|
|
315
353
|
| 3 — Remote Control IPC blocking tmux send-keys | Claude Code TUI input | Two-tier: consumer fleet structurally retired via channel-server primitive (DR-020 mTLS HTTPS POST); substrate fleet permanent operational reality — defense = rule-discipline + Pattern C fragility detector | Pattern C deployable as fragility detector |
|
|
316
354
|
| 4 — Loki/CH-logs pipeline divergence | OTLP logs routing | manifest warnings + shape-aware diagnostic | Pattern A |
|
|
317
355
|
| 5 — Workflow secrets-misnamed | GitHub Actions workflow inputs | Workflow precheck step | Pattern D |
|
|
@@ -319,10 +357,11 @@ For coordination-system safety analysis: this is a class of hazards multi-agent
|
|
|
319
357
|
| 7 — OTel-counter cumulative-state vs short-lived-process lifecycle | Metric-instrumentation lifecycle | Two-phase: doc workaround `sum(increase(...))` + OTel SDK delta temporality | Pattern A |
|
|
320
358
|
| 8 — OTLP endpoint silent-drop | Observability-endpoint routing | Five-surface defense: CLI release-discipline + substrate testers env-override + canonical template `:14318` default + cluster-side compat port-map + agent-process `doctor-otel.sh` Pattern A | Pattern A (composite — first multi-architectural-layer case in this rule; instances 1-7 have single-pattern defenses) |
|
|
321
359
|
| 9 — Sigstore TLOG orphans on failed npm publish (sister-class) | npm publish + sigstore attestation pipeline | Three-defense composite: bump-version recovery (DR-022 Amendment L) + pre-flight registry-collision check (Pattern D analog, macf#380) + TLOG-state observability (devops-toolkit#74+#77 Grafana dashboard live) | Pattern D analog (pre-flight precheck) + recovery-procedure-codification |
|
|
360
|
+
| 11 — Third-party retry-wrapping action exits 0 on retry-exhaustion | Consumer-CI connect/auth via third-party action (tailnet, OTLP, cloud-auth, registry-login) | SHIPPED — "Verify <resource> is up" step immediately after the connect asserts the connection's result-invariant (e.g. `tailscale status` `BackendState == "Running"`) + fails LOUD; never trusts the action's exit code about its own retry exhaustion (macf#461) | Pattern A (post-connect result-invariant assert) + Pattern D flavor (precheck-before-downstream) |
|
|
322
361
|
|
|
323
|
-
|
|
362
|
+
Nine of ten active instances have structural defense applied or shipped. Defense patterns (A, B, C, D, E) generalize across instances — they're reusable defense templates, not case-specific fixes. **Pattern A (result-invariant assertion at the boundary) bears the most weight** — it's the structural defense for instances 4, 7, 8, AND 11 (4 of 10), each at a different architectural boundary (logs pipeline, metric counter, observability endpoint, third-party-action connect-verify). Instance 8's five-surface defense topology (consumer canonical + cluster-side compat port-map + concrete Pattern A impl) demonstrates that structural defense at the observability-pipeline-class can compose across architectural layers — the canonical-distribution layer + the cluster-infrastructure layer + the assertion-script layer all reinforce each other rather than substituting for each other. Instance 9 demonstrates that the Pattern D template generalizes from workflow-secrets-prechecks to release-pipeline-prechecks AND that recovery-procedure-codification (DR-022 Amendment L's bump-version-not-tag-retry) is its own defense category — distinct from detection-pre-merge defenses (Patterns A/B/D) and discrimination-at-receiver defenses (Pattern E).
|
|
324
363
|
|
|
325
|
-
The breadth of layers spanned by 5 different defense patterns (identity, parsing, TUI binding, observability routing, config substitution, multi-agent coordination protocol, metric-instrumentation lifecycle, observability-endpoint routing, release-pipeline-partial-publish) is independent evidence that the hazard CLASS is real. If silent-fallback was a single-instance accident, no defense pattern would emerge. **Pattern A's recurrence across 3 different observability boundaries (logs / metrics / endpoint) is the strongest signal that result-invariant assertion is the load-bearing structural-defense template for the entire observability-pipeline-class** of silent fallback.
|
|
364
|
+
The breadth of layers spanned by 5 different defense patterns (identity, parsing, TUI binding, observability routing, config substitution, multi-agent coordination protocol, metric-instrumentation lifecycle, observability-endpoint routing, release-pipeline-partial-publish, third-party-action retry-exhaustion) is independent evidence that the hazard CLASS is real. If silent-fallback was a single-instance accident, no defense pattern would emerge. **Pattern A's recurrence across 3 different observability boundaries (logs / metrics / endpoint) is the strongest signal that result-invariant assertion is the load-bearing structural-defense template for the entire observability-pipeline-class** of silent fallback.
|
|
326
365
|
|
|
327
366
|
---
|
|
328
367
|
|
|
@@ -336,7 +375,7 @@ Add when ALL of the following hold:
|
|
|
336
375
|
|
|
337
376
|
The class-name is what makes the lesson transferable, not multi-agent witness. A single-agent-confirmed instance with a concrete trace + identified defense pattern is sufficient for canonicalization (instances 4, 5, 7, 8 are all single-agent-confirmed). Cross-agent triangulation strengthens the framing but isn't a precondition.
|
|
338
377
|
|
|
339
|
-
Add as a new numbered section
|
|
378
|
+
Add as a new numbered section (the next number is **12** — numbering is append-only; retired instances keep their slot, see Instance 10) with the same fields: Surface / Failure shape / Recurrence / Defense status. Increment the intro paragraph's active-instance count + the Defense-pattern emergence header's `N-of-M active instances` count too.
|
|
340
379
|
|
|
341
380
|
---
|
|
342
381
|
|