@fro.bot/systematic 2.6.0 → 2.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,368 @@
1
+ # HITL Review Mode
2
+
3
+ Human-in-the-loop iteration loop for a markdown document shared via Proof. Invoked either by an upstream skill (`ce-brainstorm`, `ce-ideate`, `ce-plan`) handing off a draft it produced, or directly by the user asking to iterate on an existing markdown file they already have on disk ("share this to proof and iterate", "HITL this doc with me"). Mechanics are identical in both cases: upload the local doc, let the user annotate in Proof's web UI, ingest feedback as in-thread replies and tracked edits, and sync the final doc back to disk.
4
+
5
+ This mode assumes a local markdown file exists. There is no "from scratch" entry — if the user wants a fresh doc, create one with the normal proof create workflow first, then invoke HITL.
6
+
7
+ Load this file when HITL review mode is requested — whether by an upstream caller or directly by the user.
8
+
9
+ ---
10
+
11
+ ## Invocation Contract
12
+
13
+ Inputs:
14
+
15
+ - **Source file path** (required): absolute or repo-relative path to the local markdown file. When an upstream caller invokes this mode, it passes the path explicitly. When the user invokes directly ("share that doc to proof and let's iterate"), derive the path from conversation context — the file the user just referenced, created, or edited. If ambiguous, ask the user which file.
16
+ - **Doc title** (required): display title for the Proof doc. Upstream callers pass this explicitly; on direct-user invocation, default to the file's H1 heading, falling back to the filename (minus extension) if no H1 exists.
17
+ - **Recommended next step** (optional, caller-specific): short string the caller wants echoed in the final terminal output (e.g., "Recommended next: `/ce-plan`"). Not used on direct-user invocation — the terminal report simply summarizes the iteration and asks what's next.
18
+
19
+ Agent identity is fixed, not a parameter: every API call uses agent ID `ai:systematic` and display name `Systematic`. Callers do not override this.
20
+
21
+ Return shape (used by upstream callers to resume their handoff; also shown to the user in the terminal when invoked directly):
22
+
23
+ - `status`: `proceeded` | `done_for_now` | `aborted`
24
+ - `localPath`: the source file path (same as input)
25
+ - `localSynced`: `true` if Phase 5 wrote the reviewed doc back to `localPath`; `false` if the user declined the sync and local is stale. Only present on `proceeded`.
26
+ - `docUrl`: the tokenUrl for the Proof doc
27
+ - `openThreadCount`: number of unresolved threads still in the doc
28
+ - `revision`: final doc revision after end-sync (only on `proceeded`)
29
+
30
+ ---
31
+
32
+ ## Phase 1: Upload and Wait
33
+
34
+ 1. Read the local markdown file into memory. Remember this content as `uploadedMarkdown` — Phase 5 compares against it to detect whether anything changed during the session.
35
+ 2. `POST https://www.proofeditor.ai/share/markdown` with `{title, markdown}` → capture `slug`, `accessToken`, `tokenUrl`
36
+ 3. `POST /api/agent/{slug}/presence` with `X-Agent-Id: ai:systematic`, `x-share-token: <token>`, body `{"name":"Systematic","status":"reading","summary":"Uploaded doc for review"}`
37
+ 4. Display prominently in the terminal:
38
+
39
+ ```
40
+ Doc ready for review: <tokenUrl>
41
+ ```
42
+
43
+ 5. Ask the user with the platform's blocking question tool: `question` in OpenCode (call `ToolSearch` with `select:question` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to presenting options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question.
44
+
45
+ **Question:** "Highlight text in Proof to leave a comment. The agent will read each one, reply in-thread or apply the fix, then sync changes back to your local file. What's next?"
46
+
47
+ **Options:**
48
+ - **I'm done with feedback — read it and apply**
49
+ - **I have no feedback — proceed**
50
+
51
+ If the user is still reviewing, they leave the prompt open — the blocking question waits naturally. A third "still working" option would be a no-op wrapper for that.
52
+
53
+ On **I have no feedback — proceed**: skip to Phase 5 (end-sync); return to caller with `status: proceeded`.
54
+
55
+ On **I'm done with feedback**: continue to Phase 2.
56
+
57
+ ---
58
+
59
+ ## Phase 2: Ingest Pass
60
+
61
+ A single pass over the current doc state. Deterministic, idempotent, derivable from marks — no session cache, no sidecar state.
62
+
63
+ At the start of the pass, update presence to `status: "acting"` with a short summary like `"Reading your feedback"` so anyone watching the Proof tab sees the agent is live on their comments. Update to `status: "waiting"` before the Phase 3 terminal report so the tab signals "ball is in your court" while the terminal asks for the next signal. Same `POST /presence` call as Phase 1 — just different `status`/`summary`.
64
+
65
+ ### 2.1 Read fresh state
66
+
67
+ ```
68
+ GET /api/agent/{slug}/state
69
+ Headers: x-share-token: <token>
70
+ ```
71
+
72
+ Capture:
73
+ - `markdown` (current body — includes any user direct edits and accepted suggestions)
74
+ - `revision`
75
+ - `marks` (object keyed by markId)
76
+ - `mutationBase.token` — the baseToken required for this round's mutations
77
+
78
+ ### 2.2 Identify marks that need attention
79
+
80
+ Filter `marks` to items where **all** of the following hold:
81
+
82
+ - `by` starts with `human:` (authored by a human, not the agent)
83
+ - `resolved` is `false`
84
+ - Either `thread` has no entry authored by any `ai:*` identity, **OR** the latest entry in `thread` is authored by `human:*` with an `at` timestamp newer than the latest `ai:*` entry (user responded to a prior agent reply)
85
+
86
+ Skip everything else. Agent-authored marks, resolved threads, and threads already replied to with no new human response are done.
87
+
88
+ ### 2.3 Read each mark and decide how to respond
89
+
90
+ The point of HITL is to give the user a natural way to steer the doc without dragging every decision into the terminal. Most feedback can be auto-applied. Only escalate when the agent genuinely can't make a confident call alone.
91
+
92
+ Real feedback blends types — "this is wrong, rename to Y" is both objection and directive; "why X? I'd prefer Z" is both question and suggestion. Don't force a clean classification. Read the comment text, the anchored `quote`, and any prior thread replies, and decide:
93
+
94
+ **Can the agent apply a fix directly with confidence?** Imperatives ("rename X to Y", "remove this", "add a section about Z") usually qualify. Apply the edit, reply with a one-line summary of what changed, resolve.
95
+
96
+ **Is this a question with a clear answer?** Answer in-thread. Resolve if the answer stands on its own. If answering surfaces a new decision the user should weigh in on, leave open and surface it in the terminal report.
97
+
98
+ **Is this a disagreement?** ("this is wrong", "contradicts §2", "this won't work"). Evaluate the claim against current content. If the agent agrees, fix and reply "Agreed — updated to X". If the agent disagrees, reply with the reasoning and leave open. Don't silently apply an objection without evaluating it — the whole point is that the user flagged it *because* they think the plan is wrong.
99
+
100
+ **Is the intent genuinely unclear?** First try: attempt the most reasonable interpretation, apply it, and reply "I read this as X — let me know if I should revert." That's cheaper than a round-trip when stakes are low. Ask for clarification only when the interpretations lead to meaningfully different outcomes. When asking, use the platform's blocking question tool for a quick multiple-choice when the options are discrete, or leave it as an open thread comment when free-form response is more natural. Either way the thread stays open so the next pass picks up the user's reply.
101
+
102
+ **Invariant:** every attention-needing mark ends the pass with an agent reply in its thread. Unreplied = "still to do" — the next pass re-classifies it. This is what makes the loop idempotent without a sidecar: mark state *is* the state. Even when the agent disagrees or can't decide, reply (with reasoning or a question) rather than silently skip.
103
+
104
+ **Parallelize independent thread ops.** `comment.reply` and `comment.resolve` across different marks don't conflict — they touch different thread state and a stale `baseToken` on one doesn't poison another (retry-on-`STALE_BASE` is cheap, per-mark, and local). When there are more than ~3 attention-needing marks that classify as plain replies or resolves, dispatch them in parallel — either via multiple tool calls in one turn or via sub-agents (`Agent`/`Task` in OpenCode, `spawn_agent` in Codex, `subagent` in Pi). Keep block-mutating edits (`suggestion.add`, `/edit/v2`) sequential or batch them through one `/edit/v2` call — concurrent block edits can stale one another's `baseToken` and force retries, and they interact in ways that are easier to reason about as an ordered sequence.
105
+
106
+ ### 2.4 Apply edits
107
+
108
+ The user is collaborating in the doc, not waiting on approval. Every mutation works with live clients — only whole-doc `rewrite.apply` is gated. Pick the tool that matches intent:
109
+
110
+ **Default: `suggestion.add` with `status: "accepted"`** for content changes anchored on a quote (reword, rename, clarify, correct, add a sentence inline). One call creates a tracked suggestion mark *and* commits the change. The user sees committed text (no pending approval needed), and the mark persists as audit trail with per-edit attribution and a one-click reject-to-revert. This is the right primitive for HITL auto-applied edits — it gives the user a reversible trail without asking them to re-review anything.
111
+
112
+ ```json
113
+ {"type":"suggestion.add","kind":"replace","quote":"<anchor>","content":"<new>","by":"ai:systematic","status":"accepted","baseToken":"<token>"}
114
+ ```
115
+
116
+ Use `kind: "insert" | "delete" | "replace"` as appropriate; all three support `status: "accepted"`.
117
+
118
+ **Use `/edit/v2` silently** only when the trail is actively wrong or technically blocked:
119
+
120
+ - **Atomicity is required** — multiple coordinated edits must commit together or not at all (e.g., insert new section + update a reference in another block + delete the obsolete paragraph). `/edit/v2` takes an `operations` array that commits atomically; separate `suggestion.add` calls can partially succeed.
121
+ - **Pre-user self-correction** — the agent is fixing its own output *before* the user has looked at the doc (e.g., spotted a mistake mid-ingest-pass). A tracked mark would imply "there was an old version," which is misleading from the user's perspective.
122
+ - **Pure structural insertion with no quote anchor** — adding an entirely new block/section where no existing text serves as an anchor. `suggestion.add` requires a `quote`; `/edit/v2` has `insert_before` / `insert_after` keyed on block `ref`.
123
+ - **Structural list-item or block removal** — `suggestion.add` with `kind: "delete"` only deletes the text inside a list item; the bullet marker (`*`, `-`, or numeric `1.`) stays behind as an orphan line. Use `/edit/v2 delete_block` to remove an entire block, or `find_replace_in_block` to splice out the item plus its surrounding whitespace cleanly.
124
+
125
+ ```bash
126
+ # Get snapshot for block refs + baseToken
127
+ curl -s "https://www.proofeditor.ai/api/agent/{slug}/snapshot" -H "x-share-token: <token>"
128
+ # Apply
129
+ curl -X POST "https://www.proofeditor.ai/api/agent/{slug}/edit/v2" \
130
+ -H "Content-Type: application/json" -H "x-share-token: <token>" \
131
+ -H "X-Agent-Id: ai:systematic" -H "Idempotency-Key: <uuid>" \
132
+ -d '{"by":"ai:systematic","baseToken":"<token>","operations":[...]}'
133
+ ```
134
+
135
+ Per-op body shape (singular `block` for `replace_block`; plural `blocks:[{markdown},...]` for anything that can add content; the server returns 422 on the wrong shape):
136
+
137
+ ```json
138
+ {"op":"replace_block","ref":"b8","block":{"markdown":"new content"}}
139
+ {"op":"insert_after","ref":"b3","blocks":[{"markdown":"new block"}]}
140
+ {"op":"insert_before","ref":"b3","blocks":[{"markdown":"new block"}]}
141
+ {"op":"delete_block","ref":"b6"}
142
+ {"op":"find_replace_in_block","ref":"b4","find":"old","replace":"new","occurrence":"first"}
143
+ {"op":"replace_range","fromRef":"b2","toRef":"b5","blocks":[{"markdown":"..."}]}
144
+ ```
145
+
146
+ Block `ref` values drift across revisions — re-fetch `/snapshot` for fresh refs before each `/edit/v2` call if any writes have landed since the last snapshot.
147
+
148
+ **Bulk mechanical sweep — prefer one `/edit/v2` call over N `suggestion.add` calls.** When a uniform change hits more than ~5 blocks (emdash sweep, terminology rename across a doc, heading-style normalization), batch it as a single `/edit/v2` with many `operations`. One round-trip, one atomic commit, one audit entry — versus N separate ops-endpoint calls, N baseToken reads (without the caching below), and N tracked marks for what is one logical change. Use `suggestion.add` + `accepted` when edits are distinct and anchored (each deserves its own reject-to-revert trail); use `/edit/v2` batch when they're variations of the same mechanical rule.
149
+
150
+ **Use pending `suggestion.add` (no status)** when the change is judgment-sensitive enough that the agent wants explicit user approval before commit — rare in HITL, since the point of auto-applied edits is to reduce round-trips. Most judgment-sensitive cases are better handled by leaving the thread open with a clarifying question.
151
+
152
+ **`rewrite.apply` is not needed during a live review.** It's blocked by `LIVE_CLIENTS_PRESENT` anyway.
153
+
154
+ **Mutation requirements (every write, including replies and resolves):**
155
+
156
+ - Top-level field is `type` on `/ops`; `operations[].op` on `/edit/v2`. Do not mix.
157
+ - Include `baseToken` from `/state.mutationBase.token` (or `/snapshot.mutationBase.token` for `/edit/v2`).
158
+ - Set `by: "ai:systematic"` and header `X-Agent-Id: ai:systematic`.
159
+ - Include an `Idempotency-Key` header (fresh UUID per logical write). Reuse the same key on a proven retry of the same payload; use a new key for a new logical write.
160
+ - Reply: `{"type":"comment.reply","markId":"<id>","by":"ai:systematic","text":"..."}`. Resolve: `{"type":"comment.resolve","markId":"<id>","by":"ai:systematic"}`. Reopen if needed: `{"type":"comment.unresolve", ...}`.
161
+
162
+ **Retry after any error is verify-first, not retry-first.** The Proof API can commit canonically and still return a non-2xx or a 202 with `collab.status: "pending"`; network timeouts can hit after the server has already written. Retrying without verifying is the most common cause of duplicate marks (same comment twice, same suggestion twice) that then need a manual cleanup pass.
163
+
164
+ - On `STALE_BASE` / `BASE_TOKEN_REQUIRED` / `MISSING_BASE` / `INVALID_BASE_TOKEN`: pre-commit, token-related. Re-read `/state`, send the same payload with a fresh `baseToken`, retry once. The `mutate()` helper below auto-retries these.
165
+ - On `ANCHOR_NOT_FOUND` / `ANCHOR_AMBIGUOUS`: pre-commit, but the `quote` no longer matches uniquely. Re-read is not enough; the caller must tighten or regenerate the anchor before retrying. The helper surfaces the error instead of auto-retrying.
166
+ - On `INVALID_OPERATIONS` / `INVALID_REQUEST` / `INVALID_REF` / `INVALID_BLOCK_MARKDOWN` / `INVALID_RANGE` / `INVALID_MARKDOWN` / 422: the payload is wrong. Do not retry — fix the payload and send a new write.
167
+ - On `COLLAB_SYNC_FAILED` / `REWRITE_BARRIER_FAILED` / `PROJECTION_STALE` / `INTERNAL_ERROR` / 5xx / network error / timeout / **202 with `collab.status: "pending"`**: the write may have landed. Re-read `/state`, diff against the intended change (mark exists? suggestion applied? quote replaced?), and only retry if the server did not actually commit it. If the diff shows the write did land, treat the call as successful even though the response said otherwise.
168
+
169
+ **When the loop breaks.** If a mutation keeps failing after a fresh read and a verified-needed retry, or two reads disagree about state, call `POST https://www.proofeditor.ai/api/bridge/report_bug` with the request ID, slug, and raw response body before falling back. Don't silently skip — that loses the audit trail the user is relying on.
170
+
171
+ ---
172
+
173
+ ## Phase 3: Terminal Report
174
+
175
+ Exception-based. Don't replay what the user can already see in the Proof doc — the full reasoning for each thread lives there. The terminal is for the decisions the user needs to make next.
176
+
177
+ Every report covers three things, phrased naturally for the current state:
178
+
179
+ - **What got handled** (e.g., how many comments resolved, any edits auto-applied)
180
+ - **What's still open** — if any escalations remain, each one gets one line of anchored quote plus one line of the agent's reply or question. Fuller context stays in the Proof thread
181
+ - **The doc URL** — always include it; the user may have closed the tab
182
+
183
+ Keep the whole report scannable at a glance. Three common shapes fall out of this naturally:
184
+
185
+ - A clean pass with everything handled collapses to a single line plus the doc URL
186
+ - An escalation pass lists the open threads compactly after a one-line summary of what was handled
187
+ - A pass with no new feedback just notes that and points to the doc
188
+
189
+ Phrase them in whatever voice matches the situation rather than matching a template — "handled 4, 1 still needs you" and "all 5 addressed, doc's ready" are both fine.
190
+
191
+ ---
192
+
193
+ ## Phase 4: Next-Signal Prompt
194
+
195
+ Ask the user with the platform's blocking question tool: `question` in OpenCode (call `ToolSearch` with `select:question` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to presenting options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question.
196
+
197
+ **Question:** "Proof review pass done. What's next?"
198
+
199
+ Offer options that cover these intents — use concrete user-facing labels, not agent-internal jargon (no "end-sync", "ingest pass", etc.). Only include the options that fit the current state. Keep labels imperative and third-person (no "I'll" / "I'm" — it is ambiguous in a tool-mediated menu whether the speaker is the user or the agent) and keep the `[short label] — [description]` shape consistent across every option. A "still working, come back later" option is not offered: the blocking question already waits, so that option would be a no-op wrapper.
200
+
201
+ - **Discuss** → `Discuss — walk through the open threads in terminal`
202
+ Talk through open threads in the terminal; the agent echoes decisions back to Proof threads. Only useful when escalations are open.
203
+ - **Proceed** → `Save — save the reviewed doc back to the local file`
204
+ Go to Phase 5 end-sync. If escalations are still open, name that in the label (e.g., `Save with 3 threads still open`) so the user is accepting the tradeoff explicitly instead of via a nested confirm.
205
+ - **Another pass** → `Re-check — look for new comments in Proof`
206
+ Re-read state and re-ingest. Worth offering even after a clean pass, since the user may have added comments while the report rendered.
207
+ - **Done for now** → `Pause — stop without saving`
208
+ Stop without syncing; return to caller with `status: done_for_now`, no end-sync.
209
+
210
+ The sync confirmation happens in Phase 5 regardless of whether threads are open — this step only asks what the user wants next, not whether to overwrite the local file.
211
+
212
+ ---
213
+
214
+ ## Phase 5: End-Sync
215
+
216
+ Runs when the user selects **Proceed**. Before prompting anything, check whether the Proof content actually diverged from what was uploaded — if not, there's nothing to sync and no reason to ask.
217
+
218
+ 1. Fetch current state: `GET /api/agent/{slug}/state` with `x-share-token: <token>`. Save the full response body to a temp file (`$STATE_TMP`) so the markdown bytes can later be streamed to disk without passing through `$(...)` (which would strip trailing newlines). Extract `state.revision` from that file into `$REVISION`. Read `state.markdown` from that file for the comparison in step 2.
219
+
220
+ 2. Compare `state.markdown` to `uploadedMarkdown` (captured in Phase 1).
221
+
222
+ **If identical** — no content changes happened during the session. Skip the sync prompt entirely. Display:
223
+
224
+ ```
225
+ No changes to sync. Local file is unchanged.
226
+ Doc: <tokenUrl>
227
+ ```
228
+
229
+ Set presence `status: completed`, summary `"Review complete, no changes"`. Return to the caller with `status: proceeded`, `localSynced: true` (local matches Proof — no write needed, local is not stale), `revision: <state.revision>`, and the rest of the standard fields.
230
+
231
+ **If different** — continue to step 3.
232
+
233
+ 3. Ask with the platform's blocking question tool: `question` in OpenCode (call `ToolSearch` with `select:question` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to presenting options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question.
234
+
235
+ **Question:** "Sync the reviewed doc back to `<localPath>`? Proof has your review changes; local still has the pre-review copy."
236
+
237
+ **Options:**
238
+ - **Yes, sync now** (default, recommended)
239
+ - **Not yet, I'll pull it later** (returns to caller with `localSynced: false`)
240
+
241
+ Why the extra prompt: the user may have started review hours ago and lost track of the local file at stake. A brief confirm makes the file write visible rather than a silent side-effect of clicking Proceed earlier. The caller signals via `localSynced` so downstream workflows can warn that local is stale.
242
+
243
+ 4. On **Yes, sync now**, write the fetched markdown to local — see `Workflow: Pull a Proof Doc to Local` in `SKILL.md`:
244
+
245
+ ```bash
246
+ # $STATE_TMP is the temp file holding the /state response from step 1.
247
+ TMP="${SOURCE}.proof-sync.$$"
248
+ jq -jr '.markdown' "$STATE_TMP" > "$TMP" && mv "$TMP" "$SOURCE"
249
+ rm "$STATE_TMP"
250
+ ```
251
+
252
+ Stream `.markdown` bytes directly from the saved state file with `jq -jr` — do not capture the markdown into a shell variable, since `$(...)` would strip trailing newlines and corrupt the write. `$REVISION` (extracted separately in step 1) is safe to keep as a variable; it's an opaque scalar.
253
+
254
+ On **Not yet**, skip the write (still clean up `$STATE_TMP`).
255
+
256
+ 5. Set presence `status: completed`, summary `"Review synced to <localPath>"` (or `"Review complete, local not updated"` if sync was declined) so the Proof UI shows the loop has finished.
257
+
258
+ 6. Display one of:
259
+
260
+ Synced:
261
+ ```
262
+ Doc synced to <localPath> (revision <N>).
263
+ Doc: <tokenUrl>
264
+ ```
265
+
266
+ Declined:
267
+ ```
268
+ Review complete. Local file kept as-is — pull from Proof when ready.
269
+ Doc: <tokenUrl>
270
+ ```
271
+
272
+ 7. Return to the caller with:
273
+ ```
274
+ status: proceeded
275
+ localPath: <source>
276
+ localSynced: true | false
277
+ docUrl: <tokenUrl>
278
+ openThreadCount: <K>
279
+ revision: <N>
280
+ ```
281
+
282
+ Do **not** delete the Proof doc. It remains the durable review record; the caller's workflow may want to link back to it.
283
+
284
+ ---
285
+
286
+ ## Recipes
287
+
288
+ ### BaseToken-aware mutation
289
+
290
+ Reuse `baseToken` from the most recent `/state` read. Only re-read on `STALE_BASE` / `BASE_TOKEN_REQUIRED`. For an ingest pass this means one `/state` read at Phase 2.1 feeds every subsequent mutation, not N reads for N mutations.
291
+
292
+ Two retry classes, and they behave differently. The helper below only covers the safe class; the ambiguous class needs a caller-supplied verifier because "did this write land?" depends on what the payload was (look for a markId, a quote replacement, a thread reply, etc.).
293
+
294
+ ```bash
295
+ SLUG=<slug>
296
+ TOKEN=<accessToken>
297
+ AGENT_ID=ai:systematic
298
+ BASE=<cached from most recent /state or /snapshot read>
299
+
300
+ mutate() {
301
+ local PAYLOAD="$1" # jq template without baseToken
302
+ local IDEM_KEY BODY RESP CODE
303
+ # Fresh key per logical write (per mutate() call). The same key is then
304
+ # reused below only within the retry of this single logical write — NOT
305
+ # across different payloads, which would make the server collapse
306
+ # distinct writes as duplicates and silently drop later edits.
307
+ IDEM_KEY=$(uuidgen)
308
+ BODY=$(jq -n --arg base "$BASE" --argjson payload "$PAYLOAD" '$payload + {baseToken: $base}')
309
+ RESP=$(curl -s -X POST "https://www.proofeditor.ai/api/agent/$SLUG/ops" \
310
+ -H "Content-Type: application/json" \
311
+ -H "x-share-token: $TOKEN" \
312
+ -H "X-Agent-Id: $AGENT_ID" \
313
+ -H "Idempotency-Key: $IDEM_KEY" \
314
+ -d "$BODY")
315
+ CODE=$(printf '%s' "$RESP" | jq -r '.code // .error // empty')
316
+ # Pre-commit token-related errors — safe to auto-retry with the same
317
+ # payload and a fresh baseToken. Anchor errors (ANCHOR_NOT_FOUND,
318
+ # ANCHOR_AMBIGUOUS) are also pre-commit but require a tighter quote,
319
+ # so they are surfaced instead of auto-retried.
320
+ if [ "$CODE" = "STALE_BASE" ] \
321
+ || [ "$CODE" = "BASE_TOKEN_REQUIRED" ] \
322
+ || [ "$CODE" = "MISSING_BASE" ] \
323
+ || [ "$CODE" = "INVALID_BASE_TOKEN" ]; then
324
+ BASE=$(curl -s "https://www.proofeditor.ai/api/agent/$SLUG/state" \
325
+ -H "x-share-token: $TOKEN" | jq -r '.mutationBase.token')
326
+ BODY=$(jq -n --arg base "$BASE" --argjson payload "$PAYLOAD" '$payload + {baseToken: $base}')
327
+ RESP=$(curl -s -X POST "https://www.proofeditor.ai/api/agent/$SLUG/ops" \
328
+ -H "Content-Type: application/json" \
329
+ -H "x-share-token: $TOKEN" \
330
+ -H "X-Agent-Id: $AGENT_ID" \
331
+ -H "Idempotency-Key: $IDEM_KEY" \
332
+ -d "$BODY")
333
+ fi
334
+ printf '%s' "$RESP"
335
+ }
336
+ ```
337
+
338
+ The `Idempotency-Key` is minted fresh at the top of each call (one key per logical write). The retry inside the same call reuses that key so the server can collapse a duplicate TCP-level send, but a later `mutate()` for a different payload gets its own key. Minting outside the function would make every call share one key, and the server would treat the second and subsequent writes as duplicates of the first and silently drop them.
339
+
340
+ **Ambiguous failures (anything outside the pre-commit set above — `COLLAB_SYNC_FAILED`, `INTERNAL_ERROR`, 5xx, network timeout, 202 with `collab.status: "pending"`):** do not retry from this helper. Re-read `/state` in the caller, diff the marks/content against the intended change, and only re-issue the write if the diff proves nothing landed. Pattern:
341
+
342
+ ```bash
343
+ # After an ambiguous failure on comment.add with quote "X" and text "Y":
344
+ STATE=$(curl -s "https://www.proofeditor.ai/api/agent/$SLUG/state" \
345
+ -H "x-share-token: $TOKEN")
346
+ LANDED=$(printf '%s' "$STATE" | jq --arg q "X" --arg t "Y" \
347
+ '[.marks[]? | select(.by == "ai:systematic" and .quote == $q and (.thread[0].text // .text) == $t)] | length')
348
+ if [ "$LANDED" -gt 0 ]; then
349
+ echo "Already applied — skipping retry."
350
+ else
351
+ # Safe to retry with a fresh BASE.
352
+ ...
353
+ fi
354
+ ```
355
+
356
+ Match on a payload-identifying field (quote + text for `comment.add`; quote + content for `suggestion.add`; markId + text for `comment.reply`; block ordinal + markdown for `/edit/v2`). When no such invariant is available, prefer leaving the write undone and surfacing it in the terminal report over risking a duplicate.
357
+
358
+ ### jq gotcha when inspecting responses
359
+
360
+ When extracting fields from API responses with jq's `//` alternative operator, parenthesize inside object constructors — jq parses `{markId: .markId // .result.markId}` as a syntax error. Use `{markId: (.markId // .result.markId)}`, or pull the value outside the object: `jq -r '.markId // .result.markId'`.
361
+
362
+ ### Identity
363
+
364
+ All ops must include:
365
+ - `by: "ai:systematic"` in the request body
366
+ - `X-Agent-Id: ai:systematic` in headers (required for presence; recommended for ops for consistent attribution)
367
+
368
+ Display name `Systematic` is bound via `POST /presence` with `{"name":"Systematic", ...}`. Set this once after upload; it carries across subsequent ops.