npm - @fro.bot/systematic - Versions diffs - 2.6.0 → 2.7.0 - Mend

@fro.bot/systematic 2.6.0 → 2.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (35) hide show

package/agents/review/api-contract-reviewer.md +1 -1
package/agents/review/correctness-reviewer.md +1 -1
package/agents/review/data-migrations-reviewer.md +1 -1
package/agents/review/dhh-rails-reviewer.md +1 -1
package/agents/review/julik-frontend-races-reviewer.md +1 -1
package/agents/review/kieran-python-reviewer.md +1 -1
package/agents/review/kieran-rails-reviewer.md +1 -1
package/agents/review/kieran-typescript-reviewer.md +1 -1
package/agents/review/maintainability-reviewer.md +1 -1
package/agents/review/performance-reviewer.md +1 -1
package/agents/review/reliability-reviewer.md +1 -1
package/agents/review/security-reviewer.md +1 -1
package/agents/workflow/bug-reproduction-validator.md +1 -1
package/dist/cli.js +1 -1
package/dist/{index-3h7kpmfa.js → index-k9tdxh0p.js} +1 -1
package/dist/index.d.ts +1 -1
package/dist/index.js +2 -3
package/dist/lib/skills.d.ts +1 -0
package/package.json +1 -1
package/skills/ce-brainstorm/references/handoff.md +127 -0
package/skills/ce-brainstorm/references/requirements-capture.md +243 -0
package/skills/ce-brainstorm/references/universal-brainstorming.md +63 -0
package/skills/ce-ideate/references/post-ideation-workflow.md +240 -0
package/skills/ce-plan/references/deepening-workflow.md +249 -0
package/skills/ce-plan/references/plan-handoff.md +96 -0
package/skills/ce-plan/references/universal-planning.md +114 -0
package/skills/ce-plan/references/visual-communication.md +31 -0
package/skills/ce-work/references/shipping-workflow.md +129 -0
package/skills/ce-work-beta/references/codex-delegation-workflow.md +327 -0
package/skills/ce-work-beta/references/shipping-workflow.md +129 -0
package/skills/compound-docs/SKILL.md +2 -3
package/skills/document-review/references/synthesis-and-presentation.md +406 -0
package/skills/proof/references/hitl-review.md +368 -0
package/skills/writing-systematic-skills/SKILL.md +115 -0
package/skills/writing-systematic-skills/references/foundation-conventions.md +143 -0

package/skills/proof/references/hitl-review.md ADDED Viewed

@@ -0,0 +1,368 @@
+# HITL Review Mode
+Human-in-the-loop iteration loop for a markdown document shared via Proof. Invoked either by an upstream skill (`ce-brainstorm`, `ce-ideate`, `ce-plan`) handing off a draft it produced, or directly by the user asking to iterate on an existing markdown file they already have on disk ("share this to proof and iterate", "HITL this doc with me"). Mechanics are identical in both cases: upload the local doc, let the user annotate in Proof's web UI, ingest feedback as in-thread replies and tracked edits, and sync the final doc back to disk.
+This mode assumes a local markdown file exists. There is no "from scratch" entry — if the user wants a fresh doc, create one with the normal proof create workflow first, then invoke HITL.
+Load this file when HITL review mode is requested — whether by an upstream caller or directly by the user.
+---
+## Invocation Contract
+Inputs:
+- **Source file path** (required): absolute or repo-relative path to the local markdown file. When an upstream caller invokes this mode, it passes the path explicitly. When the user invokes directly ("share that doc to proof and let's iterate"), derive the path from conversation context — the file the user just referenced, created, or edited. If ambiguous, ask the user which file.
+- **Doc title** (required): display title for the Proof doc. Upstream callers pass this explicitly; on direct-user invocation, default to the file's H1 heading, falling back to the filename (minus extension) if no H1 exists.
+- **Recommended next step** (optional, caller-specific): short string the caller wants echoed in the final terminal output (e.g., "Recommended next: `/ce-plan`"). Not used on direct-user invocation — the terminal report simply summarizes the iteration and asks what's next.
+Agent identity is fixed, not a parameter: every API call uses agent ID `ai:systematic` and display name `Systematic`. Callers do not override this.
+Return shape (used by upstream callers to resume their handoff; also shown to the user in the terminal when invoked directly):
+- `status`: `proceeded` | `done_for_now` | `aborted`
+- `localPath`: the source file path (same as input)
+- `localSynced`: `true` if Phase 5 wrote the reviewed doc back to `localPath`; `false` if the user declined the sync and local is stale. Only present on `proceeded`.
+- `docUrl`: the tokenUrl for the Proof doc
+- `openThreadCount`: number of unresolved threads still in the doc
+- `revision`: final doc revision after end-sync (only on `proceeded`)
+---
+## Phase 1: Upload and Wait
+1. Read the local markdown file into memory. Remember this content as `uploadedMarkdown` — Phase 5 compares against it to detect whether anything changed during the session.
+2. `POST https://www.proofeditor.ai/share/markdown` with `{title, markdown}` → capture `slug`, `accessToken`, `tokenUrl`
+3. `POST /api/agent/{slug}/presence` with `X-Agent-Id: ai:systematic`, `x-share-token: <token>`, body `{"name":"Systematic","status":"reading","summary":"Uploaded doc for review"}`
+4. Display prominently in the terminal:
+   ```
+   Doc ready for review: <tokenUrl>
+   ```
+5. Ask the user with the platform's blocking question tool: `question` in OpenCode (call `ToolSearch` with `select:question` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to presenting options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question.
+   **Question:** "Highlight text in Proof to leave a comment. The agent will read each one, reply in-thread or apply the fix, then sync changes back to your local file. What's next?"
+   **Options:**
+   - **I'm done with feedback — read it and apply**
+   - **I have no feedback — proceed**
+   If the user is still reviewing, they leave the prompt open — the blocking question waits naturally. A third "still working" option would be a no-op wrapper for that.
+   On **I have no feedback — proceed**: skip to Phase 5 (end-sync); return to caller with `status: proceeded`.
+   On **I'm done with feedback**: continue to Phase 2.
+---
+## Phase 2: Ingest Pass
+A single pass over the current doc state. Deterministic, idempotent, derivable from marks — no session cache, no sidecar state.
+At the start of the pass, update presence to `status: "acting"` with a short summary like `"Reading your feedback"` so anyone watching the Proof tab sees the agent is live on their comments. Update to `status: "waiting"` before the Phase 3 terminal report so the tab signals "ball is in your court" while the terminal asks for the next signal. Same `POST /presence` call as Phase 1 — just different `status`/`summary`.
+### 2.1 Read fresh state
+```
+GET /api/agent/{slug}/state
+Headers: x-share-token: <token>
+```
+Capture:
+- `markdown` (current body — includes any user direct edits and accepted suggestions)
+- `revision`
+- `marks` (object keyed by markId)
+- `mutationBase.token` — the baseToken required for this round's mutations
+### 2.2 Identify marks that need attention
+Filter `marks` to items where **all** of the following hold:
+- `by` starts with `human:` (authored by a human, not the agent)
+- `resolved` is `false`
+- Either `thread` has no entry authored by any `ai:*` identity, **OR** the latest entry in `thread` is authored by `human:*` with an `at` timestamp newer than the latest `ai:*` entry (user responded to a prior agent reply)
+Skip everything else. Agent-authored marks, resolved threads, and threads already replied to with no new human response are done.
+### 2.3 Read each mark and decide how to respond
+The point of HITL is to give the user a natural way to steer the doc without dragging every decision into the terminal. Most feedback can be auto-applied. Only escalate when the agent genuinely can't make a confident call alone.
+Real feedback blends types — "this is wrong, rename to Y" is both objection and directive; "why X? I'd prefer Z" is both question and suggestion. Don't force a clean classification. Read the comment text, the anchored `quote`, and any prior thread replies, and decide:
+**Can the agent apply a fix directly with confidence?** Imperatives ("rename X to Y", "remove this", "add a section about Z") usually qualify. Apply the edit, reply with a one-line summary of what changed, resolve.
+**Is this a question with a clear answer?** Answer in-thread. Resolve if the answer stands on its own. If answering surfaces a new decision the user should weigh in on, leave open and surface it in the terminal report.
+**Is this a disagreement?** ("this is wrong", "contradicts §2", "this won't work"). Evaluate the claim against current content. If the agent agrees, fix and reply "Agreed — updated to X". If the agent disagrees, reply with the reasoning and leave open. Don't silently apply an objection without evaluating it — the whole point is that the user flagged it *because* they think the plan is wrong.
+**Is the intent genuinely unclear?** First try: attempt the most reasonable interpretation, apply it, and reply "I read this as X — let me know if I should revert." That's cheaper than a round-trip when stakes are low. Ask for clarification only when the interpretations lead to meaningfully different outcomes. When asking, use the platform's blocking question tool for a quick multiple-choice when the options are discrete, or leave it as an open thread comment when free-form response is more natural. Either way the thread stays open so the next pass picks up the user's reply.
+**Invariant:** every attention-needing mark ends the pass with an agent reply in its thread. Unreplied = "still to do" — the next pass re-classifies it. This is what makes the loop idempotent without a sidecar: mark state *is* the state. Even when the agent disagrees or can't decide, reply (with reasoning or a question) rather than silently skip.
+**Parallelize independent thread ops.** `comment.reply` and `comment.resolve` across different marks don't conflict — they touch different thread state and a stale `baseToken` on one doesn't poison another (retry-on-`STALE_BASE` is cheap, per-mark, and local). When there are more than ~3 attention-needing marks that classify as plain replies or resolves, dispatch them in parallel — either via multiple tool calls in one turn or via sub-agents (`Agent`/`Task` in OpenCode, `spawn_agent` in Codex, `subagent` in Pi). Keep block-mutating edits (`suggestion.add`, `/edit/v2`) sequential or batch them through one `/edit/v2` call — concurrent block edits can stale one another's `baseToken` and force retries, and they interact in ways that are easier to reason about as an ordered sequence.
+### 2.4 Apply edits
+The user is collaborating in the doc, not waiting on approval. Every mutation works with live clients — only whole-doc `rewrite.apply` is gated. Pick the tool that matches intent:
+**Default: `suggestion.add` with `status: "accepted"`** for content changes anchored on a quote (reword, rename, clarify, correct, add a sentence inline). One call creates a tracked suggestion mark *and* commits the change. The user sees committed text (no pending approval needed), and the mark persists as audit trail with per-edit attribution and a one-click reject-to-revert. This is the right primitive for HITL auto-applied edits — it gives the user a reversible trail without asking them to re-review anything.
+```json
+{"type":"suggestion.add","kind":"replace","quote":"<anchor>","content":"<new>","by":"ai:systematic","status":"accepted","baseToken":"<token>"}
+```
+Use `kind: "insert" | "delete" | "replace"` as appropriate; all three support `status: "accepted"`.
+**Use `/edit/v2` silently** only when the trail is actively wrong or technically blocked:
+- **Atomicity is required** — multiple coordinated edits must commit together or not at all (e.g., insert new section + update a reference in another block + delete the obsolete paragraph). `/edit/v2` takes an `operations` array that commits atomically; separate `suggestion.add` calls can partially succeed.
+- **Pre-user self-correction** — the agent is fixing its own output *before* the user has looked at the doc (e.g., spotted a mistake mid-ingest-pass). A tracked mark would imply "there was an old version," which is misleading from the user's perspective.
+- **Pure structural insertion with no quote anchor** — adding an entirely new block/section where no existing text serves as an anchor. `suggestion.add` requires a `quote`; `/edit/v2` has `insert_before` / `insert_after` keyed on block `ref`.
+- **Structural list-item or block removal** — `suggestion.add` with `kind: "delete"` only deletes the text inside a list item; the bullet marker (`*`, `-`, or numeric `1.`) stays behind as an orphan line. Use `/edit/v2 delete_block` to remove an entire block, or `find_replace_in_block` to splice out the item plus its surrounding whitespace cleanly.
+```bash
+# Get snapshot for block refs + baseToken
+curl -s "https://www.proofeditor.ai/api/agent/{slug}/snapshot" -H "x-share-token: <token>"
+# Apply
+curl -X POST "https://www.proofeditor.ai/api/agent/{slug}/edit/v2" \
+  -H "Content-Type: application/json" -H "x-share-token: <token>" \
+  -H "X-Agent-Id: ai:systematic" -H "Idempotency-Key: <uuid>" \
+  -d '{"by":"ai:systematic","baseToken":"<token>","operations":[...]}'
+```
+Per-op body shape (singular `block` for `replace_block`; plural `blocks:[{markdown},...]` for anything that can add content; the server returns 422 on the wrong shape):
+```json
+{"op":"replace_block","ref":"b8","block":{"markdown":"new content"}}
+{"op":"insert_after","ref":"b3","blocks":[{"markdown":"new block"}]}
+{"op":"insert_before","ref":"b3","blocks":[{"markdown":"new block"}]}
+{"op":"delete_block","ref":"b6"}
+{"op":"find_replace_in_block","ref":"b4","find":"old","replace":"new","occurrence":"first"}
+{"op":"replace_range","fromRef":"b2","toRef":"b5","blocks":[{"markdown":"..."}]}
+```
+Block `ref` values drift across revisions — re-fetch `/snapshot` for fresh refs before each `/edit/v2` call if any writes have landed since the last snapshot.
+**Bulk mechanical sweep — prefer one `/edit/v2` call over N `suggestion.add` calls.** When a uniform change hits more than ~5 blocks (emdash sweep, terminology rename across a doc, heading-style normalization), batch it as a single `/edit/v2` with many `operations`. One round-trip, one atomic commit, one audit entry — versus N separate ops-endpoint calls, N baseToken reads (without the caching below), and N tracked marks for what is one logical change. Use `suggestion.add` + `accepted` when edits are distinct and anchored (each deserves its own reject-to-revert trail); use `/edit/v2` batch when they're variations of the same mechanical rule.
+**Use pending `suggestion.add` (no status)** when the change is judgment-sensitive enough that the agent wants explicit user approval before commit — rare in HITL, since the point of auto-applied edits is to reduce round-trips. Most judgment-sensitive cases are better handled by leaving the thread open with a clarifying question.
+**`rewrite.apply` is not needed during a live review.** It's blocked by `LIVE_CLIENTS_PRESENT` anyway.
+**Mutation requirements (every write, including replies and resolves):**
+- Top-level field is `type` on `/ops`; `operations[].op` on `/edit/v2`. Do not mix.
+- Include `baseToken` from `/state.mutationBase.token` (or `/snapshot.mutationBase.token` for `/edit/v2`).
+- Set `by: "ai:systematic"` and header `X-Agent-Id: ai:systematic`.
+- Include an `Idempotency-Key` header (fresh UUID per logical write). Reuse the same key on a proven retry of the same payload; use a new key for a new logical write.
+- Reply: `{"type":"comment.reply","markId":"<id>","by":"ai:systematic","text":"..."}`. Resolve: `{"type":"comment.resolve","markId":"<id>","by":"ai:systematic"}`. Reopen if needed: `{"type":"comment.unresolve", ...}`.
+**Retry after any error is verify-first, not retry-first.** The Proof API can commit canonically and still return a non-2xx or a 202 with `collab.status: "pending"`; network timeouts can hit after the server has already written. Retrying without verifying is the most common cause of duplicate marks (same comment twice, same suggestion twice) that then need a manual cleanup pass.
+- On `STALE_BASE` / `BASE_TOKEN_REQUIRED` / `MISSING_BASE` / `INVALID_BASE_TOKEN`: pre-commit, token-related. Re-read `/state`, send the same payload with a fresh `baseToken`, retry once. The `mutate()` helper below auto-retries these.
+- On `ANCHOR_NOT_FOUND` / `ANCHOR_AMBIGUOUS`: pre-commit, but the `quote` no longer matches uniquely. Re-read is not enough; the caller must tighten or regenerate the anchor before retrying. The helper surfaces the error instead of auto-retrying.
+- On `INVALID_OPERATIONS` / `INVALID_REQUEST` / `INVALID_REF` / `INVALID_BLOCK_MARKDOWN` / `INVALID_RANGE` / `INVALID_MARKDOWN` / 422: the payload is wrong. Do not retry — fix the payload and send a new write.
+- On `COLLAB_SYNC_FAILED` / `REWRITE_BARRIER_FAILED` / `PROJECTION_STALE` / `INTERNAL_ERROR` / 5xx / network error / timeout / **202 with `collab.status: "pending"`**: the write may have landed. Re-read `/state`, diff against the intended change (mark exists? suggestion applied? quote replaced?), and only retry if the server did not actually commit it. If the diff shows the write did land, treat the call as successful even though the response said otherwise.
+**When the loop breaks.** If a mutation keeps failing after a fresh read and a verified-needed retry, or two reads disagree about state, call `POST https://www.proofeditor.ai/api/bridge/report_bug` with the request ID, slug, and raw response body before falling back. Don't silently skip — that loses the audit trail the user is relying on.
+---
+## Phase 3: Terminal Report
+Exception-based. Don't replay what the user can already see in the Proof doc — the full reasoning for each thread lives there. The terminal is for the decisions the user needs to make next.
+Every report covers three things, phrased naturally for the current state:
+- **What got handled** (e.g., how many comments resolved, any edits auto-applied)
+- **What's still open** — if any escalations remain, each one gets one line of anchored quote plus one line of the agent's reply or question. Fuller context stays in the Proof thread
+- **The doc URL** — always include it; the user may have closed the tab
+Keep the whole report scannable at a glance. Three common shapes fall out of this naturally:
+- A clean pass with everything handled collapses to a single line plus the doc URL
+- An escalation pass lists the open threads compactly after a one-line summary of what was handled
+- A pass with no new feedback just notes that and points to the doc
+Phrase them in whatever voice matches the situation rather than matching a template — "handled 4, 1 still needs you" and "all 5 addressed, doc's ready" are both fine.
+---
+## Phase 4: Next-Signal Prompt
+Ask the user with the platform's blocking question tool: `question` in OpenCode (call `ToolSearch` with `select:question` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to presenting options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question.
+**Question:** "Proof review pass done. What's next?"
+Offer options that cover these intents — use concrete user-facing labels, not agent-internal jargon (no "end-sync", "ingest pass", etc.). Only include the options that fit the current state. Keep labels imperative and third-person (no "I'll" / "I'm" — it is ambiguous in a tool-mediated menu whether the speaker is the user or the agent) and keep the `[short label] — [description]` shape consistent across every option. A "still working, come back later" option is not offered: the blocking question already waits, so that option would be a no-op wrapper.
+- **Discuss** → `Discuss — walk through the open threads in terminal`
+  Talk through open threads in the terminal; the agent echoes decisions back to Proof threads. Only useful when escalations are open.
+- **Proceed** → `Save — save the reviewed doc back to the local file`
+  Go to Phase 5 end-sync. If escalations are still open, name that in the label (e.g., `Save with 3 threads still open`) so the user is accepting the tradeoff explicitly instead of via a nested confirm.
+- **Another pass** → `Re-check — look for new comments in Proof`
+  Re-read state and re-ingest. Worth offering even after a clean pass, since the user may have added comments while the report rendered.
+- **Done for now** → `Pause — stop without saving`
+  Stop without syncing; return to caller with `status: done_for_now`, no end-sync.
+The sync confirmation happens in Phase 5 regardless of whether threads are open — this step only asks what the user wants next, not whether to overwrite the local file.
+---
+## Phase 5: End-Sync
+Runs when the user selects **Proceed**. Before prompting anything, check whether the Proof content actually diverged from what was uploaded — if not, there's nothing to sync and no reason to ask.
+1. Fetch current state: `GET /api/agent/{slug}/state` with `x-share-token: <token>`. Save the full response body to a temp file (`$STATE_TMP`) so the markdown bytes can later be streamed to disk without passing through `$(...)` (which would strip trailing newlines). Extract `state.revision` from that file into `$REVISION`. Read `state.markdown` from that file for the comparison in step 2.
+2. Compare `state.markdown` to `uploadedMarkdown` (captured in Phase 1).
+   **If identical** — no content changes happened during the session. Skip the sync prompt entirely. Display:
+   ```
+   No changes to sync. Local file is unchanged.
+   Doc: <tokenUrl>
+   ```
+   Set presence `status: completed`, summary `"Review complete, no changes"`. Return to the caller with `status: proceeded`, `localSynced: true` (local matches Proof — no write needed, local is not stale), `revision: <state.revision>`, and the rest of the standard fields.
+   **If different** — continue to step 3.
+3. Ask with the platform's blocking question tool: `question` in OpenCode (call `ToolSearch` with `select:question` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to presenting options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question.
+   **Question:** "Sync the reviewed doc back to `<localPath>`? Proof has your review changes; local still has the pre-review copy."
+   **Options:**
+   - **Yes, sync now** (default, recommended)
+   - **Not yet, I'll pull it later** (returns to caller with `localSynced: false`)
+   Why the extra prompt: the user may have started review hours ago and lost track of the local file at stake. A brief confirm makes the file write visible rather than a silent side-effect of clicking Proceed earlier. The caller signals via `localSynced` so downstream workflows can warn that local is stale.
+4. On **Yes, sync now**, write the fetched markdown to local — see `Workflow: Pull a Proof Doc to Local` in `SKILL.md`:
+   ```bash
+   # $STATE_TMP is the temp file holding the /state response from step 1.
+   TMP="${SOURCE}.proof-sync.$$"
+   jq -jr '.markdown' "$STATE_TMP" > "$TMP" && mv "$TMP" "$SOURCE"
+   rm "$STATE_TMP"
+   ```
+   Stream `.markdown` bytes directly from the saved state file with `jq -jr` — do not capture the markdown into a shell variable, since `$(...)` would strip trailing newlines and corrupt the write. `$REVISION` (extracted separately in step 1) is safe to keep as a variable; it's an opaque scalar.
+   On **Not yet**, skip the write (still clean up `$STATE_TMP`).
+5. Set presence `status: completed`, summary `"Review synced to <localPath>"` (or `"Review complete, local not updated"` if sync was declined) so the Proof UI shows the loop has finished.
+6. Display one of:
+   Synced:
+   ```
+   Doc synced to <localPath> (revision <N>).
+   Doc: <tokenUrl>
+   ```
+   Declined:
+   ```
+   Review complete. Local file kept as-is — pull from Proof when ready.
+   Doc: <tokenUrl>
+   ```
+7. Return to the caller with:
+   ```
+   status: proceeded
+   localPath: <source>
+   localSynced: true | false
+   docUrl: <tokenUrl>
+   openThreadCount: <K>
+   revision: <N>
+   ```
+Do **not** delete the Proof doc. It remains the durable review record; the caller's workflow may want to link back to it.
+---
+## Recipes
+### BaseToken-aware mutation
+Reuse `baseToken` from the most recent `/state` read. Only re-read on `STALE_BASE` / `BASE_TOKEN_REQUIRED`. For an ingest pass this means one `/state` read at Phase 2.1 feeds every subsequent mutation, not N reads for N mutations.
+Two retry classes, and they behave differently. The helper below only covers the safe class; the ambiguous class needs a caller-supplied verifier because "did this write land?" depends on what the payload was (look for a markId, a quote replacement, a thread reply, etc.).
+```bash
+SLUG=<slug>
+TOKEN=<accessToken>
+AGENT_ID=ai:systematic
+BASE=<cached from most recent /state or /snapshot read>
+mutate() {
+  local PAYLOAD="$1"  # jq template without baseToken
+  local IDEM_KEY BODY RESP CODE
+  # Fresh key per logical write (per mutate() call). The same key is then
+  # reused below only within the retry of this single logical write — NOT
+  # across different payloads, which would make the server collapse
+  # distinct writes as duplicates and silently drop later edits.
+  IDEM_KEY=$(uuidgen)
+  BODY=$(jq -n --arg base "$BASE" --argjson payload "$PAYLOAD" '$payload + {baseToken: $base}')
+  RESP=$(curl -s -X POST "https://www.proofeditor.ai/api/agent/$SLUG/ops" \
+    -H "Content-Type: application/json" \
+    -H "x-share-token: $TOKEN" \
+    -H "X-Agent-Id: $AGENT_ID" \
+    -H "Idempotency-Key: $IDEM_KEY" \
+    -d "$BODY")
+  CODE=$(printf '%s' "$RESP" | jq -r '.code // .error // empty')
+  # Pre-commit token-related errors — safe to auto-retry with the same
+  # payload and a fresh baseToken. Anchor errors (ANCHOR_NOT_FOUND,
+  # ANCHOR_AMBIGUOUS) are also pre-commit but require a tighter quote,
+  # so they are surfaced instead of auto-retried.
+  if [ "$CODE" = "STALE_BASE" ] \
+    || [ "$CODE" = "BASE_TOKEN_REQUIRED" ] \
+    || [ "$CODE" = "MISSING_BASE" ] \
+    || [ "$CODE" = "INVALID_BASE_TOKEN" ]; then
+    BASE=$(curl -s "https://www.proofeditor.ai/api/agent/$SLUG/state" \
+      -H "x-share-token: $TOKEN" | jq -r '.mutationBase.token')
+    BODY=$(jq -n --arg base "$BASE" --argjson payload "$PAYLOAD" '$payload + {baseToken: $base}')
+    RESP=$(curl -s -X POST "https://www.proofeditor.ai/api/agent/$SLUG/ops" \
+      -H "Content-Type: application/json" \
+      -H "x-share-token: $TOKEN" \
+      -H "X-Agent-Id: $AGENT_ID" \
+      -H "Idempotency-Key: $IDEM_KEY" \
+      -d "$BODY")
+  fi
+  printf '%s' "$RESP"
+}
+```
+The `Idempotency-Key` is minted fresh at the top of each call (one key per logical write). The retry inside the same call reuses that key so the server can collapse a duplicate TCP-level send, but a later `mutate()` for a different payload gets its own key. Minting outside the function would make every call share one key, and the server would treat the second and subsequent writes as duplicates of the first and silently drop them.
+**Ambiguous failures (anything outside the pre-commit set above — `COLLAB_SYNC_FAILED`, `INTERNAL_ERROR`, 5xx, network timeout, 202 with `collab.status: "pending"`):** do not retry from this helper. Re-read `/state` in the caller, diff the marks/content against the intended change, and only re-issue the write if the diff proves nothing landed. Pattern:
+```bash
+# After an ambiguous failure on comment.add with quote "X" and text "Y":
+STATE=$(curl -s "https://www.proofeditor.ai/api/agent/$SLUG/state" \
+  -H "x-share-token: $TOKEN")
+LANDED=$(printf '%s' "$STATE" | jq --arg q "X" --arg t "Y" \
+  '[.marks[]? | select(.by == "ai:systematic" and .quote == $q and (.thread[0].text // .text) == $t)] | length')
+if [ "$LANDED" -gt 0 ]; then
+  echo "Already applied — skipping retry."
+else
+  # Safe to retry with a fresh BASE.
+  ...
+fi
+```
+Match on a payload-identifying field (quote + text for `comment.add`; quote + content for `suggestion.add`; markId + text for `comment.reply`; block ordinal + markdown for `/edit/v2`). When no such invariant is available, prefer leaving the write undone and surfacing it in the terminal report over risking a duplicate.
+### jq gotcha when inspecting responses
+When extracting fields from API responses with jq's `//` alternative operator, parenthesize inside object constructors — jq parses `{markId: .markId // .result.markId}` as a syntax error. Use `{markId: (.markId // .result.markId)}`, or pull the value outside the object: `jq -r '.markId // .result.markId'`.
+### Identity
+All ops must include:
+- `by: "ai:systematic"` in the request body
+- `X-Agent-Id: ai:systematic` in headers (required for presence; recommended for ops for consistent attribution)
+Display name `Systematic` is bound via `POST /presence` with `{"name":"Systematic", ...}`. Set this once after upload; it carries across subsequent ops.

package/skills/writing-systematic-skills/SKILL.md ADDED Viewed

@@ -0,0 +1,115 @@
+---
+name: writing-systematic-skills
+description: Use when creating, editing, auditing, or fixing bundled Systematic skills, especially when authoring SKILL.md files, adding skill reference files, resolving content-integrity frontmatter failures, or deciding which Systematic conventions apply beyond the general writing-skills guidance.
+---
+# Writing Systematic Skills
+Systematic skills are OpenCode-native workflow assets. Use this skill after loading the general `writing-skills` foundation; this file only covers the Systematic-specific delta.
+## When To Use
+Use this skill when you are:
+- Creating a new bundled skill under `skills/`
+- Editing an existing bundled skill's frontmatter or file layout
+- Fixing content-integrity frontmatter or sub-file failures
+- Deciding whether a skill needs references, scripts, assets, or templates
+- Auditing bundled skills for provider-portable defaults
+Do not use this as a replacement for `~/.agents/skills/writing-skills/SKILL.md`. Load that foundation first when authoring or substantially editing skill content.
+## Foundation
+`writing-skills` supplies the authoring discipline: pressure scenarios, clear trigger-oriented descriptions, concise bodies, and reference files only when depth earns its keep.
+Systematic adds repository-specific constraints:
+- Runtime-recognized frontmatter fields are fixed by the skill loader.
+- Skill sub-files live in a small set of conventional directories.
+- Bundled agents must use provider-portable model defaults.
+- Content-integrity enforces the mechanical parts in CI.
+For worked examples and judgment calls, read `references/foundation-conventions.md`.
+## Frontmatter Rules
+Every bundled skill must have YAML frontmatter with:
+- `name` - unprefixed skill identifier. The loader adds the `systematic:` command prefix automatically unless the skill intentionally belongs to another namespace such as `ce:`.
+- `description` - third-person trigger conditions. Prefer `Use when...`; describe when to load the skill, not its internal workflow.
+Optional fields are allowed only when the runtime loader recognizes them:
+| Field | Use |
+|---|---|
+| `argument-hint` | Shows expected invocation arguments. |
+| `disable-model-invocation` | Prevents direct model invocation for dispatcher-style skills. |
+| `allowed-tools` | Declares tool constraints for skill execution. |
+| `license` | Carries skill licensing metadata. |
+| `compatibility` | Notes platform or version compatibility. |
+| `metadata` | String-only metadata map. |
+| `user-invocable` | Marks whether users should invoke the skill directly. |
+| `agent` | Selects a companion agent when the loader supports it. |
+| `model` | Selects a model for skill execution when justified. |
+| `context` | Use `fork` when the skill should run in forked subtask context. |
+| `subtask` | Explicit forked-subtask marker recognized by the runtime. |
+`preconditions` is banned. It has no runtime consumer. Put prerequisite guidance in the skill body instead.
+## File Layout
+The required entry point is:
+```text
+skills/<skill-name>/SKILL.md
+```
+Optional sub-files must live under one of these directories:
+- `references/` - deeper guidance, decision tables, long examples, or API notes
+- `scripts/` - executable helpers an agent can run
+- `assets/` - static files used by the skill
+- `templates/` - reusable stubs or document templates
+Keep the main `SKILL.md` small enough to decide whether and how to proceed. Move heavy detail to `references/`, and cite it with a repo-local path such as `references/foundation-conventions.md` so the sub-file integrity gate can verify it exists.
+## Identity Defaults
+Bundled agents must declare:
+```yaml
+model: inherit
+```
+Hardcoded provider model IDs break users who do not use that provider. Use `model: inherit` unless a future design explicitly proves a provider-specific dependency and documents the tradeoff.
+For agent or API attribution, `ai:systematic` is the machine ID used by Systematic-owned operations, such as Proof's `by` field and `X-Agent-Id` header. It is not a skill cross-reference convention.
+## Validator
+Run the content-integrity gate before shipping skill changes:
+```bash
+bun 'scripts/content-integrity.ts'
+```
+The gate checks:
+- Skill frontmatter is present and uses only runtime-recognized fields.
+- Required `name` and `description` fields are non-empty.
+- Banned frontmatter such as `preconditions` is absent.
+- Bundled agents use `model: inherit`.
+- Skill references to `references/`, `scripts/`, `assets/`, and `templates/` resolve on disk.
+If the gate fails, fix the content rather than broadening the validator unless the runtime loader contract has actually changed.
+## Common Mistakes
+| Mistake | Fix |
+|---|---|
+| Adding a new frontmatter field because it reads well | Add body prose instead, unless the runtime loader consumes the field. |
+| Summarizing the whole workflow in `description` | Describe trigger conditions only. |
+| Adding provider-specific agent models | Use `model: inherit`. |
+| Linking to a non-existent reference file | Create the file or remove the link. |
+| Duplicating the general writing-skills guidance | Link to the foundation and document only the Systematic delta. |

package/skills/writing-systematic-skills/references/foundation-conventions.md ADDED Viewed

@@ -0,0 +1,143 @@
+# Foundation Conventions
+This reference expands the Systematic-specific rules from `SKILL.md`. The mechanical rules are enforced by `bun scripts/content-integrity.ts`; this file explains the judgment calls behind them.
+## Frontmatter
+Systematic skill frontmatter mirrors what the runtime loader actually reads. Do not invent fields for documentation structure. If the loader does not consume the field, put the information in the body.
+| Field | Required | When To Use | Example |
+|---|---:|---|---|
+| `name` | Yes | Every skill. Use the unprefixed skill identifier unless another namespace is intentional. | `name: writing-systematic-skills` |
+| `description` | Yes | Trigger-oriented discovery text. Third person. Prefer `Use when...`. | `description: Use when fixing bundled skill frontmatter failures...` |
+| `argument-hint` | No | The skill accepts meaningful invocation arguments. | `argument-hint: "[path/to/document.md]"` |
+| `disable-model-invocation` | No | Dispatcher or routing skills that should not be directly model-invoked. | `disable-model-invocation: true` |
+| `allowed-tools` | No | The skill needs an explicit tool allowlist. | `allowed-tools: Bash, Read` |
+| `license` | No | Licensing metadata matters for distribution. | `license: MIT` |
+| `compatibility` | No | A platform or version caveat is real and useful. | `compatibility: OpenCode` |
+| `metadata` | No | String-only metadata map. Keep it boring. | `metadata: { owner: systematic }` |
+| `user-invocable` | No | Direct user invocation should be explicitly advertised or suppressed. | `user-invocable: false` |
+| `agent` | No | A loader-supported companion agent is required. | `agent: general` |
+| `model` | No | A skill-level model choice is justified. Prefer inheritance elsewhere. | `model: inherit` |
+| `context` | No | Forked execution is required. `fork` derives subtask behavior at runtime. | `context: fork` |
+| `subtask` | No | Explicit forked-subtask dispatch marker. | `subtask: true` |
+### Required Fields
+`name` and `description` must be present. Empty strings are not valid values. Bare YAML keys such as `name:` parse as null and count as missing.
+### Banned Field
+`preconditions` is banned because it has no runtime consumer. Use a body section instead:
+```markdown
+## Prerequisites
+Only run this skill after the bug is reproduced and the fix has been verified.
+```
+### Runtime-Recognized Forking Fields
+`context: fork` and `subtask: true` are allowed. They are runtime-recognized and must not be treated as idiosyncratic frontmatter.
+Use them only when the skill genuinely needs forked subtask behavior. Do not cargo-cult them onto normal skills.
+### Description Style
+Good descriptions answer: should the agent load this skill now?
+Good:
+```yaml
+description: Use when creating, editing, auditing, or fixing bundled Systematic skills, especially when authoring SKILL.md files or resolving content-integrity frontmatter failures.
+```
+Bad:
+```yaml
+description: This skill explains frontmatter, file layout, validation, and identity defaults for Systematic skills.
+```
+The bad version describes content. The good version describes trigger conditions.
+## File Layout
+Every skill has exactly one entry point:
+```text
+skills/<skill-name>/SKILL.md
+```
+Use sub-files only when the content is too bulky or operationally distinct for the main skill.
+```text
+skills/<skill-name>/
+├── SKILL.md
+├── references/
+│   └── detailed-guidance.md
+├── scripts/
+│   └── helper.mjs
+├── assets/
+│   └── diagram.svg
+└── templates/
+    └── output-template.md
+```
+### Directory Choices
+| Directory | Use For | Do Not Use For |
+|---|---|---|
+| `references/` | Long explanations, decision tables, extended examples | Required setup that must be read before deciding to use the skill |
+| `scripts/` | Executable helpers the agent can run | Pseudocode or prose examples |
+| `assets/` | Static images, fixtures, or other bundled files | Markdown guidance that belongs in `references/` |
+| `templates/` | Fillable documents, prompts, or output stubs | Examples that should not be copied verbatim |
+### Sub-File Links
+When `SKILL.md` mentions a sub-file path, the content-integrity gate verifies it exists. Prefer direct repo-local paths:
+```markdown
+Read `references/foundation-conventions.md` for examples.
+```
+Avoid ambiguous references such as "the conventions doc". They are harder for agents to follow and impossible for the gate to verify.
+## Identity Defaults
+### Bundled Agent Model
+Bundled agents must use:
+```yaml
+model: inherit
+```
+This keeps Systematic provider-portable. Hardcoded provider IDs make an agent unusable for users on other providers and are almost never worth the lock-in.
+If a future agent truly depends on a specific provider, document the constraint in the plan and get explicit review before adding the hardcoded model.
+### Machine ID
+`ai:systematic` is a machine identity string for Systematic-owned operations. Proof uses it as the `by` field on operations and the `X-Agent-Id` header. Keep it lowercase and stable.
+Do not use `ai:systematic` as a skill-reference pattern. Skill and agent references use their own namespaces, such as `systematic:writing-systematic-skills` or `systematic:research:best-practices-researcher`.
+### Public-Facing Voice
+Skill text should be reusable guidance, not a session transcript. Avoid first-person process narration, internal note IDs, and references to how a particular implementation session unfolded.
+## Validator Workflow
+Run:
+```bash
+bun scripts/content-integrity.ts --verbose
+```
+Use the output as the source of truth for cleanup. If a violation surprises you, check the runtime loader before changing the validator:
+- Skill frontmatter rules mirror `src/lib/skills.ts`.
+- YAML parsing behavior comes from `src/lib/frontmatter.ts`.
+- Sub-file directories come from `SUBFILE_DIRECTORY_NAMES` in `scripts/content-integrity.ts`.
+When the runtime contract changes, update the loader and validator together. A validator allow-list that drifts from the loader creates false failures or misses real runtime-invisible fields.