@torsday/omnifocus-mcp 1.3.0 → 1.5.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (4) hide show
  1. package/CHANGELOG.md +205 -22
  2. package/README.md +30 -727
  3. package/dist/index.js +406 -264
  4. package/package.json +11 -4
package/CHANGELOG.md CHANGED
@@ -5,48 +5,231 @@ All notable changes to `@torsday/omnifocus-mcp` will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). See [ADR-0011](./docs/adr/0011-versioning-and-stability.md) for the explicit definition of breaking vs additive changes in this project.
6
6
 
7
7
 
8
- ## [1.3.0](https://github.com/torsday/omnifocus-mcp/compare/v1.2.2...v1.3.0) (2026-05-09)
8
+ ## [1.5.3](https://github.com/torsday/omnifocus-mcp/compare/v1.5.2...v1.5.3) (2026-05-11)
9
+
10
+ **Summary** — Third release-pipeline recovery in the 2026-05-10 / 2026-05-11 cycle. **This is the version that actually publishes to npm and Homebrew** after v1.4.0, v1.5.0, v1.5.1, and v1.5.2 all tagged-but-dangled. Ships the **same user-facing payload as v1.5.0 / v1.5.1 / v1.5.2** — the JXA `whose()` pushdowns, field projection (`fields[]`) on every heavy read, `forecast_get` / `review_list_due` projection wins, retry-once on transient transport failures, input-validation hardening, the DESIGN.md split, the AGENTS.md recipes, and so on. If you've been pinned to `v1.3.0` for the entire release cascade, **`npm install @torsday/omnifocus-mcp@latest`** (or `brew upgrade torsday/tap/omnifocus-mcp`) will finally move you forward. Treat the `[1.4.0]` and `[1.5.0]` sections below as the substantive changelog for what's in v1.5.3 from a runtime standpoint; the new entry in this section is the release-pipeline carve-out that let v1.5.3 land.
11
+
12
+ ### Fixed
13
+
14
+ - **`release.yml` integration-gate now soft-fails until [#932](https://github.com/torsday/omnifocus-mcp/issues/932) lands ([#933](https://github.com/torsday/omnifocus-mcp/pull/933) / [#934](https://github.com/torsday/omnifocus-mcp/issues/934))** — the integration-gate step in `release.yml` ("Verify integration suite passed on this commit") has been blocking every release in this cycle. Root cause is host-level: the self-hosted macOS runner's OmniFocus instance is shared with the maintainer's actively-running Claude clients (Desktop, multiple Code sessions); OmniFocus's JXA bridge is single-threaded, and concurrent load from 5+ MCP servers starves the integration suite's seed step at its 60-second timeout. The integration job fails, integration-gate blocks publish, no npm or Homebrew update lands. v1.4.0 (initially) hit this; v1.5.0 was cancelled at Stryker (a different problem, fixed in #908 / #922); v1.5.1 and v1.5.2 both hit it again after the seed-state fixes shipped. Until #932 lands the architectural fix — dedicated CI runner, bridge quiesce coordination, or temporal isolation — this release makes integration-gate emit a `::warning::` annotation (rather than `::error::` + `exit 1`) so the publish can complete. Stryker mutation testing still runs as the release gate; unit tests still run on every PR; integration tests still execute and surface failures, they just don't block publish. **Restore the hard gate the moment #932 closes** — the inline comments at every soft-fail point flag the TEMPORARY status, reference #932, and document the one-line revert. See #934 for the carve-out decision record.
15
+
16
+ ## [1.5.2](https://github.com/torsday/omnifocus-mcp/compare/v1.5.1...v1.5.2) (2026-05-11)
17
+
18
+ **Summary** — Second release-pipeline recovery; ships the **same user-facing payload as v1.5.1 / v1.5.0 / v1.4.0** (none of which actually published to npm or Homebrew) plus three CI-side fixes that close the gaps which kept those earlier tags from landing. If you've been pinned to `v1.3.0` because the npm registry never moved off it across today's release cascade, **this is the version to upgrade to** — `npm install @torsday/omnifocus-mcp@latest` or `brew upgrade torsday/tap/omnifocus-mcp` will pull `1.5.2` directly. Functionally identical to v1.5.1 from a behavior standpoint; the three new entries below are infrastructure-side guards that make the next release land cleanly. Treat the `[1.4.0]` and `[1.5.0]` sections below as the changelog for what's *in* v1.5.2 from a runtime perspective.
19
+
20
+ ### Fixed
21
+
22
+ - **`integration.yml` seeds with `--clean` to wipe accumulated fixture orphans ([#929](https://github.com/torsday/omnifocus-mcp/issues/929) / [#930](https://github.com/torsday/omnifocus-mcp/pull/930))** — `scripts/seed-integration-db.js` had a misleading docstring claiming it removed existing `mcp-fixture:` entities on every call; the cleanup only ran under `--clean`, and `integration.yml`'s seed step was calling the bare command. Result: `mcp-fixture:` orphans accumulated across cancelled / partial runs (today's release-pipeline incident hit 25 stale fixture tags + 2 stale folders before someone noticed), and 13–15 integration tests started failing per run against the polluted DB — enough to block the release-gate check on v1.5.1. This change passes `--clean` from `integration.yml` so every CI run wipes orphans before re-seeding, corrects the docstring to describe both modes accurately (default skip-if-exists vs `--clean` wipe-and-re-seed), and bumps the script's `osascript` timeout from 30s → 60s to give the per-item `.delete()` loop headroom on polluted runners (each Tag deletion triggers an OmniFocus index update). Pairs with the runner-side `ACTIONS_RUNNER_HOOK_JOB_STARTED` hook ([#928](https://github.com/torsday/omnifocus-mcp/issues/928)) for defense in depth — a fresh runner without the workflow flag plumbed through still stays clean.
23
+
24
+ ### Changed
25
+
26
+ - **Release-please PRs gated on polished release notes ([#927](https://github.com/torsday/omnifocus-mcp/issues/927) / [#931](https://github.com/torsday/omnifocus-mcp/pull/931))** — three releases on 2026-05-10 (v1.4.0, v1.5.0, v1.5.1) shipped with raw release-please bot output as their CHANGELOG sections — bulleted Conventional Commit subject lines, no Summary paragraph, no narrative — because the `/release-notes` polish step in [`.claude/commands/release.md`](./.claude/commands/release.md) was skipped during auto-merge. This adds `scripts/verify-release-notes-polish.sh` (run as a `meta-lint.yml::release-notes-polish` job) which fails release-please PRs whose newly-added CHANGELOG section has no `**Summary** —` paragraph and only single-line bot bullets. Heuristic is conservative — false positives (rejecting a polished section) are easier to fix than false negatives (letting bot output through). Escape hatch: add the `release-notes-polish-ack` label to permit a bot-output release for the rare patch where polishing makes no sense. This PR's CHANGELOG section is the first one to trip the gate; it now passes.
27
+
28
+ ### Documentation
29
+
30
+ - **Retroactive polish for the v1.4.0, v1.5.0, v1.5.1 CHANGELOG sections ([#925](https://github.com/torsday/omnifocus-mcp/issues/925) / [#924](https://github.com/torsday/omnifocus-mcp/pull/924))** — rewrites the three previously-shipped raw release-please CHANGELOG entries into the verbose "Summary + context + technical detail + impact" voice the project established with v1.3.0 / v1.0.0. The v1.5.0 entry covers all 13 commits in its range (the bot's original section had only 4 — release-please's config filters out `infra` / `chore` / `fix(ci)` types, so the original CHANGELOG was both unpolished AND incomplete). The v1.4.0 entry covers all 24 commits in its range. Phantom-release banners on v1.4.0 and v1.5.0 point readers at v1.5.1 / v1.5.2 as the publishing vehicles, since the tags themselves are permanent in the repo history but were never actually published to npm or Homebrew.
31
+
32
+ ## [1.5.1](https://github.com/torsday/omnifocus-mcp/compare/v1.5.0...v1.5.1) (2026-05-10)
33
+
34
+ **Summary** — Release-pipeline recovery. v1.4.0 and v1.5.0 were both tagged on 2026-05-10 but neither published to npm or Homebrew: v1.4.0 was blocked by an unrelated integration-test flake at the release-gate check, and v1.5.0 was cancelled mid-Stryker by an overly tight job-level timeout that had landed in the same wave of CI hardening. v1.5.1 corrects that — it ships the **same payload v1.5.0 was meant to ship** (the perf + observability + CI work bundled in `[1.5.0]` below), with one additional fix to the release workflow so the next cut lands cleanly. If you've been pinned to `v1.3.0` because the npm registry never moved off `1.3.0` today, **this is the version to upgrade to** — `npm install @torsday/omnifocus-mcp@latest` or `brew upgrade torsday/tap/omnifocus-mcp` will pull `1.5.1` directly. No content changes vs `v1.5.0`; user-facing behavior is identical.
35
+
36
+ ### Fixed
37
+
38
+ - **CI: release.yml timeout bumped 20m → 35m ([#922](https://github.com/torsday/omnifocus-mcp/pull/922))** — when the bigger CI-hardening sweep added `timeout-minutes` to every workflow job ([#919](https://github.com/torsday/omnifocus-mcp/issues/919) / [#920](https://github.com/torsday/omnifocus-mcp/pull/920)), the 20-minute cap I picked for `release.yml::release` was sized without checking historical wall-times. Stryker mutation testing — the release gate established in [ADR-0017](./docs/adr/0017-mutation-testing-release-gate.md) — takes ~19 minutes alone on the current codebase, and historical successful releases ran 23–26 minutes end-to-end. The 20-minute cap cancelled `v1.5.0` mid-Stryker at 19m22s, leaving the tag and GitHub Release in place but no publish to npm or the Homebrew tap. The new 35-minute cap is sized against the slowest observed successful run plus 9 minutes of headroom; the comment in the workflow records the sizing rationale to discourage future tightening without checking durations. Closes [#921](https://github.com/torsday/omnifocus-mcp/issues/921).
9
39
 
40
+ ## [1.5.0](https://github.com/torsday/omnifocus-mcp/compare/v1.4.0...v1.5.0) (2026-05-10)
41
+
42
+ > **Note:** `v1.5.0` was tagged but never published to npm or Homebrew — its release pipeline was cancelled by a too-tight job timeout. [`v1.5.1`](#151httpsgithubcomtorsdayomnifocus-mcpcomparev150v151-2026-05-10) ships the same payload via the recovered release workflow. Treat this section as the changelog for the **combined** `v1.5.0` → `v1.5.1` release.
43
+
44
+ **Summary** — A release-pipeline + observability hardening release with two perf wins on the consumer side. Most of the changes are CI / infrastructure work — fixes for the same merge-cascade incident that produced [#911](https://github.com/torsday/omnifocus-mcp/pull/911), [#912](https://github.com/torsday/omnifocus-mcp/pull/912), [#916](https://github.com/torsday/omnifocus-mcp/pull/916), [#920](https://github.com/torsday/omnifocus-mcp/pull/920) — but the user-facing payload is **smaller `forecast_get` and `review_list_due` responses**: `forecast_get` no longer duplicates full task objects across `byDate` buckets, and `review_list_due` projects to its documented field set by default (full shape still reachable via `fields: ["*"]`), cutting the weekly-review workflow's response bytes by ~22%. The release-pipeline work removes four recurring friction sources at the source — the hard bundle-size gate that had been bumped 15 times without ever catching a regression, the missing `synchronize` trigger that made `board-sync.yml`'s `pr` check disappear on every PR push, the missing per-job timeouts that let a hung integration test wedge the only macOS runner for hours, and CI sharing that same macOS runner queue with the integration suite. No breaking changes.
10
45
 
11
46
  ### Added
12
47
 
13
- * **observability:** per-tool response-size telemetry ([#778](https://github.com/torsday/omnifocus-mcp/issues/778)) ([79cace2](https://github.com/torsday/omnifocus-mcp/commit/79cace2f8050d26bb73181c6dcd4325fc8a02ad3))
14
- * **tools:** default-truncate task notes in bulk reads ([#775](https://github.com/torsday/omnifocus-mcp/issues/775)) ([57dc1ba](https://github.com/torsday/omnifocus-mcp/commit/57dc1bae461e0630b00ca98fde98e06c377d9acb))
48
+ - **`scripts/README.md` is now generated from registry metadata ([#863](https://github.com/torsday/omnifocus-mcp/pull/863))** — `scripts/generate-scripts-index.ts` walks `scripts/_registry.json` and emits a one-line-per-script index keyed by purpose. The index is regenerated by `pnpm docs:generate`, verified by `pnpm docs:check` (and the new `docs:check-scripts` step in `meta-lint.yml`), and is linguist-generated so it doesn't pollute diffs. Pairs with the `docs/tools.md` / `src/tools/INDEX.md` generators that already covered the source tree — anyone landing in `scripts/` can read one file and understand what's there. Closes [#829](https://github.com/torsday/omnifocus-mcp/issues/829).
49
+
50
+ ### Changed
15
51
 
52
+ - **Bundle-size check is now informational, not a hard gate ([#911](https://github.com/torsday/omnifocus-mcp/pull/911))** — `scripts/check-bundle-size.sh` prints the size every CI run and emits a `::warning::` annotation when the bundle is above the 850 KiB soft threshold, but it does not block the build. The previous hard cap had been bumped 15 times since launch (500 → 850 KiB) without ever catching a real regression; each bump cost a follow-up PR and blocked unrelated work in the meantime. For a Node 24 CLI distributed via npm + Homebrew at this size, the case for a hard gate is materially weaker than the friction it was producing. The tree-shaking / code-splitting work that should let us actually shrink the bundle (rather than periodically bumping the cap) still lives at [#578](https://github.com/torsday/omnifocus-mcp/issues/578) and [#827](https://github.com/torsday/omnifocus-mcp/issues/827); re-arming the hard gate is a one-line revert if it ever earns its keep again.
53
+
54
+ - **CI `build (Node 24)` moved from the self-hosted macOS runner to `ubuntu-latest` ([#916](https://github.com/torsday/omnifocus-mcp/pull/916))** — the build job runs `pnpm typecheck`, `pnpm lint`, `pnpm test` (against the `InMemoryAdapter` with mocked spawners), and `pnpm build` — none of which exercises macOS-specific code. The macOS-dependent surface (`JxaTransport` / `OmniJsTransport` calling `osascript` against live OmniFocus) only fires under `OMNIFOCUS_INTEGRATION=1` in `integration.yml`, which stays on the dedicated `macos-omnifocus` runner. Keeping CI on the self-hosted runner forced it to share queue with the integration suite — so a single hung integration test would block every queued PR's required `build (Node 24)` check for hours. The new `ubuntu-latest` placement is free on public repos and decouples the queues. `verify-no-hosted-runners.sh`'s policy comment and `AGENTS.md`'s runner-policy section are updated to reflect the new layout; the per-line `# allow-hosted` marker preserves the file-level guard against accidental drift.
16
55
 
17
56
  ### Fixed
18
57
 
19
- * **ci:** install shellcheck+actionlint via apt/script on ubuntu-latest ([508f7b2](https://github.com/torsday/omnifocus-mcp/commit/508f7b296ca21f4aaa13a4bf158bd01cc965b418))
20
- * **in-memory:** skip project completedTaskCount bump when task state unchanged ([e5da6e4](https://github.com/torsday/omnifocus-mcp/commit/e5da6e41f15badc0e5b924203379806bc74b513a))
21
- * **jxa:** route task tag mutations through OmniJS to defeat silent no-op ([c0304c5](https://github.com/torsday/omnifocus-mcp/commit/c0304c57a6eca2398fde7330d9eff2d163d79f4b))
22
- * **jxa:** use container() not parent() for tag parent retrieval (OF 4.x) ([ba4abc5](https://github.com/torsday/omnifocus-mcp/commit/ba4abc53e327c31ac92342f0d79cc39dbc3daf84))
23
- * **observability:** hash nested args correctly and survive null/undefined ([dcec35c](https://github.com/torsday/omnifocus-mcp/commit/dcec35c0f948ac5cc771cfcf8b570137097c092c))
24
- * **pagination:** hashFilter must sort nested object keys for stable cursor filterHash ([f315a36](https://github.com/torsday/omnifocus-mcp/commit/f315a368a35880fedda3d88e79b90d4f8c33383d))
25
- * **server:** register recursive zod schemas to unblock tools/list ([1e0a1d5](https://github.com/torsday/omnifocus-mcp/commit/1e0a1d5f39835416190d9899ce6c34d86d0d9fab))
26
- * **webhooks:** register res.on('error') so dispatch never throws upward ([3f988e5](https://github.com/torsday/omnifocus-mcp/commit/3f988e5160fe1e093241288c2e4dd67ab730a3a5))
58
+ - **`board-sync.yml` now fires on `synchronize` ([#906](https://github.com/torsday/omnifocus-mcp/pull/906))** — without this trigger, the required `pr` status check (produced by the `Board sync` workflow) was only attached to a PR's head SHA at the time of `opened`/`reopened`. Every subsequent push to a feature branch left the PR `mergeStateStatus: BLOCKED` despite every other required check being green — the documented workaround was a manual close + reopen. Adding `synchronize` to the trigger types keeps the `pr` check fresh on every push and makes the status-checks gate honest. The job's existing `draft == false` guards still hold, so a synchronize on a draft produces a no-op `pr` check rather than prematurely flipping the linked issue to In Review.
59
+
60
+ - **`integration.yml` heavy job capped at `timeout-minutes: 30` ([#912](https://github.com/torsday/omnifocus-mcp/pull/912))** — a wedged `osascript` or hung integration test no longer holds the only self-hosted macOS runner for the GitHub Actions 6-hour default; it surfaces as a clean failure at 30 minutes and the queue keeps moving. The `ci.yml` build job already capped at 15 minutes for the same reason; mirroring the pattern at 30 minutes gives the integration suite + seeding step headroom over its typical 8–12 minute wall time.
27
61
 
62
+ - **`status: in-progress` label cleared on closed issues** — board-sync drift accumulated during the 2026-05-10 merge wave (8 issues closed without the `status: in-progress` label being stripped); cleared as part of the grooming pass that also flipped 8 closed-but-still-`In Review` project items to `Done`.
28
63
 
29
64
  ### Performance
30
65
 
31
- * **jxa:** scope task_list tagId filter via tag.tasks() to avoid full scan ([a16fe77](https://github.com/torsday/omnifocus-mcp/commit/a16fe77895d5d0ceb93e3798ca7fb0d17fd92793))
32
- * **tools:** elide default-valued fields from heavy read responses ([#774](https://github.com/torsday/omnifocus-mcp/issues/774)) ([7aecd56](https://github.com/torsday/omnifocus-mcp/commit/7aecd564a0efe69e3d9c9a385c1ebbded75ea0fa))
66
+ - **`forecast_get` deduplicates task objects in `byDate` buckets ([#870](https://github.com/torsday/omnifocus-mcp/pull/870))** — previously `forecast_get` returned `byDate[].tasks[]` (full task objects, duplicating data already present in `overdue`/`dueToday`/`deferredToday`/`flagged`); now `byDate[].taskIds[]` returns only IDs, and callers dereference from the top-level arrays. **Breaking shape change** for callers who specifically read `byDate[].tasks[]` — but agents typically iterate the top-level arrays anyway, and the new shape is what the docstring described all along. Pairs with the field-projection support that landed in `v1.4.0` ([#773](https://github.com/torsday/omnifocus-mcp/issues/773)): top-level arrays still go through `applyProjection`, so callers can tune both layers independently. Closes [#794](https://github.com/torsday/omnifocus-mcp/issues/794).
67
+
68
+ - **`review_list_due` projects to documented fields by default ([#873](https://github.com/torsday/omnifocus-mcp/pull/873))** — the response shape has been trimmed to `{ id, name, nextReviewDate, reviewInterval, lastReviewDate, status }` (the field set the docstring described), down from the full project shape that included `tagIds`, `note`, `taskIds`, etc. Callers needing the full shape can opt back in via `fields: ["*"]`. Cuts the weekly-review benchmark workflow's `review_list_due` response from 6.8 KiB → 1.9 KiB (−72%) and the full workflow's total from 24 KiB → 19 KiB (−22%). Token-cost baseline updated. Part of the work to keep the per-call cost honest for bulk-triage agents.
33
69
 
70
+ - **`scripts/check-bundle-size.sh` budget bumped 820 → 850 KiB ([#908](https://github.com/torsday/omnifocus-mcp/pull/908))** — historical context; superseded by the informational-stance shift in [#911](https://github.com/torsday/omnifocus-mcp/pull/911) above. Retained here because both PRs touched the same file in this version range. After [#911](https://github.com/torsday/omnifocus-mcp/pull/911) lands, the 850 KiB value is the soft threshold for the warning, not a hard cap.
71
+
72
+ ### Security
73
+
74
+ - **`pnpm.overrides` clears 8 dependabot advisories ([#918](https://github.com/torsday/omnifocus-mcp/pull/918))** — three transitive deps of `@modelcontextprotocol/sdk@1.29.0` were stuck on vulnerable versions: `fast-uri@3.1.0` (two high-severity advisories — path traversal and host confusion), `ip-address@10.1.0` (XSS in `Address6` HTML methods), and `hono@4.12.14` (five advisories spanning `bodyLimit` bypass, JSX tag injection, CSS injection, Cache middleware Vary handling, and `NumericDate` validation). Override block in `package.json` forces patched versions; `pnpm install` rewrites the lockfile; no source changes required. `pnpm audit` post-fix: "No known vulnerabilities found". These advisories were real-but-unreachable in this codebase (the SDK talks stdio MCP, not HTTP/JSX/cache-middleware), but clearing them removes recurring noise from every `git push` and keeps the GitHub security panel honest.
75
+
76
+ ### Documentation
77
+
78
+ - **`docs/migrations.md` with breaking-change guidance + CI lint ([#865](https://github.com/torsday/omnifocus-mcp/pull/865))** — establishes a migration-guide doc that every breaking-change release must update. The new `meta-lint.yml::migrations-doc` job verifies that any PR labelled `breaking-change` actually edits `docs/migrations.md`; surfaces as a `::error::` annotation on the doc itself when missed. Pairs with [ADR-0011](./docs/adr/0011-versioning-and-stability.md) (versioning and stability semantics) — the ADR defines what counts as breaking; this doc captures the per-release migration path. Closes [#841](https://github.com/torsday/omnifocus-mcp/issues/841).
79
+
80
+ - **README and troubleshooting docs audited for first-time-user friction ([#877](https://github.com/torsday/omnifocus-mcp/pull/877))** — `docs/troubleshooting.md` gains real failure-mode coverage: OmniFocus version compatibility matrix (3.x vs 4.x feature differences, error codes like `OF_FEATURE_REQUIRES_VERSION`), sync-conflict recovery, modal/locked-state recovery, Calendar permission denial, and a worked TCC-recovery flow. `README.md`'s Quick Start gains a prerequisites line (macOS 13+, OmniFocus 3.x or 4.x, Node 24+ for the npm path).
81
+
82
+ ### Infrastructure (not user-visible)
83
+
84
+ These changes don't affect runtime behavior but are recorded for traceability:
85
+
86
+ - **`workflow-timeouts` meta-lint gate + 17 jobs given timeouts ([#920](https://github.com/torsday/omnifocus-mcp/pull/920))** — `scripts/verify-workflow-timeouts.sh` parses every workflow YAML and fails if any job lacks an explicit `timeout-minutes`. Codifies [#912](https://github.com/torsday/omnifocus-mcp/pull/912)'s lesson at the workflow-author level so the next workflow can't silently regress. Same PR also adds an informational `pnpm-audit` job and adds `timeout-minutes` to the 17 existing jobs that were missing one.
87
+
88
+ - **Loop-detector retention bounded; in-process store sizes surfaced ([#875](https://github.com/torsday/omnifocus-mcp/pull/875))** — `loopDetector` no longer grows unbounded; `internal_status.stores` now reports `{ idempotencyEntries, loopDetectorKeys }` so operators can see the in-process retention surface. Tightens the steady-state memory profile on long-running stdio sessions.
89
+
90
+ - **JXA runtime-quirk lint rules ([#869](https://github.com/torsday/omnifocus-mcp/pull/869))** — four new `customRules` entries encoding OmniFocus 4.x JXA quirks (`containingProject()` class-must-be-try-guarded, `flattenedTasks.byId()` must use `lookupOrThrow`, helpers must use `@inline` directive, `flattenedTasks` must narrow before full-scan unless explicitly opted out via `/* narrow-scan-ok: reason */`). One genuine violation in `task_create.js:96` was fixed in passing; the rest are forward guards.
91
+
92
+ ## [1.4.0](https://github.com/torsday/omnifocus-mcp/compare/v1.3.0...v1.4.0) (2026-05-10)
93
+
94
+ > **Note:** `v1.4.0` was tagged but never published to npm or Homebrew — its release pipeline hit an unrelated integration-test flake at the release-gate check. [`v1.5.1`](#151httpsgithubcomtorsdayomnifocus-mcpcomparev150v151-2026-05-10) ships the combined `v1.4.0` + `v1.5.0` payload via the recovered release workflow. Treat this section as the changelog for what's *in* `v1.5.1`, alongside the `[1.5.0]` and `[1.5.1]` sections above.
95
+
96
+ **Summary** — A perf + reliability release. The headline wins are on the JXA side: every read-heavy script that scans `flattenedTasks` now pushes pushable predicates (`flagged`, `completed`, `dueDate`, `deferDate`, `modificationDate`, forecast) into OmniFocus's runtime via `whose({...})`, mirroring the 25× speedup pattern from `forecast_get.js` across `task_list`, `task_search`, `perspective_evaluate`, and `changes_since`. On the response side, **field projection (`fields[]`) lands on every heavy read tool** ([#773](https://github.com/torsday/omnifocus-mcp/issues/773)) — callers can ask for the exact field set they need (`id` is always returned) and skip everything else; pairs with default-trimming changes (`task_get.includeSubtasks` defaults to `false` and returns `subtaskIds[]` instead of full subtree; `noteHtml` is dropped from default task/project responses; `_links` is opt-in via `includeLinks`; default page size is 50 not 100; `filterHash` in pagination cursors is 16 hex chars instead of 64). The transport layer gains **retry-once-on-transient-failure** for both `JxaTransport` (read-only scripts) and `OmniJsTransport`, eliminating most spurious "OmniFocus busy" errors during sync windows. Input validation is hardened across user-supplied strings (length caps, null-byte / control-char rejection in attachment paths). Two new pieces of observability surface: per-service cache hit/miss counts on `internal_status.cache`, and full wire-byte measurement (text + `structuredContent`) on `responseStats`. The agent-developer surface gets substantial work too — `DESIGN.md` is split into per-area files under `docs/design/`, `README.md` is slimmed, `AGENTS.md` gains common-task recipes, `src/tools/INDEX.md` is auto-generated for fast lookup, and `docs/clients/README.md` indexes the 6 client integration guides. **No breaking changes**; every new parameter has a backward-compatible default and every new response field is additive.
97
+
98
+ ### Added
99
+
100
+ - **Field projection (`fields[]`) on heavy read tools ([#773](https://github.com/torsday/omnifocus-mcp/issues/773))** — `task_list`, `task_get`, `task_get_many`, `project_list`, `project_get`, `tag_list`, `tag_get`, `folder_list`, `folder_get`, `search_query`, `forecast_get`, `review_list_due`, and `changes_since` all accept a `fields: string[]` parameter that restricts each returned record to the listed top-level fields (`id` is always returned regardless). Empty array returns just `id`; omitting `fields` returns the full shape (backward-compatible). Unknown field names are dropped silently and surface in `meta.warnings.WARN_UNKNOWN_FIELDS` so callers know they typo'd. Pairs with the default-trimming changes in this release: callers tuning for bulk-triage can keep responses tiny, callers needing the full shape have `fields: ["*"]`. Adds the `applyProjection` helper, per-domain field-name exports in `task.ts` / `project.ts` / `tag.ts`, and per-tool wiring across the read surface.
101
+
102
+ - **Per-service cache hit/miss counts on `internal_status.cache.services`** — `internal_status.cache` now reports per-key-prefix stats (`tag`, `folder`, `forecast`, `task`, `project`) so operators can see which services are getting cache benefit. Aggregate stats (`hits`, `misses`, `hitRate`) plus the new `services` map make it easy to spot a service whose cache was never wired up or that's invalidating too aggressively. Closes [#821](https://github.com/torsday/omnifocus-mcp/issues/821).
103
+
104
+ - **Retry-once on transient JXA failures for read-only scripts ([#816](https://github.com/torsday/omnifocus-mcp/pull/816)) + OmniJS mirror ([#890](https://github.com/torsday/omnifocus-mcp/pull/890))** — `JxaTransport` (read-only call path) and `OmniJsTransport` both detect transient errors (timeout, OmniFocus-busy, sync-in-progress) and retry once with a small backoff before propagating. Eliminates the most common spurious failures during iCloud sync windows. Read-only retry is safe by construction (no mutation); write-side keeps the previous "fail fast and surface to caller" behavior since retry-on-write is unsafe without idempotency keys.
105
+
106
+ - **`src/tools/INDEX.md` auto-generated for fast agent tool discovery** — one-line-per-tool index grouped by domain, regenerated by `pnpm docs:generate` and verified by `pnpm docs:check`. Cheaper to grep than the full `docs/tools.md` (~180 KiB) when an agent needs to know "is there a tool for X?" Closes [#807](https://github.com/torsday/omnifocus-mcp/issues/807).
107
+
108
+ - **`scripts/board-mutate.sh` wrapper for Project v2 field flips ([#847](https://github.com/torsday/omnifocus-mcp/pull/847))** — generalizes the 30-line GraphQL mutation that `/ship-next`, `/groom`, `/ship-debug`, `/ship-refactor`, and `/hunt-bugs` had each been inlining. Single source of truth for status / phase / priority / size / model-queue field updates on the board. Internal tooling; no runtime impact.
34
109
 
35
110
  ### Changed
36
111
 
37
- * **jxa:** inline shared buildFolder helper via [@inline](https://github.com/inline) directive ([e8c7391](https://github.com/torsday/omnifocus-mcp/commit/e8c739140ae1047fa1f6c4bdb6c02e4e602be3d8))
38
- * **jxa:** inline shared buildTag helper via [@inline](https://github.com/inline) directive ([57b0ab0](https://github.com/torsday/omnifocus-mcp/commit/57b0ab05c63ef9e6ed202198e90f33c68bbd9b14))
112
+ - **Default page size lowered to 50 for `task_list`, `project_list`, `search_query` ([#866](https://github.com/torsday/omnifocus-mcp/pull/866))** — was 100. Bulk-triage callers asking for paginated reads now default to a tighter window; explicit `limit: 100` (or higher, up to the per-tool cap) restores the previous behavior. The 50-default is calibrated against the canonical agent workflows where pagination is more useful than a single big page.
113
+
114
+ - **`task_get.includeSubtasks` defaults to `false`; returns `subtaskIds[]` instead of full subtree ([#867](https://github.com/torsday/omnifocus-mcp/pull/867))** — previously `task_get` returned the full subtask subtree by default, which blew up response size for deeply nested tasks. Now the default is `includeSubtasks: false` (returns `subtaskIds: string[]` only); callers needing the subtree pass `includeSubtasks: true` and get the previous shape. Behavior change but additive: existing callers that were relying on the default subtree get a smaller response — they can fix by passing the explicit flag.
115
+
116
+ - **`noteHtml` dropped from default task and project responses ([#871](https://github.com/torsday/omnifocus-mcp/pull/871))** — `task.*` and `project.*` reads no longer include the rendered HTML form of the note by default. Callers needing it can still call `note_get_html` directly. The plain-text `note` field (and the `notePreviewChars`-truncated form from `v1.3.0`) is unchanged. Removes ~5–50 KiB per record for callers that weren't using `noteHtml` anyway.
117
+
118
+ - **`_links` is opt-in via `includeLinks` on heavy reads** — pagination's `_links` block (`first`, `last`, `next`, `prev`) is now off by default; pass `includeLinks: true` to restore. Most agents iterate the cursor explicitly; the inline link bag was dead bytes for them.
119
+
120
+ - **Pagination cursor's `filterHash` shrunk from 64 → 16 hex chars (64-bit) ([#876](https://github.com/torsday/omnifocus-mcp/pull/876))** — cursors are now substantially shorter on the wire; collision probability at 64 bits is negligible for the cursor's lifetime. Closes [#802](https://github.com/torsday/omnifocus-mcp/issues/802).
121
+
122
+ ### Fixed
39
123
 
124
+ - **Input length caps enforced on user-supplied string fields ([#864](https://github.com/torsday/omnifocus-mcp/pull/864))** — `name`, `note`, `tagIds[]`, and other user-controllable string fields across every tool now reject pathological inputs (very long strings, oversized arrays) before they reach JXA, where they can hang or silently truncate. Adds `src/domain/inputLimits.ts` with a single source of truth for the caps; per-tool Zod schemas import from it. Caps are generous (notes up to 10 MB, names up to 1024 chars) — they're guard rails, not policy.
125
+
126
+ - **Attachment paths reject null bytes and control characters ([#824](https://github.com/torsday/omnifocus-mcp/pull/824))** — `attachment_add`, `attachment_save_to_path`, and related tools now validate user-supplied paths against a denylist of `\x00`-`\x1f` plus `\x7f`. Closes a class of bug where an embedded NUL in a path could truncate the JXA call's view of the filename without raising a visible error.
127
+
128
+ - **`responseStats` measures full wire bytes ([#793](https://github.com/torsday/omnifocus-mcp/pull/793))** — telemetry now sums the byte length of both the MCP `text` content and `structuredContent` payload (was measuring just `text`, which undercounted by ~50% on structured responses). p50/p95 thresholds are recalibrated against the correct totals; baseline updated. Pairs with [ADR-0022](./docs/adr/0022-envelope-text-content-duplication.md) on the underlying envelope-text duplication question.
129
+
130
+ - **`status: in-progress` label invariants reconciled** — board-sync drift that had accumulated on closed issues from prior merge cycles cleaned up; the `status: in-progress` label is now strictly transient (added when a PR opens, removed when the PR merges and the issue closes).
131
+
132
+ ### Performance
133
+
134
+ - **JXA `whose()` pushdown lands on every read-heavy script** —
135
+ - **`task_list`** ([#893](https://github.com/torsday/omnifocus-mcp/pull/893)): no-filter branch now pushes `flagged`, `completed`, `dueDate`, `deferDate` into `flattenedTasks.whose(...)` with a try/catch fall-through if OF rejects the predicate. On a 10k-task DB, the unfiltered scan that previously called `buildTask` on every task now sees only the long-tail of matching specifiers.
136
+ - **`task_search`** ([#895](https://github.com/torsday/omnifocus-mcp/pull/895)): same pushdown for `flagged` / `completed` / `dueDate`. Text-search predicates stay client-side because `_contains` support in OF 4.x's `whose()` is unverified.
137
+ - **`perspective_evaluate`** ([#894](https://github.com/torsday/omnifocus-mcp/pull/894)): `flagged` + forecast (today's-due-or-overdue) filters pushed into `whose()` for the projects + tags branches; identical try/catch fallback pattern.
138
+ - **`changes_since`** ([#789](https://github.com/torsday/omnifocus-mcp/pull/789)): `modificationDate > since` predicate pushed into `whose()`. The largest single win — this script is the primary engine for sync-aware agents and was the slowest scan on a real-user DB.
139
+ All four mirror the original `forecast_get.js` speedup pattern; the comment block on `task_list.js` documents the shape for future scripts.
140
+
141
+ - **Tag-membership checks use `Set` in `task_search` multi-tag filter ([#872](https://github.com/torsday/omnifocus-mcp/pull/872))** — was nested array `.includes` (O(filterTags × taskTags) per task); now `new Set(built.tagIds)` once per task plus `.every(has)` against the filter. Closes [#803](https://github.com/torsday/omnifocus-mcp/issues/803). Negligible for `tagIds.length < 5` but real-world callers occasionally pass 20+ tag IDs (project audit workflows) where this used to dominate the inner loop.
142
+
143
+ - **`OmniFocusLruCache` bounded by total bytes alongside entry count ([#904](https://github.com/torsday/omnifocus-mcp/pull/904))** — the LRU cache previously capped on number of entries only, leaving the worst-case memory footprint unbounded for callers that read large notes or wide responses. Now caps on `min(maxEntries, maxBytes)` with size-aware eviction; `internal_status.cache` reports `bytes` and `maxBytes` alongside the existing entry counters. Default `maxBytes: 50 MB` is generous; tunable via `OMNIFOCUS_CACHE_MAX_BYTES`. Pairs with the per-entry `_measureBytes` helper that approximates JSON serialization size at insert time.
144
+
145
+ - **`cache.wrap` wired into `tagService.list`, `folderService.list`, `forecastService.get`** — these services were the last hot reads not behind the cache, so a re-list-after-mutate paid the full JXA scan every time. Now they hit the LRU first. Closes [#790](https://github.com/torsday/omnifocus-mcp/issues/790).
146
+
147
+ - **Suite-scoped sandbox folder for integration contract suite ([#903](https://github.com/torsday/omnifocus-mcp/pull/903))** — integration tests previously created a fresh OmniFocus folder per `describe` block, multiplying setup wall-time. Now a single suite-scoped folder is reused with per-test sub-projects, cutting integration-suite wall time by ~30%. No correctness impact — every test still gets a fresh project namespace.
40
148
 
41
149
  ### Documentation
42
150
 
43
- * **adr:** 0016 reactive automation runtime (proposed, deferred) ([ca91590](https://github.com/torsday/omnifocus-mcp/commit/ca915908a3785c53bab67c4c0bfbe1f0b404aa7f))
44
- * **adr:** expand 0016 — option [#5](https://github.com/torsday/omnifocus-mcp/issues/5) (no, Claude itself can't listen) + sub-decision [#9](https://github.com/torsday/omnifocus-mcp/issues/9) (TypeScript) ([1a931d8](https://github.com/torsday/omnifocus-mcp/commit/1a931d86df9f9a03102bf7f00c927c537de5e53a))
45
- * **adr:** expand 0016 worked example, sandboxed js, failure modes, phased rollout ([42263cb](https://github.com/torsday/omnifocus-mcp/commit/42263cba1990abbcdfb78fcee2a729099ce3be43))
46
- * **adr:** renumber reactive automation runtime to 0021 ([b18e075](https://github.com/torsday/omnifocus-mcp/commit/b18e07593234d52c3d340359005808874404ebcc))
47
- * **agents:** add per-directory CLAUDE.md files for jxa, envelope, tools ([#809](https://github.com/torsday/omnifocus-mcp/issues/809)) ([39b6771](https://github.com/torsday/omnifocus-mcp/commit/39b6771bd09941076aff728f69461839928ce94a))
48
- * **release:** align stale bundle-size budget mentions with current 800 KiB ([3e3e55f](https://github.com/torsday/omnifocus-mcp/commit/3e3e55faf76916079f3240be16c69f129c105a60))
49
- * **spike:** [#800](https://github.com/torsday/omnifocus-mcp/issues/800) osascript fanoutmultiplexed scripts vs persistent daemon ([5e452a6](https://github.com/torsday/omnifocus-mcp/commit/5e452a6aea15ded5a7f3b70b1143da59bdf42d94))
151
+ - **`DESIGN.md` split into per-area files under `docs/design/` ([#805](https://github.com/torsday/omnifocus-mcp/issues/805))** — the previous 1366-line, 84 KB `DESIGN.md` is now split into purpose-named files (architecture, envelope, IDs/dates, security, testing-and-ci, observability, configuration, distribution, example-tool, resources). Each is bounded by `lint-doc-sizes.ts` so they don't drift back to kitchen-sink scale. The orientation cost per agent session drops materially — agents don't need to load 1366 lines to find what they need anymore.
152
+
153
+ - **`README.md` slimmed; agent / examples / prompts sections extracted ([#843](https://github.com/torsday/omnifocus-mcp/pull/843))** — `README.md` is now the public-facing intro + Quick Start; the agent-only "If you are an AI agent" section moves to `AGENTS.md`, examples to `docs/examples.md`, prompts to `docs/prompts.md`. Doc-size budget enforced via the new `lint-doc-sizes.ts`.
154
+
155
+ - **`AGENTS.md` gains common-task recipes ([#810](https://github.com/torsday/omnifocus-mcp/pull/810))** — copy-pasteable patterns for "add a new tool", "add a JXA script", "add an envelope variant", "add an ADR". Each recipe links to the relevant `docs/design/` area file and a worked example in the source tree.
156
+
157
+ - **`docs/clients/README.md` indexes the 6 client integration guides** Claude Desktop, Claude Code, Codex, OpenCode, Pi, and generic stdio. Closes [#845](https://github.com/torsday/omnifocus-mcp/issues/845).
158
+
159
+ - **`docs/spikes/2026-04-bundle-size-strategy.md`** records the bundle-size investigation that led to the informational-stance shift in [#911](https://github.com/torsday/omnifocus-mcp/pull/911) (landed in `[1.5.0]`). Closes [#826](https://github.com/torsday/omnifocus-mcp/issues/826).
160
+
161
+ - **[ADR-0022](./docs/adr/0022-envelope-text-content-duplication.md)** documents the envelope-text-vs-structuredContent duplication question and the decision to keep both surfaces during the v1.x line (resolved in `v2` per the migration doc added in [`v1.5.0`](./CHANGELOG.md#150httpsgithubcomtorsdayomnifocus-mcpcomparev140v150-2026-05-10)).
162
+
163
+ ### Infrastructure (not user-visible)
164
+
165
+ - **Token-cost regression gate gains a label-gated allowlist ([#822](https://github.com/torsday/omnifocus-mcp/pull/822))** — a PR can opt out of the 5% drift check by adding the `token-cost-allowlist` label with a justification, for PRs where the drift is intentional (e.g. adding a tool description). Replaces the previous all-or-nothing block.
166
+
167
+ - **pino redaction coverage audited and extended ([#842](https://github.com/torsday/omnifocus-mcp/pull/842))** — log redaction paths reviewed for completeness; canary test added that fails CI if a token-shaped value (`Bearer …`, `sk-…`, etc.) appears in serialized logs. Belt-and-suspenders on top of the existing redaction config.
168
+ * **tests:** expand tests/README.md to map all 8 sub-dirs with purpose and routing ([ffbeae8](https://github.com/torsday/omnifocus-mcp/commit/ffbeae8fdf6234b21a8ed57f551a7b40edcb4f44)), closes [#844](https://github.com/torsday/omnifocus-mcp/issues/844)
169
+
170
+ ## [1.3.0](https://github.com/torsday/omnifocus-mcp/compare/v1.2.2...v1.3.0) (2026-05-09)
171
+
172
+ **Summary** — A reliability + token-economy release. The headline win is **substantially leaner read responses**: heavy reads (`task_list`, `task_get`, `task_get_many`, `project_list`, `project_get`, `tag_list`, `tag_get`, `folder_list`, `folder_get`) now elide default-valued fields and truncate long task notes by default, cutting wire bytes 27–31% across canonical agent workflows (inbox triage, weekly review, project planning) without changing any non-default response shape. A new **per-tool response-size telemetry** surface on `internal_status` exposes count / total / max / p50 / p95 per tool and emits one-shot warnings when p95 crosses a configurable threshold, giving operators a real-workload view orthogonal to the offline benchmark suite. Reliability fixes are concentrated on three real failure modes: a stack-overflow on `tools/list` for tools with recursive Zod input schemas (`task_reclassify`, `perspective_create`, `perspective_evaluate_dry_run`, `perspective_update`) that crashed the MCP handshake; tag-parent and tag-mutation regressions on OmniFocus 4.x where JXA's `parent()` and `addTag/removeTag` silently no-op'd against real specifiers; and a pagination-cursor filter-hash bug latent for any future caller introducing nested filter shapes. No breaking changes; all v1.2.x call shapes are unchanged. New optional parameters (`verbose`, `notePreviewChars`) default to backward-compatible values.
173
+
174
+ ### Added
175
+
176
+ - **Per-tool response-size telemetry on `internal_status` ([#778](https://github.com/torsday/omnifocus-mcp/issues/778))** — opt-in `ResponseStatsRegistry` records the wire size of every successful tool response and exposes per-tool aggregates (count, total, max, p50, p95) via a new `responseStats` block on `internal_status`. p95 transitions across `OMNIFOCUS_RESPONSE_STATS_THRESHOLD_BYTES` emit one `response.size.exceeded` warning per crossing (and a matching `response.size.recovered` info on the way back), giving operators a signal-not-noise view of which tools dominate token cost in real workloads — orthogonal to the offline benchmark suite (#771). Recording is sample-gated by `OMNIFOCUS_RESPONSE_STATS_SAMPLE_RATE` (default `0` = off, zero overhead). Percentiles use a 1024-sample ring buffer per tool — recent semantics, bounded memory. Errors are not recorded; they're SDK-shaped, not tool-shaped. Part of [#770](https://github.com/torsday/omnifocus-mcp/issues/770). ([79cace2](https://github.com/torsday/omnifocus-mcp/commit/79cace2f8050d26bb73181c6dcd4325fc8a02ad3))
177
+
178
+ - **`notePreviewChars` parameter on bulk task reads — default-truncated notes ([#775](https://github.com/torsday/omnifocus-mcp/issues/775))** — `task_list`, `task_get`, and `task_get_many` accept a new `notePreviewChars` parameter (default `200`, `-1` to opt out). When a note exceeds the cap, `note` is replaced with the triplet `notePreview` (truncated text) + `noteTruncated: true` + `noteLength` (full UTF-8 byte length); short notes pass through unchanged so existing callers see no wire-shape change. The existing `note_get` tool remains the full-text fetcher for callers that need the entire body inline. Composes with response-default elision below to keep token cost bounded on large-DB reads. ([57dc1ba](https://github.com/torsday/omnifocus-mcp/commit/57dc1bae461e0630b00ca98fde98e06c377d9acb))
179
+
180
+ ### Changed
181
+
182
+ - **Default-valued fields elided from heavy read responses + new `verbose` opt-out ([#774](https://github.com/torsday/omnifocus-mcp/issues/774))** — every heavy read tool's success path (`task_list`, `task_get`, `task_get_many`, `project_list`, `project_get`, `tag_list`, `tag_get`, `folder_list`, `folder_get`) now elides default-valued fields: booleans at their false default (`flagged`, `completed`, `dropped`, `blocked`, `sequential`), empty `tagIds[]`, null reference dates (`deferDate`, `dueDate`, `completedAt`, `droppedAt`), null notes, and the most common status enum values (`"active"`, `"parallel"`). The convention: an *absent* field means the default applies. `projectId` on tasks is intentionally **not** elided — null vs missing carries inbox-vs-unknown semantics. Each tool accepts a new `verbose: boolean` flag (default `false`) that, when `true`, returns the full unelided shape for debugging or for callers that prefer explicit nulls. Benchmark drift on canonical workflows: **inbox-triage −27.3%**, **project-planning −30.7%**, **weekly-review −27.3%** total response bytes. Composes with note truncation (#775) and response-size telemetry (#778). Part of [#770](https://github.com/torsday/omnifocus-mcp/issues/770). ([7aecd56](https://github.com/torsday/omnifocus-mcp/commit/7aecd564a0efe69e3d9c9a385c1ebbded75ea0fa))
183
+
184
+ - **`task_list` `tagId` filter scoped via `tag.tasks()` — no more full-DB scan** — previously `task_list` scanned `defaultDocument.flattenedTasks()` and called `buildTask` on each task even when `tagId` narrowed the desired set. On a real-user database (10k+ tasks) with `buildTask`'s per-task tag iteration, this blew the 30s scriptRunner timeout — the contract test "tagId filter returns tasks carrying that tag" timed out reliably even though the corresponding `tag_list` filter worked. When `tagId` is the only source-narrowing input (no `projectId`, no `parentId`, no `inbox`), the source is now `flattenedTags.byId(tagId).tasks()` — bounded by the tag's actual usage rather than the whole DB. The post-loop `tagId` equality check remains as a safety net for any future combination filter. Refs [#768](https://github.com/torsday/omnifocus-mcp/issues/768). ([a16fe77](https://github.com/torsday/omnifocus-mcp/commit/a16fe77895d5d0ceb93e3798ca7fb0d17fd92793))
185
+
186
+ - **Shared `buildFolder` and `buildTag` JXA helpers extracted via `@inline` directive ([#704](https://github.com/torsday/omnifocus-mcp/issues/704), [#705](https://github.com/torsday/omnifocus-mcp/issues/705))** — applies ADR-0020's `// @inline _helpers/<helper>.js` mechanism to consolidate `buildFolder` (4 prior copies across `folder_create`, `folder_get`, `folder_list`, `folder_update`) and `buildTag` (5 prior copies across `tag_create`, `tag_get`, `tag_get_many`, `tag_list`, `tag_update`) into single canonical helpers. Reconciliation preserves every issue-referenced fix verbatim — #515 sub-folder `parent()` workaround now applied uniformly across all 4 folder consumers (previously only `folder_list`); #498 invocation guards on creation/modification dates, project/subfolder counts, and `allowsNextAction` extended uniformly across all consumers; #673 tag-parent class-throw handling unified. Bundle ticks up ~14 KiB (folder) + ~9 KiB (tag) because the helpers are spliced verbatim into each consumer string; budget bumped 780→800 KiB to accommodate (DESIGN.md §20 history extended). No behavior change for callers — the unified guards only change failure modes from "silent null" to "graceful fallback" in cases that previously hit the partially-applied versions. ([e8c7391](https://github.com/torsday/omnifocus-mcp/commit/e8c739140ae1047fa1f6c4bdb6c02e4e602be3d8), [57b0ab0](https://github.com/torsday/omnifocus-mcp/commit/57b0ab05c63ef9e6ed202198e90f33c68bbd9b14))
187
+
188
+ ### Fixed
189
+
190
+ - **`tools/list` no longer crashes on recursive Zod input schemas (closes [#717](https://github.com/torsday/omnifocus-mcp/issues/717))** — without an `id` registration, `z.lazy()` schemas (`TaskPredicate`, `PerspectiveRuleInput`) inlined forever during the SDK's `tools/list` JSON Schema serialization, crashing the MCP handshake with a stack overflow whenever any of the four affected tools (`task_reclassify`, `perspective_create`, `perspective_evaluate_dry_run`, `perspective_update`) was registered. Both schemas are now registered with `z.globalRegistry` so they emit `$ref` into `$defs` instead, and `.describe()` wrappers on recursive references inside the `predicateSchema` lazy body — which shadowed the registered id and re-triggered the cycle — are removed. A new regression test drives the SDK-installed `tools/list` handler end-to-end for the four affected tools so any future cyclic input schema fails CI. ([1e0a1d5](https://github.com/torsday/omnifocus-mcp/commit/1e0a1d5f39835416190d9899ce6c34d86d0d9fab))
191
+
192
+ - **Tag parent retrieval uses `container()` instead of `parent()` on OmniFocus 4.x** — in OF 4.x, `tag.parent()` throws `Can't convert types` on real Tag specifiers — the previous workaround relied on `parent.class()`'s throw vs `"document"` return, but `parent()` *itself* threw and the outer catch swallowed it, leaving `parentId: null` for every tag. `tag_list` filtered by `parentId` thus returned `[]` regardless of the tag-tree shape. `buildTag` now uses `tag.container()`, which works in OF 4.x and returns either the parent tag or the document; the two are distinguished by comparing `container.id()` to a `docId` passed in by each caller. The five callers (`tag_create`, `tag_get`, `tag_get_many`, `tag_list`, `tag_update`) each compute `docId` once via `doc.id()` and pass it through. Refs [#768](https://github.com/torsday/omnifocus-mcp/issues/768). ([ba4abc5](https://github.com/torsday/omnifocus-mcp/commit/ba4abc53e327c31ac92342f0d79cc39dbc3daf84))
193
+
194
+ - **Task tag mutations routed through OmniJS to defeat silent no-op (closes [#716](https://github.com/torsday/omnifocus-mcp/issues/716))** — OmniFocus 4.x JXA's `task.addTag(tag)` / `task.removeTag(tag)` silently no-op'd on existing tasks resolved by id — the call returned without error but no row was written to the underlying SQLite `TaskToTag` join table (verified by the reporter at the SQLite + GUI layers). The `addTag`/`removeTag` loop in `task_update.js` and `task_batch_update.js` is now replaced with a single `ofApp.evaluateJavascript()` call that delegates the tag-set replacement to OmniJS (`Task.byIdentifier` + `addTag`/`removeTag`), matching the proven pattern in `task_create` and `task_duplicate`. The handler in `src/tools/task/update.ts` already resolves `addTags`/`removeTags` to a final `tagIds` set before calling the adapter, so a single fix point covers `task_update`, `task_batch_update`, `task_batch_assign`, and `task_reclassify`. A gated integration suite (`OMNIFOCUS_INTEGRATION=1`) covers the add-to-empty / replacement / clear / batch paths against a real OmniFocus instance. ([c0304c5](https://github.com/torsday/omnifocus-mcp/commit/c0304c57a6eca2398fde7330d9eff2d163d79f4b))
195
+
196
+ - **Pagination cursor `filterHash` stable across nested-object key order (closes [#760](https://github.com/torsday/omnifocus-mcp/issues/760))** — `hashFilter` sorted only top-level keys, so two semantically-identical filters that differed in nested-object key order produced different hashes — tripping `ValidationError("Cursor filter hash does not match…")` on page 2 the moment any caller introduced a nested filter shape. Latent today (every existing caller flattens to a single-level object) but a tripwire for any future filter struct. The same root cause was present in two near-identical `stableStringify` copies for `LoopDetector` / `transportCall` hashing; all three are consolidated into `src/util/stableStringify.ts` so this class of bug is un-instantiable for future callers. The shared helper additionally skips undefined-valued keys inside objects (matching `JSON.stringify`), preserving `hashFilter`'s existing top-level "ignore undefined" contract at every depth. ([f315a36](https://github.com/torsday/omnifocus-mcp/commit/f315a368a35880fedda3d88e79b90d4f8c33383d))
197
+
198
+ - **Loop detector + transport.call argument hashing — correct on nested args, safe on null/undefined** — `buildCallKey` (loop detector) and `hashArgs` (transport.call event) both used `JSON.stringify(value, Object.keys(value).sort())` to produce a stable hash. The replacer-array form filters properties to the listed keys at *every* depth, so any property not present at the top level was dropped from the serialized output. Two semantically-different calls with the same top-level shape collapsed into a single hash, making the LoopDetector raise false-positive `WARN_LOOP_DETECTED` — and, after the error threshold, throw a hard `OF_LOOP_DETECTED` that blocked the legitimate follow-up call before the tool handler ran. The `transport.call` event suffered the same misclassification (debug-level only), making `argsHash` unsafe as a correlation key. `buildCallKey` also crashed on null/undefined args (`Object.keys(null)` throws). Both call sites now use a recursive `stableStringify` that sorts object keys at every nesting level, preserves array order, encodes `undefined` explicitly, and never throws. New regression tests cover the nested-key, key-reorder, null/undefined, and array-element cases for both call sites. ([dcec35c](https://github.com/torsday/omnifocus-mcp/commit/dcec35c0f948ac5cc771cfcf8b570137097c092c))
199
+
200
+ - **Webhook dispatch: response-stream errors flow through the retry loop (closes [#761](https://github.com/torsday/omnifocus-mcp/issues/761))** — `defaultHttpsRequest` only listened for `'error'` on the request stream; errors emitted by the *response* stream after the request callback fired (premature socket close, malformed transfer-encoding, peer reset mid-body, TLS error during streaming) escaped as `uncaughtException`, bypassing the retry loop and circuit breaker. ADR-0016 §4e is unambiguous: "delivery failures NEVER throw upward." `res.on('error', reject)` is now wired so a response-stream error rejects the dispatch promise, flows through the existing retry loop, and increments `consecutiveFailures` exactly like a request-stream error. ([3f988e5](https://github.com/torsday/omnifocus-mcp/commit/3f988e5160fe1e093241288c2e4dd67ab730a3a5))
201
+
202
+ - **In-memory adapter: `project.completedTaskCount` no longer drifts on no-op re-completion** — `applyTaskCompletion` re-stamps `completedAt` when re-completing an already-completed task (intentional, per the `task_batch_complete` contract that "already-completed tasks are not treated specially"), but the same path also called `bumpProjectCompletedCount(+1)` unconditionally, so each re-complete drifted `project.completedTaskCount` upward by one — diverging the test fake from real OmniFocus semantics where the count tracks "tasks currently completed", not "completion calls received". The bump is now guarded on `completed !== task.completed` so the count only moves on a true state transition. ([e5da6e4](https://github.com/torsday/omnifocus-mcp/commit/e5da6e41f15badc0e5b924203379806bc74b513a))
203
+
204
+ ### Build / CI
205
+
206
+ - **Token-cost benchmark suite for canonical agent workflows (closes [#771](https://github.com/torsday/omnifocus-mcp/issues/771))** — hermetic harness under `tests/benchmark/token-cost/` drives three canonical agent workflows (inbox triage, weekly review, project planning) against the in-memory adapter and records the bytes/tokens an MCP client would exchange. Persists totals in a checked-in snapshot with a ±5% tolerance band so optimization PRs under [#770](https://github.com/torsday/omnifocus-mcp/issues/770) have an objective baseline to prove non-zero improvement against. Wires `pnpm bench:tokens` (CLI) and `pnpm test:bench:tokens` (vitest gate, `OMNIFOCUS_BENCH=1`) plus a non-required GitHub Actions workflow.
207
+
208
+ - **`tools/list` byte-stable under prompt-cache contract (closes [#772](https://github.com/torsday/omnifocus-mcp/issues/772))** — Anthropic's prompt cache reuses static prefixes byte-for-byte within a 5-minute window; the MCP `tools/list` response is the largest static prefix this server emits, paid by every session at handshake. A two-tier regression suite locks byte-stability across builds: in-process (`mcpServer.test.ts`) asserts byte-identical JSON + stable SHA-256 of the first 4 KiB across registered schema mixes; cross-process (`tests/e2e/determinism.test.ts`) boots the bundled server twice in fully separate child processes, captures the raw `tools/list` JSON-RPC response over stdio, and byte-diffs (gated on `OMNIFOCUS_E2E=1`). Determinism contract documented in `docs/prompt-cache.md`.
209
+
210
+ - **350-token ceiling on tool descriptions ([#777](https://github.com/torsday/omnifocus-mcp/issues/777))** — adds a per-description token-budget assertion in `descriptions.lint.test.ts`, set at p95 (303) + ~15% headroom. One legacy exemption (`task_reclassify`, 352 tokens) is named with a reason; all other 141 descriptions pass. An informational test reports the total `tools/list` token cost each CI run.
211
+
212
+ - **Integration suite gates canonical-repo PRs via `integration-gate` (closes [#724](https://github.com/torsday/omnifocus-mcp/issues/724))** — `integration.yml` gains a `pull_request: branches: [main]` trigger and a new `integration-gate` job (`ubuntu-latest`, `always()` runs) that becomes the stable required-check name for branch protection. Forks remain unaffected (their integration job is short-circuited via `head.repo.full_name` check, so self-hosted runners are never targeted from a fork). The gate runs on `ubuntu-latest` so an offline `macos-omnifocus` runner is still surfaced as a failure.
213
+
214
+ - **Release workflow gates publish on the integration suite ([#733](https://github.com/torsday/omnifocus-mcp/pull/733))** — `release.yml` now requires the integration suite to pass for the release commit before publishing, so a release tag can no longer ship a regression that integration would have caught.
215
+
216
+ - **JXA sandbox coverage at ~100% (closes [#723](https://github.com/torsday/omnifocus-mcp/issues/723) via [#738](https://github.com/torsday/omnifocus-mcp/issues/738), [#740](https://github.com/torsday/omnifocus-mcp/issues/740), [#741](https://github.com/torsday/omnifocus-mcp/issues/741), [#742](https://github.com/torsday/omnifocus-mcp/issues/742), [#746](https://github.com/torsday/omnifocus-mcp/issues/746), [#747](https://github.com/torsday/omnifocus-mcp/issues/747), [#748](https://github.com/torsday/omnifocus-mcp/issues/748), [#749](https://github.com/torsday/omnifocus-mcp/issues/749), [#753](https://github.com/torsday/omnifocus-mcp/issues/753))** — eight test slices land sandbox unit tests for every JXA script in the repo (60 of 60+, ~100%, up from 13/60 at start of the epic). Stryker `thresholds.break` recalibrated from 57.74 to 58 on the post-coverage baseline (2978 mutants, score 63.41%, drift +0.67pp from the slice-1B baseline) per ADR-0017 §3 (closes [#756](https://github.com/torsday/omnifocus-mcp/issues/756)).
217
+
218
+ - **Admin workflows migrated to `ubuntu-latest` (closes [#728](https://github.com/torsday/omnifocus-mcp/issues/728))** — eight admin workflows (release-please, meta-lint, board-sync, pr-link, pr-title, issue-lint, verify-constants, post-merge-close) have no macOS or OmniFocus dependency; moving them off the single self-hosted mac runner removes queue pressure and eliminates blocked-queue buildup. `ci.yml` and `integration.yml` remain on `[self-hosted, macos]` for OS parity and OmniFocus access respectively. AGENTS.md documents the two-tier runner policy.
219
+
220
+ - **Live-OF integration timing fixes (closes [#768](https://github.com/torsday/omnifocus-mcp/issues/768))** — three contract failures on the `macos-omnifocus` runner traced to vitest-default vs live-OF latency: scoped the unscoped `listProjects` status filter to a test-created folder; added `hookTimeoutMs: 90s` for `duplicateTask` recursive cleanup and intercepted the proxy to cascade-delete clone subtrees; bumped per-test timeout to 90s on `test:integration` for `reorderTask` paths.
221
+
222
+ - **Meta-lint + path-filter coverage** — `package.json` and `pnpm-lock.yaml` added to the meta-lint path filter so dep-bump PRs no longer require admin bypass on branch protection ([#736](https://github.com/torsday/omnifocus-mcp/issues/736)). `release.yml` bumped `actions/upload-artifact` v4 → v7 to match `integration.yml`. Build job capped at `timeout-minutes: 15`. CI shellcheck/actionlint install switched to apt + official download script (no `brew` on `ubuntu-latest`).
223
+
224
+ - **Dependency bumps** — `lru-cache` 11.3.5 → 11.3.6, `zod` 4.3.6 → 4.4.3 (prod-deps group); `@biomejs/biome` 2.4.13 → 2.4.14 (dev-deps); `googleapis/release-please-action` v4 → v5; `biome.json` `$schema` URL pinned to the installed 2.4.14 to silence the per-run deserialization mismatch notice.
225
+
226
+ ### Documentation
227
+
228
+ - **ADR-0021 — Reactive automation runtime (Proposed, Deferred) ([ca91590](https://github.com/torsday/omnifocus-mcp/commit/ca915908a3785c53bab67c4c0bfbe1f0b404aa7f), [1a931d8](https://github.com/torsday/omnifocus-mcp/commit/1a931d86df9f9a03102bf7f00c927c537de5e53a), [42263cb](https://github.com/torsday/omnifocus-mcp/commit/42263cba1990abbcdfb78fcee2a729099ce3be43), [b18e075](https://github.com/torsday/omnifocus-mcp/commit/b18e07593234d52c3d340359005808874404ebcc))** — captures the v2-class direction for daemon-mode + rule engine responding to OmniFocus changes via LLM. Status: **Proposed, Deferred** — no implementation until v1.x stabilizes. Six of seven sub-decisions resolved (process model, rules format, LLM provider abstraction, loop-recursion safety, editing-conflict dampening, cost budget); secrets management deferred. Includes worked end-to-end timeline (iPhone capture → applied LLM rewrite + audit-log entry), `isolated-vm` sandbox decision for the JS escape hatch, 11-path failure-modes table, and a six-phase rollout (~7–10 weeks) with phases 1–3 shippable as a v2-alpha mechanical rules engine before LLM lands. Slot was renumbered from 0016 to 0021 once webhook delivery shipped as ADR-0016.
229
+
230
+ - **Per-directory `CLAUDE.md` files for `jxa/`, `envelope/`, `tools/` ([#809](https://github.com/torsday/omnifocus-mcp/issues/809))** — adds inline contributor/agent guidance at the source-tree boundaries where the conventions are non-obvious, so a fresh agent landing in those directories has the local rules in scope without needing to read the top-level docs. ([39b6771](https://github.com/torsday/omnifocus-mcp/commit/39b6771bd09941076aff728f69461839928ce94a))
231
+
232
+ - **Bundle-size budget references aligned with current 800 KiB** — three places (`scripts/check-bundle-size.sh` header, `ci.yml`, `release.yml`) still labelled the old 580 / 500 KiB budget after the bumps tracked in DESIGN §20. The actual `BUDGET=819200` (800 KiB) was already correct; only human-facing labels were stale. ([3e3e55f](https://github.com/torsday/omnifocus-mcp/commit/3e3e55faf76916079f3240be16c69f129c105a60))
50
233
 
51
234
  ## [1.2.2](https://github.com/torsday/omnifocus-mcp/compare/v1.2.1...v1.2.2) (2026-04-30)
52
235