npm - @openparachute/vault - Versions diffs - 0.6.0 → 0.6.2-rc.1 - Mend

@openparachute/vault 0.6.0 → 0.6.2-rc.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

package/README.md +31 -6
package/core/src/content-range.test.ts +374 -0
package/core/src/content-range.ts +185 -0
package/core/src/links.ts +76 -21
package/core/src/mcp.ts +53 -1
package/core/src/notes.ts +128 -40
package/core/src/query-perf-routing.test.ts +208 -0
package/core/src/schema.ts +30 -1
package/package.json +1 -1
package/src/cli.ts +90 -25
package/src/content-range-routes.test.ts +178 -0
package/src/github-device-flow.test.ts +265 -6
package/src/github-device-flow.ts +297 -45
package/src/init-summary.test.ts +125 -125
package/src/init-summary.ts +89 -54
package/src/init.test.ts +128 -0
package/src/mirror-credentials.test.ts +20 -0
package/src/mirror-credentials.ts +6 -2
package/src/mirror-remote-guard.test.ts +269 -0
package/src/mirror-remote-guard.ts +273 -0
package/src/mirror-routes.test.ts +1118 -46
package/src/mirror-routes.ts +405 -32
package/src/routes.ts +69 -3
package/src/routing.ts +8 -0
package/src/vault.test.ts +56 -0
package/web/ui/dist/assets/index-BPgyIjR7.js +61 -0
package/web/ui/dist/index.html +1 -1
package/web/ui/dist/assets/index-CGL256oe.js +0 -60

package/README.md CHANGED Viewed

@@ -22,7 +22,7 @@ bun install
 bun src/cli.ts vault init
 ```
-`vault init` creates a vault, generates an API key, starts a background daemon (launchd on Mac, systemd on Linux), and configures Claude Code's MCP — all in one command. Start a new Claude Code session and your vault's tools show up. For other local MCP clients (Codex, Goose, OpenCode, Cursor, Zed, Cline, your own agent), point them at `http://127.0.0.1:1940/vault/default/mcp` — the API key is printed once at init; save it for anything that isn't Claude Code.
+`vault init` creates a vault, starts a background daemon (launchd on Mac, systemd on Linux), and prints how to connect your AI — the web setup wizard URL, your vault's connector URL (`http://127.0.0.1:1940/vault/default/mcp`), and a ready-to-paste `claude mcp add` command. It does **not** write any client config for you by default. Pass `--configure-claude-code` if you want init to add the Claude Code MCP entry; pass `--token` if you also need a header-auth API token for non-OAuth clients (Codex, Goose, OpenCode, Cursor, Zed, Cline, scripts, `curl`). OAuth-capable clients just need the connector URL and sign in on first connect.
 For remote access from Claude Desktop or mobile apps, see [Deployment](#deployment) below.
@@ -93,11 +93,11 @@ The daemon binds `0.0.0.0:1940` (or whatever you set in `PORT`) and serves REST,
 ### `~/.claude.json`
-`vault init` adds one entry — `mcpServers["parachute-vault"]` — pointing at `http://127.0.0.1:<port>/vault/<default-vault>/mcp` with a baked-in `Authorization: Bearer <hub-jwt>` header (a hub-minted JWT — vault#282 Stage 2). Next Claude Code session picks it up; there's no further wiring. See [Connecting a client](#connecting-a-client) for rotating that token or pointing it elsewhere.
+When you opt in (`--configure-claude-code`), `vault init` adds one entry — `mcpServers["parachute-vault"]` — pointing at `http://127.0.0.1:<port>/vault/<default-vault>/mcp`. By default that entry uses OAuth (browser sign-in on first connect); add `--token` and init bakes a scope-narrow `Authorization: Bearer <hub-jwt>` header instead (a hub-minted JWT — vault#282 Stage 2). Next Claude Code session picks it up. See [Connecting a client](#connecting-a-client) for rotating that token or pointing it elsewhere.
 ### Your API token
-`vault init` asks two explicit questions: (1) install vault as an MCP server in `~/.claude.json`? (2) also surface the access token so you can paste it into other MCP clients (Codex, Goose, OpenCode, Cursor, Zed, Cline), scripts, or `curl`? Both default yes. Pass `--mcp` / `--no-mcp` and `--token` / `--no-token` for non-interactive installs.
+`vault init` asks two explicit questions: (1) write the Claude Code MCP entry in `~/.claude.json`? (2) also mint + surface a header-auth API token so you can paste it into non-OAuth MCP clients (Codex, Goose, OpenCode, Cursor, Zed, Cline), scripts, or `curl`? **Both default no** — init's job is to get you to the web wizard and print the connector URL + a paste-ready `claude mcp add` command, not to write client config behind your back. Opt in with `--configure-claude-code` (alias `--mcp`) and/or `--token`; `--no-mcp` / `--no-token` are the explicit opt-outs for non-interactive installs.
 If you said yes to (2), the hub-issued JWT is printed prominently at the end — it's the same token baked into `~/.claude.json` (if you also said yes to (1)). It's not stored anywhere retrievable — save it if you need it for `curl`, cron, or any other script. Lost it? Mint a fresh one with `parachute auth mint-token --scope vault:<name>:<verb>` (or rewire an MCP client with `parachute-vault mcp-install`, or use the admin SPA Tokens page). As of vault 0.5.0 (vault#282 Stage 2) vault no longer mints its own `pvt_*` tokens — minting is the hub's job.
@@ -126,7 +126,7 @@ As of 0.5.0 (vault#282 Stage 2) vault is a **pure hub resource-server**: both pa
 ### Claude Code
-`vault init` fully auto-configures `~/.claude.json` — there's nothing else to do. The entry it writes bakes in a hub-minted JWT rather than running the interactive OAuth browser flow:
+`vault init` does **not** touch `~/.claude.json` by default — connecting is self-serve (paste the `claude mcp add` command init prints, or add the connector in your client). To have init write the entry for you, pass `--configure-claude-code`. The entry it writes uses OAuth by default (browser sign-in on first connect); add `--token` and it bakes a scope-narrow hub-minted JWT instead:
 ```json
 {
@@ -140,9 +140,9 @@ As of 0.5.0 (vault#282 Stage 2) vault is a **pure hub resource-server**: both pa
 }
 ```
-Where `{name}` is `default` on a fresh install, or whatever vault you pointed `vault init` at. **First MCP call after `vault init` requires no browser handoff — Claude Code uses the baked-in token and the vault's tools show up in your next session.** This is intentional: for an owner connecting their own machine's vault to their own Claude Code, the token is already there and the OAuth browser handshake would add friction.
+Where `{name}` is `default` on a fresh install, or whatever vault you pointed `vault init` at. **With `--configure-claude-code --token`, the first MCP call needs no browser handoff** — Claude Code uses the baked-in token and the vault's tools show up in your next session. Without `--token`, the opted-in entry uses OAuth and Claude Code does a one-time browser sign-in on first connect. This is a deliberate trade: OAuth-first by default (no long-lived token sitting in a dotfile), with the baked-token path one flag away for an owner wiring their own machine.
-To re-point Claude Code at a different vault, change `default_vault` in `~/.parachute/vault/config.yaml` and re-run `parachute-vault init` — which re-mints an API token and re-writes the `~/.claude.json` entry end-to-end. To rotate the token only, run `parachute-vault mcp-install` (defaults to `--mint`, which mints a fresh scope-narrow hub JWT via `~/.parachute/operator.token` and writes it into `~/.claude.json` with an `Authorization: Bearer …` header). See the [cookbook](#install-vault-mcp-into-a-client-config) section below for the full flag surface — token paste, scope narrowing, project-level install, multi-vault.
+To re-point Claude Code at a different vault, change `default_vault` in `~/.parachute/vault/config.yaml` and re-run `parachute-vault init --configure-claude-code` (add `--token` to bake a fresh token). To rotate the token only, run `parachute-vault mcp-install` (defaults to `--mint`, which mints a fresh scope-narrow hub JWT via `~/.parachute/operator.token` and writes it into `~/.claude.json` with an `Authorization: Bearer …` header). See the [cookbook](#install-vault-mcp-into-a-client-config) section below for the full flag surface — token paste, scope narrowing, project-level install, multi-vault.
 ### Claude Desktop (OAuth)
@@ -233,6 +233,14 @@ parachute-vault config                     # show current configuration
 parachute-vault config set KEY value       # set an env var (e.g. PORT=1940)
 parachute-vault config unset KEY           # remove an env var
 parachute-vault restart                    # apply config changes (bounces the daemon)
+# Env vars live in ~/.parachute/vault/.env. Notable ones:
+#   PORT                          — server port (default 1940)
+#   PARACHUTE_GITHUB_CLIENT_ID +
+#   PARACHUTE_GITHUB_APP_SLUG     — bring-your-own GitHub App for the mirror
+#                                   "Back up to GitHub" flow (defaults to the
+#                                   shared Parachute app). Set BOTH or NEITHER:
+#                                   the pair must name the same app — mixing
+#                                   apps breaks the install probe.
 # Server
 parachute-vault serve                      # run the server in the foreground (no daemon)
@@ -522,6 +530,23 @@ curl -H "Authorization: Bearer $VAULT_TOKEN" \
 Caller-tunable preview length is a future enhancement — file an issue if 120 chars isn't enough.
+### Read a large note in chunks (content range)
+A 100KB+ transcript won't fit in one MCP response. Pass `content_offset` / `content_length` (UTF-8 bytes) for a bounded read — the response carries the slice plus `content_total_length` and `content_next_offset` (`null` when complete). Loop, feeding `content_next_offset` back in as `content_offset`; concatenating the slices reconstructs the content byte-for-byte. Slices end on a codepoint boundary within the budget (never over `content_length`, at most 3 bytes under).
+```bash
+curl -H "Authorization: Bearer $VAULT_TOKEN" \
+  "http://localhost:1940/vault/default/api/notes/Meetings%2F2026-06-09?content_offset=0&content_length=65536"
+# → { ..., "content": "<first ≤64KB>", "content_total_length": 118034, "content_next_offset": 65530 }
+```
+```jsonc
+// MCP — same params on query-notes; works per-note on lists with include_content: true
+{ "name": "query-notes", "arguments": { "id": "Meetings/2026-06-09", "content_offset": 0, "content_length": 65536 } }
+```
+Range params require content in the response — with `include_content=false` (or a list query left on its lean default) they error rather than silently no-op. Full semantics in [docs/HTTP_API.md](./docs/HTTP_API.md) ("Content range — bounded reads for large notes").
 ### Incremental rebuilds: "what changed since X"
 The SSG / sync pattern. Two equivalent forms — bracket-style is canonical going forward; the flat form is the same shape that ships through the REST/MCP date filter today.

package/core/src/content-range.test.ts ADDED Viewed

@@ -0,0 +1,374 @@
+/**
+ * Content range / pagination tests — bounded reads for large notes.
+ *
+ * Three layers:
+ *   1. Unit tests of the parse + slice helpers (boundary cases: offset
+ *      past end, sub-minimum budget, multi-byte codepoints at the cut).
+ *   2. Property test of the reassembly invariant: walking a string from
+ *      offset 0 via `content_next_offset` and concatenating the slices is
+ *      byte-identical to the full content, for arbitrary unicode content
+ *      and budgets — and no slice ever exceeds the byte budget.
+ *   3. MCP face (`query-notes` execute): single + list shapes, the
+ *      include_content interaction, and the no-params regression
+ *      (response shape byte-identical to pre-pagination behavior).
+ *
+ * The REST face is exercised in src/content-range-routes.test.ts.
+ */
+import { describe, it, expect, beforeEach } from "bun:test";
+import { Database } from "bun:sqlite";
+import { SqliteStore } from "./store.js";
+import { generateMcpTools } from "./mcp.js";
+import {
+  parseContentRange,
+  sliceContentRange,
+  applyContentRange,
+  MIN_CONTENT_LENGTH,
+} from "./content-range.js";
+// ---------------------------------------------------------------------------
+// 1. parseContentRange
+// ---------------------------------------------------------------------------
+describe("parseContentRange", () => {
+  it("returns null when neither param is present (range mode off)", () => {
+    expect(parseContentRange(undefined, undefined)).toBeNull();
+    expect(parseContentRange(null, null)).toBeNull();
+  });
+  it("treats empty strings as absent (REST `?content_offset=`)", () => {
+    expect(parseContentRange("", "")).toBeNull();
+  });
+  it("offset only → length omitted (read to end)", () => {
+    expect(parseContentRange(10, undefined)).toEqual({ offset: 10 });
+  });
+  it("length only → offset defaults to 0", () => {
+    expect(parseContentRange(undefined, 64)).toEqual({ offset: 0, length: 64 });
+  });
+  it("accepts decimal strings (REST query params)", () => {
+    expect(parseContentRange("5", "1024")).toEqual({ offset: 5, length: 1024 });
+  });
+  it("rejects negative offset", () => {
+    expect(() => parseContentRange(-1, undefined)).toThrow(/content_offset/);
+  });
+  it("rejects non-integer values", () => {
+    expect(() => parseContentRange(1.5, undefined)).toThrow(/content_offset/);
+    expect(() => parseContentRange(undefined, 7.2)).toThrow(/content_length/);
+    expect(() => parseContentRange("abc", undefined)).toThrow(/content_offset/);
+    expect(() => parseContentRange(undefined, "-4")).toThrow(/content_length/);
+  });
+  it(`rejects zero / negative / sub-minimum budget (< ${MIN_CONTENT_LENGTH})`, () => {
+    expect(() => parseContentRange(undefined, 0)).toThrow(/content_length/);
+    expect(() => parseContentRange(undefined, -8)).toThrow(/content_length/);
+    expect(() => parseContentRange(undefined, MIN_CONTENT_LENGTH - 1)).toThrow(/content_length/);
+    // The minimum itself is fine.
+    expect(parseContentRange(undefined, MIN_CONTENT_LENGTH)).toEqual({
+      offset: 0,
+      length: MIN_CONTENT_LENGTH,
+    });
+  });
+  it("throws QueryError with INVALID_QUERY code", () => {
+    try {
+      parseContentRange(undefined, 2);
+      throw new Error("should have thrown");
+    } catch (e: any) {
+      expect(e.name).toBe("QueryError");
+      expect(e.code).toBe("INVALID_QUERY");
+    }
+  });
+});
+// ---------------------------------------------------------------------------
+// 2. sliceContentRange — boundary cases
+// ---------------------------------------------------------------------------
+describe("sliceContentRange", () => {
+  it("plain ASCII window", () => {
+    const r = sliceContentRange("hello world", { offset: 0, length: 5 });
+    expect(r.content).toBe("hello");
+    expect(r.content_offset).toBe(0);
+    expect(r.content_total_length).toBe(11);
+    expect(r.content_next_offset).toBe(5);
+  });
+  it("continuation window reaches the end → next_offset null", () => {
+    const r = sliceContentRange("hello world", { offset: 5, length: 100 });
+    expect(r.content).toBe(" world");
+    expect(r.content_next_offset).toBeNull();
+  });
+  it("offset with no length reads to the end", () => {
+    const r = sliceContentRange("hello world", { offset: 6 });
+    expect(r.content).toBe("world");
+    expect(r.content_total_length).toBe(11);
+    expect(r.content_next_offset).toBeNull();
+  });
+  it("offset exactly at end → empty slice, complete", () => {
+    const r = sliceContentRange("abc", { offset: 3 });
+    expect(r.content).toBe("");
+    expect(r.content_offset).toBe(3);
+    expect(r.content_total_length).toBe(3);
+    expect(r.content_next_offset).toBeNull();
+  });
+  it("offset past end → empty slice, complete (graceful loop termination)", () => {
+    const r = sliceContentRange("abc", { offset: 999, length: 16 });
+    expect(r.content).toBe("");
+    expect(r.content_offset).toBe(3); // clamped to total
+    expect(r.content_total_length).toBe(3);
+    expect(r.content_next_offset).toBeNull();
+  });
+  it("empty content → empty slice, total 0", () => {
+    const r = sliceContentRange("", { offset: 0, length: 16 });
+    expect(r.content).toBe("");
+    expect(r.content_total_length).toBe(0);
+    expect(r.content_next_offset).toBeNull();
+  });
+  it("budget cutting mid-codepoint backs off to the boundary (never over budget)", () => {
+    // "ab😀cd" — bytes: a=0, b=1, 😀=2..5 (4 bytes), c=6, d=7; total 8.
+    const s = "ab\u{1F600}cd";
+    expect(Buffer.byteLength(s, "utf8")).toBe(8);
+    // Budget 5 would cut mid-emoji → slice backs off to byte 2.
+    const r1 = sliceContentRange(s, { offset: 0, length: 5 });
+    expect(r1.content).toBe("ab");
+    expect(Buffer.byteLength(r1.content, "utf8")).toBeLessThanOrEqual(5);
+    expect(r1.content_next_offset).toBe(2);
+    // Next window picks up the whole emoji.
+    const r2 = sliceContentRange(s, { offset: 2, length: 4 });
+    expect(r2.content).toBe("\u{1F600}");
+    expect(r2.content_next_offset).toBe(6);
+    // Final window.
+    const r3 = sliceContentRange(s, { offset: 6, length: 4 });
+    expect(r3.content).toBe("cd");
+    expect(r3.content_next_offset).toBeNull();
+  });
+  it("offset landing mid-codepoint aligns DOWN (no bytes skipped) and echoes the effective offset", () => {
+    const s = "ab\u{1F600}cd"; // emoji occupies bytes 2..5
+    const r = sliceContentRange(s, { offset: 4, length: 8 });
+    expect(r.content_offset).toBe(2); // aligned down to the emoji's lead byte
+    expect(r.content).toBe("\u{1F600}cd");
+    expect(r.content_next_offset).toBeNull();
+  });
+  it("minimum budget always makes progress, even on a 4-byte codepoint", () => {
+    const s = "\u{1F600}\u{1F601}"; // two 4-byte emoji
+    const r = sliceContentRange(s, { offset: 0, length: MIN_CONTENT_LENGTH });
+    expect(r.content).toBe("\u{1F600}");
+    expect(r.content_next_offset).toBe(4);
+  });
+  it("applyContentRange mutates the shaped result in place", () => {
+    const result: any = { id: "n1", content: "hello world", tags: ["x"] };
+    applyContentRange(result, { offset: 0, length: 5 });
+    expect(result.content).toBe("hello");
+    expect(result.content_offset).toBe(0);
+    expect(result.content_total_length).toBe(11);
+    expect(result.content_next_offset).toBe(5);
+    expect(result.tags).toEqual(["x"]); // untouched
+  });
+});
+// ---------------------------------------------------------------------------
+// 2b. Property test — reassembly invariant
+// ---------------------------------------------------------------------------
+/** Deterministic PRNG (mulberry32) so failures reproduce. */
+function mulberry32(seed: number): () => number {
+  let a = seed >>> 0;
+  return () => {
+    a |= 0;
+    a = (a + 0x6d2b79f5) | 0;
+    let t = Math.imul(a ^ (a >>> 15), 1 | a);
+    t = (t + Math.imul(t ^ (t >>> 7), 61 | t)) ^ t;
+    return ((t ^ (t >>> 14)) >>> 0) / 4294967296;
+  };
+}
+describe("content range — reassembly property", () => {
+  // Mixed-width pool: 1-byte ASCII, 2-byte (é, ψ), 3-byte (你, ‱), 4-byte
+  // (😀, 𝄞) — plus whitespace, so windows land on every codepoint width.
+  const POOL = ["a", "Z", "9", " ", "\n", "é", "ψ", "你", "‱", "\u{1F600}", "\u{1D11E}"];
+  it("concatenating all slices is byte-identical to the full content; no slice exceeds the budget", () => {
+    const rand = mulberry32(0xc0ffee);
+    for (let iter = 0; iter < 60; iter++) {
+      const charCount = Math.floor(rand() * 120); // includes 0 (empty content)
+      let content = "";
+      for (let i = 0; i < charCount; i++) {
+        content += POOL[Math.floor(rand() * POOL.length)]!;
+      }
+      const budget = MIN_CONTENT_LENGTH + Math.floor(rand() * 13); // 4..16 bytes
+      const totalBytes = Buffer.byteLength(content, "utf8");
+      let offset = 0;
+      let assembled = "";
+      let lastTotal: number | null = null;
+      // Hard ceiling on iterations: every window must advance by >= 1 byte.
+      for (let step = 0; step <= totalBytes + 2; step++) {
+        const slice = sliceContentRange(content, { offset, length: budget });
+        expect(Buffer.byteLength(slice.content, "utf8")).toBeLessThanOrEqual(budget);
+        expect(slice.content_total_length).toBe(totalBytes);
+        lastTotal = slice.content_total_length;
+        assembled += slice.content;
+        if (slice.content_next_offset === null) break;
+        // Progress guarantee — next offset strictly advances.
+        expect(slice.content_next_offset).toBeGreaterThan(offset);
+        offset = slice.content_next_offset;
+      }
+      expect(assembled).toBe(content);
+      expect(Buffer.from(assembled, "utf8").equals(Buffer.from(content, "utf8"))).toBe(true);
+      expect(lastTotal).toBe(totalBytes);
+    }
+  });
+});
+// ---------------------------------------------------------------------------
+// 3. MCP face — query-notes
+// ---------------------------------------------------------------------------
+describe("MCP query-notes — content range", () => {
+  let db: Database;
+  let store: SqliteStore;
+  beforeEach(() => {
+    db = new Database(":memory:");
+    store = new SqliteStore(db);
+  });
+  function queryTool() {
+    const tools = generateMcpTools(store);
+    return tools.find((t) => t.name === "query-notes")!;
+  }
+  it("single note: paged read loop reassembles the full content", async () => {
+    // Mixed-width content so windows hit multi-byte boundaries.
+    const content = ("section \u{1F600} 你好 " .repeat(40)).trim();
+    const note = await store.createNote(content);
+    const query = queryTool();
+    let offset = 0;
+    let assembled = "";
+    for (;;) {
+      const r: any = await query.execute({ id: note.id, content_offset: offset, content_length: 64 });
+      expect(r.content_total_length).toBe(Buffer.byteLength(content, "utf8"));
+      assembled += r.content;
+      if (r.content_next_offset === null) break;
+      offset = r.content_next_offset;
+    }
+    expect(assembled).toBe(content);
+  });
+  it("single note: response carries the range fields and the slice", async () => {
+    const note = await store.createNote("0123456789");
+    const r: any = await queryTool().execute({ id: note.id, content_length: 4 });
+    expect(r.content).toBe("0123");
+    expect(r.content_offset).toBe(0);
+    expect(r.content_total_length).toBe(10);
+    expect(r.content_next_offset).toBe(4);
+    expect(r.id).toBe(note.id); // rest of the note shape intact
+  });
+  it("single note, no range params → byte-identical to today (regression)", async () => {
+    const note = await store.createNote("full body here");
+    const r: any = await queryTool().execute({ id: note.id });
+    expect(r.content).toBe("full body here");
+    expect("content_total_length" in r).toBe(false);
+    expect("content_next_offset" in r).toBe(false);
+    expect("content_offset" in r).toBe(false);
+  });
+  it("single note: content_offset past end → empty slice, complete", async () => {
+    const note = await store.createNote("abc");
+    const r: any = await queryTool().execute({ id: note.id, content_offset: 999 });
+    expect(r.content).toBe("");
+    expect(r.content_total_length).toBe(3);
+    expect(r.content_next_offset).toBeNull();
+  });
+  it("single note: include_content=false + range params → loud error", async () => {
+    const note = await store.createNote("abc");
+    expect(
+      queryTool().execute({ id: note.id, include_content: false, content_length: 8 }),
+    ).rejects.toThrow(/include_content/);
+  });
+  it("rejects sub-minimum / invalid budgets before any query work", async () => {
+    const note = await store.createNote("abc");
+    expect(queryTool().execute({ id: note.id, content_length: 0 })).rejects.toThrow(/content_length/);
+    expect(queryTool().execute({ id: note.id, content_length: -5 })).rejects.toThrow(/content_length/);
+    expect(queryTool().execute({ id: note.id, content_length: 2 })).rejects.toThrow(/content_length/);
+    expect(queryTool().execute({ id: note.id, content_offset: -1 })).rejects.toThrow(/content_offset/);
+  });
+  it("list query: include_content=true applies the window per note", async () => {
+    await store.createNote("alpha alpha alpha", { tags: ["big"] });
+    await store.createNote("beta beta beta beta", { tags: ["big"] });
+    const out: any[] = (await queryTool().execute({
+      tag: "big",
+      include_content: true,
+      content_length: 5,
+    })) as any[];
+    expect(out.length).toBe(2);
+    for (const n of out) {
+      expect(Buffer.byteLength(n.content, "utf8")).toBeLessThanOrEqual(5);
+      expect(typeof n.content_total_length).toBe("number");
+      expect(n.content_next_offset).toBe(5);
+    }
+  });
+  it("list query: lean default (no include_content) + range params → loud error", async () => {
+    await store.createNote("alpha", { tags: ["big"] });
+    expect(queryTool().execute({ tag: "big", content_length: 8 })).rejects.toThrow(
+      /include_content/,
+    );
+  });
+  it("list query, no range params → no range fields injected (regression)", async () => {
+    await store.createNote("alpha", { tags: ["big"] });
+    const out: any[] = (await queryTool().execute({ tag: "big", include_content: true })) as any[];
+    expect(out.length).toBe(1);
+    expect("content_total_length" in out[0]).toBe(false);
+    expect("content_next_offset" in out[0]).toBe(false);
+  });
+  it("expand_links: the range applies to the EXPANDED content", async () => {
+    await store.createNote("inlined body of B", { path: "B" });
+    const a = await store.createNote("A says: [[B]]", { path: "A" });
+    const query = queryTool();
+    const unpaged: any = await query.execute({ id: a.id, expand_links: true });
+    const paged: any = await query.execute({
+      id: a.id,
+      expand_links: true,
+      content_offset: 0,
+      content_length: 100000,
+    });
+    expect(paged.content).toBe(unpaged.content);
+    expect(paged.content_total_length).toBe(Buffer.byteLength(unpaged.content, "utf8"));
+    expect(paged.content_next_offset).toBeNull();
+  });
+  it("query-notes schema advertises the params (MCP discovery)", () => {
+    const query = queryTool();
+    const props = (query.inputSchema as any).properties;
+    expect(props.content_offset).toBeDefined();
+    expect(props.content_length).toBeDefined();
+    expect(query.description).toContain("content_offset");
+    expect(query.description).toContain("content_next_offset");
+  });
+});

package/core/src/content-range.ts ADDED Viewed

@@ -0,0 +1,185 @@
+/**
+ * Content range / pagination — bounded reads for large notes.
+ *
+ * MCP responses are size-limited: a 100KB+ transcript can't come back in
+ * one `query-notes` call, and a remote MCP client has no `curl | head -c`
+ * escape hatch. These helpers let a caller read note content in byte
+ * windows:
+ *
+ *   request:  `content_offset` (default 0) + `content_length` (byte budget)
+ *   response: `content` (the slice) + `content_offset` (effective start)
+ *             + `content_total_length` + `content_next_offset`
+ *             (`null` when the slice reaches the end)
+ *
+ * Unit is **UTF-8 bytes** — the same unit as `byteSize` on the lean
+ * NoteIndex shape, and the natural unit for budgeting response size. But
+ * naive byte-slicing can split a multi-byte codepoint, which would corrupt
+ * the JSON string. So slices always end on a codepoint boundary *within*
+ * the budget: a slice never exceeds `content_length` bytes but may come up
+ * to 3 bytes short when a multi-byte character straddles the cut. A
+ * `content_offset` that lands mid-codepoint (only possible when the caller
+ * computes offsets by hand — chained `content_next_offset` values are
+ * always boundary-aligned) is aligned DOWN to the codepoint's leading byte
+ * so no bytes are ever skipped; the effective start is echoed back as
+ * `content_offset` on the response.
+ *
+ * Reassembly invariant (pinned by content-range.test.ts): starting at
+ * offset 0 and following `content_next_offset` until `null`, the
+ * concatenation of slices is byte-identical to the full content.
+ */
+import { QueryError } from "./query-operators.js";
+/**
+ * Minimum accepted `content_length`. A UTF-8 codepoint is at most 4 bytes,
+ * so any budget >= 4 is guaranteed to make progress (the codepoint at the
+ * window start always fits). Budgets 1–3 could stall forever on a 4-byte
+ * emoji (empty slice, next_offset == offset); rejecting them up front is
+ * deterministic and simpler than a runtime "no progress" error.
+ */
+export const MIN_CONTENT_LENGTH = 4;
+export interface ContentRange {
+  /** Byte offset (UTF-8) to start reading from. */
+  offset: number;
+  /** Max bytes to return. Absent = read to the end. */
+  length?: number;
+}
+export interface ContentRangeFields {
+  content: string;
+  /** Effective start (requested offset aligned down to a codepoint boundary). */
+  content_offset: number;
+  /** Full content size in UTF-8 bytes. */
+  content_total_length: number;
+  /** Byte offset to resume from, or null when the slice reaches the end. */
+  content_next_offset: number | null;
+}
+function toNonNegativeInt(raw: unknown, name: string): number | undefined {
+  if (raw === undefined || raw === null) return undefined;
+  let n: number;
+  if (typeof raw === "number") {
+    n = raw;
+  } else if (typeof raw === "string") {
+    if (raw.trim() === "") return undefined; // `?content_offset=` — treat empty as absent
+    if (!/^\d+$/.test(raw.trim())) {
+      throw new QueryError(
+        `invalid \`${name}\` value ${JSON.stringify(raw)} — must be a non-negative integer (UTF-8 byte count).`,
+        "INVALID_QUERY",
+      );
+    }
+    n = Number(raw.trim());
+  } else {
+    throw new QueryError(
+      `invalid \`${name}\` value — must be a non-negative integer (UTF-8 byte count).`,
+      "INVALID_QUERY",
+    );
+  }
+  if (!Number.isSafeInteger(n) || n < 0) {
+    throw new QueryError(
+      `invalid \`${name}\` value ${JSON.stringify(raw)} — must be a non-negative integer (UTF-8 byte count).`,
+      "INVALID_QUERY",
+    );
+  }
+  return n;
+}
+/**
+ * Parse the `content_offset` / `content_length` pair. Returns `null` when
+ * neither is present (range mode off — response shape byte-identical to
+ * the no-pagination behavior). Throws `QueryError` (INVALID_QUERY) on
+ * negative / non-integer values or a `content_length` below
+ * {@link MIN_CONTENT_LENGTH}.
+ *
+ * Accepts numbers (MCP params) and decimal strings (REST query params);
+ * empty strings count as absent.
+ */
+export function parseContentRange(offsetRaw: unknown, lengthRaw: unknown): ContentRange | null {
+  const offset = toNonNegativeInt(offsetRaw, "content_offset");
+  const length = toNonNegativeInt(lengthRaw, "content_length");
+  if (offset === undefined && length === undefined) return null;
+  if (length !== undefined && length < MIN_CONTENT_LENGTH) {
+    throw new QueryError(
+      `invalid \`content_length\` value ${JSON.stringify(lengthRaw)} — must be at least ${MIN_CONTENT_LENGTH} bytes (the size of the largest UTF-8 codepoint, so every window makes progress).`,
+      "INVALID_QUERY",
+    );
+  }
+  return { offset: offset ?? 0, ...(length !== undefined ? { length } : {}) };
+}
+/**
+ * The error both faces raise when range params are combined with a
+ * response shape that excludes content (`include_content=false`, or a
+ * list query left on its lean default). Centralized so MCP and REST emit
+ * the same message.
+ */
+export function contentRangeRequiresContent(): QueryError {
+  return new QueryError(
+    `content_offset/content_length apply to note content, but content is not included in this response shape. Pass include_content=true (lists default to false) or drop the range params.`,
+    "INVALID_QUERY",
+  );
+}
+/** True for UTF-8 continuation bytes (0b10xxxxxx). */
+function isContinuationByte(b: number): boolean {
+  return (b & 0xc0) === 0x80;
+}
+/**
+ * Slice `content` to the requested byte window, never splitting a UTF-8
+ * codepoint and never exceeding `range.length` bytes. See module doc for
+ * the alignment rules.
+ */
+export function sliceContentRange(content: string, range: ContentRange): ContentRangeFields {
+  const bytes = Buffer.from(content, "utf8");
+  const total = bytes.byteLength;
+  // At/past the end: empty slice, complete. Graceful (not an error) so a
+  // pagination loop that overshoots — e.g. the note shrank between calls —
+  // terminates cleanly on `content_next_offset: null`.
+  if (range.offset >= total) {
+    return {
+      content: "",
+      content_offset: total,
+      content_total_length: total,
+      content_next_offset: null,
+    };
+  }
+  // Align the start DOWN to the leading byte of the codepoint containing
+  // `offset` — never skip bytes (a forward-align would drop them from
+  // every window and break reassembly).
+  let start = range.offset;
+  while (start > 0 && isContinuationByte(bytes[start]!)) start--;
+  // Window end: budget capped at total. Align DOWN so the slice doesn't
+  // end mid-codepoint — under the budget, never over.
+  let end = range.length === undefined ? total : Math.min(start + range.length, total);
+  while (end > start && end < total && isContinuationByte(bytes[end]!)) end--;
+  return {
+    content: bytes.subarray(start, end).toString("utf8"),
+    content_offset: start,
+    content_total_length: total,
+    content_next_offset: end >= total ? null : end,
+  };
+}
+/**
+ * Apply a content range to a shaped response object in place: replaces
+ * `content` with the slice and adds `content_offset`,
+ * `content_total_length`, `content_next_offset`. No-op fields are never
+ * added when range mode is off — callers only invoke this with a parsed
+ * (non-null) range.
+ */
+export function applyContentRange(
+  result: { content?: unknown; [key: string]: unknown },
+  range: ContentRange,
+): void {
+  const fields = sliceContentRange(typeof result.content === "string" ? result.content : "", range);
+  result.content = fields.content;
+  result.content_offset = fields.content_offset;
+  result.content_total_length = fields.content_total_length;
+  result.content_next_offset = fields.content_next_offset;
+}