npm - ada-agent - Versions diffs - 0.6.1 → 0.7.0 - Mend

ada-agent 0.6.1 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/README.md +7 -0
package/docs/enterprise.md +83 -0
package/package.json +1 -1
package/src/client/cli.ts +39 -2
package/src/client/settings.ts +21 -3
package/src/selfcheck.ts +73 -0
package/src/server/enterprise.ts +346 -0
package/src/server/index.ts +163 -25
package/src/server/providers/anthropic.ts +10 -1
package/src/server/providers/openai-compat.ts +23 -4

package/README.md CHANGED Viewed

@@ -245,6 +245,13 @@ See **[docs/architecture.md](docs/architecture.md)** for the design (adapters, r
 flow, file layout), **[docs/orchestration.md](docs/orchestration.md)** for the agent strategies, and
 **[docs/integrations.md](docs/integrations.md)** for the HTTP API / SDK / ACP.
+## Enterprise
+`ada-server` doubles as an org **control plane**: per-user seat keys, an org policy (server-enforced
+model allowlist + tool rules pushed to every client), per-user usage metering, and an audit log —
+activated only when you create seats, file-backed, self-hosted in your own network. See
+**[docs/enterprise.md](docs/enterprise.md)** for the 2-minute bootstrap.
 ## Benchmarks
 ada can run **SWE-bench Verified** — it generates patches for real GitHub issues (one isolated repo

package/docs/enterprise.md ADDED Viewed

@@ -0,0 +1,83 @@
+# Enterprise: seats, policy, metering, audit
+`ada-server` doubles as an org **control plane**: per-user seat keys, an org policy the backend
+enforces (and clients apply locally), per-user usage metering, and an audit log. It's all
+file-backed under `~/.ada/server/` (override with `ADA_DATA_DIR`) — fine to ~50 seats; a database
+is the upgrade path, not the starting point.
+**Enterprise mode activates only when a seat exists or `ADA_ADMIN_KEY` is set.** With neither, the
+backend behaves exactly as before (dev-open, or `ADA_CLIENT_KEYS`/login).
+## Bootstrap (2 minutes)
+```bash
+# 1. start the backend with a bootstrap admin key (any long random string)
+export ADA_ADMIN_KEY=$(openssl rand -hex 24)
+export CLOUDFLARE_ACCOUNT_ID=... CLOUDFLARE_API_TOKEN=...     # your provider keys
+ada-server                     # banner shows: [ENTERPRISE (0 seats + admin key)]
+# 2. create a seat per developer — the key is shown ONCE
+curl -s -X POST -H "Authorization: Bearer $ADA_ADMIN_KEY" \
+     -d '{"name":"alice"}' http://localhost:8787/v1/users
+# → { "key": "ada_sk_…", "name": "alice", "role": "dev", "note": "shown once — store it now" }
+```
+Each developer sets their seat key and points at the backend:
+```bash
+export ADA_BACKEND_URL=https://ada.yourcompany.com/v1
+export ADA_CLIENT_KEY=ada_sk_…
+ada
+```
+## Seats
+```bash
+GET    /v1/users               # list: name, role, keyPrefix, created, disabled   (admin)
+POST   /v1/users               # {"name":"bob","role":"dev"|"admin"} → full key, once   (admin)
+DELETE /v1/users/<keyPrefix>   # disable a seat (≥12 chars of its key; kept for the audit trail)   (admin)
+```
+Full keys are never listed after creation — only a 14-char prefix. `ADA_ADMIN_KEY` is the
+break-glass admin; create an admin *seat* for day-to-day and keep the env key in a vault.
+## Org policy
+```bash
+curl -X PUT -H "Authorization: Bearer $ADA_ADMIN_KEY" http://localhost:8787/v1/policy -d '{
+  "models": ["@cf/*", "claude-*"],
+  "permissions": [
+    { "tool": "web_*",  "action": "deny" },
+    { "tool": "bash",   "pattern": "*curl*", "action": "ask" }
+  ]
+}'
+```
+- **`models`** — allowlist (`*` wildcards). Enforced **server-side** (403 + audit entry), so a
+  modified client can't route around it. Empty/absent = all models.
+- **`permissions`** — tool rules **pushed to clients** (fetched from `GET /v1/policy` at startup —
+  interactive, `-p` headless, `serve`, and `acp` alike). Merged restrictive-wins with local config:
+  an org `deny` beats any local `allow`, an org `ask` upgrades a local `allow`, and an org `allow`
+  can never *loosen* a local deny or the default gating. **Honest caveat:** tool rules run in the
+  *client*, so they govern well-behaved clients — and only **model-allowlist** denials are audited
+  server-side; tool-rule outcomes are not visible to `/v1/audit`. The **hard, server-enforced**
+  guarantees are: authentication, the model allowlist, provider pinning (when an allowlist is set,
+  the client's `provider` hint is ignored so a request can't be re-routed off-policy), and metering.
+## Usage & audit
+```bash
+GET /v1/usage?days=30    # totals + per-user + per-model {requests, promptTokens, completionTokens}   (admin)
+GET /v1/audit?limit=200  # seat_created / seat_disabled / policy_updated / policy_denied_model …      (admin)
+```
+Metering is captured server-side by teeing every chat response (streamed or not) and recording the
+upstream's reported token usage per user — clients can't underreport. Join with
+`ada catalog` prices for cost.
+## Deployment notes
+- Run behind TLS (caddy/nginx) — seat keys travel as bearer tokens.
+- `ADA_DATA_DIR` on a persistent volume; back it up (it's 4 small JSON/JSONL files).
+- One deployment = one org. Multi-org/SaaS is deliberately out of scope for v1.
+- Compliance paperwork (SOC 2, DPA) is process, not code — start it when a buyer asks.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ada-agent",
-  "version": "0.6.1",
+  "version": "0.7.0",
   "description": "A from-zero terminal coding agent with a Cursor-style routing backend, ~285 skills, MCP connectors, and ask/plan/auto modes",
   "type": "module",
   "license": "MIT",

package/src/client/cli.ts CHANGED Viewed

@@ -3,7 +3,8 @@
 import { createInterface } from "node:readline/promises";
 import { spawnSync } from "node:child_process";
 import { basename, dirname, join, resolve } from "node:path";
-import { readFileSync } from "node:fs";
+import { mkdirSync, readFileSync, writeFileSync } from "node:fs";
+import { homedir } from "node:os";
 import { fileURLToPath } from "node:url";
 import { stdin, stdout } from "node:process";
 import OpenAI from "openai";
@@ -13,7 +14,7 @@ import { expandPrompt, loadPrompts } from "./prompts.ts";
 import { Session, list, type SessionMeta } from "./session.ts";
 import { deleteCredential, getCredential, listCredentials } from "../server/credentials.ts";
 import { deviceLogin, oauthConfig } from "../server/oauth.ts";
-import { addTrust, isTrusted, loadSettings, setActiveAgentPermissions, type Settings } from "./settings.ts";
+import { addTrust, isTrusted, loadSettings, setActiveAgentPermissions, setOrgPermissions, type PermRule, type Settings } from "./settings.ts";
 import { getCommands, loadExtensions } from "./extensions.ts";
 import { registerTool, setAsker } from "./tools.ts";
 import { addRemoteSkill, loadSkills, registerSkillTool } from "./skills.ts";
@@ -368,6 +369,38 @@ async function loginFlow(provider: string): Promise<boolean> {
   }
 }
+/** Fetch org policy from an enterprise backend and apply its tool rules locally (restrictive-wins
+ *  merge in settings.permissionFor; the backend enforces the model allowlist regardless). Caches the
+ *  last-good policy under ~/.ada so a transient fetch failure falls back to known rules instead of
+ *  silently dropping them. No-op against a non-enterprise backend. */
+async function applyOrgPolicy(): Promise<void> {
+  const cacheFile = join(homedir(), ".ada", "org-policy.json");
+  const enterprise = clientKey().startsWith("ada_sk_"); // a seat key ⇒ this is an enterprise backend
+  try {
+    const r = await fetch(`${BACKEND}/policy`, { headers: { authorization: `Bearer ${clientKey()}` }, signal: AbortSignal.timeout(3000) });
+    if (!r.ok) throw new Error(`HTTP ${r.status}`);
+    const policy = (await r.json()) as { permissions?: PermRule[] };
+    setOrgPermissions(policy.permissions ?? null);
+    try {
+      mkdirSync(join(homedir(), ".ada"), { recursive: true });
+      writeFileSync(cacheFile, JSON.stringify(policy));
+    } catch {
+      /* cache best-effort */
+    }
+    if (policy.permissions?.length) console.error(`\x1b[2m↳ org policy applied (${policy.permissions.length} rule${policy.permissions.length === 1 ? "" : "s"})\x1b[0m`);
+  } catch (e) {
+    if (!enterprise) return; // non-enterprise backend — local rules only, silently
+    // Enterprise backend unreachable — fall back to the last policy we saw, and say so loudly.
+    try {
+      const cached = JSON.parse(readFileSync(cacheFile, "utf8")) as { permissions?: PermRule[] };
+      setOrgPermissions(cached.permissions ?? null);
+      console.error(`\x1b[33m[warn] could not fetch org policy (${e instanceof Error ? e.message : e}) — using cached rules.\x1b[0m`);
+    } catch {
+      console.error(`\x1b[33m[warn] could not fetch org policy (${e instanceof Error ? e.message : e}) and no cache — org tool rules NOT applied this session.\x1b[0m`);
+    }
+  }
+}
 /** Startup login check: probe the backend; if it says 401, offer to sign in and rebuild the client. */
 async function ensureAuth(rl: RL, client: OpenAI): Promise<OpenAI> {
   let status: number;
@@ -658,6 +691,7 @@ async function main(): Promise<void> {
     registerSkillTool(loadSkills(trusted));
     await loadMcpServers(trusted);
     const client = makeClient();
+    await applyOrgPolicy(); // enterprise org rules apply to acp sessions too
     let model = process.env.ADA_MODEL || settings.model || "";
     if (!model) {
       try {
@@ -741,6 +775,7 @@ async function main(): Promise<void> {
     registerSkillTool(loadSkills(trusted));
     await loadMcpServers(trusted);
     const client = makeClient();
+    await applyOrgPolicy(); // enterprise org rules apply to serve sessions too
     let model = (process.argv[3] && !process.argv[3].startsWith("--") ? process.argv[3] : "") || process.env.ADA_MODEL || settings.model || "";
     if (!model) {
       try {
@@ -1082,6 +1117,7 @@ async function main(): Promise<void> {
   if (flags.print !== undefined) {
     const trusted = isTrusted(process.cwd());
     const settings = loadSettings(trusted);
+    await applyOrgPolicy(); // org tool rules bind headless runs too (CI is the classic bypass path)
     let pm = flags.model ?? process.env.ADA_MODEL ?? settings.model ?? scoped[0] ?? "";
     if (!pm) {
       try {
@@ -1125,6 +1161,7 @@ async function main(): Promise<void> {
   const mcp = await loadMcpServers(includeProject);
   client = await ensureAuth(rl, client); // always check login at startup; prompt if the backend says 401
+  await applyOrgPolicy(); // enterprise backends push org tool rules; no-op otherwise
   let session: Session;
   let history: Msg[] = [];

package/src/client/settings.ts CHANGED Viewed

@@ -66,9 +66,15 @@ export function setActiveAgentPermissions(rules: PermRule[] | null): void {
   activeAgentPerms = rules;
 }
-/** Evaluate the configured permission rules for a tool call. null = no matching rule (use defaults). */
-export function permissionFor(toolName: string, summary: string): PermAction | null {
-  const rules = activeAgentPerms ?? loadSettings(isTrusted(process.cwd())).permissions ?? [];
+// Org policy pushed by an enterprise backend (fetched from /v1/policy at startup). Merged
+// restrictive-wins: an org "deny" beats any local "allow"; an org "ask" upgrades a local "allow".
+// A local "deny" always stands — the org can tighten a user's setup, never loosen it.
+let orgPerms: PermRule[] | null = null;
+export function setOrgPermissions(rules: PermRule[] | null): void {
+  orgPerms = rules?.length ? rules : null;
+}
+function evalRules(rules: PermRule[], toolName: string, summary: string): PermAction | null {
   let result: PermAction | null = null;
   for (const r of rules) {
     const toolOk = !r.tool || r.tool === toolName || globMatch(r.tool, toolName);
@@ -78,6 +84,18 @@ export function permissionFor(toolName: string, summary: string): PermAction | n
   return result;
 }
+const STRICTNESS: Record<PermAction, number> = { allow: 0, ask: 1, deny: 2 };
+/** Evaluate the configured permission rules for a tool call. null = no matching rule (use defaults). */
+export function permissionFor(toolName: string, summary: string): PermAction | null {
+  const local = evalRules(activeAgentPerms ?? loadSettings(isTrusted(process.cwd())).permissions ?? [], toolName, summary);
+  if (!orgPerms) return local;
+  const org = evalRules(orgPerms, toolName, summary);
+  if (org === null) return local;
+  if (local === null) return org === "allow" ? null : org; // org can't LOOSEN the default gating, only tighten
+  return STRICTNESS[org] > STRICTNESS[local] ? org : local;
+}
 export function addTrust(dir: string): void {
   const g = readJson(GLOBAL);
   const dirs = new Set(g.trustedDirs ?? []);

package/src/selfcheck.ts CHANGED Viewed

@@ -293,6 +293,79 @@ async function main(): Promise<void> {
     assert.equal(route("anything-else"), "openrouter", "unmatched → openrouter");
   }
+  // --- enterprise control plane: seats, policy, metering, audit (temp data dir, no HTTP) ---
+  {
+    const dir = join(tmpdir(), `ada-ent-${Date.now()}`);
+    const ent = await import("./server/enterprise.ts");
+    process.env.ADA_DATA_DIR = dir;
+    try {
+      assert.equal(ent.enterpriseMode(dir), false, "no seats + no admin key → enterprise mode off");
+      const key = ent.createSeat("alice", "admin", dir);
+      assert.ok(key.startsWith("ada_sk_") && key.length > 40, "seat keys are long and prefixed");
+      assert.equal(ent.enterpriseMode(dir), true, "a seat activates enterprise mode");
+      assert.deepEqual(ent.identifySeat(key, dir), { user: "alice", role: "admin" }, "seat key resolves to its identity");
+      assert.equal(ent.identifySeat("ada_sk_wrong", dir), null, "unknown key → null");
+      // The auth-bypass the review caught: Object.prototype keys must NOT authenticate.
+      for (const evil of ["toString", "constructor", "__proto__", "valueOf", "hasOwnProperty"]) {
+        assert.equal(ent.identifySeat(evil, dir), null, `prototype key "${evil}" must not authenticate`);
+      }
+      assert.equal(ent.listSeats(dir)[0]!.keyPrefix.length, 14, "listing exposes only a key prefix");
+      assert.equal(ent.disableSeat(key.slice(0, 8), dir), null, "too-short prefix refused");
+      assert.equal(ent.disableSeat(key.slice(0, 14), dir), "alice", "disable by unique prefix");
+      assert.equal(ent.identifySeat(key, dir), null, "disabled seat no longer authenticates");
+      assert.ok(ent.modelAllowed("claude-opus-4-8", {}), "empty policy allows everything");
+      const pol = { models: ["@cf/*", "claude-*"] };
+      assert.ok(ent.modelAllowed("@cf/moonshotai/kimi-k2.7-code", pol), "wildcard allowlist matches");
+      assert.ok(!ent.modelAllowed("gpt-5", pol), "non-listed model denied");
+      ent.appendUsage({ ts: Date.now(), user: "alice", model: "m1", provider: "p", promptTokens: 100, completionTokens: 20 }, dir);
+      ent.appendUsage({ ts: Date.now(), user: "alice", model: "m1", provider: "p", promptTokens: 50, completionTokens: 10 }, dir);
+      ent.appendUsage({ ts: Date.now() - 90 * 86_400_000, user: "old", model: "m1", provider: "p", promptTokens: 999, completionTokens: 999 }, dir);
+      const sum = ent.usageSummary(30, dir);
+      assert.equal(sum.byUser.alice!.requests, 2, "usage aggregates per user");
+      assert.equal(sum.totals.promptTokens, 150, "old rows fall outside the window");
+      assert.ok(ent.auditTail(10, dir).some((e) => e.event === "seat_created"), "audit log records seat creation");
+      const sse = 'data: {"choices":[]}\n\ndata: {"choices":[],"usage":{"prompt_tokens":11,"completion_tokens":7,"completion_tokens_details":{"reasoning_tokens":2}}}\n\ndata: [DONE]\n\n';
+      assert.deepEqual(ent.extractLastUsage(sse), { promptTokens: 11, completionTokens: 7 }, "usage extracted from SSE tail (nested details ok)");
+      assert.equal(ent.extractLastUsage("no usage here"), null, "no usage → null");
+      // A trailing "usage": null must not hide the real one earlier in the stream.
+      assert.deepEqual(ent.extractLastUsage('{"usage":{"prompt_tokens":5,"completion_tokens":3}}\n{"usage":null}'), { promptTokens: 5, completionTokens: 3 }, "trailing usage:null skipped, real one found");
+      // policy validation rejects malformed shapes, accepts good ones
+      assert.ok("error" in ent.validatePolicy({ models: [1, 2] }), "non-string models rejected");
+      assert.ok("error" in ent.validatePolicy({ permissions: [{ tool: "x" }] }), "permission without action rejected");
+      assert.ok("policy" in ent.validatePolicy({ models: ["@cf/*"], permissions: [{ tool: "bash", action: "deny" }] }), "valid policy accepted");
+      // corrupt users.json → CorruptStore (fail-closed), NOT an empty map that unlocks the backend
+      writeFileSync(join(dir, "users.json"), "{ this is not json");
+      assert.throws(() => ent.loadSeats(dir), (e: unknown) => e instanceof ent.CorruptStore, "corrupt users.json throws CorruptStore");
+      assert.equal(ent.enterpriseMode(dir), true, "corrupt store → still enterprise (locked), never open");
+    } finally {
+      delete process.env.ADA_DATA_DIR;
+      rmSync(dir, { recursive: true, force: true });
+    }
+  }
+  // --- org policy merge: restrictive wins, org can tighten but never loosen ---
+  {
+    const { permissionFor, setActiveAgentPermissions, setOrgPermissions } = await import("./client/settings.ts");
+    setActiveAgentPermissions([{ tool: "bash", action: "allow" }]);
+    setOrgPermissions([{ tool: "bash", action: "deny" }]);
+    assert.equal(permissionFor("bash", "x"), "deny", "org deny beats local allow");
+    setOrgPermissions([{ tool: "bash", action: "ask" }]);
+    assert.equal(permissionFor("bash", "x"), "ask", "org ask upgrades local allow");
+    setActiveAgentPermissions([{ tool: "bash", action: "deny" }]);
+    setOrgPermissions([{ tool: "bash", action: "allow" }]);
+    assert.equal(permissionFor("bash", "x"), "deny", "org allow cannot loosen a local deny");
+    setActiveAgentPermissions([]);
+    assert.equal(permissionFor("bash", "x"), null, "org allow cannot loosen the default gating");
+    setOrgPermissions(null);
+    setActiveAgentPermissions(null);
+  }
   // --- @codebase semantic search: pure parts (no network / no embedding model needed) ---
   {
     const { chunkText, cosine, walkFiles } = await import("./client/embed-index.ts");

package/src/server/enterprise.ts ADDED Viewed

@@ -0,0 +1,346 @@
+// Enterprise control plane: seats (per-user client keys), org policy, usage metering, audit log.
+// One deployment = one org (it's self-hosted; multi-org is the SaaS upgrade path, not v1).
+//
+// Enterprise mode ACTIVATES when a seat exists or ADA_ADMIN_KEY is set — with neither, the backend
+// behaves exactly as before (dev-open / ADA_CLIENT_KEYS / login). Bootstrap:
+//
+//   ADA_ADMIN_KEY=<random> ada-server
+//   curl -X POST -H "Authorization: Bearer $ADA_ADMIN_KEY" localhost:8787/v1/users -d '{"name":"alice"}'
+//
+// Security posture (hardened after an adversarial review):
+//   - lookups are own-property + format-guarded (no prototype-key auth bypass);
+//   - writes are atomic (tmp + rename); a corrupt/unreadable store fails CLOSED (never dev-open);
+//   - key comparisons are timing-safe.
+// ponytail: file-backed under ~/.ada/server — fine to ~50 seats. Postgres + rotating usage logs are
+// the upgrade path when an org outgrows files (usageSummary/auditTail read whole files: OK to
+// low-millions of rows, then rotate).
+import { randomBytes, timingSafeEqual } from "node:crypto";
+import { appendFileSync, existsSync, mkdirSync, readFileSync, renameSync, writeFileSync } from "node:fs";
+import { homedir } from "node:os";
+import { join, resolve } from "node:path";
+// Resolved once: an empty ADA_DATA_DIR means "unset", and a relative path can't scatter the auth
+// store across working directories (which would itself be a fail-open).
+const DATA_DIR = resolve(process.env.ADA_DATA_DIR || join(homedir(), ".ada", "server"));
+function dataDir(): string {
+  return DATA_DIR;
+}
+/** Thrown when a store file exists but can't be read/parsed — callers must fail CLOSED, never open. */
+export class CorruptStore extends Error {}
+export interface Seat {
+  name: string;
+  role: "admin" | "dev";
+  created: string;
+  disabled?: boolean;
+}
+export interface PolicyRule {
+  tool?: string;
+  pattern?: string;
+  action: "allow" | "ask" | "deny";
+}
+export interface Policy {
+  models?: string[];
+  permissions?: PolicyRule[];
+}
+export interface UsageRow {
+  ts: number;
+  user: string;
+  model: string;
+  provider: string;
+  promptTokens: number;
+  completionTokens: number;
+}
+export interface Identity {
+  user: string;
+  role: "admin" | "dev";
+}
+const usersFile = (dir: string): string => join(dir, "users.json");
+const policyFile = (dir: string): string => join(dir, "policy.json");
+const usageFile = (dir: string): string => join(dir, "usage.jsonl");
+const auditFile = (dir: string): string => join(dir, "audit.jsonl");
+function atomicWrite(file: string, data: string): void {
+  mkdirSync(join(file, ".."), { recursive: true });
+  const tmp = `${file}.tmp.${process.pid}`;
+  writeFileSync(tmp, data);
+  renameSync(tmp, file); // atomic on the same filesystem — a crash can't leave a torn file
+}
+function errno(e: unknown): string | undefined {
+  return (e as NodeJS.ErrnoException).code;
+}
+/** Seats keyed by full key. Missing file → empty (prototype-free) map. Any OTHER read/parse error
+ *  → CorruptStore (callers fail closed — a torn users.json must not silently disable auth). */
+export function loadSeats(dir = dataDir()): Record<string, Seat> {
+  const map: Record<string, Seat> = Object.create(null); // no Object.prototype — belt-and-suspenders with the own-property check
+  let text: string;
+  try {
+    text = readFileSync(usersFile(dir), "utf8");
+  } catch (e) {
+    if (errno(e) === "ENOENT") return map;
+    throw new CorruptStore(`users.json unreadable: ${e instanceof Error ? e.message : e}`);
+  }
+  try {
+    const parsed = (JSON.parse(text) as { users?: Record<string, Seat> }).users ?? {};
+    for (const [k, v] of Object.entries(parsed)) map[k] = v;
+    return map;
+  } catch (e) {
+    throw new CorruptStore(`users.json corrupt: ${e instanceof Error ? e.message : e}`);
+  }
+}
+function saveSeats(seats: Record<string, Seat>, dir = dataDir()): void {
+  atomicWrite(usersFile(dir), JSON.stringify({ users: seats }, null, 2));
+}
+/** Enterprise mode = admin key set, or seats exist. A corrupt seat store counts as enterprise
+ *  (locked), never as "no seats" — fail closed. */
+export function enterpriseMode(dir = dataDir()): boolean {
+  if (process.env.ADA_ADMIN_KEY) return true;
+  try {
+    return Object.keys(loadSeats(dir)).length > 0;
+  } catch {
+    return true;
+  }
+}
+function timingEqual(a: string, b: string): boolean {
+  const ab = Buffer.from(a);
+  const bb = Buffer.from(b);
+  return ab.length === bb.length && timingSafeEqual(ab, bb);
+}
+/** Resolve a bearer token to a seat identity (or the bootstrap admin). Null = not a seat. Throws
+ *  CorruptStore if the seat store can't be read (caller returns 503, never dev-open). */
+export function identifySeat(token: string, dir = dataDir()): Identity | null {
+  const admin = process.env.ADA_ADMIN_KEY;
+  if (admin && timingEqual(token, admin)) return { user: "admin", role: "admin" };
+  if (!token.startsWith("ada_sk_")) return null; // format guard — "toString"/"__proto__"/… never reach the map
+  const seats = loadSeats(dir); // may throw CorruptStore
+  if (!Object.prototype.hasOwnProperty.call(seats, token)) return null; // own-property only
+  const seat = seats[token]!;
+  return seat.disabled ? null : { user: seat.name, role: seat.role };
+}
+/** Create a seat; returns its full key (shown once — only a prefix is ever listed again). */
+export function createSeat(name: string, role: "admin" | "dev" = "dev", dir = dataDir()): string {
+  const key = `ada_sk_${randomBytes(24).toString("hex")}`;
+  const seats = loadSeats(dir);
+  seats[key] = { name, role, created: new Date().toISOString() };
+  saveSeats(seats, dir);
+  appendAudit({ ts: Date.now(), user: "-", event: "seat_created", detail: `${name} (${role})` }, dir);
+  return key;
+}
+/** Disable (not delete — the audit trail keeps the history) the seat whose key starts with prefix. */
+export function disableSeat(prefix: string, dir = dataDir()): string | null {
+  if (prefix.length < 12) return null; // too short to be safely unique
+  const seats = loadSeats(dir);
+  const keys = Object.keys(seats).filter((k) => k.startsWith(prefix));
+  if (keys.length !== 1) return null;
+  seats[keys[0]!]!.disabled = true;
+  saveSeats(seats, dir);
+  appendAudit({ ts: Date.now(), user: "-", event: "seat_disabled", detail: seats[keys[0]!]!.name }, dir);
+  return seats[keys[0]!]!.name;
+}
+/** Key prefixes + metadata for listing — full keys are never returned after creation. Display-only,
+ *  so a corrupt store yields [] rather than crashing the banner. */
+export function listSeats(dir = dataDir()): Array<Seat & { keyPrefix: string }> {
+  try {
+    return Object.entries(loadSeats(dir)).map(([k, s]) => ({ ...s, keyPrefix: k.slice(0, 14) }));
+  } catch {
+    return [];
+  }
+}
+let lastGoodPolicy: Policy | null = null;
+/** Missing file → {} (no policy = allow all, legitimate). A corrupt EXISTING file → last-known-good
+ *  if we have one, else CorruptStore (fail closed — a security control must not degrade to allow-all). */
+export function loadPolicy(dir = dataDir()): Policy {
+  let text: string;
+  try {
+    text = readFileSync(policyFile(dir), "utf8");
+  } catch (e) {
+    if (errno(e) === "ENOENT") return {};
+    if (lastGoodPolicy) return lastGoodPolicy;
+    throw new CorruptStore(`policy.json unreadable: ${e instanceof Error ? e.message : e}`);
+  }
+  try {
+    lastGoodPolicy = JSON.parse(text) as Policy;
+    return lastGoodPolicy;
+  } catch (e) {
+    if (lastGoodPolicy) return lastGoodPolicy;
+    throw new CorruptStore(`policy.json corrupt: ${e instanceof Error ? e.message : e}`);
+  }
+}
+export function savePolicy(p: Policy, dir = dataDir()): void {
+  atomicWrite(policyFile(dir), JSON.stringify(p, null, 2));
+  lastGoodPolicy = p;
+  appendAudit({ ts: Date.now(), user: "-", event: "policy_updated", detail: JSON.stringify(p).slice(0, 300) }, dir);
+}
+/** Validate a policy shape from the wire. Returns the typed policy or an error message. */
+export function validatePolicy(raw: unknown): { policy: Policy } | { error: string } {
+  if (!raw || typeof raw !== "object" || Array.isArray(raw)) return { error: "policy must be a JSON object" };
+  const r = raw as Record<string, unknown>;
+  const out: Policy = {};
+  if (r.models !== undefined) {
+    if (!Array.isArray(r.models) || r.models.some((m) => typeof m !== "string" || !m.trim())) return { error: "models must be an array of non-empty strings" };
+    out.models = r.models as string[];
+  }
+  if (r.permissions !== undefined) {
+    if (!Array.isArray(r.permissions)) return { error: "permissions must be an array" };
+    for (const p of r.permissions) {
+      const rule = p as Record<string, unknown>;
+      if (!rule || typeof rule !== "object" || !["allow", "ask", "deny"].includes(rule.action as string)) return { error: "each permission needs action: allow|ask|deny" };
+      if (rule.tool !== undefined && typeof rule.tool !== "string") return { error: "permission.tool must be a string" };
+      if (rule.pattern !== undefined && typeof rule.pattern !== "string") return { error: "permission.pattern must be a string" };
+    }
+    out.permissions = r.permissions as PolicyRule[];
+  }
+  return { policy: out };
+}
+function globMatch(pattern: string, s: string): boolean {
+  const re = new RegExp(`^${pattern.split("*").map((p) => p.replace(/[.*+?^${}()|[\]\\]/g, "\\$&")).join(".*")}$`, "i");
+  return re.test(s);
+}
+/** Is this model allowed by org policy? No/empty allowlist = everything allowed. */
+export function modelAllowed(model: string, policy: Policy): boolean {
+  if (!Array.isArray(policy.models) || !policy.models.length) return true;
+  return policy.models.some((p) => globMatch(p, model));
+}
+export function appendUsage(row: UsageRow, dir = dataDir()): void {
+  try {
+    mkdirSync(dir, { recursive: true });
+    appendFileSync(usageFile(dir), `${JSON.stringify(row)}\n`);
+  } catch {
+    /* metering is best-effort; never fail a request over it */
+  }
+}
+export interface AuditRow {
+  ts: number;
+  user: string;
+  event: string;
+  detail: string;
+}
+export function appendAudit(row: AuditRow, dir = dataDir()): void {
+  try {
+    mkdirSync(dir, { recursive: true });
+    appendFileSync(auditFile(dir), `${JSON.stringify(row)}\n`);
+  } catch {
+    /* best-effort */
+  }
+}
+export function auditTail(limit = 200, dir = dataDir()): AuditRow[] {
+  let lines: string[];
+  try {
+    lines = readFileSync(auditFile(dir), "utf8").split("\n").filter(Boolean);
+  } catch {
+    return [];
+  }
+  const out: AuditRow[] = [];
+  for (const l of lines.slice(-limit)) {
+    try {
+      out.push(JSON.parse(l) as AuditRow); // skip a torn last line instead of losing the whole view
+    } catch {
+      /* skip corrupt line */
+    }
+  }
+  return out;
+}
+interface Bucket {
+  requests: number;
+  promptTokens: number;
+  completionTokens: number;
+}
+export interface UsageSummary {
+  since: number;
+  totals: Bucket;
+  byUser: Record<string, Bucket>;
+  byModel: Record<string, Bucket>;
+}
+export function usageSummary(days = 30, dir = dataDir()): UsageSummary {
+  const since = Date.now() - days * 86_400_000;
+  const zero = (): Bucket => ({ requests: 0, promptTokens: 0, completionTokens: 0 });
+  const out: UsageSummary = { since, totals: zero(), byUser: Object.create(null), byModel: Object.create(null) };
+  let lines: string[] = [];
+  try {
+    lines = readFileSync(usageFile(dir), "utf8").split("\n").filter(Boolean);
+  } catch {
+    return out;
+  }
+  for (const l of lines) {
+    let r: UsageRow;
+    try {
+      r = JSON.parse(l) as UsageRow;
+    } catch {
+      continue;
+    }
+    if (r.ts < since) continue;
+    for (const b of [out.totals, (out.byUser[r.user] ??= zero()), (out.byModel[r.model] ??= zero())]) {
+      b.requests++;
+      b.promptTokens += r.promptTokens || 0;
+      b.completionTokens += r.completionTokens || 0;
+    }
+  }
+  return out;
+}
+function matchBraces(text: string, start: number): string | null {
+  let depth = 0;
+  let inStr = false;
+  let esc = false;
+  for (let i = start; i < text.length; i++) {
+    const c = text[i];
+    if (inStr) {
+      if (esc) esc = false;
+      else if (c === "\\") esc = true;
+      else if (c === '"') inStr = false;
+    } else if (c === '"') inStr = true;
+    else if (c === "{") depth++;
+    else if (c === "}" && --depth === 0) return text.slice(start, i + 1);
+  }
+  return null;
+}
+/** Pull the LAST real `"usage": { … }` object out of streamed/response text. Skips a trailing
+ *  `"usage": null` and keeps scanning backwards, so a null in a late frame doesn't hide a real one. */
+export function extractLastUsage(text: string): { promptTokens: number; completionTokens: number } | null {
+  let at = text.lastIndexOf('"usage"');
+  while (at >= 0) {
+    const brace = text.indexOf("{", at + 7);
+    const colon = text.indexOf(":", at + 7);
+    if (brace >= 0 && colon >= 0 && text.slice(colon + 1, brace).trim() === "") {
+      const obj = matchBraces(text, brace);
+      if (obj) {
+        try {
+          const u = JSON.parse(obj) as { prompt_tokens?: number; completion_tokens?: number };
+          if (u.prompt_tokens != null || u.completion_tokens != null) return { promptTokens: u.prompt_tokens ?? 0, completionTokens: u.completion_tokens ?? 0 };
+        } catch {
+          /* malformed — keep looking backwards */
+        }
+      }
+    }
+    at = text.lastIndexOf('"usage"', at - 1);
+  }
+  return null;
+}
+export function storeExists(dir = dataDir()): boolean {
+  return existsSync(usersFile(dir)) || existsSync(policyFile(dir));
+}

package/src/server/index.ts CHANGED Viewed

@@ -5,6 +5,7 @@
 import { createServer } from "node:http";
 import type { IncomingMessage, ServerResponse } from "node:http";
 import { PORT, PROVIDERS, clientKeys, configuredProviders, isConfigured } from "./config.ts";
+import { CorruptStore, type Identity, appendAudit, appendUsage, auditTail, createSeat, disableSeat, enterpriseMode, extractLastUsage, identifySeat, listSeats, loadPolicy, modelAllowed, savePolicy, usageSummary, validatePolicy } from "./enterprise.ts";
 import { allowedUsers, isAllowed, verifyIdentity } from "./identity.ts";
 import { adapterFor } from "./providers/registry.ts";
 import { route } from "./router.ts";
@@ -19,20 +20,32 @@ function readBody(req: IncomingMessage): Promise<string> {
 }
 function locked(): boolean {
-  return clientKeys() !== null || allowedUsers() !== null || !!process.env.ADA_REQUIRE_LOGIN;
+  return enterpriseMode() || clientKeys() !== null || allowedUsers() !== null || !!process.env.ADA_REQUIRE_LOGIN;
 }
-/** A request is allowed if it carries a known static client key, OR a valid GitHub/Google
- *  login token (allowlisted). With nothing configured, the backend is open (dev mode). */
-async function authorized(req: IncomingMessage): Promise<boolean> {
-  if (!locked()) return true; // dev mode: no auth configured
+/** Resolve a request to WHO is making it. Order: seat key / ADA_ADMIN_KEY (enterprise), legacy
+ *  static client key, GitHub/Google login. With no auth configured, the backend is open (dev mode).
+ *  Returns "corrupt" if the seat store can't be read — the caller MUST 503, never fall through to
+ *  dev-open. Null = unauthorized. */
+async function identify(req: IncomingMessage): Promise<Identity | "corrupt" | null> {
   const h = req.headers["authorization"];
   const token = typeof h === "string" && h.startsWith("Bearer ") ? h.slice(7) : "";
-  if (!token) return false;
-  const keys = clientKeys();
-  if (keys?.includes(token)) return true; // static client key
-  const id = await verifyIdentity(token); // GitHub / Google login
-  return !!id && isAllowed(id.user);
+  if (token) {
+    let seat: Identity | null;
+    try {
+      seat = identifySeat(token);
+    } catch (e) {
+      if (e instanceof CorruptStore) return "corrupt";
+      throw e;
+    }
+    if (seat) return seat;
+    // Legacy ADA_CLIENT_KEYS are NOT honored once seats/admin-key exist — enterprise supersedes them,
+    // so a disabled seat can't be resurrected via a still-configured shared key.
+    if (!enterpriseMode() && clientKeys()?.includes(token)) return { user: "team", role: "dev" };
+    const id = await verifyIdentity(token); // GitHub / Google login
+    if (id && isAllowed(id.user)) return { user: id.user, role: "dev" };
+  }
+  return locked() ? null : { user: "dev", role: "dev" }; // dev mode: open
 }
 function json(res: ServerResponse, status: number, obj: unknown): void {
@@ -49,7 +62,7 @@ async function handleModels(res: ServerResponse): Promise<void> {
   json(res, 200, { object: "list", data });
 }
-async function handleChat(req: IncomingMessage, res: ServerResponse): Promise<void> {
+async function handleChat(req: IncomingMessage, res: ServerResponse, who: Identity): Promise<void> {
   const raw = await readBody(req);
   let body: Record<string, unknown>;
   try {
@@ -61,32 +74,90 @@ async function handleChat(req: IncomingMessage, res: ServerResponse): Promise<vo
   const model = String(body.model ?? "");
   if (!model) return json(res, 400, { error: { message: "missing 'model'" } });
-  const provider = route(model, typeof body.provider === "string" ? body.provider : undefined);
+  // Org policy: model allowlist (enterprise). Enforced server-side so a modified client can't skip it.
+  let policy: import("./enterprise.ts").Policy;
+  try {
+    policy = loadPolicy();
+  } catch (e) {
+    if (e instanceof CorruptStore) return json(res, 503, { error: { message: "org policy unreadable — refusing requests (fail-closed)" } });
+    throw e;
+  }
+  if (!modelAllowed(model, policy)) {
+    appendAudit({ ts: Date.now(), user: who.user, event: "policy_denied_model", detail: model });
+    return json(res, 403, { error: { message: `model '${model}' is not allowed by org policy (allowed: ${policy.models!.join(", ")})` } });
+  }
+  // When an allowlist is active, IGNORE the client's `provider` hint — else a seat holder could
+  // send an allowlisted model id with a different provider and leak the body to it before the
+  // upstream rejects the id. Route by the model id only.
+  const explicit = policy.models?.length ? undefined : typeof body.provider === "string" ? body.provider : undefined;
+  const provider = route(model, explicit);
   if (!isConfigured(provider)) {
     return json(res, 400, {
       error: { message: `provider '${provider}' not configured — set ${PROVIDERS[provider].keyEnv} on the backend` },
     });
   }
+  // Metering must not be client-suppressible: force the upstream to emit a usage object on streams
+  // (OpenAI-compat only sends it when include_usage is set). Harmless for providers that ignore it.
+  if (body.stream) body.stream_options = { ...((body.stream_options as Record<string, unknown>) ?? {}), include_usage: true };
+  // Usage metering: tee the response (streamed or not) and record the last usage object the
+  // upstream reported. Wrapping res keeps this in ONE place for every adapter.
+  {
+    let tail = "";
+    const scan = (c: unknown): void => {
+      if (typeof c === "string" || Buffer.isBuffer(c)) tail = (tail + c.toString()).slice(-16_384);
+    };
+    const write = res.write.bind(res);
+    const end = res.end.bind(res);
+    res.write = ((c: never, ...a: never[]) => {
+      scan(c);
+      return write(c, ...a);
+    }) as typeof res.write;
+    res.end = ((c?: never, ...a: never[]) => {
+      scan(c);
+      const u = extractLastUsage(tail);
+      if (u) appendUsage({ ts: Date.now(), user: who.user, model, provider, promptTokens: u.promptTokens, completionTokens: u.completionTokens });
+      return end(c, ...a);
+    }) as typeof res.end;
+  }
   delete body.provider; // our routing hint; never forward it upstream
   await adapterFor(provider).chat({ provider, model, body, res });
 }
 /** Embeddings for @codebase semantic search — forwarded to the ollama provider's
- *  OpenAI-compatible endpoint (embedding models only live there for now). */
-async function handleEmbeddings(req: IncomingMessage, res: ServerResponse): Promise<void> {
+ *  OpenAI-compatible endpoint (embedding models only live there for now). Subject to the same org
+ *  model allowlist as chat, and metered/attributed. */
+async function handleEmbeddings(req: IncomingMessage, res: ServerResponse, who: Identity): Promise<void> {
   const raw = await readBody(req);
+  let body: Record<string, unknown>;
   try {
-    JSON.parse(raw);
+    body = JSON.parse(raw);
   } catch {
     return json(res, 400, { error: { message: "invalid JSON body" } });
   }
+  const model = String(body.model ?? "");
+  let policy: import("./enterprise.ts").Policy;
+  try {
+    policy = loadPolicy();
+  } catch (e) {
+    if (e instanceof CorruptStore) return json(res, 503, { error: { message: "org policy unreadable — refusing requests" } });
+    throw e;
+  }
+  if (model && !modelAllowed(model, policy)) {
+    appendAudit({ ts: Date.now(), user: who.user, event: "policy_denied_model", detail: `embeddings:${model}` });
+    return json(res, 403, { error: { message: `embedding model '${model}' is not allowed by org policy` } });
+  }
   const upstream = await fetch(`${PROVIDERS.ollama.baseURL}/embeddings`, {
     method: "POST",
     headers: { "content-type": "application/json" },
     body: raw,
   });
   const text = await upstream.text();
+  const u = extractLastUsage(text); // embedding responses report prompt_tokens
+  if (u) appendUsage({ ts: Date.now(), user: who.user, model, provider: "ollama", promptTokens: u.promptTokens, completionTokens: 0 });
   res.writeHead(upstream.status, { "content-type": "application/json" });
   res.end(text);
 }
@@ -98,22 +169,85 @@ const server = createServer(async (req, res) => {
       res.writeHead(200, { "content-type": "text/plain" });
       return res.end("ada backend ok");
     }
+    const who = await identify(req);
+    if (who === "corrupt") return json(res, 503, { error: { message: "auth store unreadable — refusing all requests (fail-closed). Fix ~/.ada/server/users.json." } });
+    if (!who) return json(res, 401, { error: { message: "unauthorized — invalid client key, seat key, or login" } });
     if (req.method === "GET" && url.pathname === "/v1/whoami") {
-      if (!(await authorized(req))) return json(res, 401, { error: { message: "not logged in" } });
-      return json(res, 200, { ok: true });
+      return json(res, 200, { ok: true, user: who.user, role: who.role });
     }
     if (req.method === "GET" && url.pathname === "/v1/models") {
-      if (!(await authorized(req))) return json(res, 401, { error: { message: "unauthorized — invalid client key or login" } });
       return await handleModels(res);
     }
     if (req.method === "POST" && url.pathname === "/v1/chat/completions") {
-      if (!(await authorized(req))) return json(res, 401, { error: { message: "unauthorized — invalid client key or login" } });
-      return await handleChat(req, res);
+      return await handleChat(req, res, who);
     }
     if (req.method === "POST" && url.pathname === "/v1/embeddings") {
-      if (!(await authorized(req))) return json(res, 401, { error: { message: "unauthorized — invalid client key or login" } });
-      return await handleEmbeddings(req, res);
+      return await handleEmbeddings(req, res, who);
     }
+    // ---- enterprise control plane ----
+    if (url.pathname === "/v1/policy") {
+      if (req.method === "GET") {
+        // any seat — clients fetch this and apply the tool rules locally
+        let policy: unknown;
+        try {
+          policy = loadPolicy();
+        } catch (e) {
+          if (e instanceof CorruptStore) return json(res, 503, { error: { message: "org policy unreadable" } });
+          throw e;
+        }
+        appendAudit({ ts: Date.now(), user: who.user, event: "policy_fetched", detail: "" }); // spot seats that never fetch
+        return json(res, 200, policy);
+      }
+      if (req.method === "PUT") {
+        if (who.role !== "admin") return json(res, 403, { error: { message: "admin only" } });
+        let parsed: unknown;
+        try {
+          parsed = JSON.parse(await readBody(req));
+        } catch {
+          return json(res, 400, { error: { message: "invalid JSON body" } });
+        }
+        const v = validatePolicy(parsed);
+        if ("error" in v) return json(res, 400, { error: { message: v.error } });
+        savePolicy(v.policy);
+        return json(res, 200, { ok: true });
+      }
+    }
+    if (url.pathname === "/v1/users") {
+      if (who.role !== "admin") return json(res, 403, { error: { message: "admin only" } });
+      if (req.method === "GET") return json(res, 200, { users: listSeats() });
+      if (req.method === "POST") {
+        let name = "";
+        let role: "admin" | "dev" = "dev";
+        try {
+          const b = JSON.parse(await readBody(req)) as { name?: string; role?: string };
+          name = String(b.name ?? "").trim();
+          if (b.role === "admin") role = "admin";
+        } catch {
+          /* falls through to the name check */
+        }
+        if (!name) return json(res, 400, { error: { message: "missing 'name'" } });
+        return json(res, 200, { key: createSeat(name, role), name, role, note: "shown once — store it now" });
+      }
+    }
+    {
+      const m = req.method === "DELETE" && url.pathname.match(/^\/v1\/users\/([\w]+)$/);
+      if (m) {
+        if (who.role !== "admin") return json(res, 403, { error: { message: "admin only" } });
+        const name = disableSeat(m[1]!);
+        return json(res, name ? 200 : 404, name ? { ok: true, disabled: name } : { error: { message: "unknown or ambiguous key prefix (send ≥12 chars)" } });
+      }
+    }
+    if (req.method === "GET" && url.pathname === "/v1/usage") {
+      if (who.role !== "admin") return json(res, 403, { error: { message: "admin only" } });
+      return json(res, 200, usageSummary(Math.min(Number(url.searchParams.get("days")) || 30, 365)));
+    }
+    if (req.method === "GET" && url.pathname === "/v1/audit") {
+      if (who.role !== "admin") return json(res, 403, { error: { message: "admin only" } });
+      return json(res, 200, { events: auditTail(Math.min(Number(url.searchParams.get("limit")) || 200, 2000)) });
+    }
     return json(res, 404, { error: { message: "not found" } });
   } catch (err) {
     if (!res.headersSent) json(res, 500, { error: { message: err instanceof Error ? err.message : String(err) } });
@@ -127,9 +261,13 @@ const server = createServer(async (req, res) => {
 });
 server.listen(PORT, () => {
-  const auth = locked()
-    ? `auth ON (client keys + GitHub/Google login${allowedUsers() ? `, allowlist: ${allowedUsers()!.length}` : ""})`
-    : "AUTH DISABLED (dev) — set ADA_CLIENT_KEYS or ADA_ALLOWED_USERS to lock down";
+  if (enterpriseMode() && clientKeys()) console.warn("\x1b[33m[warn] ADA_CLIENT_KEYS is set but ignored in enterprise mode (seats supersede it) — unset it to avoid confusion.\x1b[0m");
+  const seats = listSeats().filter((s) => !s.disabled).length;
+  const auth = enterpriseMode()
+    ? `ENTERPRISE (${seats} seat${seats === 1 ? "" : "s"}${process.env.ADA_ADMIN_KEY ? " + admin key" : ""})`
+    : locked()
+      ? `auth ON (client keys + GitHub/Google login${allowedUsers() ? `, allowlist: ${allowedUsers()!.length}` : ""})`
+      : "AUTH DISABLED (dev) — set ADA_CLIENT_KEYS or ADA_ADMIN_KEY to lock down";
   const provs = configuredProviders();
   console.log(`ada backend → http://localhost:${PORT}  [${auth}]`);
   console.log(`providers: ${provs.length ? provs.join(", ") : "(none configured — set provider API keys)"}`);

package/src/server/providers/anthropic.ts CHANGED Viewed

@@ -114,6 +114,8 @@ export const anthropicAdapter: Adapter = {
     let stop = "stop";
     let toolIndex = -1;
+    let inTokens = 0; // Anthropic reports input on message_start, cumulative output on message_delta
+    let outTokens = 0;
     try {
       const client = await getClient();
@@ -148,7 +150,9 @@ export const anthropicAdapter: Adapter = {
       );
       for await (const event of stream) {
-        if (event.type === "content_block_start") {
+        if (event.type === "message_start") {
+          inTokens = (event.message as { usage?: { input_tokens?: number } }).usage?.input_tokens ?? 0;
+        } else if (event.type === "content_block_start") {
           const cb = event.content_block as { type: string; id?: string; name?: string };
           if (cb.type === "tool_use") {
             toolIndex++;
@@ -161,10 +165,15 @@ export const anthropicAdapter: Adapter = {
         } else if (event.type === "message_delta") {
           const reason = (event.delta as { stop_reason?: string | null }).stop_reason;
           if (reason) stop = mapStop(reason);
+          const ot = (event as { usage?: { output_tokens?: number } }).usage?.output_tokens;
+          if (typeof ot === "number") outTokens = ot; // cumulative — take the latest
         }
       }
       chunk({}, stop);
+      // Emit an OpenAI-shaped usage chunk so the backend's metering (and the client's own token
+      // counters) work for Claude too — Anthropic doesn't send one in this wire format.
+      writeChunk(res, { id, object: "chat.completion.chunk", created, model, choices: [], usage: { prompt_tokens: inTokens, completion_tokens: outTokens, total_tokens: inTokens + outTokens } });
       endStream(res);
     } catch (err) {
       chunk({ content: `\n[backend: anthropic error: ${err instanceof Error ? err.message : String(err)}]` }, "stop");

package/src/server/providers/openai-compat.ts CHANGED Viewed

@@ -42,14 +42,25 @@ export const openAICompatAdapter: Adapter = {
     // endpoint wants the bare id. (Cloudflare's "@cf/…" ids aren't "cloudflare/…", so they pass through.)
     const prefix = `${provider}/`;
     const outBody = typeof body.model === "string" && body.model.startsWith(prefix) ? { ...body, model: body.model.slice(prefix.length) } : body;
+    // If the client goes away, abort the upstream too — else the full completion is generated,
+    // billed, and (for enterprise) metered against a request nobody is reading.
+    const ac = new AbortController();
+    res.on("close", () => {
+      if (!res.writableEnded) ac.abort();
+    });
     let upstream: Awaited<ReturnType<typeof fetch>>;
     try {
       upstream = await fetch(`${def.baseURL}/chat/completions`, {
         method: "POST",
         headers: { "content-type": "application/json", ...(await authHeaders(provider)) },
         body: JSON.stringify(outBody),
+        signal: ac.signal,
       });
     } catch (e) {
+      if (ac.signal.aborted) {
+        res.end();
+        return;
+      }
       res.writeHead(502, { "content-type": "application/json" });
       res.end(
         JSON.stringify({
@@ -71,10 +82,18 @@ export const openAICompatAdapter: Adapter = {
     if (body.stream) {
       res.writeHead(200, SSE_HEADERS);
       const reader = upstream.body.getReader();
-      for (;;) {
-        const { done, value } = await reader.read();
-        if (done) break;
-        if (value) res.write(Buffer.from(value));
+      try {
+        for (;;) {
+          const { done, value } = await reader.read();
+          if (done) break;
+          if (res.destroyed) {
+            await reader.cancel(); // client gone → stop pulling tokens from upstream
+            break;
+          }
+          if (value) res.write(Buffer.from(value));
+        }
+      } catch {
+        /* aborted mid-stream (client closed) — nothing more to do */
       }
       res.end();
     } else {