npm - @llamaventures/cli - Versions diffs - 1.2.4 → 1.3.1 - Mend

@llamaventures/cli 1.2.4 → 1.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/AGENT_BRIEFING.md CHANGED Viewed

@@ -76,8 +76,8 @@ Default to action. Ask only for genuine judgment.
 | Error | What to do |
 |---|---|
-| `Error[NO_AUTH]` | Tell user: mint a token at `command.llamaventures.vc/settings/tokens`, then `llama token set <llc_...>`. |
-| `Error[UNAUTHORIZED]` | Credentials rejected (revoked / expired / wrong account). Same recovery — re-mint. |
+| `Error[NO_AUTH]` | Tell user: run `llama auth login` (browser sign-in via Google, OAuth tokens stored in OS Keychain). For unattended/CI: mint a long-lived PAT at `command.llamaventures.vc/settings/tokens` and `llama token set <llc_...>`. |
+| `Error[UNAUTHORIZED]` | Credentials rejected (revoked / expired / wrong account). If using OAuth: `llama auth login` again. If using PAT: re-mint. |
 | HTTP 5xx | Wait 5s, retry once. Two failures → tell the user "Command unavailable, will retry later." |
 | `Too many failed authentication attempts` (HTTP 429) | IP rate-limit. Wait until next UTC hour, OR switch network (e.g. tether to phone). |
@@ -87,7 +87,9 @@ Default to action. Ask only for genuine judgment.
 ```bash
 # Auth
-llama auth status
+llama auth login              # browser PKCE flow → OAuth tokens in OS Keychain (recommended)
+llama auth logout             # revoke + clear local
+llama auth status             # show identity + active method
 # Pipeline — read
 llama deal search "<name>"
@@ -137,7 +139,6 @@ Tools available:
 - `timeline` / `post`
 - `mentions_list`
 - `pitch_start` / `pitch_send_message` / `pitch_upload_file` / `pitch_status` / `pitch_finalize` — public intake (no Llama token needed; for founders / EAs / external agents)
-- `llama_api` — escape hatch for any endpoint not yet wrapped (path must start `/api/`)
 You can also fetch this exact briefing as an MCP prompt named `agent_briefing`.

package/README.md CHANGED Viewed

@@ -103,16 +103,37 @@ The client tries credentials **in this order**, on every call:
 | # | Source | Header sent | Best for |
 |---|--------|-------------|----------|
-| 1 | `gcloud auth print-identity-token` | `Authorization: Bearer …` | Team members on a workstation (zero config) |
-| 2 | `$LLAMA_TOKEN` env var | `X-Llama-Token` | CI runners, sandboxed cloud agents |
-| 3 | `~/.llama/token` (mode `0600`) | `X-Llama-Token` | Persistent local install |
-| 4 | `~/.llama-command/config.json` | `X-Llama-Token` | Legacy CLI v0.1 — auto-migrates to `~/.llama/token` on first read |
+| 1 | `llama auth login` (OAuth 2.1, OS Keychain) | `Authorization: Bearer …` | **Recommended for everyone.** One-shot browser login; tokens auto-refresh and survive reboots. |
+| 2 | `gcloud auth print-identity-token` | `Authorization: Bearer …` | Workstations with gcloud already wired (zero config) |
+| 3 | `$LLAMA_TOKEN` env var | `X-Llama-Token` | CI runners, sandboxed cloud agents |
+| 4 | `~/.llama/token` (mode `0600`) | `X-Llama-Token` | Persistent local install (legacy PATs) |
+| 5 | `~/.llama-command/config.json` | `X-Llama-Token` | CLI v0.1 — auto-migrates to `~/.llama/token` |
-Both Bearer and X-Llama-Token are sent if both exist. The server tries Bearer
-first; on verification failure it falls through to X-Llama-Token. Inspect the
-resolved identity any time with `llama auth status`.
+If both Bearer and X-Llama-Token are present, both are sent — the server tries
+Bearer first and falls through to X-Llama-Token on verification failure.
+Inspect the resolved identity any time with `llama auth status`.
-### Zero-config — recommended for team members
+### Browser sign-in — recommended
+```bash
+llama auth login           # opens browser → Google sign-in → consent → done
+llama auth status          # → activeMethod=oauth, scope, identity
+llama deal search acme-ai  # ready
+```
+`llama auth login` runs an OAuth 2.1 PKCE + RFC 8252 loopback flow against
+`https://command.llamaventures.vc`, exchanges the code for an access + refresh
+token pair, and stores them in the OS Keychain (macOS Keychain / Windows
+Credential Manager / Linux Secret Service via [`@napi-rs/keyring`](https://www.npmjs.com/package/@napi-rs/keyring)).
+Linux containers without libsecret use a 0600-mode file at `~/.llama/oauth.json`
+— same posture `gcloud` / `gh` / `aws` ship with on Linux servers. Refresh
+tokens rotate transparently when the access token nears expiry; a cross-process
+file lock prevents two shells from burning each other's refresh during
+concurrent calls.
+`llama auth logout` revokes server-side via RFC 7009 and clears local storage.
+### gcloud — for machines already wired with `gcloud auth login`
 ```bash
 gcloud auth login          # one-time; pick your @llamaventures.vc account
@@ -120,7 +141,7 @@ llama auth status          # → role + email
 llama deal search acme-ai  # ready
 ```
-### Manual token — for machines without `gcloud`, or stable CI
+### Long-lived PAT — for CI / unattended environments
 1. Sign in to https://command.llamaventures.vc.
 2. Open `/settings/tokens` → **Mint Token**.
@@ -321,9 +342,8 @@ llama pitch upload ./deck.pdf
 llama pitch                       # interactive REPL
 ```
-Server-enforced caps (same as the web flow): 5 sessions/IP/day,
-3 sessions/email/day, 30 min idle timeout, 100 messages/session,
-1 M tokens/session.
+Server-enforced rate limits apply (per-IP, per-email, per-session). If you
+hit a limit, the CLI surfaces the server's response message.
 This is genuine **agent-to-agent**: your AI helps you tell the story, our
 intake agent extracts the structured fields and produces the verdict.

package/bin/llama-mcp.mjs CHANGED Viewed

@@ -364,9 +364,8 @@ server.registerTool(
 // founder's agent talks to ours, structured intake gets captured, and a
 // 12-dimension verdict is returned.
 //
-// Anti-abuse caps are server-enforced (5 sessions/IP/day, 3/email/day,
-// 30min idle, 100 msg cap, 1M token cap, global daily cap). The MCP tools
-// surface those rejections as text back to the agent.
+// Anti-abuse rate limits are server-enforced. The MCP tools surface
+// any server-side rejections as text back to the agent.
 function asTextResult(text, isError = false) {
   return {
@@ -383,8 +382,8 @@ server.registerTool(
       "when a founder (the user) wants to pitch their company to Llama. " +
       "Requires their name + email. Returns a session_id; the conversation " +
       "is then maintained via pitch_send_message until the agent finalizes. " +
-      "Caps (server-enforced): 5 sessions/IP/day, 3 sessions/email/day, " +
-      "30min idle timeout. No Llama Command token needed.",
+      "Server-enforced rate limits apply (per-IP, per-email, per-session). " +
+      "No Llama Command token needed.",
     inputSchema: {
       name: z.string().describe("the founder's full name (max 100 chars)"),
       email: z.string().describe("the founder's email (deliverable, not a disposable domain)"),
@@ -447,8 +446,9 @@ server.registerTool(
     description:
       "Attach a file (deck, one-pager, deck PDF, screenshot, etc.) to the " +
       "active pitch session. Server allows pdf / pptx / ppt / docx / doc / " +
-      "xlsx / xls / png / jpg / webp / heic / heif / txt / md, max 50 MB, " +
-      "10 files per session. Returns a drive_file_id; the intake agent will " +
+      "xlsx / xls / png / jpg / webp / heic / heif / txt / md, with " +
+      "server-enforced size and per-session count limits. " +
+      "Returns a drive_file_id; the intake agent will " +
       "pick the file up via list_uploaded_files / read_uploaded_file on its " +
       "next turn (so call pitch_send_message with a one-line note like " +
       "'I just uploaded our pitch deck' so the agent knows to look).",
@@ -493,7 +493,7 @@ server.registerTool(
       "server-side intake agent to finalize — the agent decides that on its " +
       "own once the pitch is sufficient. Use this for cleanup after a session " +
       "ends, or to abandon a session early. The server-side session will " +
-      "naturally expire after 30min of idle.",
+      "naturally expire after the server's idle timeout.",
     inputSchema: {},
   },
   async () => {
@@ -505,7 +505,7 @@ server.registerTool(
           {
             cleared: before.active,
             previous_session: before.active ? before : null,
-            note: "Local pitch session state cleared. Server-side session may still be active for ~30min until idle timeout.",
+            note: "Local pitch session state cleared. Server-side session may still be active until its idle timeout.",
           },
           null,
           2

package/bin/llama.mjs CHANGED Viewed

@@ -28,6 +28,8 @@ import {
   startExternalSession,
   uploadExternalFile,
 } from "../lib/external.mjs";
+import { LLAMA_CLI_CLIENT_ID, pkceLoopbackFlow, revokeToken as revokeOAuthToken } from "../lib/oauth-flow.mjs";
+import { deleteBundle, detectBackend, readBundle, writeBundle } from "../lib/oauth-storage.mjs";
 function parseFlags(args) {
   const flags = {};
@@ -240,7 +242,7 @@ Skill corrections (persona-owner pushback — read by persona-watcher):
   llama skill-correction add <skill-slug> "<correction text>" [--deal <uuid>] [--block <blockId>]
   llama skill-correction delete <id>
   Server enforces persona owner OR system admin on POST/DELETE; GET is open.
-  External personas (owner_email=null, e.g. virtual-liu-yi) are admin-only for write.
+  External personas (owner_email=null) are admin-only for write.
 Mentions / Inbox:
   llama mentions                                       # default: my unresolved cues
@@ -316,9 +318,9 @@ Inspect / clean up:
   llama pitch status         # session id, idle minutes, finalized?
   llama pitch end            # clear local session state
-Caps (server-enforced):
-  5 sessions per IP per day, 3 per email per day, 60min idle timeout,
-  100 messages per session, 1M tokens per session.
+Caps:
+  Server-enforced per-IP / per-email / per-session rate limits apply.
+  The CLI surfaces server messages if a limit is hit.
 Environment:
   LLAMA_API_URL              override base URL (dev: http://localhost:3000)
@@ -409,7 +411,7 @@ Environment:
       cleared: !!had,
       session_file: EXTERNAL_SESSION_FILE,
       note: had
-        ? "Local session state cleared. Server-side session may still be active until idle timeout (60min)."
+        ? "Local session state cleared. Server-side session may still be active until idle timeout."
         : "No local session was active.",
     });
     return;
@@ -697,8 +699,11 @@ https://command.llamaventures.vc/settings/tokens, run
           ? "~/.llama-command/config.json (legacy)"
           : null;
+    const oauthBundle = await readBundle();
+    const oauthBackend = oauthBundle ? await detectBackend() : null;
     let serverCheck = "skipped (no credentials)";
-    if (bearer || token) {
+    if (oauthBundle?.access_token || bearer || token) {
       try {
         const me = await request("GET", "/api/me");
         serverCheck = `ok — authenticated as ${me?.email ?? "unknown"} (role: ${me?.role ?? "unknown"})`;
@@ -707,12 +712,101 @@ https://command.llamaventures.vc/settings/tokens, run
       }
     }
-    print({
+    const out = {
       baseUrl: getBaseUrl(),
+      activeMethod: oauthBundle?.access_token
+        ? "oauth"
+        : bearer
+          ? "gcloud-bearer"
+          : token
+            ? "llama-token"
+            : "none",
+      oauth: oauthBundle
+        ? {
+            storage: oauthBackend,
+            client_id: oauthBundle.client_id,
+            scope: oauthBundle.scope,
+            issuer: oauthBundle.issuer,
+            expires_in_seconds: Math.max(0, Math.round((oauthBundle.expires_at - Date.now()) / 1000)),
+          }
+        : "absent (run `llama auth login`)",
       gcloudIdentityToken: bearer ? "present" : "absent",
       llamaToken: token ? `${token.slice(0, 8)}...${token.slice(-4)}` : "absent",
       llamaTokenSource: tokenSrc,
       serverCheck,
+    };
+    print(out);
+    return;
+  }
+  // ============================================================
+  // auth login — PKCE + loopback browser flow
+  // ============================================================
+  if (area === "auth" && action === "login") {
+    const { flags } = parseFlags(rest);
+    const requestedScope = typeof flags.scope === "string" && flags.scope.trim()
+      ? flags.scope.trim()
+      : "read write";
+    const baseUrl = getBaseUrl();
+    const resource = baseUrl; // general API audience (oauthApiResource on the server)
+    console.error(`Signing in to ${baseUrl} as Llama CLI (client_id=${LLAMA_CLI_CLIENT_ID})...`);
+    const bundle = await pkceLoopbackFlow({ baseUrl, scope: requestedScope, resource });
+    const stored = await writeBundle({
+      access_token: bundle.access_token,
+      refresh_token: bundle.refresh_token,
+      expires_at: Date.now() + (bundle.expires_in ?? 3600) * 1000,
+      scope: bundle.scope,
+      client_id: bundle.client_id,
+      issuer: bundle.issuer,
+      resource: bundle.resource,
+      created_at: Date.now(),
+    });
+    // Verify by hitting /api/me with the new token.
+    let identity = "(unable to verify — /api/me did not respond)";
+    try {
+      const me = await request("GET", "/api/me");
+      identity = `${me?.email ?? "unknown"} (role: ${me?.role ?? "unknown"})`;
+    } catch (e) {
+      identity = `verification failed: ${e.message.split("\n")[0]}`;
+    }
+    print({
+      ok: true,
+      message: "Signed in",
+      identity,
+      storage: stored.backend,
+      scope: bundle.scope,
+      expires_in_seconds: bundle.expires_in,
+    });
+    return;
+  }
+  // ============================================================
+  // auth logout — revoke + clear local
+  // ============================================================
+  if (area === "auth" && action === "logout") {
+    const bundle = await readBundle();
+    if (!bundle) {
+      print({ ok: true, message: "No OAuth credentials to clear" });
+      return;
+    }
+    let revoked = false;
+    try {
+      revoked = await revokeOAuthToken({
+        baseUrl: bundle.issuer ?? getBaseUrl(),
+        token: bundle.refresh_token,
+        tokenTypeHint: "refresh_token",
+      });
+    } catch {
+      revoked = false;
+    }
+    await deleteBundle();
+    print({
+      ok: true,
+      message: "Signed out — local credentials cleared",
+      serverRevoke: revoked ? "succeeded" : "failed (server unreachable or token already invalid; local state cleared anyway)",
     });
     return;
   }

package/lib/client.mjs CHANGED Viewed

@@ -139,18 +139,54 @@ export async function tryGcloudIdentityToken() {
   }
 }
-// Build the auth header set. If both Bearer and X-Llama-Token are available,
-// send both — the server tries Bearer first and falls through to
-// X-Llama-Token on verification failure.
+// Build the auth header set. Priority order (server tries them in this
+// order too and falls through on failure):
+//
+//   1. OAuth access token from Keychain (`llama auth login`) — Bearer
+//      header. Auto-refreshes if near expiry. Highest priority because
+//      it's scope-aware + revocable.
+//   2. gcloud identity token — Bearer header. Falls back if no OAuth.
+//   3. X-Llama-Token PAT — sent alongside whatever Bearer was set, so
+//      server's authenticate() can fall through on Bearer-verify failure.
 export async function getAuthHeaders() {
   const headers = {};
-  const bearer = await tryGcloudIdentityToken();
-  if (bearer) headers["Authorization"] = `Bearer ${bearer}`;
+  // Lazy import — keeps zero-OAuth call paths fast and avoids loading
+  // @napi-rs/keyring's native binding when the user isn't using OAuth.
+  let oauthAccess = null;
+  try {
+    const { getValidAccessToken } = await import("./oauth-refresh.mjs");
+    oauthAccess = await getValidAccessToken();
+  } catch {
+    // OAuth modules failed to load (e.g. keyring native binding missing
+    // on this platform) — fall through to gcloud / PAT silently.
+  }
+  if (oauthAccess) {
+    headers["Authorization"] = `Bearer ${oauthAccess}`;
+  } else {
+    const bearer = await tryGcloudIdentityToken();
+    if (bearer) headers["Authorization"] = `Bearer ${bearer}`;
+  }
   const token = getToken();
   if (token) headers["X-Llama-Token"] = token;
   return headers;
 }
+/**
+ * Was the Bearer header on this request set from an OAuth access token?
+ * `request()` uses this to decide whether a 401 should trigger a
+ * refresh-and-retry-once path (only meaningful when we sent an OAuth
+ * token; gcloud / PAT 401s should NOT retry blindly).
+ */
+async function bearerCameFromOAuth() {
+  try {
+    const { readBundle } = await import("./oauth-storage.mjs");
+    const bundle = await readBundle();
+    return Boolean(bundle?.access_token);
+  } catch {
+    return false;
+  }
+}
 // Structured no-credential error. Format is stable so agents can pattern-match
 // `Error[NO_AUTH]` and trigger a recovery flow.
 function noAuthError() {
@@ -181,6 +217,10 @@ function unauthorizedError() {
 }
 export async function request(method, endpoint, body) {
+  return requestWithRetry(method, endpoint, body, /* allowRetry */ true);
+}
+async function requestWithRetry(method, endpoint, body, allowRetry) {
   const authHeaders = await getAuthHeaders();
   if (Object.keys(authHeaders).length === 0) throw noAuthError();
   const res = await fetch(`${getBaseUrl()}${endpoint}`, {
@@ -191,7 +231,29 @@ export async function request(method, endpoint, body) {
     },
     body: body === undefined ? undefined : JSON.stringify(body),
   });
+  // 401 + we sent an OAuth Bearer + this is the first attempt → try a
+  // forced refresh once. Covers two cases: (a) clock skew between client
+  // and server pushed us past expiry mid-request, (b) server-side
+  // revocation occurred between the client cache and now. Either way,
+  // the refresh either succeeds (we retry once with the new access
+  // token) or fails (refresh token also dead — bubble UNAUTHORIZED).
+  if (res.status === 401 && allowRetry && (await bearerCameFromOAuth())) {
+    let refreshed = null;
+    try {
+      const { forceRefresh } = await import("./oauth-refresh.mjs");
+      refreshed = await forceRefresh();
+    } catch {
+      refreshed = null;
+    }
+    if (refreshed) {
+      return requestWithRetry(method, endpoint, body, /* allowRetry */ false);
+    }
+    throw unauthorizedError();
+  }
   if (res.status === 401) throw unauthorizedError();
   const text = await res.text();
   let data;
   try {

package/lib/external.mjs CHANGED Viewed

@@ -18,14 +18,11 @@ import { getBaseUrl } from "./client.mjs";
 const SESSION_DIR = path.join(os.homedir(), ".llama");
 const SESSION_FILE = path.join(SESSION_DIR, "external-session.json");
-// Server-side proof-of-work prefix. Must agree with
-// llama-command/src/lib/external-pow-client.ts. ~65k iterations average on
-// commodity hardware (~50–500ms in node).
+// Server-side proof-of-work prefix. Server-validated; tune in tandem
+// with the server policy if changed.
 const POW_DIFFICULTY_PREFIX = "0000";
-// Server requires ts_rendered to be at least 3s old (anti-replay). We
-// backdate by 4s when computing PoW so the request lands inside the
-// validity window without waiting.
+// Backdate offset for the rendered-at timestamp passed to the server.
 const POW_BACKDATE_MS = 4_000;
 // ============================================================
@@ -360,13 +357,13 @@ export async function uploadExternalFile(filePath) {
   if (!res.ok) {
     if (res.status === 413) {
-      throw new Error("File too large (max 50 MB).");
+      throw new Error("File too large.");
     }
     if (res.status === 415) {
       throw new Error(`MIME type "${mimetype}" not in server allowlist.`);
     }
     if (res.status === 429) {
-      throw new Error("Upload cap reached (10 files per session).");
+      throw new Error("Upload cap reached.");
     }
     if (res.status === 401 || res.status === 403) {
       throw new Error(

package/lib/oauth-flow.mjs ADDED Viewed

@@ -0,0 +1,245 @@
+// OAuth 2.1 PKCE + loopback flow for the Llama CLI.
+//
+// Mirrors `gh auth login` / `gcloud auth login`: the CLI binds an
+// ephemeral HTTP server on 127.0.0.1, opens the browser to the
+// authorization endpoint with a PKCE challenge + state, and waits for
+// the user to approve. The browser redirects to the loopback URL
+// carrying the auth code; the local server captures it and shuts down.
+// The CLI then exchanges the code (with the PKCE verifier) for tokens.
+//
+// Pure stdlib: node:crypto for PKCE, node:http for the loopback server,
+// child_process for the platform-specific browser open. No third-party
+// HTTP/OAuth client.
+//
+// RFC compliance: OAuth 2.1 + RFC 7636 PKCE S256 + RFC 8252 native-app
+// loopback flow + RFC 8707 audience parameter.
+import { createHash, randomBytes } from "crypto";
+import http from "http";
+import { spawn } from "child_process";
+const CLIENT_ID = "llama-cli-official";
+const REDIRECT_PATH = "/callback";
+const FLOW_TIMEOUT_MS = 5 * 60 * 1000; // 5 min — generous for slow Google sign-in
+// ============================================================
+// PKCE primitives
+// ============================================================
+function base64url(buf) {
+  return buf
+    .toString("base64")
+    .replace(/=/g, "")
+    .replace(/\+/g, "-")
+    .replace(/\//g, "_");
+}
+export function generateVerifier() {
+  // RFC 7636 §4.1: 43-128 chars from unreserved alphabet. 32 random bytes
+  // → 43 base64url chars (256 bits entropy).
+  return base64url(randomBytes(32));
+}
+export function challengeFor(verifier) {
+  return base64url(createHash("sha256").update(verifier).digest());
+}
+// ============================================================
+// Browser launcher
+// ============================================================
+function openBrowser(url) {
+  // Platform-native open. We never block on it (the user closes the
+  // browser when they're done; the loopback server is what we wait for).
+  let cmd, args;
+  if (process.platform === "darwin") {
+    cmd = "open";
+    args = [url];
+  } else if (process.platform === "win32") {
+    cmd = "cmd";
+    args = ["/c", "start", "", url];
+  } else {
+    cmd = "xdg-open";
+    args = [url];
+  }
+  try {
+    spawn(cmd, args, { detached: true, stdio: "ignore" }).unref();
+  } catch {
+    // Best-effort — if we can't open the browser, the user can copy the
+    // URL from stderr. The loopback server keeps listening either way.
+  }
+}
+// ============================================================
+// Loopback server response page
+// ============================================================
+function respondHtml(res, ok, message) {
+  const color = ok ? "#16a34a" : "#dc2626";
+  const title = ok ? "Llama CLI — Signed in" : "Llama CLI — Sign-in failed";
+  res.statusCode = ok ? 200 : 400;
+  res.setHeader("Content-Type", "text/html; charset=utf-8");
+  res.end(`<!doctype html>
+<html><head><meta charset="utf-8"><title>${title}</title>
+<style>
+  body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif;
+         background: #fafaf9; color: #292524; display: grid; place-items: center;
+         min-height: 100vh; margin: 0; }
+  .card { background: white; border: 1px solid #e7e5e4; border-radius: 8px;
+          padding: 32px 40px; max-width: 400px; text-align: center; }
+  h1 { margin: 0 0 12px; font-size: 18px; color: ${color}; }
+  p  { margin: 0; color: #57534e; font-size: 14px; }
+</style></head><body>
+<div class="card"><h1>${title}</h1><p>${message}</p></div>
+</body></html>`);
+}
+// ============================================================
+// PKCE + loopback driver
+// ============================================================
+/**
+ * Run the full PKCE + loopback OAuth flow.
+ *
+ * @param {Object} opts
+ * @param {string} opts.baseUrl   AS issuer (e.g. https://command.llamaventures.vc)
+ * @param {string} opts.scope     Space-separated scope request (e.g. "read write")
+ * @param {string} opts.resource  RFC 8707 audience the access token will bind to
+ * @returns {Promise<Object>}     {access_token, refresh_token, expires_in, scope, token_type, redirect_uri}
+ */
+export async function pkceLoopbackFlow({ baseUrl, scope, resource }) {
+  const verifier = generateVerifier();
+  const challenge = challengeFor(verifier);
+  const state = base64url(randomBytes(16));
+  // Bind the loopback server FIRST so we know the port for redirect_uri.
+  const server = http.createServer();
+  await new Promise((resolve, reject) => {
+    server.once("error", reject);
+    server.listen(0, "127.0.0.1", resolve);
+  });
+  const { port } = server.address();
+  const redirectUri = `http://127.0.0.1:${port}${REDIRECT_PATH}`;
+  // Set up the request handler now that we have the port.
+  const codePromise = new Promise((resolve, reject) => {
+    const timeoutId = setTimeout(() => {
+      try { server.close(); } catch { /* */ }
+      reject(new Error(
+        "Error[OAUTH_TIMEOUT]: Browser flow did not complete within " +
+        Math.round(FLOW_TIMEOUT_MS / 1000) + "s. Re-run `llama auth login`."
+      ));
+    }, FLOW_TIMEOUT_MS);
+    server.on("request", (req, res) => {
+      const url = new URL(req.url, "http://127.0.0.1");
+      if (url.pathname !== REDIRECT_PATH) {
+        res.statusCode = 404;
+        res.end();
+        return;
+      }
+      const code = url.searchParams.get("code");
+      const respState = url.searchParams.get("state");
+      const error = url.searchParams.get("error");
+      const errorDescription = url.searchParams.get("error_description") ?? "";
+      if (error) {
+        respondHtml(res, false, `${error}: ${errorDescription}`);
+        clearTimeout(timeoutId);
+        server.close();
+        reject(new Error(`Error[OAUTH_DENIED]: ${error} — ${errorDescription}`));
+        return;
+      }
+      if (respState !== state) {
+        respondHtml(res, false, "state parameter mismatch (CSRF defense)");
+        clearTimeout(timeoutId);
+        server.close();
+        reject(new Error("Error[OAUTH_BAD_STATE]: state mismatch — possible CSRF or stale callback"));
+        return;
+      }
+      if (!code) {
+        respondHtml(res, false, "missing code parameter");
+        clearTimeout(timeoutId);
+        server.close();
+        reject(new Error("Error[OAUTH_BAD_CALLBACK]: callback missing code parameter"));
+        return;
+      }
+      respondHtml(res, true, "You can close this window and return to the terminal.");
+      clearTimeout(timeoutId);
+      server.close();
+      resolve(code);
+    });
+  });
+  // Build authorize URL and open browser.
+  const authorizeUrl = new URL(`${baseUrl}/api/oauth/authorize`);
+  authorizeUrl.searchParams.set("response_type", "code");
+  authorizeUrl.searchParams.set("client_id", CLIENT_ID);
+  authorizeUrl.searchParams.set("redirect_uri", redirectUri);
+  authorizeUrl.searchParams.set("scope", scope);
+  authorizeUrl.searchParams.set("state", state);
+  authorizeUrl.searchParams.set("code_challenge", challenge);
+  authorizeUrl.searchParams.set("code_challenge_method", "S256");
+  authorizeUrl.searchParams.set("resource", resource);
+  console.error(`Opening browser to ${baseUrl} for sign-in...`);
+  console.error(`(If the browser does not open, visit this URL manually:\n  ${authorizeUrl.toString()}\n)`);
+  openBrowser(authorizeUrl.toString());
+  const code = await codePromise;
+  // Exchange code → tokens.
+  const tokenBody = new URLSearchParams({
+    grant_type: "authorization_code",
+    code,
+    redirect_uri: redirectUri,
+    client_id: CLIENT_ID,
+    code_verifier: verifier,
+    resource,
+  }).toString();
+  const tokenRes = await fetch(`${baseUrl}/api/oauth/token`, {
+    method: "POST",
+    headers: { "Content-Type": "application/x-www-form-urlencoded" },
+    body: tokenBody,
+  });
+  const tokenJson = await tokenRes.json().catch(() => ({}));
+  if (!tokenRes.ok) {
+    throw new Error(
+      `Error[OAUTH_TOKEN_EXCHANGE_FAILED]: ${tokenJson.error ?? tokenRes.status} — ${tokenJson.error_description ?? "no description"}`
+    );
+  }
+  return {
+    access_token: tokenJson.access_token,
+    refresh_token: tokenJson.refresh_token,
+    expires_in: tokenJson.expires_in ?? 3600,
+    scope: tokenJson.scope ?? scope,
+    token_type: tokenJson.token_type ?? "Bearer",
+    client_id: CLIENT_ID,
+    resource,
+    issuer: baseUrl,
+  };
+}
+// ============================================================
+// Token revoke (used by `llama auth logout`)
+// ============================================================
+export async function revokeToken({ baseUrl, token, tokenTypeHint }) {
+  const body = new URLSearchParams({
+    token,
+    client_id: CLIENT_ID,
+    ...(tokenTypeHint ? { token_type_hint: tokenTypeHint } : {}),
+  }).toString();
+  const res = await fetch(`${baseUrl}/api/oauth/revoke`, {
+    method: "POST",
+    headers: { "Content-Type": "application/x-www-form-urlencoded" },
+    body,
+  });
+  // RFC 7009 §2.2: 200 on success OR unknown token. Anything else is unexpected.
+  return res.ok;
+}
+export const LLAMA_CLI_CLIENT_ID = CLIENT_ID;

package/lib/oauth-refresh.mjs ADDED Viewed

@@ -0,0 +1,87 @@
+// OAuth refresh-token rotation for the Llama CLI.
+//
+// Called from lib/client.mjs::request when an OAuth-bearing call returns
+// 401. We exchange the stored refresh_token for a new (access, refresh)
+// pair via POST /api/oauth/token, persist the new bundle, and surface
+// the new access_token so the caller can retry once.
+//
+// Cross-process locking via oauth-storage.withRefreshLock so two shells
+// hitting 401 simultaneously don't burn each other's refresh token.
+// After acquiring the lock we re-read the bundle in case the other
+// shell has already refreshed.
+import { LLAMA_CLI_CLIENT_ID } from "./oauth-flow.mjs";
+import { readBundle, withRefreshLock, writeBundle } from "./oauth-storage.mjs";
+const ACCESS_TOKEN_SKEW_MS = 30_000; // refresh proactively 30s before expiry
+/**
+ * Returns the current access token if non-expired, else attempts
+ * refresh. Returns null if no bundle is stored, refresh fails, or the
+ * refresh token itself is expired/revoked (caller should fall through
+ * to the next auth method or surface NO_AUTH).
+ */
+export async function getValidAccessToken() {
+  const bundle = await readBundle();
+  if (!bundle?.access_token) return null;
+  if (bundle.expires_at - Date.now() > ACCESS_TOKEN_SKEW_MS) {
+    return bundle.access_token;
+  }
+  // Near or past expiry — refresh under lock.
+  return refreshUnderLock();
+}
+/**
+ * Force a refresh regardless of expiry. Used by client.mjs on a 401
+ * with an OAuth bundle present (the access token may have been revoked
+ * server-side, in which case the refresh might still work).
+ */
+export async function forceRefresh() {
+  return refreshUnderLock();
+}
+async function refreshUnderLock() {
+  return withRefreshLock(async () => {
+    // Re-read inside the lock — another shell may have refreshed already.
+    const fresh = await readBundle();
+    if (!fresh?.refresh_token) return null;
+    if (fresh.expires_at - Date.now() > ACCESS_TOKEN_SKEW_MS) {
+      // Another shell already refreshed; we're good.
+      return fresh.access_token;
+    }
+    return performRefresh(fresh);
+  });
+}
+async function performRefresh(bundle) {
+  const body = new URLSearchParams({
+    grant_type: "refresh_token",
+    refresh_token: bundle.refresh_token,
+    client_id: bundle.client_id ?? LLAMA_CLI_CLIENT_ID,
+    resource: bundle.resource,
+  }).toString();
+  const res = await fetch(`${bundle.issuer}/api/oauth/token`, {
+    method: "POST",
+    headers: { "Content-Type": "application/x-www-form-urlencoded" },
+    body,
+  });
+  if (!res.ok) {
+    // Refresh failed — most likely refresh expired or grant was revoked.
+    // Don't delete the bundle automatically; the user might want to
+    // inspect it or `llama auth logout` themselves to clear it.
+    return null;
+  }
+  const json = await res.json().catch(() => null);
+  if (!json?.access_token || !json?.refresh_token) return null;
+  const newBundle = {
+    ...bundle,
+    access_token: json.access_token,
+    refresh_token: json.refresh_token,
+    expires_at: Date.now() + (json.expires_in ?? 3600) * 1000,
+    scope: json.scope ?? bundle.scope,
+  };
+  await writeBundle(newBundle);
+  return newBundle.access_token;
+}

package/lib/oauth-storage.mjs ADDED Viewed

@@ -0,0 +1,191 @@
+// OAuth credential storage for the Llama CLI.
+//
+// Persists the access_token / refresh_token / expires_at bundle returned
+// by the Llama Command authorization server. Two backends, in order:
+//
+//   1. OS Keychain via @napi-rs/keyring — macOS Keychain, Windows
+//      Credential Manager, Linux Secret Service (libsecret). Industry
+//      standard for desktop CLIs (gh, gcloud, Azure SDK).
+//
+//   2. Plain file `~/.llama/oauth.json` mode 0600 — used when the
+//      Keychain backend isn't available (Linux container with no
+//      libsecret, headless CI runner). Same posture as the existing
+//      `~/.llama/token` for PATs, and the same posture gh/gcloud/aws
+//      ship with on Linux servers.
+//
+// Cross-process lock: the refresh-token rotation contract requires that
+// two shells refreshing simultaneously don't burn each other's refresh
+// token. We coordinate via atomic O_CREAT|O_EXCL on `~/.llama/oauth.lock`
+// with a short retry window, and after acquiring re-read the credentials
+// in case the other shell already refreshed.
+import fs from "fs";
+import os from "os";
+import path from "path";
+const SERVICE = "com.llamaventures.cli";
+const ACCOUNT = "oauth";
+const STORE_DIR = path.join(os.homedir(), ".llama");
+const FILE_PATH = path.join(STORE_DIR, "oauth.json");
+const LOCK_PATH = path.join(STORE_DIR, "oauth.lock");
+// ============================================================
+// Keychain backend (lazy-loaded — keep startup fast)
+// ============================================================
+let _keychainEntry = null;
+let _keychainTried = false;
+async function getKeychainEntry() {
+  if (_keychainTried) return _keychainEntry;
+  _keychainTried = true;
+  try {
+    const { Entry } = await import("@napi-rs/keyring");
+    _keychainEntry = new Entry(SERVICE, ACCOUNT);
+    // Probe — if the platform backend is missing (e.g. Linux without
+    // libsecret), the Entry methods throw on first use. Surface that
+    // here so callers route to the file backend.
+    try {
+      _keychainEntry.getPassword();
+    } catch (err) {
+      const msg = String(err?.message ?? err);
+      // "no entry" / "not found" is fine — backend works, just empty.
+      // Any other error means the backend itself is unavailable.
+      if (!/no entry|not found|no such/i.test(msg)) {
+        _keychainEntry = null;
+      }
+    }
+  } catch {
+    _keychainEntry = null;
+  }
+  return _keychainEntry;
+}
+// ============================================================
+// Bundle shape
+// ============================================================
+/**
+ * @typedef {Object} OAuthBundle
+ * @property {string} access_token
+ * @property {string} refresh_token
+ * @property {number} expires_at         absolute ms epoch when access_token expires
+ * @property {string} scope              space-separated, OAuth wire format
+ * @property {string} client_id          which AS client minted this bundle
+ * @property {string} issuer             AS issuer URL — bundle is bound to it
+ * @property {string} resource           RFC 8707 audience the access_token is for
+ * @property {number} created_at         ms epoch when bundle was first stored
+ */
+// ============================================================
+// Read / write / delete
+// ============================================================
+export async function readBundle() {
+  const entry = await getKeychainEntry();
+  if (entry) {
+    try {
+      const raw = entry.getPassword();
+      if (raw) return JSON.parse(raw);
+    } catch {
+      // fall through to file
+    }
+  }
+  try {
+    const raw = fs.readFileSync(FILE_PATH, "utf8");
+    return JSON.parse(raw);
+  } catch {
+    return null;
+  }
+}
+export async function writeBundle(bundle) {
+  const json = JSON.stringify(bundle);
+  const entry = await getKeychainEntry();
+  if (entry) {
+    try {
+      entry.setPassword(json);
+      // Best-effort cleanup: if a stale plaintext file exists from a
+      // pre-Keychain install, remove it so we don't have two copies of
+      // the credential drifting.
+      try { fs.unlinkSync(FILE_PATH); } catch { /* not present */ }
+      return { backend: "keychain" };
+    } catch {
+      // fall through to file
+    }
+  }
+  fs.mkdirSync(STORE_DIR, { recursive: true, mode: 0o700 });
+  fs.writeFileSync(FILE_PATH, `${json}\n`, { mode: 0o600 });
+  fs.chmodSync(FILE_PATH, 0o600);
+  return { backend: "file" };
+}
+export async function deleteBundle() {
+  const entry = await getKeychainEntry();
+  if (entry) {
+    try { entry.deletePassword(); } catch { /* may not be present */ }
+  }
+  try { fs.unlinkSync(FILE_PATH); } catch { /* may not be present */ }
+}
+export async function detectBackend() {
+  const entry = await getKeychainEntry();
+  return entry ? "keychain" : "file";
+}
+// ============================================================
+// Cross-process lock
+// ============================================================
+//
+// Refresh rotation requires that only ONE process at a time exchange
+// the current refresh token. Without a lock, two CLI invocations racing
+// on token expiry would both POST /oauth/token; the first wins, the
+// second gets `invalid_grant` (because the first already rotated), and
+// the user sees a confusing failure.
+//
+// Pattern: atomic O_CREAT | O_EXCL on a sentinel file. If we get the
+// fd, we own the lock; on EEXIST, another process owns it — wait briefly
+// and retry. After acquiring, ALWAYS re-read the bundle from storage in
+// case the other process has refreshed in the meantime (then we don't
+// need to refresh ourselves).
+const LOCK_RETRY_MS = 100;
+const LOCK_TIMEOUT_MS = 5_000;
+export async function withRefreshLock(fn) {
+  fs.mkdirSync(STORE_DIR, { recursive: true, mode: 0o700 });
+  const start = Date.now();
+  let fd;
+  while (true) {
+    try {
+      fd = fs.openSync(LOCK_PATH, "wx", 0o600);
+      break;
+    } catch (err) {
+      if (err.code !== "EEXIST") throw err;
+      // Stale lock cleanup: if the lock file is older than the timeout,
+      // the holding process likely crashed. Remove and retry.
+      try {
+        const stat = fs.statSync(LOCK_PATH);
+        if (Date.now() - stat.mtimeMs > LOCK_TIMEOUT_MS) {
+          fs.unlinkSync(LOCK_PATH);
+          continue;
+        }
+      } catch { /* lock disappeared between EEXIST and stat — fine */ }
+      if (Date.now() - start > LOCK_TIMEOUT_MS) {
+        throw new Error(
+          "Error[OAUTH_LOCK_TIMEOUT]: Could not acquire OAuth refresh lock at " +
+          LOCK_PATH + ". Another `llama` process may be hung. Remove the " +
+          "lock file manually if you're sure no other CLI is running."
+        );
+      }
+      await new Promise((r) => setTimeout(r, LOCK_RETRY_MS));
+    }
+  }
+  try {
+    return await fn();
+  } finally {
+    try { fs.closeSync(fd); } catch { /* already closed */ }
+    try { fs.unlinkSync(LOCK_PATH); } catch { /* already gone */ }
+  }
+}

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "@llamaventures/cli",
-  "version": "1.2.4",
-  "description": "Llama Ventures CLI + MCP server. Internal team tool for command.llamaventures.vc.",
+  "version": "1.3.1",
+  "description": "CLI + MCP server for the Llama Ventures investment workbench (command.llamaventures.vc).",
   "type": "module",
   "bin": {
     "llama": "bin/llama.mjs",
@@ -45,6 +45,7 @@
   },
   "dependencies": {
     "@modelcontextprotocol/sdk": "1.29.0",
+    "@napi-rs/keyring": "^1.3.0",
     "zod": "^4.4.3"
   }
 }