npm - @llamaventures/cli - Versions diffs - 1.2.3 → 1.3.0 - Mend

@llamaventures/cli 1.2.3 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/AGENT_BRIEFING.md CHANGED Viewed

@@ -76,8 +76,8 @@ Default to action. Ask only for genuine judgment.
 | Error | What to do |
 |---|---|
-| `Error[NO_AUTH]` | Tell user: mint a token at `command.llamaventures.vc/settings/tokens`, then `llama token set <llc_...>`. |
-| `Error[UNAUTHORIZED]` | Credentials rejected (revoked / expired / wrong account). Same recovery — re-mint. |
+| `Error[NO_AUTH]` | Tell user: run `llama auth login` (browser sign-in via Google, OAuth tokens stored in OS Keychain). For unattended/CI: mint a long-lived PAT at `command.llamaventures.vc/settings/tokens` and `llama token set <llc_...>`. |
+| `Error[UNAUTHORIZED]` | Credentials rejected (revoked / expired / wrong account). If using OAuth: `llama auth login` again. If using PAT: re-mint. |
 | HTTP 5xx | Wait 5s, retry once. Two failures → tell the user "Command unavailable, will retry later." |
 | `Too many failed authentication attempts` (HTTP 429) | IP rate-limit. Wait until next UTC hour, OR switch network (e.g. tether to phone). |
@@ -87,7 +87,9 @@ Default to action. Ask only for genuine judgment.
 ```bash
 # Auth
-llama auth status
+llama auth login              # browser PKCE flow → OAuth tokens in OS Keychain (recommended)
+llama auth logout             # revoke + clear local
+llama auth status             # show identity + active method
 # Pipeline — read
 llama deal search "<name>"

package/README.md CHANGED Viewed

@@ -103,16 +103,37 @@ The client tries credentials **in this order**, on every call:
 | # | Source | Header sent | Best for |
 |---|--------|-------------|----------|
-| 1 | `gcloud auth print-identity-token` | `Authorization: Bearer …` | Team members on a workstation (zero config) |
-| 2 | `$LLAMA_TOKEN` env var | `X-Llama-Token` | CI runners, sandboxed cloud agents |
-| 3 | `~/.llama/token` (mode `0600`) | `X-Llama-Token` | Persistent local install |
-| 4 | `~/.llama-command/config.json` | `X-Llama-Token` | Legacy CLI v0.1 — auto-migrates to `~/.llama/token` on first read |
+| 1 | `llama auth login` (OAuth 2.1, OS Keychain) | `Authorization: Bearer …` | **Recommended for everyone.** One-shot browser login; tokens auto-refresh and survive reboots. |
+| 2 | `gcloud auth print-identity-token` | `Authorization: Bearer …` | Workstations with gcloud already wired (zero config) |
+| 3 | `$LLAMA_TOKEN` env var | `X-Llama-Token` | CI runners, sandboxed cloud agents |
+| 4 | `~/.llama/token` (mode `0600`) | `X-Llama-Token` | Persistent local install (legacy PATs) |
+| 5 | `~/.llama-command/config.json` | `X-Llama-Token` | CLI v0.1 — auto-migrates to `~/.llama/token` |
-Both Bearer and X-Llama-Token are sent if both exist. The server tries Bearer
-first; on verification failure it falls through to X-Llama-Token. Inspect the
-resolved identity any time with `llama auth status`.
+If both Bearer and X-Llama-Token are present, both are sent — the server tries
+Bearer first and falls through to X-Llama-Token on verification failure.
+Inspect the resolved identity any time with `llama auth status`.
-### Zero-config — recommended for team members
+### Browser sign-in — recommended
+```bash
+llama auth login           # opens browser → Google sign-in → consent → done
+llama auth status          # → activeMethod=oauth, scope, identity
+llama deal search acme-ai  # ready
+```
+`llama auth login` runs an OAuth 2.1 PKCE + RFC 8252 loopback flow against
+`https://command.llamaventures.vc`, exchanges the code for an access + refresh
+token pair, and stores them in the OS Keychain (macOS Keychain / Windows
+Credential Manager / Linux Secret Service via [`@napi-rs/keyring`](https://www.npmjs.com/package/@napi-rs/keyring)).
+Linux containers without libsecret use a 0600-mode file at `~/.llama/oauth.json`
+— same posture `gcloud` / `gh` / `aws` ship with on Linux servers. Refresh
+tokens rotate transparently when the access token nears expiry; a cross-process
+file lock prevents two shells from burning each other's refresh during
+concurrent calls.
+`llama auth logout` revokes server-side via RFC 7009 and clears local storage.
+### gcloud — for machines already wired with `gcloud auth login`
 ```bash
 gcloud auth login          # one-time; pick your @llamaventures.vc account
@@ -120,7 +141,7 @@ llama auth status          # → role + email
 llama deal search acme-ai  # ready
 ```
-### Manual token — for machines without `gcloud`, or stable CI
+### Long-lived PAT — for CI / unattended environments
 1. Sign in to https://command.llamaventures.vc.
 2. Open `/settings/tokens` → **Mint Token**.

package/bin/llama.mjs CHANGED Viewed

@@ -28,6 +28,8 @@ import {
   startExternalSession,
   uploadExternalFile,
 } from "../lib/external.mjs";
+import { LLAMA_CLI_CLIENT_ID, pkceLoopbackFlow, revokeToken as revokeOAuthToken } from "../lib/oauth-flow.mjs";
+import { deleteBundle, detectBackend, readBundle, writeBundle } from "../lib/oauth-storage.mjs";
 function parseFlags(args) {
   const flags = {};
@@ -309,13 +311,19 @@ Upload a file (deck / pitch / one-pager):
 Interactive REPL (requires existing session):
   llama pitch
+Wrap up the pitch (asks the agent to call finalize_intake immediately):
+  llama pitch finalize       # use when you're done — agent stops asking
 Inspect / clean up:
   llama pitch status         # session id, idle minutes, finalized?
   llama pitch end            # clear local session state
 Caps (server-enforced):
-  5 sessions per IP per day, 3 per email per day, 30min idle timeout,
+  5 sessions per IP per day, 3 per email per day, 60min idle timeout,
   100 messages per session, 1M tokens per session.
+Environment:
+  LLAMA_API_URL              override base URL (dev: http://localhost:3000)
 `);
     return;
   }
@@ -403,14 +411,49 @@ Caps (server-enforced):
       cleared: !!had,
       session_file: EXTERNAL_SESSION_FILE,
       note: had
-        ? "Local session state cleared. Server-side session may still be active until idle timeout (30min)."
+        ? "Local session state cleared. Server-side session may still be active until idle timeout (60min)."
         : "No local session was active.",
     });
     return;
   }
+  if (action === "finalize") {
+    // Founder-initiated finalize: send a sentinel token in the chat
+    // stream that the system prompt recognizes as "wrap up now." The
+    // intake agent calls finalize_intake on this turn with whatever
+    // fields are recorded — no extra questions, no confirmation prompt.
+    // Local session is left as-is; on next read its `finalized=true`
+    // reflects the server's status.
+    const session = readExternalSession();
+    if (!session) {
+      throw new Error(
+        "No active pitch session. Run `llama pitch start --name \"...\" --email \"...\"` first."
+      );
+    }
+    if (session.finalized) {
+      throw new Error(
+        "This pitch session is already finalized. Run `llama pitch end` to clear local state."
+      );
+    }
+    process.stderr.write("Asking the agent to wrap up...\n");
+    const result = await sendExternalMessage("[FOUNDER_FINALIZE_REQUEST]");
+    process.stdout.write(result.text + "\n");
+    if (result.finalized) {
+      process.stderr.write("\n--- Pitch session finalized ---\n");
+      if (result.finalize_payload) {
+        process.stderr.write(JSON.stringify(result.finalize_payload, null, 2) + "\n");
+      }
+    } else {
+      process.stderr.write(
+        "\n⚠ Agent did not call finalize_intake on this turn. " +
+        "Try `llama pitch finalize` once more, or `llama pitch end` to abandon.\n"
+      );
+    }
+    return;
+  }
   // No action → REPL mode (requires existing session)
-  if (action === undefined || (rest.length === 0 && !["start", "say", "upload", "status", "end"].includes(action))) {
+  if (action === undefined || (rest.length === 0 && !["start", "say", "upload", "status", "end", "finalize"].includes(action))) {
     // Treat any unknown bare action as "join existing session in REPL mode"
     const session = readExternalSession();
     if (!session) {
@@ -656,8 +699,11 @@ https://command.llamaventures.vc/settings/tokens, run
           ? "~/.llama-command/config.json (legacy)"
           : null;
+    const oauthBundle = await readBundle();
+    const oauthBackend = oauthBundle ? await detectBackend() : null;
     let serverCheck = "skipped (no credentials)";
-    if (bearer || token) {
+    if (oauthBundle?.access_token || bearer || token) {
       try {
         const me = await request("GET", "/api/me");
         serverCheck = `ok — authenticated as ${me?.email ?? "unknown"} (role: ${me?.role ?? "unknown"})`;
@@ -666,12 +712,101 @@ https://command.llamaventures.vc/settings/tokens, run
       }
     }
-    print({
+    const out = {
       baseUrl: getBaseUrl(),
+      activeMethod: oauthBundle?.access_token
+        ? "oauth"
+        : bearer
+          ? "gcloud-bearer"
+          : token
+            ? "llama-token"
+            : "none",
+      oauth: oauthBundle
+        ? {
+            storage: oauthBackend,
+            client_id: oauthBundle.client_id,
+            scope: oauthBundle.scope,
+            issuer: oauthBundle.issuer,
+            expires_in_seconds: Math.max(0, Math.round((oauthBundle.expires_at - Date.now()) / 1000)),
+          }
+        : "absent (run `llama auth login`)",
       gcloudIdentityToken: bearer ? "present" : "absent",
       llamaToken: token ? `${token.slice(0, 8)}...${token.slice(-4)}` : "absent",
       llamaTokenSource: tokenSrc,
       serverCheck,
+    };
+    print(out);
+    return;
+  }
+  // ============================================================
+  // auth login — PKCE + loopback browser flow
+  // ============================================================
+  if (area === "auth" && action === "login") {
+    const { flags } = parseFlags(rest);
+    const requestedScope = typeof flags.scope === "string" && flags.scope.trim()
+      ? flags.scope.trim()
+      : "read write";
+    const baseUrl = getBaseUrl();
+    const resource = baseUrl; // general API audience (oauthApiResource on the server)
+    console.error(`Signing in to ${baseUrl} as Llama CLI (client_id=${LLAMA_CLI_CLIENT_ID})...`);
+    const bundle = await pkceLoopbackFlow({ baseUrl, scope: requestedScope, resource });
+    const stored = await writeBundle({
+      access_token: bundle.access_token,
+      refresh_token: bundle.refresh_token,
+      expires_at: Date.now() + (bundle.expires_in ?? 3600) * 1000,
+      scope: bundle.scope,
+      client_id: bundle.client_id,
+      issuer: bundle.issuer,
+      resource: bundle.resource,
+      created_at: Date.now(),
+    });
+    // Verify by hitting /api/me with the new token.
+    let identity = "(unable to verify — /api/me did not respond)";
+    try {
+      const me = await request("GET", "/api/me");
+      identity = `${me?.email ?? "unknown"} (role: ${me?.role ?? "unknown"})`;
+    } catch (e) {
+      identity = `verification failed: ${e.message.split("\n")[0]}`;
+    }
+    print({
+      ok: true,
+      message: "Signed in",
+      identity,
+      storage: stored.backend,
+      scope: bundle.scope,
+      expires_in_seconds: bundle.expires_in,
+    });
+    return;
+  }
+  // ============================================================
+  // auth logout — revoke + clear local
+  // ============================================================
+  if (area === "auth" && action === "logout") {
+    const bundle = await readBundle();
+    if (!bundle) {
+      print({ ok: true, message: "No OAuth credentials to clear" });
+      return;
+    }
+    let revoked = false;
+    try {
+      revoked = await revokeOAuthToken({
+        baseUrl: bundle.issuer ?? getBaseUrl(),
+        token: bundle.refresh_token,
+        tokenTypeHint: "refresh_token",
+      });
+    } catch {
+      revoked = false;
+    }
+    await deleteBundle();
+    print({
+      ok: true,
+      message: "Signed out — local credentials cleared",
+      serverRevoke: revoked ? "succeeded" : "failed (server unreachable or token already invalid; local state cleared anyway)",
     });
     return;
   }

package/lib/client.mjs CHANGED Viewed

@@ -139,18 +139,54 @@ export async function tryGcloudIdentityToken() {
   }
 }
-// Build the auth header set. If both Bearer and X-Llama-Token are available,
-// send both — the server tries Bearer first and falls through to
-// X-Llama-Token on verification failure.
+// Build the auth header set. Priority order (server tries them in this
+// order too and falls through on failure):
+//
+//   1. OAuth access token from Keychain (`llama auth login`) — Bearer
+//      header. Auto-refreshes if near expiry. Highest priority because
+//      it's scope-aware + revocable.
+//   2. gcloud identity token — Bearer header. Falls back if no OAuth.
+//   3. X-Llama-Token PAT — sent alongside whatever Bearer was set, so
+//      server's authenticate() can fall through on Bearer-verify failure.
 export async function getAuthHeaders() {
   const headers = {};
-  const bearer = await tryGcloudIdentityToken();
-  if (bearer) headers["Authorization"] = `Bearer ${bearer}`;
+  // Lazy import — keeps zero-OAuth call paths fast and avoids loading
+  // @napi-rs/keyring's native binding when the user isn't using OAuth.
+  let oauthAccess = null;
+  try {
+    const { getValidAccessToken } = await import("./oauth-refresh.mjs");
+    oauthAccess = await getValidAccessToken();
+  } catch {
+    // OAuth modules failed to load (e.g. keyring native binding missing
+    // on this platform) — fall through to gcloud / PAT silently.
+  }
+  if (oauthAccess) {
+    headers["Authorization"] = `Bearer ${oauthAccess}`;
+  } else {
+    const bearer = await tryGcloudIdentityToken();
+    if (bearer) headers["Authorization"] = `Bearer ${bearer}`;
+  }
   const token = getToken();
   if (token) headers["X-Llama-Token"] = token;
   return headers;
 }
+/**
+ * Was the Bearer header on this request set from an OAuth access token?
+ * `request()` uses this to decide whether a 401 should trigger a
+ * refresh-and-retry-once path (only meaningful when we sent an OAuth
+ * token; gcloud / PAT 401s should NOT retry blindly).
+ */
+async function bearerCameFromOAuth() {
+  try {
+    const { readBundle } = await import("./oauth-storage.mjs");
+    const bundle = await readBundle();
+    return Boolean(bundle?.access_token);
+  } catch {
+    return false;
+  }
+}
 // Structured no-credential error. Format is stable so agents can pattern-match
 // `Error[NO_AUTH]` and trigger a recovery flow.
 function noAuthError() {
@@ -181,6 +217,10 @@ function unauthorizedError() {
 }
 export async function request(method, endpoint, body) {
+  return requestWithRetry(method, endpoint, body, /* allowRetry */ true);
+}
+async function requestWithRetry(method, endpoint, body, allowRetry) {
   const authHeaders = await getAuthHeaders();
   if (Object.keys(authHeaders).length === 0) throw noAuthError();
   const res = await fetch(`${getBaseUrl()}${endpoint}`, {
@@ -191,7 +231,29 @@ export async function request(method, endpoint, body) {
     },
     body: body === undefined ? undefined : JSON.stringify(body),
   });
+  // 401 + we sent an OAuth Bearer + this is the first attempt → try a
+  // forced refresh once. Covers two cases: (a) clock skew between client
+  // and server pushed us past expiry mid-request, (b) server-side
+  // revocation occurred between the client cache and now. Either way,
+  // the refresh either succeeds (we retry once with the new access
+  // token) or fails (refresh token also dead — bubble UNAUTHORIZED).
+  if (res.status === 401 && allowRetry && (await bearerCameFromOAuth())) {
+    let refreshed = null;
+    try {
+      const { forceRefresh } = await import("./oauth-refresh.mjs");
+      refreshed = await forceRefresh();
+    } catch {
+      refreshed = null;
+    }
+    if (refreshed) {
+      return requestWithRetry(method, endpoint, body, /* allowRetry */ false);
+    }
+    throw unauthorizedError();
+  }
   if (res.status === 401) throw unauthorizedError();
   const text = await res.text();
   let data;
   try {

package/lib/external.mjs CHANGED Viewed

@@ -100,6 +100,10 @@ export async function startExternalSession({ name, email }) {
       pow_nonce: powNonce,
       user_agent: "@llamaventures/cli",
     }),
+    // Cap at 60s — start-session is PoW + DB insert, never legitimate
+    // beyond a few seconds. Without this, a network hang freezes the CLI
+    // indefinitely.
+    signal: AbortSignal.timeout(60_000),
   });
   if (!res.ok) {
@@ -218,6 +222,11 @@ export async function sendExternalMessage(message, { attachments, onChunk } = {}
       message,
       ...(attachments ? { attachments } : {}),
     }),
+    // 180s ceiling — covers a legitimate slow agent turn (multi-tool
+    // call + deck read + Sonnet ~2k token reply ≈ 90-120s in practice)
+    // while still detecting a dead connection. Without this, a hung
+    // SSE stream freezes the CLI indefinitely.
+    signal: AbortSignal.timeout(180_000),
   });
   if (!res.ok) {
@@ -343,6 +352,10 @@ export async function uploadExternalFile(filePath) {
     method: "POST",
     headers: { Cookie: `external_session=${session.session_id}` },
     body: formData,
+    // 180s ceiling — covers a 50MB upload over a slow tether (~280KB/s).
+    // Faster networks return in seconds; this only kicks in on a dead
+    // connection so the CLI doesn't hang forever.
+    signal: AbortSignal.timeout(180_000),
   });
   if (!res.ok) {

package/lib/oauth-flow.mjs ADDED Viewed

@@ -0,0 +1,245 @@
+// OAuth 2.1 PKCE + loopback flow for the Llama CLI.
+//
+// Mirrors `gh auth login` / `gcloud auth login`: the CLI binds an
+// ephemeral HTTP server on 127.0.0.1, opens the browser to the
+// authorization endpoint with a PKCE challenge + state, and waits for
+// the user to approve. The browser redirects to the loopback URL
+// carrying the auth code; the local server captures it and shuts down.
+// The CLI then exchanges the code (with the PKCE verifier) for tokens.
+//
+// Pure stdlib: node:crypto for PKCE, node:http for the loopback server,
+// child_process for the platform-specific browser open. No third-party
+// HTTP/OAuth client.
+//
+// RFC compliance: OAuth 2.1 + RFC 7636 PKCE S256 + RFC 8252 native-app
+// loopback flow + RFC 8707 audience parameter.
+import { createHash, randomBytes } from "crypto";
+import http from "http";
+import { spawn } from "child_process";
+const CLIENT_ID = "llama-cli-official";
+const REDIRECT_PATH = "/callback";
+const FLOW_TIMEOUT_MS = 5 * 60 * 1000; // 5 min — generous for slow Google sign-in
+// ============================================================
+// PKCE primitives
+// ============================================================
+function base64url(buf) {
+  return buf
+    .toString("base64")
+    .replace(/=/g, "")
+    .replace(/\+/g, "-")
+    .replace(/\//g, "_");
+}
+export function generateVerifier() {
+  // RFC 7636 §4.1: 43-128 chars from unreserved alphabet. 32 random bytes
+  // → 43 base64url chars (256 bits entropy).
+  return base64url(randomBytes(32));
+}
+export function challengeFor(verifier) {
+  return base64url(createHash("sha256").update(verifier).digest());
+}
+// ============================================================
+// Browser launcher
+// ============================================================
+function openBrowser(url) {
+  // Platform-native open. We never block on it (the user closes the
+  // browser when they're done; the loopback server is what we wait for).
+  let cmd, args;
+  if (process.platform === "darwin") {
+    cmd = "open";
+    args = [url];
+  } else if (process.platform === "win32") {
+    cmd = "cmd";
+    args = ["/c", "start", "", url];
+  } else {
+    cmd = "xdg-open";
+    args = [url];
+  }
+  try {
+    spawn(cmd, args, { detached: true, stdio: "ignore" }).unref();
+  } catch {
+    // Best-effort — if we can't open the browser, the user can copy the
+    // URL from stderr. The loopback server keeps listening either way.
+  }
+}
+// ============================================================
+// Loopback server response page
+// ============================================================
+function respondHtml(res, ok, message) {
+  const color = ok ? "#16a34a" : "#dc2626";
+  const title = ok ? "Llama CLI — Signed in" : "Llama CLI — Sign-in failed";
+  res.statusCode = ok ? 200 : 400;
+  res.setHeader("Content-Type", "text/html; charset=utf-8");
+  res.end(`<!doctype html>
+<html><head><meta charset="utf-8"><title>${title}</title>
+<style>
+  body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif;
+         background: #fafaf9; color: #292524; display: grid; place-items: center;
+         min-height: 100vh; margin: 0; }
+  .card { background: white; border: 1px solid #e7e5e4; border-radius: 8px;
+          padding: 32px 40px; max-width: 400px; text-align: center; }
+  h1 { margin: 0 0 12px; font-size: 18px; color: ${color}; }
+  p  { margin: 0; color: #57534e; font-size: 14px; }
+</style></head><body>
+<div class="card"><h1>${title}</h1><p>${message}</p></div>
+</body></html>`);
+}
+// ============================================================
+// PKCE + loopback driver
+// ============================================================
+/**
+ * Run the full PKCE + loopback OAuth flow.
+ *
+ * @param {Object} opts
+ * @param {string} opts.baseUrl   AS issuer (e.g. https://command.llamaventures.vc)
+ * @param {string} opts.scope     Space-separated scope request (e.g. "read write")
+ * @param {string} opts.resource  RFC 8707 audience the access token will bind to
+ * @returns {Promise<Object>}     {access_token, refresh_token, expires_in, scope, token_type, redirect_uri}
+ */
+export async function pkceLoopbackFlow({ baseUrl, scope, resource }) {
+  const verifier = generateVerifier();
+  const challenge = challengeFor(verifier);
+  const state = base64url(randomBytes(16));
+  // Bind the loopback server FIRST so we know the port for redirect_uri.
+  const server = http.createServer();
+  await new Promise((resolve, reject) => {
+    server.once("error", reject);
+    server.listen(0, "127.0.0.1", resolve);
+  });
+  const { port } = server.address();
+  const redirectUri = `http://127.0.0.1:${port}${REDIRECT_PATH}`;
+  // Set up the request handler now that we have the port.
+  const codePromise = new Promise((resolve, reject) => {
+    const timeoutId = setTimeout(() => {
+      try { server.close(); } catch { /* */ }
+      reject(new Error(
+        "Error[OAUTH_TIMEOUT]: Browser flow did not complete within " +
+        Math.round(FLOW_TIMEOUT_MS / 1000) + "s. Re-run `llama auth login`."
+      ));
+    }, FLOW_TIMEOUT_MS);
+    server.on("request", (req, res) => {
+      const url = new URL(req.url, "http://127.0.0.1");
+      if (url.pathname !== REDIRECT_PATH) {
+        res.statusCode = 404;
+        res.end();
+        return;
+      }
+      const code = url.searchParams.get("code");
+      const respState = url.searchParams.get("state");
+      const error = url.searchParams.get("error");
+      const errorDescription = url.searchParams.get("error_description") ?? "";
+      if (error) {
+        respondHtml(res, false, `${error}: ${errorDescription}`);
+        clearTimeout(timeoutId);
+        server.close();
+        reject(new Error(`Error[OAUTH_DENIED]: ${error} — ${errorDescription}`));
+        return;
+      }
+      if (respState !== state) {
+        respondHtml(res, false, "state parameter mismatch (CSRF defense)");
+        clearTimeout(timeoutId);
+        server.close();
+        reject(new Error("Error[OAUTH_BAD_STATE]: state mismatch — possible CSRF or stale callback"));
+        return;
+      }
+      if (!code) {
+        respondHtml(res, false, "missing code parameter");
+        clearTimeout(timeoutId);
+        server.close();
+        reject(new Error("Error[OAUTH_BAD_CALLBACK]: callback missing code parameter"));
+        return;
+      }
+      respondHtml(res, true, "You can close this window and return to the terminal.");
+      clearTimeout(timeoutId);
+      server.close();
+      resolve(code);
+    });
+  });
+  // Build authorize URL and open browser.
+  const authorizeUrl = new URL(`${baseUrl}/api/oauth/authorize`);
+  authorizeUrl.searchParams.set("response_type", "code");
+  authorizeUrl.searchParams.set("client_id", CLIENT_ID);
+  authorizeUrl.searchParams.set("redirect_uri", redirectUri);
+  authorizeUrl.searchParams.set("scope", scope);
+  authorizeUrl.searchParams.set("state", state);
+  authorizeUrl.searchParams.set("code_challenge", challenge);
+  authorizeUrl.searchParams.set("code_challenge_method", "S256");
+  authorizeUrl.searchParams.set("resource", resource);
+  console.error(`Opening browser to ${baseUrl} for sign-in...`);
+  console.error(`(If the browser does not open, visit this URL manually:\n  ${authorizeUrl.toString()}\n)`);
+  openBrowser(authorizeUrl.toString());
+  const code = await codePromise;
+  // Exchange code → tokens.
+  const tokenBody = new URLSearchParams({
+    grant_type: "authorization_code",
+    code,
+    redirect_uri: redirectUri,
+    client_id: CLIENT_ID,
+    code_verifier: verifier,
+    resource,
+  }).toString();
+  const tokenRes = await fetch(`${baseUrl}/api/oauth/token`, {
+    method: "POST",
+    headers: { "Content-Type": "application/x-www-form-urlencoded" },
+    body: tokenBody,
+  });
+  const tokenJson = await tokenRes.json().catch(() => ({}));
+  if (!tokenRes.ok) {
+    throw new Error(
+      `Error[OAUTH_TOKEN_EXCHANGE_FAILED]: ${tokenJson.error ?? tokenRes.status} — ${tokenJson.error_description ?? "no description"}`
+    );
+  }
+  return {
+    access_token: tokenJson.access_token,
+    refresh_token: tokenJson.refresh_token,
+    expires_in: tokenJson.expires_in ?? 3600,
+    scope: tokenJson.scope ?? scope,
+    token_type: tokenJson.token_type ?? "Bearer",
+    client_id: CLIENT_ID,
+    resource,
+    issuer: baseUrl,
+  };
+}
+// ============================================================
+// Token revoke (used by `llama auth logout`)
+// ============================================================
+export async function revokeToken({ baseUrl, token, tokenTypeHint }) {
+  const body = new URLSearchParams({
+    token,
+    client_id: CLIENT_ID,
+    ...(tokenTypeHint ? { token_type_hint: tokenTypeHint } : {}),
+  }).toString();
+  const res = await fetch(`${baseUrl}/api/oauth/revoke`, {
+    method: "POST",
+    headers: { "Content-Type": "application/x-www-form-urlencoded" },
+    body,
+  });
+  // RFC 7009 §2.2: 200 on success OR unknown token. Anything else is unexpected.
+  return res.ok;
+}
+export const LLAMA_CLI_CLIENT_ID = CLIENT_ID;

package/lib/oauth-refresh.mjs ADDED Viewed

@@ -0,0 +1,87 @@
+// OAuth refresh-token rotation for the Llama CLI.
+//
+// Called from lib/client.mjs::request when an OAuth-bearing call returns
+// 401. We exchange the stored refresh_token for a new (access, refresh)
+// pair via POST /api/oauth/token, persist the new bundle, and surface
+// the new access_token so the caller can retry once.
+//
+// Cross-process locking via oauth-storage.withRefreshLock so two shells
+// hitting 401 simultaneously don't burn each other's refresh token.
+// After acquiring the lock we re-read the bundle in case the other
+// shell has already refreshed.
+import { LLAMA_CLI_CLIENT_ID } from "./oauth-flow.mjs";
+import { readBundle, withRefreshLock, writeBundle } from "./oauth-storage.mjs";
+const ACCESS_TOKEN_SKEW_MS = 30_000; // refresh proactively 30s before expiry
+/**
+ * Returns the current access token if non-expired, else attempts
+ * refresh. Returns null if no bundle is stored, refresh fails, or the
+ * refresh token itself is expired/revoked (caller should fall through
+ * to the next auth method or surface NO_AUTH).
+ */
+export async function getValidAccessToken() {
+  const bundle = await readBundle();
+  if (!bundle?.access_token) return null;
+  if (bundle.expires_at - Date.now() > ACCESS_TOKEN_SKEW_MS) {
+    return bundle.access_token;
+  }
+  // Near or past expiry — refresh under lock.
+  return refreshUnderLock();
+}
+/**
+ * Force a refresh regardless of expiry. Used by client.mjs on a 401
+ * with an OAuth bundle present (the access token may have been revoked
+ * server-side, in which case the refresh might still work).
+ */
+export async function forceRefresh() {
+  return refreshUnderLock();
+}
+async function refreshUnderLock() {
+  return withRefreshLock(async () => {
+    // Re-read inside the lock — another shell may have refreshed already.
+    const fresh = await readBundle();
+    if (!fresh?.refresh_token) return null;
+    if (fresh.expires_at - Date.now() > ACCESS_TOKEN_SKEW_MS) {
+      // Another shell already refreshed; we're good.
+      return fresh.access_token;
+    }
+    return performRefresh(fresh);
+  });
+}
+async function performRefresh(bundle) {
+  const body = new URLSearchParams({
+    grant_type: "refresh_token",
+    refresh_token: bundle.refresh_token,
+    client_id: bundle.client_id ?? LLAMA_CLI_CLIENT_ID,
+    resource: bundle.resource,
+  }).toString();
+  const res = await fetch(`${bundle.issuer}/api/oauth/token`, {
+    method: "POST",
+    headers: { "Content-Type": "application/x-www-form-urlencoded" },
+    body,
+  });
+  if (!res.ok) {
+    // Refresh failed — most likely refresh expired or grant was revoked.
+    // Don't delete the bundle automatically; the user might want to
+    // inspect it or `llama auth logout` themselves to clear it.
+    return null;
+  }
+  const json = await res.json().catch(() => null);
+  if (!json?.access_token || !json?.refresh_token) return null;
+  const newBundle = {
+    ...bundle,
+    access_token: json.access_token,
+    refresh_token: json.refresh_token,
+    expires_at: Date.now() + (json.expires_in ?? 3600) * 1000,
+    scope: json.scope ?? bundle.scope,
+  };
+  await writeBundle(newBundle);
+  return newBundle.access_token;
+}

package/lib/oauth-storage.mjs ADDED Viewed

@@ -0,0 +1,191 @@
+// OAuth credential storage for the Llama CLI.
+//
+// Persists the access_token / refresh_token / expires_at bundle returned
+// by the Llama Command authorization server. Two backends, in order:
+//
+//   1. OS Keychain via @napi-rs/keyring — macOS Keychain, Windows
+//      Credential Manager, Linux Secret Service (libsecret). Industry
+//      standard for desktop CLIs (gh, gcloud, Azure SDK).
+//
+//   2. Plain file `~/.llama/oauth.json` mode 0600 — used when the
+//      Keychain backend isn't available (Linux container with no
+//      libsecret, headless CI runner). Same posture as the existing
+//      `~/.llama/token` for PATs, and the same posture gh/gcloud/aws
+//      ship with on Linux servers.
+//
+// Cross-process lock: the refresh-token rotation contract requires that
+// two shells refreshing simultaneously don't burn each other's refresh
+// token. We coordinate via atomic O_CREAT|O_EXCL on `~/.llama/oauth.lock`
+// with a short retry window, and after acquiring re-read the credentials
+// in case the other shell already refreshed.
+import fs from "fs";
+import os from "os";
+import path from "path";
+const SERVICE = "com.llamaventures.cli";
+const ACCOUNT = "oauth";
+const STORE_DIR = path.join(os.homedir(), ".llama");
+const FILE_PATH = path.join(STORE_DIR, "oauth.json");
+const LOCK_PATH = path.join(STORE_DIR, "oauth.lock");
+// ============================================================
+// Keychain backend (lazy-loaded — keep startup fast)
+// ============================================================
+let _keychainEntry = null;
+let _keychainTried = false;
+async function getKeychainEntry() {
+  if (_keychainTried) return _keychainEntry;
+  _keychainTried = true;
+  try {
+    const { Entry } = await import("@napi-rs/keyring");
+    _keychainEntry = new Entry(SERVICE, ACCOUNT);
+    // Probe — if the platform backend is missing (e.g. Linux without
+    // libsecret), the Entry methods throw on first use. Surface that
+    // here so callers route to the file backend.
+    try {
+      _keychainEntry.getPassword();
+    } catch (err) {
+      const msg = String(err?.message ?? err);
+      // "no entry" / "not found" is fine — backend works, just empty.
+      // Any other error means the backend itself is unavailable.
+      if (!/no entry|not found|no such/i.test(msg)) {
+        _keychainEntry = null;
+      }
+    }
+  } catch {
+    _keychainEntry = null;
+  }
+  return _keychainEntry;
+}
+// ============================================================
+// Bundle shape
+// ============================================================
+/**
+ * @typedef {Object} OAuthBundle
+ * @property {string} access_token
+ * @property {string} refresh_token
+ * @property {number} expires_at         absolute ms epoch when access_token expires
+ * @property {string} scope              space-separated, OAuth wire format
+ * @property {string} client_id          which AS client minted this bundle
+ * @property {string} issuer             AS issuer URL — bundle is bound to it
+ * @property {string} resource           RFC 8707 audience the access_token is for
+ * @property {number} created_at         ms epoch when bundle was first stored
+ */
+// ============================================================
+// Read / write / delete
+// ============================================================
+export async function readBundle() {
+  const entry = await getKeychainEntry();
+  if (entry) {
+    try {
+      const raw = entry.getPassword();
+      if (raw) return JSON.parse(raw);
+    } catch {
+      // fall through to file
+    }
+  }
+  try {
+    const raw = fs.readFileSync(FILE_PATH, "utf8");
+    return JSON.parse(raw);
+  } catch {
+    return null;
+  }
+}
+export async function writeBundle(bundle) {
+  const json = JSON.stringify(bundle);
+  const entry = await getKeychainEntry();
+  if (entry) {
+    try {
+      entry.setPassword(json);
+      // Best-effort cleanup: if a stale plaintext file exists from a
+      // pre-Keychain install, remove it so we don't have two copies of
+      // the credential drifting.
+      try { fs.unlinkSync(FILE_PATH); } catch { /* not present */ }
+      return { backend: "keychain" };
+    } catch {
+      // fall through to file
+    }
+  }
+  fs.mkdirSync(STORE_DIR, { recursive: true, mode: 0o700 });
+  fs.writeFileSync(FILE_PATH, `${json}\n`, { mode: 0o600 });
+  fs.chmodSync(FILE_PATH, 0o600);
+  return { backend: "file" };
+}
+export async function deleteBundle() {
+  const entry = await getKeychainEntry();
+  if (entry) {
+    try { entry.deletePassword(); } catch { /* may not be present */ }
+  }
+  try { fs.unlinkSync(FILE_PATH); } catch { /* may not be present */ }
+}
+export async function detectBackend() {
+  const entry = await getKeychainEntry();
+  return entry ? "keychain" : "file";
+}
+// ============================================================
+// Cross-process lock
+// ============================================================
+//
+// Refresh rotation requires that only ONE process at a time exchange
+// the current refresh token. Without a lock, two CLI invocations racing
+// on token expiry would both POST /oauth/token; the first wins, the
+// second gets `invalid_grant` (because the first already rotated), and
+// the user sees a confusing failure.
+//
+// Pattern: atomic O_CREAT | O_EXCL on a sentinel file. If we get the
+// fd, we own the lock; on EEXIST, another process owns it — wait briefly
+// and retry. After acquiring, ALWAYS re-read the bundle from storage in
+// case the other process has refreshed in the meantime (then we don't
+// need to refresh ourselves).
+const LOCK_RETRY_MS = 100;
+const LOCK_TIMEOUT_MS = 5_000;
+export async function withRefreshLock(fn) {
+  fs.mkdirSync(STORE_DIR, { recursive: true, mode: 0o700 });
+  const start = Date.now();
+  let fd;
+  while (true) {
+    try {
+      fd = fs.openSync(LOCK_PATH, "wx", 0o600);
+      break;
+    } catch (err) {
+      if (err.code !== "EEXIST") throw err;
+      // Stale lock cleanup: if the lock file is older than the timeout,
+      // the holding process likely crashed. Remove and retry.
+      try {
+        const stat = fs.statSync(LOCK_PATH);
+        if (Date.now() - stat.mtimeMs > LOCK_TIMEOUT_MS) {
+          fs.unlinkSync(LOCK_PATH);
+          continue;
+        }
+      } catch { /* lock disappeared between EEXIST and stat — fine */ }
+      if (Date.now() - start > LOCK_TIMEOUT_MS) {
+        throw new Error(
+          "Error[OAUTH_LOCK_TIMEOUT]: Could not acquire OAuth refresh lock at " +
+          LOCK_PATH + ". Another `llama` process may be hung. Remove the " +
+          "lock file manually if you're sure no other CLI is running."
+        );
+      }
+      await new Promise((r) => setTimeout(r, LOCK_RETRY_MS));
+    }
+  }
+  try {
+    return await fn();
+  } finally {
+    try { fs.closeSync(fd); } catch { /* already closed */ }
+    try { fs.unlinkSync(LOCK_PATH); } catch { /* already gone */ }
+  }
+}

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@llamaventures/cli",
-  "version": "1.2.3",
+  "version": "1.3.0",
   "description": "Llama Ventures CLI + MCP server. Internal team tool for command.llamaventures.vc.",
   "type": "module",
   "bin": {
@@ -45,6 +45,7 @@
   },
   "dependencies": {
     "@modelcontextprotocol/sdk": "1.29.0",
+    "@napi-rs/keyring": "^1.3.0",
     "zod": "^4.4.3"
   }
 }