npm - @yul-labs/agent-relay - Versions diffs - 0.1.2 → 0.1.3 - Mend

@yul-labs/agent-relay 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -99,8 +99,10 @@ prompt ──> spawn agent in a PTY (its real TUI) ──> watch terminal output
 ```
 1. The agent is launched **interactively** in a pseudo-terminal under **pure
-   autonomy** (the project's concept): Claude with `--dangerously-skip-permissions`,
-   Codex with `-s workspace-write -a never`. The agent rarely asks — but the
+   autonomy** (the project's concept): Claude with `--dangerously-skip-permissions`
+   (+ `--effort xhigh` by default), Codex with `-s danger-full-access -a never`
+   (full bypass by default — tighten with `defaults.sandbox` /
+   `adapters.codex.sandbox`, e.g. `workspace-write`). The agent rarely asks — but the
    prompts that *still* appear (the directory-trust dialog, the occasional choice)
    are what the Decider answers. `approvalPolicy: "gated"` makes the agent ask
    more — and routes those asks to the Decider — via Claude `--permission-mode
@@ -161,15 +163,31 @@ attempted. Two things you *can* rely on:
   session transcript (`~/.claude/projects/…` for Claude, `~/.codex/sessions/…` for
   Codex) — so it works on every machine **regardless of TUI / status-line
   settings**, and the token counts are the API's real numbers, not a scrape.
-  Fields: `{ source, model, inputTokens, outputTokens, cachedInputTokens,
-  cacheCreationTokens, reasoningTokens, totalTokens }`. `contextPercent` and
-  `sessionCostUsd` are best-effort *extras* scraped from the status line only when
-  it is enabled (cost reads `0` on subscription/Team billing). `source` is
-  `"transcript"` (authoritative) or `"status-line"` (scrape) so you know which.
-The proper long-term fix for the *answer text* is the **structured-first** path
-(Claude `--input-format stream-json` / Codex's JSON protocol) — see *Remaining
-TODO*.
+  Token fields: `{ source, model, inputTokens, outputTokens, cachedInputTokens,
+  cacheCreationTokens, reasoningTokens, totalTokens }`.
+- **Cost is COMPUTED, not scraped** — `usage.costUsd` = Σ(tokens × the model's
+  list price) from a built-in table (override per `config.pricing`). It is `null`
+  (with a warning) when the model's price is unknown, so you can tell "unpriced"
+  from a real `0`. The scraped nominal "Session $" (often `0` on Team/Max seats) is
+  kept separately as `usage.subscriptionSessionCostUsd`; `usage.contextPercent` is
+  another status-line extra. `source` is `"transcript"` (authoritative tokens) or
+  `"status-line"`.
+  > **`costUsd` is a REFERENCE (shadow) cost** — what the run *would* bill via the
+  > API. agent-relay drives the **interactive TUI**, which on a Claude subscription
+  > is covered by the plan, so for those runs the real marginal cost is ~0 and
+  > `costUsd` is a reference, not the amount charged. Only the non-interactive paths
+  > (`claude -p` / Agent SDK — and, per Anthropic, from **2026-06-15** no longer
+  > counted against subscription limits) actually bill at these rates. Staying on
+  > the **interactive** PTY path is what keeps unattended runs on the subscription.
+A **structured-first** backend (Claude `--output-format stream-json`) *would*
+return the answer text + usage + cost natively — but it **requires `-p`/`--print`
+(non-interactive)**, which is a different execution model from the TUI and, on a
+Claude subscription, is **not covered by the plan** (per Anthropic, `claude -p` /
+Agent SDK fall outside subscription limits from 2026-06-15). So driving the
+**interactive TUI** here is deliberate: it keeps unattended runs on the
+subscription. The file-write pattern is the supported way to recover answers
+without leaving that path.
 ## The Decider
@@ -342,6 +360,28 @@ see [`agent-relay.config.example.json`](./agent-relay.config.example.json)):
   cancelled | waiting_approval }`. The `CompletionDetector` maps the run to a
   terminal status (timeout/idle → `timeout`, cancel → `cancelled`, success/exit
   0 → `completed`, else `failed`) and is extensible.
+- `meta.completedCleanly` (exit 0, not aborted) is the unambiguous success flag;
+  `meta.settled` is `true` when the run answered a prompt **or** completed cleanly
+  (so a 0-interaction skip-mode run is `settled: true`, not falsely `false`).
+## Operational notes
+- **It writes into the target repo.** `sessionsDir` / `logsDir` default to
+  `<root>/.agent-relay/{sessions,logs}` — add **`.agent-relay/`** to the target
+  repo's `.gitignore`, or point them elsewhere (absolute / out-of-root paths work).
+  `--dry-run` still records a session entry, clearly marked `meta.dryRun: true`.
+- **Pin the binary.** Without `adapters.<name>.command`, the first `claude` /
+  `codex` on `PATH` is used — ambiguous when several exist (version managers, shell
+  wrappers, `cmux`). Set `command` to an absolute path; either way the **resolved
+  path is logged** at run start (`resolved claude → …`).
+- **Effort/cost defaults.** Claude runs at `--effort xhigh` and Codex at
+  `-s danger-full-access` by **default** (full autonomy) — both raise cost/latency
+  and access; override with `adapters.<name>.args` / `defaults.sandbox`.
+- **Reasoning-model deciders.** An `api` decider pointed at a reasoning model
+  (e.g. gpt-oss) spends tokens on `reasoning_content` before the JSON answer; too
+  small a `--decider-max-tokens` empties `content`. agent-relay clamps it up to a
+  **512 floor** and falls back to parsing `reasoning_content`, so a normal config
+  won't return an empty decision — but give verbose reasoners more headroom.
 ## Architecture

package/dist/cli.js CHANGED Viewed

@@ -90,7 +90,7 @@ function resolveApprovalMode(defaults, adapter) {
   return adapter?.approvalPolicy ?? defaults.approvalPolicy ?? (defaults.requireApprovalOnRiskyActions ? "gated" : "auto");
 }
 function resolveSandbox(defaults, adapter) {
-  return adapter?.sandbox ?? defaults.sandbox ?? "workspace-write";
+  return adapter?.sandbox ?? defaults.sandbox ?? "danger-full-access";
 }
 var hooksSchema = z.object({
   /** Shell command run just before the agent starts. */
@@ -117,6 +117,14 @@ var deciderSchema = z.object({
   apiKey: z.string().optional(),
   maxTokens: z.number().int().positive().optional()
 }).strict();
+var pricingRuleSchema = z.object({
+  /** Regex (case-insensitive) matched against the model id; first match wins. */
+  match: z.string().min(1),
+  input: z.number().nonnegative(),
+  output: z.number().nonnegative(),
+  cacheWrite: z.number().nonnegative().optional(),
+  cacheRead: z.number().nonnegative().optional()
+}).strict();
 var configSchema = z.object({
   defaultAdapter: z.string().min(1),
   sessionsDir: z.string().min(1).default(".agent-relay/sessions"),
@@ -125,6 +133,8 @@ var configSchema = z.object({
   adapters: z.record(adapterConfigSchema),
   /** Which decider answers interactive prompts. */
   decider: deciderSchema.optional(),
+  /** Override the built-in per-model token pricing (USD per 1M tokens). */
+  pricing: z.array(pricingRuleSchema).optional(),
   /** Optional shell-command lifecycle hooks. */
   hooks: hooksSchema.optional()
 }).strict().superRefine((cfg, ctx) => {
@@ -145,9 +155,10 @@ function createDefaultConfig() {
       maxTurns: 20,
       timeoutMs: 18e5,
       idleTimeoutMs: 3e5,
-      // Autonomous by default: agents run unattended without asking.
+      // Autonomous by default: agents run unattended without asking, with full
+      // bypass so they can act freely (tighten `sandbox` for guarded runs).
       approvalPolicy: "auto",
-      sandbox: "workspace-write",
+      sandbox: "danger-full-access",
       requireApprovalOnRiskyActions: false
     },
     // Interactive (PTY) is the default mode: the agent runs in its real TUI and
@@ -1011,6 +1022,7 @@ function runPtySession(opts, ctx) {
     const triggerQuit = (reason) => {
       if (finished || quitting) return;
       quitting = true;
+      settled = true;
       if (completionTimer) clearTimeout(completionTimer);
       completionTimer = void 0;
       emit("status", `task appears complete (${reason})`);
@@ -1230,7 +1242,14 @@ function runPtySession(opts, ctx) {
         exitCode,
         error,
         sessionRef: void 0,
-        meta: { interactions, settled, ...usage ? { usage } : {} }
+        // `completedCleanly` is an unambiguous success flag (exit 0, not aborted/
+        // error) — use it instead of `settled` for "did the run finish well?".
+        meta: {
+          interactions,
+          settled,
+          completedCleanly: success,
+          ...usage ? { usage } : {}
+        }
       });
     });
     if (ctx.signal.aborted) onAbort();
@@ -1330,8 +1349,17 @@ var InteractivePtyAdapter = class {
     else args.push(input.prompt);
     return this.spawn(args, initialInput, input, ctx);
   }
-  spawn(args, initialInput, input, ctx, extra) {
+  async spawn(args, initialInput, input, ctx, extra) {
     const decider = ctx.decider ?? new RuleDecider();
+    try {
+      const resolved = await which(this.cfg.command);
+      ctx.onEvent({
+        type: "status",
+        timestamp: (this.cfg.now ?? (() => /* @__PURE__ */ new Date()))().toISOString(),
+        text: resolved ? `resolved ${this.cfg.command} \u2192 ${resolved}` : `${this.cfg.command} not found on PATH; relying on spawn-time resolution`
+      });
+    } catch {
+    }
     return runPtySession(
       {
         command: this.cfg.command,
@@ -1370,8 +1398,8 @@ function scrapeClaudeStatusLine(text) {
     u.raw = ctx[0].trim().replace(/\s+/g, " ");
   }
   const cost = text.match(/session\s*\$\s*([\d.]+)/i);
-  if (cost) u.sessionCostUsd = Number(cost[1]);
-  return u.contextPercent !== void 0 || u.sessionCostUsd !== void 0 ? u : void 0;
+  if (cost) u.subscriptionSessionCostUsd = Number(cost[1]);
+  return u.contextPercent !== void 0 || u.subscriptionSessionCostUsd !== void 0 ? u : void 0;
 }
 var DEFINITION = {
   name: "claude",
@@ -1617,7 +1645,7 @@ var CodexInteractiveAdapter = class _CodexInteractiveAdapter extends Interactive
       now: opts.now,
       installHint: "Install the Codex CLI (`npm i -g @openai/codex`), run `codex login`, and ensure `codex` is on your PATH.",
       preArgs: (input) => {
-        const sandbox = input.approvalMode === "readonly" ? "read-only" : input.sandbox ?? "workspace-write";
+        const sandbox = input.approvalMode === "readonly" ? "read-only" : input.sandbox ?? "danger-full-access";
         const approval = input.approvalMode === "gated" ? "on-request" : "never";
         return ["-s", sandbox, "-a", approval];
       },
@@ -2150,6 +2178,27 @@ var RunLogger = class {
     this.append(lines.join("\n"));
   }
 };
+// src/core/pricing.ts
+var DEFAULT_PRICING = [
+  { match: /opus/i, pricing: { input: 15, output: 75, cacheWrite: 18.75, cacheRead: 1.5 } },
+  { match: /sonnet/i, pricing: { input: 3, output: 15, cacheWrite: 3.75, cacheRead: 0.3 } },
+  { match: /haiku/i, pricing: { input: 1, output: 5, cacheWrite: 1.25, cacheRead: 0.1 } }
+];
+function pricingForModel(model, overrides = []) {
+  if (!model) return void 0;
+  for (const rule of [...overrides, ...DEFAULT_PRICING]) {
+    if (rule.match.test(model)) return rule.pricing;
+  }
+  return void 0;
+}
+function computeCostUsd(usage, overrides = []) {
+  const p = pricingForModel(usage.model, overrides);
+  if (!p) return null;
+  const M = 1e6;
+  const cost = ((usage.inputTokens ?? 0) * p.input + (usage.outputTokens ?? 0) * p.output + (usage.cacheCreationTokens ?? 0) * (p.cacheWrite ?? p.input) + (usage.cachedInputTokens ?? 0) * (p.cacheRead ?? 0)) / M;
+  return Math.round(cost * 1e6) / 1e6;
+}
 var SessionManager = class {
   constructor(sessionsDir) {
     this.sessionsDir = sessionsDir;
@@ -2471,6 +2520,28 @@ async function runAgent(options) {
     }
     externalSignal?.removeEventListener("abort", onExternalAbort);
   }
+  if (result?.meta && typeof result.meta === "object") {
+    const usage = result.meta.usage;
+    if (usage && (usage.inputTokens != null || usage.outputTokens != null)) {
+      const overrides = (config.pricing ?? []).map((r) => ({
+        match: new RegExp(r.match, "i"),
+        pricing: {
+          input: r.input,
+          output: r.output,
+          cacheWrite: r.cacheWrite,
+          cacheRead: r.cacheRead
+        }
+      }));
+      usage.costUsd = computeCostUsd(usage, overrides);
+      if (usage.costUsd === null) {
+        onEvent({
+          type: "status",
+          timestamp: iso(),
+          text: `usage: no price for model ${usage.model ? `"${usage.model}"` : "(unknown)"} \u2192 costUsd=null (set config.pricing to override)`
+        });
+      }
+    }
+  }
   const status = detector.detect({ result, events, abortReason, error: runError }) ?? "failed";
   const endedAt = iso();
   const error = runError && !result?.error ? { message: runError.message, code: "ADAPTER_THREW" } : result?.error ?? (abortReason === "timeout" || abortReason === "idle" ? {

package/dist/index.d.ts CHANGED Viewed

@@ -265,9 +265,11 @@ interface AgentRunInput {
  * Resource usage for a run, exposed under `result.meta.usage`. Every field is
  * optional. Token counts come from the agent's own session transcript/rollout
  * JSONL (the AUTHORITATIVE, device-independent source — written on every run
- * regardless of TUI/status-line settings); `contextPercent` / `sessionCostUsd`
- * are best-effort extras scraped from the TUI status line when it is enabled.
- * `source` records which path produced the token figures.
+ * regardless of TUI/status-line settings); `contextPercent` /
+ * `subscriptionSessionCostUsd` are best-effort extras scraped from the TUI status
+ * line when it is enabled. `costUsd` is COMPUTED from the token counts × the
+ * model's list price (see core/pricing.ts). `source` records which path produced
+ * the token figures.
  */
 interface AgentUsage {
     /**
@@ -291,10 +293,23 @@ interface AgentUsage {
     reasoningTokens?: number;
     /** Total tokens (the agent's own total when given, else the sum of the above). */
     totalTokens?: number;
+    /**
+     * API-equivalent REFERENCE cost in USD, COMPUTED as Σ(tokens × the model's list
+     * price) — see core/pricing.ts. This is what the run WOULD bill via the API
+     * (or `claude -p` / the Agent SDK). agent-relay drives the agent's INTERACTIVE
+     * TUI, which on a Claude subscription is covered by the plan — so for those runs
+     * the real marginal cost is ~0 and this is a SHADOW cost, not the amount charged.
+     * `null` when the model's price is unknown (so it isn't confused with a real 0).
+     */
+    costUsd?: number | null;
     /** Context-window usage as a percent, when surfaced by the status line. */
     contextPercent?: number;
-    /** Session cost (USD) when the agent reports it; reads 0 on subscription billing. */
-    sessionCostUsd?: number;
+    /**
+     * The NOMINAL session cost the agent prints in its status line ("Session $X").
+     * On Team/Max subscription seats this reads 0 / is not the API price — use
+     * {@link costUsd} for an actual cost estimate.
+     */
+    subscriptionSessionCostUsd?: number;
     /** Raw status-line snippet the scraped extras came from (status-line only). */
     raw?: string;
 }
@@ -498,7 +513,9 @@ declare const defaultsSchema: z.ZodObject<{
 type RelayDefaults = z.infer<typeof defaultsSchema>;
 /** Resolve the effective approval mode (adapter override > defaults > legacy). */
 declare function resolveApprovalMode(defaults: RelayDefaults, adapter?: AdapterConfig): ApprovalMode;
-/** Resolve the effective sandbox level (adapter override > defaults > default). */
+/** Resolve the effective sandbox level (adapter override > defaults > default).
+ *  The project default is full bypass (`danger-full-access`) so unattended runs
+ *  "just work"; tighten it per `defaults.sandbox` / `adapters.<name>.sandbox`. */
 declare function resolveSandbox(defaults: RelayDefaults, adapter?: AdapterConfig): SandboxLevel;
 declare const hooksSchema: z.ZodObject<{
     /** Shell command run just before the agent starts. */
@@ -674,6 +691,27 @@ declare const configSchema: z.ZodEffects<z.ZodObject<{
         apiKey?: string | undefined;
         maxTokens?: number | undefined;
     }>>;
+    /** Override the built-in per-model token pricing (USD per 1M tokens). */
+    pricing: z.ZodOptional<z.ZodArray<z.ZodObject<{
+        /** Regex (case-insensitive) matched against the model id; first match wins. */
+        match: z.ZodString;
+        input: z.ZodNumber;
+        output: z.ZodNumber;
+        cacheWrite: z.ZodOptional<z.ZodNumber>;
+        cacheRead: z.ZodOptional<z.ZodNumber>;
+    }, "strict", z.ZodTypeAny, {
+        match: string;
+        input: number;
+        output: number;
+        cacheWrite?: number | undefined;
+        cacheRead?: number | undefined;
+    }, {
+        match: string;
+        input: number;
+        output: number;
+        cacheWrite?: number | undefined;
+        cacheRead?: number | undefined;
+    }>, "many">>;
     /** Optional shell-command lifecycle hooks. */
     hooks: z.ZodOptional<z.ZodObject<{
         /** Shell command run just before the agent starts. */
@@ -723,6 +761,13 @@ declare const configSchema: z.ZodEffects<z.ZodObject<{
         apiKey?: string | undefined;
         maxTokens?: number | undefined;
     } | undefined;
+    pricing?: {
+        match: string;
+        input: number;
+        output: number;
+        cacheWrite?: number | undefined;
+        cacheRead?: number | undefined;
+    }[] | undefined;
     hooks?: {
         onStart?: string | undefined;
         onComplete?: string | undefined;
@@ -763,6 +808,13 @@ declare const configSchema: z.ZodEffects<z.ZodObject<{
         apiKey?: string | undefined;
         maxTokens?: number | undefined;
     } | undefined;
+    pricing?: {
+        match: string;
+        input: number;
+        output: number;
+        cacheWrite?: number | undefined;
+        cacheRead?: number | undefined;
+    }[] | undefined;
     hooks?: {
         onStart?: string | undefined;
         onComplete?: string | undefined;
@@ -803,6 +855,13 @@ declare const configSchema: z.ZodEffects<z.ZodObject<{
         apiKey?: string | undefined;
         maxTokens?: number | undefined;
     } | undefined;
+    pricing?: {
+        match: string;
+        input: number;
+        output: number;
+        cacheWrite?: number | undefined;
+        cacheRead?: number | undefined;
+    }[] | undefined;
     hooks?: {
         onStart?: string | undefined;
         onComplete?: string | undefined;
@@ -843,6 +902,13 @@ declare const configSchema: z.ZodEffects<z.ZodObject<{
         apiKey?: string | undefined;
         maxTokens?: number | undefined;
     } | undefined;
+    pricing?: {
+        match: string;
+        input: number;
+        output: number;
+        cacheWrite?: number | undefined;
+        cacheRead?: number | undefined;
+    }[] | undefined;
     hooks?: {
         onStart?: string | undefined;
         onComplete?: string | undefined;
@@ -893,6 +959,64 @@ declare class SessionNotFoundError extends AgentRelayError {
     constructor(sessionId: string);
 }
+/**
+ * Per-model token pricing → an API-equivalent USD cost for a run's token usage.
+ *
+ * agent-relay reads exact token counts from the agent's transcript (see
+ * `claude-session.ts` / `codex-session.ts`), but the transcript carries no dollar
+ * figure, and the TUI "Session $" line is a NOMINAL/subscription number (often $0
+ * on Team/Max seats) — NOT the API-equivalent cost. So we compute
+ * `cost = Σ(tokens × list rate)` from a built-in price table (overridable via
+ * `config.pricing`). Rates are USD per 1M tokens and are LIST prices, so the
+ * result is an ESTIMATE — override the table when your actual billing differs.
+ *
+ * IMPORTANT: this is a REFERENCE (shadow) cost — what the run would bill via the
+ * API. agent-relay drives the agent's INTERACTIVE TUI, which on a Claude
+ * subscription is covered by the plan (the real marginal cost is ~0). Only the
+ * non-interactive paths (`claude -p` / the Agent SDK — and, from 2026-06-15, no
+ * longer counted against subscription limits) bill at these rates. So treat
+ * `costUsd` as "what this would cost via API", not necessarily what you're charged.
+ */
+interface ModelPricing {
+    /** USD per 1M input (prompt) tokens. */
+    input: number;
+    /** USD per 1M output (completion) tokens. */
+    output: number;
+    /** USD per 1M cache-CREATION tokens (defaults to `input` when unset). */
+    cacheWrite?: number;
+    /** USD per 1M cache-READ tokens (defaults to 0 when unset). */
+    cacheRead?: number;
+}
+interface PricingRule {
+    /** Matched (case-insensitive) against the model id — first match wins. */
+    match: RegExp;
+    pricing: ModelPricing;
+}
+/**
+ * Built-in Anthropic list prices (USD / 1M tokens). These are estimates that
+ * drift as pricing changes — pass `config.pricing` to override per model. Codex
+ * (OpenAI) models aren't listed here, so a Codex run reports `costUsd: null`
+ * unless you add a rule.
+ */
+declare const DEFAULT_PRICING: PricingRule[];
+/** Resolve the price for a model id (overrides first, then the built-in table). */
+declare function pricingForModel(model: string | undefined, overrides?: PricingRule[]): ModelPricing | undefined;
+/** Token usage shape needed to price a run. */
+interface CostUsageInput {
+    model?: string;
+    inputTokens?: number;
+    outputTokens?: number;
+    cacheCreationTokens?: number;
+    cachedInputTokens?: number;
+}
+/**
+ * API-equivalent USD cost for a run's token usage, or `null` when the model's
+ * price is unknown — so callers WARN instead of reporting a misleading `0`.
+ * Assumes Claude-style additive cache accounting (input excludes cached reads;
+ * cache reads are billed separately at the cheap `cacheRead` rate).
+ */
+declare function computeCostUsd(usage: CostUsageInput, overrides?: PricingRule[]): number | null;
 /** Session metadata persistence: create, save, load, list. */
 interface CreateSessionInput {
@@ -1473,11 +1597,13 @@ declare class ClaudeInteractiveAdapter extends InteractivePtyAdapter {
 /**
  * Codex driven interactively in a PTY. The project's concept is PURE AUTONOMY:
- * by default Codex runs with `-a never` (never ask) within the chosen sandbox,
- * so it just works. The {@link Decider} still handles the prompts that appear
- * anyway (the directory-trust dialog, etc.). `approvalPolicy: "gated"` switches
- * Codex to `-a on-request` so the decider sees each action; `"readonly"` runs it
- * read-only. The prompt is a positional arg so the TUI starts immediately.
+ * by default Codex runs with `-a never` (never ask) and `-s danger-full-access`
+ * (full bypass — the project default), so it just works unattended. The
+ * {@link Decider} still handles the prompts that appear anyway (the directory-
+ * trust dialog, etc.). `approvalPolicy: "gated"` switches Codex to `-a on-request`
+ * so the decider sees each action; `"readonly"` runs it read-only. Tighten the
+ * sandbox with `defaults.sandbox` / `adapters.codex.sandbox` (e.g. workspace-write).
+ * The prompt is a positional arg so the TUI starts immediately.
  */
 interface CodexInteractiveOptions {
@@ -1765,4 +1891,4 @@ declare function cleanTerminalText(input: string): string;
 /** Return the last `n` non-empty lines of cleaned text. */
 declare function tailLines(text: string, n: number): string[];
-export { type AbortReason, type AdapterAvailability, type AdapterConfig, type AdapterFactory, type AdapterListItem, type AdapterMode, AdapterRegistry, type AdapterRunContext, type AgentAdapter, type AgentAdapterDefinition, type AgentErrorInfo, type AgentEvent, type AgentEventType, AgentRelayError, type AgentRunInput, type AgentRunResult, type AgentSessionRef, type AgentUsage, AlwaysApproveDecider, ApiDecider, type ApprovalMode, BUILTIN_ADAPTER_DEFINITIONS, CONFIG_FILENAME, ClaudeInteractiveAdapter, CodexInteractiveAdapter, CommandDecider, type CommandPreview, type CompletionContext, type CompletionDetector, CompositeCompletionDetector, ConfigError, type CreateSessionInput, DEFAULT_DENY_PATTERNS, type Decider, type DeciderConfig, type DeciderConfigSchema, type DeciderFlags, type DecisionAction, DefaultCompletionDetector, DefaultKeymap, type DetectedPrompt, type DoctorReport, type FakeAdapterOptions, FakeAgentAdapter, FunctionDecider, type HooksConfig, type InitResult, type InteractionDecision, type InteractionKind, type InteractionRequest, type InteractiveAdapterConfig, InteractivePtyAdapter, OutputPatternDetector, PromptDetector, type PromptDetectorOptions, type PruneOptions, type PruneResult, type PtyKeymap, type PtySessionOptions, type RelayConfig, type RelayDefaults, type ResumeCommandResult, RuleDecider, type RunHooks, RunLogger, type RunLoggerOptions, type RunOutcome, type RunnerOptions, type SandboxLevel, type SessionListItem, SessionManager, type SessionMetadata, SessionNotFoundError, type SessionStatus, type ShellHookContext, type ShellHooks, UnknownAdapterError, adapterConfigSchema, approvalPolicySchema, cleanTerminalText, configPath, configSchema, createAdapterFactory, createDecider, createDefaultConfig, deciderConfigFromFlags, deciderSchema, defaultRegistry, defaultsSchema, hooksSchema, listAdapters, listSessions, loadConfig, loadConfigOrDefault, parseCheckbox, parseConfig, parseDecisionReply, pruneSessions, renderDecisionPrompt, resolveApprovalMode, resolvePrompt, resolveSandbox, resumeCommand, runAgent, runCommand, runDoctor, runInit, runPtySession, runShellHook, sandboxSchema, saveConfig, stringifyConfig, stripAnsi, tailLines };
+export { type AbortReason, type AdapterAvailability, type AdapterConfig, type AdapterFactory, type AdapterListItem, type AdapterMode, AdapterRegistry, type AdapterRunContext, type AgentAdapter, type AgentAdapterDefinition, type AgentErrorInfo, type AgentEvent, type AgentEventType, AgentRelayError, type AgentRunInput, type AgentRunResult, type AgentSessionRef, type AgentUsage, AlwaysApproveDecider, ApiDecider, type ApprovalMode, BUILTIN_ADAPTER_DEFINITIONS, CONFIG_FILENAME, ClaudeInteractiveAdapter, CodexInteractiveAdapter, CommandDecider, type CommandPreview, type CompletionContext, type CompletionDetector, CompositeCompletionDetector, ConfigError, type CreateSessionInput, DEFAULT_DENY_PATTERNS, DEFAULT_PRICING, type Decider, type DeciderConfig, type DeciderConfigSchema, type DeciderFlags, type DecisionAction, DefaultCompletionDetector, DefaultKeymap, type DetectedPrompt, type DoctorReport, type FakeAdapterOptions, FakeAgentAdapter, FunctionDecider, type HooksConfig, type InitResult, type InteractionDecision, type InteractionKind, type InteractionRequest, type InteractiveAdapterConfig, InteractivePtyAdapter, type ModelPricing, OutputPatternDetector, type PricingRule, PromptDetector, type PromptDetectorOptions, type PruneOptions, type PruneResult, type PtyKeymap, type PtySessionOptions, type RelayConfig, type RelayDefaults, type ResumeCommandResult, RuleDecider, type RunHooks, RunLogger, type RunLoggerOptions, type RunOutcome, type RunnerOptions, type SandboxLevel, type SessionListItem, SessionManager, type SessionMetadata, SessionNotFoundError, type SessionStatus, type ShellHookContext, type ShellHooks, UnknownAdapterError, adapterConfigSchema, approvalPolicySchema, cleanTerminalText, computeCostUsd, configPath, configSchema, createAdapterFactory, createDecider, createDefaultConfig, deciderConfigFromFlags, deciderSchema, defaultRegistry, defaultsSchema, hooksSchema, listAdapters, listSessions, loadConfig, loadConfigOrDefault, parseCheckbox, parseConfig, parseDecisionReply, pricingForModel, pruneSessions, renderDecisionPrompt, resolveApprovalMode, resolvePrompt, resolveSandbox, resumeCommand, runAgent, runCommand, runDoctor, runInit, runPtySession, runShellHook, sandboxSchema, saveConfig, stringifyConfig, stripAnsi, tailLines };

package/dist/index.js CHANGED Viewed

@@ -92,7 +92,7 @@ function resolveApprovalMode(defaults, adapter) {
   return adapter?.approvalPolicy ?? defaults.approvalPolicy ?? (defaults.requireApprovalOnRiskyActions ? "gated" : "auto");
 }
 function resolveSandbox(defaults, adapter) {
-  return adapter?.sandbox ?? defaults.sandbox ?? "workspace-write";
+  return adapter?.sandbox ?? defaults.sandbox ?? "danger-full-access";
 }
 var hooksSchema = z.object({
   /** Shell command run just before the agent starts. */
@@ -119,6 +119,14 @@ var deciderSchema = z.object({
   apiKey: z.string().optional(),
   maxTokens: z.number().int().positive().optional()
 }).strict();
+var pricingRuleSchema = z.object({
+  /** Regex (case-insensitive) matched against the model id; first match wins. */
+  match: z.string().min(1),
+  input: z.number().nonnegative(),
+  output: z.number().nonnegative(),
+  cacheWrite: z.number().nonnegative().optional(),
+  cacheRead: z.number().nonnegative().optional()
+}).strict();
 var configSchema = z.object({
   defaultAdapter: z.string().min(1),
   sessionsDir: z.string().min(1).default(".agent-relay/sessions"),
@@ -127,6 +135,8 @@ var configSchema = z.object({
   adapters: z.record(adapterConfigSchema),
   /** Which decider answers interactive prompts. */
   decider: deciderSchema.optional(),
+  /** Override the built-in per-model token pricing (USD per 1M tokens). */
+  pricing: z.array(pricingRuleSchema).optional(),
   /** Optional shell-command lifecycle hooks. */
   hooks: hooksSchema.optional()
 }).strict().superRefine((cfg, ctx) => {
@@ -147,9 +157,10 @@ function createDefaultConfig() {
       maxTurns: 20,
       timeoutMs: 18e5,
       idleTimeoutMs: 3e5,
-      // Autonomous by default: agents run unattended without asking.
+      // Autonomous by default: agents run unattended without asking, with full
+      // bypass so they can act freely (tighten `sandbox` for guarded runs).
       approvalPolicy: "auto",
-      sandbox: "workspace-write",
+      sandbox: "danger-full-access",
       requireApprovalOnRiskyActions: false
     },
     // Interactive (PTY) is the default mode: the agent runs in its real TUI and
@@ -231,6 +242,27 @@ async function saveConfig(rootDir, config) {
   await promises.writeFile(file, stringifyConfig(config), "utf8");
   return file;
 }
+// src/core/pricing.ts
+var DEFAULT_PRICING = [
+  { match: /opus/i, pricing: { input: 15, output: 75, cacheWrite: 18.75, cacheRead: 1.5 } },
+  { match: /sonnet/i, pricing: { input: 3, output: 15, cacheWrite: 3.75, cacheRead: 0.3 } },
+  { match: /haiku/i, pricing: { input: 1, output: 5, cacheWrite: 1.25, cacheRead: 0.1 } }
+];
+function pricingForModel(model, overrides = []) {
+  if (!model) return void 0;
+  for (const rule of [...overrides, ...DEFAULT_PRICING]) {
+    if (rule.match.test(model)) return rule.pricing;
+  }
+  return void 0;
+}
+function computeCostUsd(usage, overrides = []) {
+  const p = pricingForModel(usage.model, overrides);
+  if (!p) return null;
+  const M = 1e6;
+  const cost = ((usage.inputTokens ?? 0) * p.input + (usage.outputTokens ?? 0) * p.output + (usage.cacheCreationTokens ?? 0) * (p.cacheWrite ?? p.input) + (usage.cachedInputTokens ?? 0) * (p.cacheRead ?? 0)) / M;
+  return Math.round(cost * 1e6) / 1e6;
+}
 var SessionManager = class {
   constructor(sessionsDir) {
     this.sessionsDir = sessionsDir;
@@ -1031,6 +1063,28 @@ async function runAgent(options) {
     }
     externalSignal?.removeEventListener("abort", onExternalAbort);
   }
+  if (result?.meta && typeof result.meta === "object") {
+    const usage = result.meta.usage;
+    if (usage && (usage.inputTokens != null || usage.outputTokens != null)) {
+      const overrides = (config.pricing ?? []).map((r) => ({
+        match: new RegExp(r.match, "i"),
+        pricing: {
+          input: r.input,
+          output: r.output,
+          cacheWrite: r.cacheWrite,
+          cacheRead: r.cacheRead
+        }
+      }));
+      usage.costUsd = computeCostUsd(usage, overrides);
+      if (usage.costUsd === null) {
+        onEvent({
+          type: "status",
+          timestamp: iso(),
+          text: `usage: no price for model ${usage.model ? `"${usage.model}"` : "(unknown)"} \u2192 costUsd=null (set config.pricing to override)`
+        });
+      }
+    }
+  }
   const status = detector.detect({ result, events, abortReason, error: runError }) ?? "failed";
   const endedAt = iso();
   const error = runError && !result?.error ? { message: runError.message, code: "ADAPTER_THREW" } : result?.error ?? (abortReason === "timeout" || abortReason === "idle" ? {
@@ -1557,6 +1611,7 @@ function runPtySession(opts, ctx) {
     const triggerQuit = (reason) => {
       if (finished || quitting) return;
       quitting = true;
+      settled = true;
       if (completionTimer) clearTimeout(completionTimer);
       completionTimer = void 0;
       emit("status", `task appears complete (${reason})`);
@@ -1776,7 +1831,14 @@ function runPtySession(opts, ctx) {
         exitCode,
         error,
         sessionRef: void 0,
-        meta: { interactions, settled, ...usage ? { usage } : {} }
+        // `completedCleanly` is an unambiguous success flag (exit 0, not aborted/
+        // error) — use it instead of `settled` for "did the run finish well?".
+        meta: {
+          interactions,
+          settled,
+          completedCleanly: success,
+          ...usage ? { usage } : {}
+        }
       });
     });
     if (ctx.signal.aborted) onAbort();
@@ -1876,8 +1938,17 @@ var InteractivePtyAdapter = class {
     else args.push(input.prompt);
     return this.spawn(args, initialInput, input, ctx);
   }
-  spawn(args, initialInput, input, ctx, extra) {
+  async spawn(args, initialInput, input, ctx, extra) {
     const decider = ctx.decider ?? new RuleDecider();
+    try {
+      const resolved = await which(this.cfg.command);
+      ctx.onEvent({
+        type: "status",
+        timestamp: (this.cfg.now ?? (() => /* @__PURE__ */ new Date()))().toISOString(),
+        text: resolved ? `resolved ${this.cfg.command} \u2192 ${resolved}` : `${this.cfg.command} not found on PATH; relying on spawn-time resolution`
+      });
+    } catch {
+    }
     return runPtySession(
       {
         command: this.cfg.command,
@@ -2010,8 +2081,8 @@ function scrapeClaudeStatusLine(text) {
     u.raw = ctx[0].trim().replace(/\s+/g, " ");
   }
   const cost = text.match(/session\s*\$\s*([\d.]+)/i);
-  if (cost) u.sessionCostUsd = Number(cost[1]);
-  return u.contextPercent !== void 0 || u.sessionCostUsd !== void 0 ? u : void 0;
+  if (cost) u.subscriptionSessionCostUsd = Number(cost[1]);
+  return u.contextPercent !== void 0 || u.subscriptionSessionCostUsd !== void 0 ? u : void 0;
 }
 var DEFINITION2 = {
   name: "claude",
@@ -2257,7 +2328,7 @@ var CodexInteractiveAdapter = class _CodexInteractiveAdapter extends Interactive
       now: opts.now,
       installHint: "Install the Codex CLI (`npm i -g @openai/codex`), run `codex login`, and ensure `codex` is on your PATH.",
       preArgs: (input) => {
-        const sandbox = input.approvalMode === "readonly" ? "read-only" : input.sandbox ?? "workspace-write";
+        const sandbox = input.approvalMode === "readonly" ? "read-only" : input.sandbox ?? "danger-full-access";
         const approval = input.approvalMode === "gated" ? "on-request" : "never";
         return ["-s", sandbox, "-a", approval];
       },
@@ -2742,4 +2813,4 @@ async function pruneSessions(opts) {
   };
 }
-export { AdapterRegistry, AgentRelayError, AlwaysApproveDecider, ApiDecider, BUILTIN_ADAPTER_DEFINITIONS, CONFIG_FILENAME, ClaudeInteractiveAdapter, CodexInteractiveAdapter, CommandDecider, CompositeCompletionDetector, ConfigError, DEFAULT_DENY_PATTERNS, DefaultCompletionDetector, DefaultKeymap, FakeAgentAdapter, FunctionDecider, InteractivePtyAdapter, OutputPatternDetector, PromptDetector, RuleDecider, RunLogger, SessionManager, SessionNotFoundError, UnknownAdapterError, adapterConfigSchema, approvalPolicySchema, cleanTerminalText, configPath, configSchema, createAdapterFactory, createDecider, createDefaultConfig, deciderConfigFromFlags, deciderSchema, defaultRegistry, defaultsSchema, hooksSchema, listAdapters, listSessions, loadConfig, loadConfigOrDefault, parseCheckbox, parseConfig, parseDecisionReply, pruneSessions, renderDecisionPrompt, resolveApprovalMode, resolvePrompt, resolveSandbox, resumeCommand, runAgent, runCommand, runDoctor, runInit, runPtySession, runShellHook, sandboxSchema, saveConfig, stringifyConfig, stripAnsi, tailLines };
+export { AdapterRegistry, AgentRelayError, AlwaysApproveDecider, ApiDecider, BUILTIN_ADAPTER_DEFINITIONS, CONFIG_FILENAME, ClaudeInteractiveAdapter, CodexInteractiveAdapter, CommandDecider, CompositeCompletionDetector, ConfigError, DEFAULT_DENY_PATTERNS, DEFAULT_PRICING, DefaultCompletionDetector, DefaultKeymap, FakeAgentAdapter, FunctionDecider, InteractivePtyAdapter, OutputPatternDetector, PromptDetector, RuleDecider, RunLogger, SessionManager, SessionNotFoundError, UnknownAdapterError, adapterConfigSchema, approvalPolicySchema, cleanTerminalText, computeCostUsd, configPath, configSchema, createAdapterFactory, createDecider, createDefaultConfig, deciderConfigFromFlags, deciderSchema, defaultRegistry, defaultsSchema, hooksSchema, listAdapters, listSessions, loadConfig, loadConfigOrDefault, parseCheckbox, parseConfig, parseDecisionReply, pricingForModel, pruneSessions, renderDecisionPrompt, resolveApprovalMode, resolvePrompt, resolveSandbox, resumeCommand, runAgent, runCommand, runDoctor, runInit, runPtySession, runShellHook, sandboxSchema, saveConfig, stringifyConfig, stripAnsi, tailLines };

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@yul-labs/agent-relay",
-  "version": "0.1.2",
+  "version": "0.1.3",
   "description": "Vendor-agnostic LLM coding agent session orchestrator (Claude, Codex, ...)",
   "type": "module",
   "packageManager": "pnpm@10.25.0",