@yul-labs/agent-relay 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -99,8 +99,10 @@ prompt ──> spawn agent in a PTY (its real TUI) ──> watch terminal output
99
99
  ```
100
100
 
101
101
  1. The agent is launched **interactively** in a pseudo-terminal under **pure
102
- autonomy** (the project's concept): Claude with `--dangerously-skip-permissions`,
103
- Codex with `-s workspace-write -a never`. The agent rarely asks — but the
102
+ autonomy** (the project's concept): Claude with `--dangerously-skip-permissions`
103
+ (+ `--effort xhigh` by default), Codex with `-s danger-full-access -a never`
104
+ (full bypass by default — tighten with `defaults.sandbox` /
105
+ `adapters.codex.sandbox`, e.g. `workspace-write`). The agent rarely asks — but the
104
106
  prompts that *still* appear (the directory-trust dialog, the occasional choice)
105
107
  are what the Decider answers. `approvalPolicy: "gated"` makes the agent ask
106
108
  more — and routes those asks to the Decider — via Claude `--permission-mode
@@ -161,15 +163,31 @@ attempted. Two things you *can* rely on:
161
163
  session transcript (`~/.claude/projects/…` for Claude, `~/.codex/sessions/…` for
162
164
  Codex) — so it works on every machine **regardless of TUI / status-line
163
165
  settings**, and the token counts are the API's real numbers, not a scrape.
164
- Fields: `{ source, model, inputTokens, outputTokens, cachedInputTokens,
165
- cacheCreationTokens, reasoningTokens, totalTokens }`. `contextPercent` and
166
- `sessionCostUsd` are best-effort *extras* scraped from the status line only when
167
- it is enabled (cost reads `0` on subscription/Team billing). `source` is
168
- `"transcript"` (authoritative) or `"status-line"` (scrape) so you know which.
169
-
170
- The proper long-term fix for the *answer text* is the **structured-first** path
171
- (Claude `--input-format stream-json` / Codex's JSON protocol) — see *Remaining
172
- TODO*.
166
+ Token fields: `{ source, model, inputTokens, outputTokens, cachedInputTokens,
167
+ cacheCreationTokens, reasoningTokens, totalTokens }`.
168
+ - **Cost is COMPUTED, not scraped** `usage.costUsd` = Σ(tokens × the model's
169
+ list price) from a built-in table (override per `config.pricing`). It is `null`
170
+ (with a warning) when the model's price is unknown, so you can tell "unpriced"
171
+ from a real `0`. The scraped nominal "Session $" (often `0` on Team/Max seats) is
172
+ kept separately as `usage.subscriptionSessionCostUsd`; `usage.contextPercent` is
173
+ another status-line extra. `source` is `"transcript"` (authoritative tokens) or
174
+ `"status-line"`.
175
+ > **`costUsd` is a REFERENCE (shadow) cost** — what the run *would* bill via the
176
+ > API. agent-relay drives the **interactive TUI**, which on a Claude subscription
177
+ > is covered by the plan, so for those runs the real marginal cost is ~0 and
178
+ > `costUsd` is a reference, not the amount charged. Only the non-interactive paths
179
+ > (`claude -p` / Agent SDK — and, per Anthropic, from **2026-06-15** no longer
180
+ > counted against subscription limits) actually bill at these rates. Staying on
181
+ > the **interactive** PTY path is what keeps unattended runs on the subscription.
182
+
183
+ A **structured-first** backend (Claude `--output-format stream-json`) *would*
184
+ return the answer text + usage + cost natively — but it **requires `-p`/`--print`
185
+ (non-interactive)**, which is a different execution model from the TUI and, on a
186
+ Claude subscription, is **not covered by the plan** (per Anthropic, `claude -p` /
187
+ Agent SDK fall outside subscription limits from 2026-06-15). So driving the
188
+ **interactive TUI** here is deliberate: it keeps unattended runs on the
189
+ subscription. The file-write pattern is the supported way to recover answers
190
+ without leaving that path.
173
191
 
174
192
  ## The Decider
175
193
 
@@ -342,6 +360,28 @@ see [`agent-relay.config.example.json`](./agent-relay.config.example.json)):
342
360
  cancelled | waiting_approval }`. The `CompletionDetector` maps the run to a
343
361
  terminal status (timeout/idle → `timeout`, cancel → `cancelled`, success/exit
344
362
  0 → `completed`, else `failed`) and is extensible.
363
+ - `meta.completedCleanly` (exit 0, not aborted) is the unambiguous success flag;
364
+ `meta.settled` is `true` when the run answered a prompt **or** completed cleanly
365
+ (so a 0-interaction skip-mode run is `settled: true`, not falsely `false`).
366
+
367
+ ## Operational notes
368
+
369
+ - **It writes into the target repo.** `sessionsDir` / `logsDir` default to
370
+ `<root>/.agent-relay/{sessions,logs}` — add **`.agent-relay/`** to the target
371
+ repo's `.gitignore`, or point them elsewhere (absolute / out-of-root paths work).
372
+ `--dry-run` still records a session entry, clearly marked `meta.dryRun: true`.
373
+ - **Pin the binary.** Without `adapters.<name>.command`, the first `claude` /
374
+ `codex` on `PATH` is used — ambiguous when several exist (version managers, shell
375
+ wrappers, `cmux`). Set `command` to an absolute path; either way the **resolved
376
+ path is logged** at run start (`resolved claude → …`).
377
+ - **Effort/cost defaults.** Claude runs at `--effort xhigh` and Codex at
378
+ `-s danger-full-access` by **default** (full autonomy) — both raise cost/latency
379
+ and access; override with `adapters.<name>.args` / `defaults.sandbox`.
380
+ - **Reasoning-model deciders.** An `api` decider pointed at a reasoning model
381
+ (e.g. gpt-oss) spends tokens on `reasoning_content` before the JSON answer; too
382
+ small a `--decider-max-tokens` empties `content`. agent-relay clamps it up to a
383
+ **512 floor** and falls back to parsing `reasoning_content`, so a normal config
384
+ won't return an empty decision — but give verbose reasoners more headroom.
345
385
 
346
386
  ## Architecture
347
387
 
package/dist/cli.js CHANGED
@@ -90,7 +90,7 @@ function resolveApprovalMode(defaults, adapter) {
90
90
  return adapter?.approvalPolicy ?? defaults.approvalPolicy ?? (defaults.requireApprovalOnRiskyActions ? "gated" : "auto");
91
91
  }
92
92
  function resolveSandbox(defaults, adapter) {
93
- return adapter?.sandbox ?? defaults.sandbox ?? "workspace-write";
93
+ return adapter?.sandbox ?? defaults.sandbox ?? "danger-full-access";
94
94
  }
95
95
  var hooksSchema = z.object({
96
96
  /** Shell command run just before the agent starts. */
@@ -117,6 +117,14 @@ var deciderSchema = z.object({
117
117
  apiKey: z.string().optional(),
118
118
  maxTokens: z.number().int().positive().optional()
119
119
  }).strict();
120
+ var pricingRuleSchema = z.object({
121
+ /** Regex (case-insensitive) matched against the model id; first match wins. */
122
+ match: z.string().min(1),
123
+ input: z.number().nonnegative(),
124
+ output: z.number().nonnegative(),
125
+ cacheWrite: z.number().nonnegative().optional(),
126
+ cacheRead: z.number().nonnegative().optional()
127
+ }).strict();
120
128
  var configSchema = z.object({
121
129
  defaultAdapter: z.string().min(1),
122
130
  sessionsDir: z.string().min(1).default(".agent-relay/sessions"),
@@ -125,6 +133,8 @@ var configSchema = z.object({
125
133
  adapters: z.record(adapterConfigSchema),
126
134
  /** Which decider answers interactive prompts. */
127
135
  decider: deciderSchema.optional(),
136
+ /** Override the built-in per-model token pricing (USD per 1M tokens). */
137
+ pricing: z.array(pricingRuleSchema).optional(),
128
138
  /** Optional shell-command lifecycle hooks. */
129
139
  hooks: hooksSchema.optional()
130
140
  }).strict().superRefine((cfg, ctx) => {
@@ -145,9 +155,10 @@ function createDefaultConfig() {
145
155
  maxTurns: 20,
146
156
  timeoutMs: 18e5,
147
157
  idleTimeoutMs: 3e5,
148
- // Autonomous by default: agents run unattended without asking.
158
+ // Autonomous by default: agents run unattended without asking, with full
159
+ // bypass so they can act freely (tighten `sandbox` for guarded runs).
149
160
  approvalPolicy: "auto",
150
- sandbox: "workspace-write",
161
+ sandbox: "danger-full-access",
151
162
  requireApprovalOnRiskyActions: false
152
163
  },
153
164
  // Interactive (PTY) is the default mode: the agent runs in its real TUI and
@@ -1011,6 +1022,7 @@ function runPtySession(opts, ctx) {
1011
1022
  const triggerQuit = (reason) => {
1012
1023
  if (finished || quitting) return;
1013
1024
  quitting = true;
1025
+ settled = true;
1014
1026
  if (completionTimer) clearTimeout(completionTimer);
1015
1027
  completionTimer = void 0;
1016
1028
  emit("status", `task appears complete (${reason})`);
@@ -1230,7 +1242,14 @@ function runPtySession(opts, ctx) {
1230
1242
  exitCode,
1231
1243
  error,
1232
1244
  sessionRef: void 0,
1233
- meta: { interactions, settled, ...usage ? { usage } : {} }
1245
+ // `completedCleanly` is an unambiguous success flag (exit 0, not aborted/
1246
+ // error) — use it instead of `settled` for "did the run finish well?".
1247
+ meta: {
1248
+ interactions,
1249
+ settled,
1250
+ completedCleanly: success,
1251
+ ...usage ? { usage } : {}
1252
+ }
1234
1253
  });
1235
1254
  });
1236
1255
  if (ctx.signal.aborted) onAbort();
@@ -1330,8 +1349,17 @@ var InteractivePtyAdapter = class {
1330
1349
  else args.push(input.prompt);
1331
1350
  return this.spawn(args, initialInput, input, ctx);
1332
1351
  }
1333
- spawn(args, initialInput, input, ctx, extra) {
1352
+ async spawn(args, initialInput, input, ctx, extra) {
1334
1353
  const decider = ctx.decider ?? new RuleDecider();
1354
+ try {
1355
+ const resolved = await which(this.cfg.command);
1356
+ ctx.onEvent({
1357
+ type: "status",
1358
+ timestamp: (this.cfg.now ?? (() => /* @__PURE__ */ new Date()))().toISOString(),
1359
+ text: resolved ? `resolved ${this.cfg.command} \u2192 ${resolved}` : `${this.cfg.command} not found on PATH; relying on spawn-time resolution`
1360
+ });
1361
+ } catch {
1362
+ }
1335
1363
  return runPtySession(
1336
1364
  {
1337
1365
  command: this.cfg.command,
@@ -1370,8 +1398,8 @@ function scrapeClaudeStatusLine(text) {
1370
1398
  u.raw = ctx[0].trim().replace(/\s+/g, " ");
1371
1399
  }
1372
1400
  const cost = text.match(/session\s*\$\s*([\d.]+)/i);
1373
- if (cost) u.sessionCostUsd = Number(cost[1]);
1374
- return u.contextPercent !== void 0 || u.sessionCostUsd !== void 0 ? u : void 0;
1401
+ if (cost) u.subscriptionSessionCostUsd = Number(cost[1]);
1402
+ return u.contextPercent !== void 0 || u.subscriptionSessionCostUsd !== void 0 ? u : void 0;
1375
1403
  }
1376
1404
  var DEFINITION = {
1377
1405
  name: "claude",
@@ -1617,7 +1645,7 @@ var CodexInteractiveAdapter = class _CodexInteractiveAdapter extends Interactive
1617
1645
  now: opts.now,
1618
1646
  installHint: "Install the Codex CLI (`npm i -g @openai/codex`), run `codex login`, and ensure `codex` is on your PATH.",
1619
1647
  preArgs: (input) => {
1620
- const sandbox = input.approvalMode === "readonly" ? "read-only" : input.sandbox ?? "workspace-write";
1648
+ const sandbox = input.approvalMode === "readonly" ? "read-only" : input.sandbox ?? "danger-full-access";
1621
1649
  const approval = input.approvalMode === "gated" ? "on-request" : "never";
1622
1650
  return ["-s", sandbox, "-a", approval];
1623
1651
  },
@@ -2150,6 +2178,27 @@ var RunLogger = class {
2150
2178
  this.append(lines.join("\n"));
2151
2179
  }
2152
2180
  };
2181
+
2182
+ // src/core/pricing.ts
2183
+ var DEFAULT_PRICING = [
2184
+ { match: /opus/i, pricing: { input: 15, output: 75, cacheWrite: 18.75, cacheRead: 1.5 } },
2185
+ { match: /sonnet/i, pricing: { input: 3, output: 15, cacheWrite: 3.75, cacheRead: 0.3 } },
2186
+ { match: /haiku/i, pricing: { input: 1, output: 5, cacheWrite: 1.25, cacheRead: 0.1 } }
2187
+ ];
2188
+ function pricingForModel(model, overrides = []) {
2189
+ if (!model) return void 0;
2190
+ for (const rule of [...overrides, ...DEFAULT_PRICING]) {
2191
+ if (rule.match.test(model)) return rule.pricing;
2192
+ }
2193
+ return void 0;
2194
+ }
2195
+ function computeCostUsd(usage, overrides = []) {
2196
+ const p = pricingForModel(usage.model, overrides);
2197
+ if (!p) return null;
2198
+ const M = 1e6;
2199
+ const cost = ((usage.inputTokens ?? 0) * p.input + (usage.outputTokens ?? 0) * p.output + (usage.cacheCreationTokens ?? 0) * (p.cacheWrite ?? p.input) + (usage.cachedInputTokens ?? 0) * (p.cacheRead ?? 0)) / M;
2200
+ return Math.round(cost * 1e6) / 1e6;
2201
+ }
2153
2202
  var SessionManager = class {
2154
2203
  constructor(sessionsDir) {
2155
2204
  this.sessionsDir = sessionsDir;
@@ -2471,6 +2520,28 @@ async function runAgent(options) {
2471
2520
  }
2472
2521
  externalSignal?.removeEventListener("abort", onExternalAbort);
2473
2522
  }
2523
+ if (result?.meta && typeof result.meta === "object") {
2524
+ const usage = result.meta.usage;
2525
+ if (usage && (usage.inputTokens != null || usage.outputTokens != null)) {
2526
+ const overrides = (config.pricing ?? []).map((r) => ({
2527
+ match: new RegExp(r.match, "i"),
2528
+ pricing: {
2529
+ input: r.input,
2530
+ output: r.output,
2531
+ cacheWrite: r.cacheWrite,
2532
+ cacheRead: r.cacheRead
2533
+ }
2534
+ }));
2535
+ usage.costUsd = computeCostUsd(usage, overrides);
2536
+ if (usage.costUsd === null) {
2537
+ onEvent({
2538
+ type: "status",
2539
+ timestamp: iso(),
2540
+ text: `usage: no price for model ${usage.model ? `"${usage.model}"` : "(unknown)"} \u2192 costUsd=null (set config.pricing to override)`
2541
+ });
2542
+ }
2543
+ }
2544
+ }
2474
2545
  const status = detector.detect({ result, events, abortReason, error: runError }) ?? "failed";
2475
2546
  const endedAt = iso();
2476
2547
  const error = runError && !result?.error ? { message: runError.message, code: "ADAPTER_THREW" } : result?.error ?? (abortReason === "timeout" || abortReason === "idle" ? {
package/dist/index.d.ts CHANGED
@@ -265,9 +265,11 @@ interface AgentRunInput {
265
265
  * Resource usage for a run, exposed under `result.meta.usage`. Every field is
266
266
  * optional. Token counts come from the agent's own session transcript/rollout
267
267
  * JSONL (the AUTHORITATIVE, device-independent source — written on every run
268
- * regardless of TUI/status-line settings); `contextPercent` / `sessionCostUsd`
269
- * are best-effort extras scraped from the TUI status line when it is enabled.
270
- * `source` records which path produced the token figures.
268
+ * regardless of TUI/status-line settings); `contextPercent` /
269
+ * `subscriptionSessionCostUsd` are best-effort extras scraped from the TUI status
270
+ * line when it is enabled. `costUsd` is COMPUTED from the token counts × the
271
+ * model's list price (see core/pricing.ts). `source` records which path produced
272
+ * the token figures.
271
273
  */
272
274
  interface AgentUsage {
273
275
  /**
@@ -291,10 +293,23 @@ interface AgentUsage {
291
293
  reasoningTokens?: number;
292
294
  /** Total tokens (the agent's own total when given, else the sum of the above). */
293
295
  totalTokens?: number;
296
+ /**
297
+ * API-equivalent REFERENCE cost in USD, COMPUTED as Σ(tokens × the model's list
298
+ * price) — see core/pricing.ts. This is what the run WOULD bill via the API
299
+ * (or `claude -p` / the Agent SDK). agent-relay drives the agent's INTERACTIVE
300
+ * TUI, which on a Claude subscription is covered by the plan — so for those runs
301
+ * the real marginal cost is ~0 and this is a SHADOW cost, not the amount charged.
302
+ * `null` when the model's price is unknown (so it isn't confused with a real 0).
303
+ */
304
+ costUsd?: number | null;
294
305
  /** Context-window usage as a percent, when surfaced by the status line. */
295
306
  contextPercent?: number;
296
- /** Session cost (USD) when the agent reports it; reads 0 on subscription billing. */
297
- sessionCostUsd?: number;
307
+ /**
308
+ * The NOMINAL session cost the agent prints in its status line ("Session $X").
309
+ * On Team/Max subscription seats this reads 0 / is not the API price — use
310
+ * {@link costUsd} for an actual cost estimate.
311
+ */
312
+ subscriptionSessionCostUsd?: number;
298
313
  /** Raw status-line snippet the scraped extras came from (status-line only). */
299
314
  raw?: string;
300
315
  }
@@ -498,7 +513,9 @@ declare const defaultsSchema: z.ZodObject<{
498
513
  type RelayDefaults = z.infer<typeof defaultsSchema>;
499
514
  /** Resolve the effective approval mode (adapter override > defaults > legacy). */
500
515
  declare function resolveApprovalMode(defaults: RelayDefaults, adapter?: AdapterConfig): ApprovalMode;
501
- /** Resolve the effective sandbox level (adapter override > defaults > default). */
516
+ /** Resolve the effective sandbox level (adapter override > defaults > default).
517
+ * The project default is full bypass (`danger-full-access`) so unattended runs
518
+ * "just work"; tighten it per `defaults.sandbox` / `adapters.<name>.sandbox`. */
502
519
  declare function resolveSandbox(defaults: RelayDefaults, adapter?: AdapterConfig): SandboxLevel;
503
520
  declare const hooksSchema: z.ZodObject<{
504
521
  /** Shell command run just before the agent starts. */
@@ -674,6 +691,27 @@ declare const configSchema: z.ZodEffects<z.ZodObject<{
674
691
  apiKey?: string | undefined;
675
692
  maxTokens?: number | undefined;
676
693
  }>>;
694
+ /** Override the built-in per-model token pricing (USD per 1M tokens). */
695
+ pricing: z.ZodOptional<z.ZodArray<z.ZodObject<{
696
+ /** Regex (case-insensitive) matched against the model id; first match wins. */
697
+ match: z.ZodString;
698
+ input: z.ZodNumber;
699
+ output: z.ZodNumber;
700
+ cacheWrite: z.ZodOptional<z.ZodNumber>;
701
+ cacheRead: z.ZodOptional<z.ZodNumber>;
702
+ }, "strict", z.ZodTypeAny, {
703
+ match: string;
704
+ input: number;
705
+ output: number;
706
+ cacheWrite?: number | undefined;
707
+ cacheRead?: number | undefined;
708
+ }, {
709
+ match: string;
710
+ input: number;
711
+ output: number;
712
+ cacheWrite?: number | undefined;
713
+ cacheRead?: number | undefined;
714
+ }>, "many">>;
677
715
  /** Optional shell-command lifecycle hooks. */
678
716
  hooks: z.ZodOptional<z.ZodObject<{
679
717
  /** Shell command run just before the agent starts. */
@@ -723,6 +761,13 @@ declare const configSchema: z.ZodEffects<z.ZodObject<{
723
761
  apiKey?: string | undefined;
724
762
  maxTokens?: number | undefined;
725
763
  } | undefined;
764
+ pricing?: {
765
+ match: string;
766
+ input: number;
767
+ output: number;
768
+ cacheWrite?: number | undefined;
769
+ cacheRead?: number | undefined;
770
+ }[] | undefined;
726
771
  hooks?: {
727
772
  onStart?: string | undefined;
728
773
  onComplete?: string | undefined;
@@ -763,6 +808,13 @@ declare const configSchema: z.ZodEffects<z.ZodObject<{
763
808
  apiKey?: string | undefined;
764
809
  maxTokens?: number | undefined;
765
810
  } | undefined;
811
+ pricing?: {
812
+ match: string;
813
+ input: number;
814
+ output: number;
815
+ cacheWrite?: number | undefined;
816
+ cacheRead?: number | undefined;
817
+ }[] | undefined;
766
818
  hooks?: {
767
819
  onStart?: string | undefined;
768
820
  onComplete?: string | undefined;
@@ -803,6 +855,13 @@ declare const configSchema: z.ZodEffects<z.ZodObject<{
803
855
  apiKey?: string | undefined;
804
856
  maxTokens?: number | undefined;
805
857
  } | undefined;
858
+ pricing?: {
859
+ match: string;
860
+ input: number;
861
+ output: number;
862
+ cacheWrite?: number | undefined;
863
+ cacheRead?: number | undefined;
864
+ }[] | undefined;
806
865
  hooks?: {
807
866
  onStart?: string | undefined;
808
867
  onComplete?: string | undefined;
@@ -843,6 +902,13 @@ declare const configSchema: z.ZodEffects<z.ZodObject<{
843
902
  apiKey?: string | undefined;
844
903
  maxTokens?: number | undefined;
845
904
  } | undefined;
905
+ pricing?: {
906
+ match: string;
907
+ input: number;
908
+ output: number;
909
+ cacheWrite?: number | undefined;
910
+ cacheRead?: number | undefined;
911
+ }[] | undefined;
846
912
  hooks?: {
847
913
  onStart?: string | undefined;
848
914
  onComplete?: string | undefined;
@@ -893,6 +959,64 @@ declare class SessionNotFoundError extends AgentRelayError {
893
959
  constructor(sessionId: string);
894
960
  }
895
961
 
962
+ /**
963
+ * Per-model token pricing → an API-equivalent USD cost for a run's token usage.
964
+ *
965
+ * agent-relay reads exact token counts from the agent's transcript (see
966
+ * `claude-session.ts` / `codex-session.ts`), but the transcript carries no dollar
967
+ * figure, and the TUI "Session $" line is a NOMINAL/subscription number (often $0
968
+ * on Team/Max seats) — NOT the API-equivalent cost. So we compute
969
+ * `cost = Σ(tokens × list rate)` from a built-in price table (overridable via
970
+ * `config.pricing`). Rates are USD per 1M tokens and are LIST prices, so the
971
+ * result is an ESTIMATE — override the table when your actual billing differs.
972
+ *
973
+ * IMPORTANT: this is a REFERENCE (shadow) cost — what the run would bill via the
974
+ * API. agent-relay drives the agent's INTERACTIVE TUI, which on a Claude
975
+ * subscription is covered by the plan (the real marginal cost is ~0). Only the
976
+ * non-interactive paths (`claude -p` / the Agent SDK — and, from 2026-06-15, no
977
+ * longer counted against subscription limits) bill at these rates. So treat
978
+ * `costUsd` as "what this would cost via API", not necessarily what you're charged.
979
+ */
980
+ interface ModelPricing {
981
+ /** USD per 1M input (prompt) tokens. */
982
+ input: number;
983
+ /** USD per 1M output (completion) tokens. */
984
+ output: number;
985
+ /** USD per 1M cache-CREATION tokens (defaults to `input` when unset). */
986
+ cacheWrite?: number;
987
+ /** USD per 1M cache-READ tokens (defaults to 0 when unset). */
988
+ cacheRead?: number;
989
+ }
990
+ interface PricingRule {
991
+ /** Matched (case-insensitive) against the model id — first match wins. */
992
+ match: RegExp;
993
+ pricing: ModelPricing;
994
+ }
995
+ /**
996
+ * Built-in Anthropic list prices (USD / 1M tokens). These are estimates that
997
+ * drift as pricing changes — pass `config.pricing` to override per model. Codex
998
+ * (OpenAI) models aren't listed here, so a Codex run reports `costUsd: null`
999
+ * unless you add a rule.
1000
+ */
1001
+ declare const DEFAULT_PRICING: PricingRule[];
1002
+ /** Resolve the price for a model id (overrides first, then the built-in table). */
1003
+ declare function pricingForModel(model: string | undefined, overrides?: PricingRule[]): ModelPricing | undefined;
1004
+ /** Token usage shape needed to price a run. */
1005
+ interface CostUsageInput {
1006
+ model?: string;
1007
+ inputTokens?: number;
1008
+ outputTokens?: number;
1009
+ cacheCreationTokens?: number;
1010
+ cachedInputTokens?: number;
1011
+ }
1012
+ /**
1013
+ * API-equivalent USD cost for a run's token usage, or `null` when the model's
1014
+ * price is unknown — so callers WARN instead of reporting a misleading `0`.
1015
+ * Assumes Claude-style additive cache accounting (input excludes cached reads;
1016
+ * cache reads are billed separately at the cheap `cacheRead` rate).
1017
+ */
1018
+ declare function computeCostUsd(usage: CostUsageInput, overrides?: PricingRule[]): number | null;
1019
+
896
1020
  /** Session metadata persistence: create, save, load, list. */
897
1021
 
898
1022
  interface CreateSessionInput {
@@ -1473,11 +1597,13 @@ declare class ClaudeInteractiveAdapter extends InteractivePtyAdapter {
1473
1597
 
1474
1598
  /**
1475
1599
  * Codex driven interactively in a PTY. The project's concept is PURE AUTONOMY:
1476
- * by default Codex runs with `-a never` (never ask) within the chosen sandbox,
1477
- * so it just works. The {@link Decider} still handles the prompts that appear
1478
- * anyway (the directory-trust dialog, etc.). `approvalPolicy: "gated"` switches
1479
- * Codex to `-a on-request` so the decider sees each action; `"readonly"` runs it
1480
- * read-only. The prompt is a positional arg so the TUI starts immediately.
1600
+ * by default Codex runs with `-a never` (never ask) and `-s danger-full-access`
1601
+ * (full bypass the project default), so it just works unattended. The
1602
+ * {@link Decider} still handles the prompts that appear anyway (the directory-
1603
+ * trust dialog, etc.). `approvalPolicy: "gated"` switches Codex to `-a on-request`
1604
+ * so the decider sees each action; `"readonly"` runs it read-only. Tighten the
1605
+ * sandbox with `defaults.sandbox` / `adapters.codex.sandbox` (e.g. workspace-write).
1606
+ * The prompt is a positional arg so the TUI starts immediately.
1481
1607
  */
1482
1608
 
1483
1609
  interface CodexInteractiveOptions {
@@ -1765,4 +1891,4 @@ declare function cleanTerminalText(input: string): string;
1765
1891
  /** Return the last `n` non-empty lines of cleaned text. */
1766
1892
  declare function tailLines(text: string, n: number): string[];
1767
1893
 
1768
- export { type AbortReason, type AdapterAvailability, type AdapterConfig, type AdapterFactory, type AdapterListItem, type AdapterMode, AdapterRegistry, type AdapterRunContext, type AgentAdapter, type AgentAdapterDefinition, type AgentErrorInfo, type AgentEvent, type AgentEventType, AgentRelayError, type AgentRunInput, type AgentRunResult, type AgentSessionRef, type AgentUsage, AlwaysApproveDecider, ApiDecider, type ApprovalMode, BUILTIN_ADAPTER_DEFINITIONS, CONFIG_FILENAME, ClaudeInteractiveAdapter, CodexInteractiveAdapter, CommandDecider, type CommandPreview, type CompletionContext, type CompletionDetector, CompositeCompletionDetector, ConfigError, type CreateSessionInput, DEFAULT_DENY_PATTERNS, type Decider, type DeciderConfig, type DeciderConfigSchema, type DeciderFlags, type DecisionAction, DefaultCompletionDetector, DefaultKeymap, type DetectedPrompt, type DoctorReport, type FakeAdapterOptions, FakeAgentAdapter, FunctionDecider, type HooksConfig, type InitResult, type InteractionDecision, type InteractionKind, type InteractionRequest, type InteractiveAdapterConfig, InteractivePtyAdapter, OutputPatternDetector, PromptDetector, type PromptDetectorOptions, type PruneOptions, type PruneResult, type PtyKeymap, type PtySessionOptions, type RelayConfig, type RelayDefaults, type ResumeCommandResult, RuleDecider, type RunHooks, RunLogger, type RunLoggerOptions, type RunOutcome, type RunnerOptions, type SandboxLevel, type SessionListItem, SessionManager, type SessionMetadata, SessionNotFoundError, type SessionStatus, type ShellHookContext, type ShellHooks, UnknownAdapterError, adapterConfigSchema, approvalPolicySchema, cleanTerminalText, configPath, configSchema, createAdapterFactory, createDecider, createDefaultConfig, deciderConfigFromFlags, deciderSchema, defaultRegistry, defaultsSchema, hooksSchema, listAdapters, listSessions, loadConfig, loadConfigOrDefault, parseCheckbox, parseConfig, parseDecisionReply, pruneSessions, renderDecisionPrompt, resolveApprovalMode, resolvePrompt, resolveSandbox, resumeCommand, runAgent, runCommand, runDoctor, runInit, runPtySession, runShellHook, sandboxSchema, saveConfig, stringifyConfig, stripAnsi, tailLines };
1894
+ export { type AbortReason, type AdapterAvailability, type AdapterConfig, type AdapterFactory, type AdapterListItem, type AdapterMode, AdapterRegistry, type AdapterRunContext, type AgentAdapter, type AgentAdapterDefinition, type AgentErrorInfo, type AgentEvent, type AgentEventType, AgentRelayError, type AgentRunInput, type AgentRunResult, type AgentSessionRef, type AgentUsage, AlwaysApproveDecider, ApiDecider, type ApprovalMode, BUILTIN_ADAPTER_DEFINITIONS, CONFIG_FILENAME, ClaudeInteractiveAdapter, CodexInteractiveAdapter, CommandDecider, type CommandPreview, type CompletionContext, type CompletionDetector, CompositeCompletionDetector, ConfigError, type CreateSessionInput, DEFAULT_DENY_PATTERNS, DEFAULT_PRICING, type Decider, type DeciderConfig, type DeciderConfigSchema, type DeciderFlags, type DecisionAction, DefaultCompletionDetector, DefaultKeymap, type DetectedPrompt, type DoctorReport, type FakeAdapterOptions, FakeAgentAdapter, FunctionDecider, type HooksConfig, type InitResult, type InteractionDecision, type InteractionKind, type InteractionRequest, type InteractiveAdapterConfig, InteractivePtyAdapter, type ModelPricing, OutputPatternDetector, type PricingRule, PromptDetector, type PromptDetectorOptions, type PruneOptions, type PruneResult, type PtyKeymap, type PtySessionOptions, type RelayConfig, type RelayDefaults, type ResumeCommandResult, RuleDecider, type RunHooks, RunLogger, type RunLoggerOptions, type RunOutcome, type RunnerOptions, type SandboxLevel, type SessionListItem, SessionManager, type SessionMetadata, SessionNotFoundError, type SessionStatus, type ShellHookContext, type ShellHooks, UnknownAdapterError, adapterConfigSchema, approvalPolicySchema, cleanTerminalText, computeCostUsd, configPath, configSchema, createAdapterFactory, createDecider, createDefaultConfig, deciderConfigFromFlags, deciderSchema, defaultRegistry, defaultsSchema, hooksSchema, listAdapters, listSessions, loadConfig, loadConfigOrDefault, parseCheckbox, parseConfig, parseDecisionReply, pricingForModel, pruneSessions, renderDecisionPrompt, resolveApprovalMode, resolvePrompt, resolveSandbox, resumeCommand, runAgent, runCommand, runDoctor, runInit, runPtySession, runShellHook, sandboxSchema, saveConfig, stringifyConfig, stripAnsi, tailLines };
package/dist/index.js CHANGED
@@ -92,7 +92,7 @@ function resolveApprovalMode(defaults, adapter) {
92
92
  return adapter?.approvalPolicy ?? defaults.approvalPolicy ?? (defaults.requireApprovalOnRiskyActions ? "gated" : "auto");
93
93
  }
94
94
  function resolveSandbox(defaults, adapter) {
95
- return adapter?.sandbox ?? defaults.sandbox ?? "workspace-write";
95
+ return adapter?.sandbox ?? defaults.sandbox ?? "danger-full-access";
96
96
  }
97
97
  var hooksSchema = z.object({
98
98
  /** Shell command run just before the agent starts. */
@@ -119,6 +119,14 @@ var deciderSchema = z.object({
119
119
  apiKey: z.string().optional(),
120
120
  maxTokens: z.number().int().positive().optional()
121
121
  }).strict();
122
+ var pricingRuleSchema = z.object({
123
+ /** Regex (case-insensitive) matched against the model id; first match wins. */
124
+ match: z.string().min(1),
125
+ input: z.number().nonnegative(),
126
+ output: z.number().nonnegative(),
127
+ cacheWrite: z.number().nonnegative().optional(),
128
+ cacheRead: z.number().nonnegative().optional()
129
+ }).strict();
122
130
  var configSchema = z.object({
123
131
  defaultAdapter: z.string().min(1),
124
132
  sessionsDir: z.string().min(1).default(".agent-relay/sessions"),
@@ -127,6 +135,8 @@ var configSchema = z.object({
127
135
  adapters: z.record(adapterConfigSchema),
128
136
  /** Which decider answers interactive prompts. */
129
137
  decider: deciderSchema.optional(),
138
+ /** Override the built-in per-model token pricing (USD per 1M tokens). */
139
+ pricing: z.array(pricingRuleSchema).optional(),
130
140
  /** Optional shell-command lifecycle hooks. */
131
141
  hooks: hooksSchema.optional()
132
142
  }).strict().superRefine((cfg, ctx) => {
@@ -147,9 +157,10 @@ function createDefaultConfig() {
147
157
  maxTurns: 20,
148
158
  timeoutMs: 18e5,
149
159
  idleTimeoutMs: 3e5,
150
- // Autonomous by default: agents run unattended without asking.
160
+ // Autonomous by default: agents run unattended without asking, with full
161
+ // bypass so they can act freely (tighten `sandbox` for guarded runs).
151
162
  approvalPolicy: "auto",
152
- sandbox: "workspace-write",
163
+ sandbox: "danger-full-access",
153
164
  requireApprovalOnRiskyActions: false
154
165
  },
155
166
  // Interactive (PTY) is the default mode: the agent runs in its real TUI and
@@ -231,6 +242,27 @@ async function saveConfig(rootDir, config) {
231
242
  await promises.writeFile(file, stringifyConfig(config), "utf8");
232
243
  return file;
233
244
  }
245
+
246
+ // src/core/pricing.ts
247
+ var DEFAULT_PRICING = [
248
+ { match: /opus/i, pricing: { input: 15, output: 75, cacheWrite: 18.75, cacheRead: 1.5 } },
249
+ { match: /sonnet/i, pricing: { input: 3, output: 15, cacheWrite: 3.75, cacheRead: 0.3 } },
250
+ { match: /haiku/i, pricing: { input: 1, output: 5, cacheWrite: 1.25, cacheRead: 0.1 } }
251
+ ];
252
+ function pricingForModel(model, overrides = []) {
253
+ if (!model) return void 0;
254
+ for (const rule of [...overrides, ...DEFAULT_PRICING]) {
255
+ if (rule.match.test(model)) return rule.pricing;
256
+ }
257
+ return void 0;
258
+ }
259
+ function computeCostUsd(usage, overrides = []) {
260
+ const p = pricingForModel(usage.model, overrides);
261
+ if (!p) return null;
262
+ const M = 1e6;
263
+ const cost = ((usage.inputTokens ?? 0) * p.input + (usage.outputTokens ?? 0) * p.output + (usage.cacheCreationTokens ?? 0) * (p.cacheWrite ?? p.input) + (usage.cachedInputTokens ?? 0) * (p.cacheRead ?? 0)) / M;
264
+ return Math.round(cost * 1e6) / 1e6;
265
+ }
234
266
  var SessionManager = class {
235
267
  constructor(sessionsDir) {
236
268
  this.sessionsDir = sessionsDir;
@@ -1031,6 +1063,28 @@ async function runAgent(options) {
1031
1063
  }
1032
1064
  externalSignal?.removeEventListener("abort", onExternalAbort);
1033
1065
  }
1066
+ if (result?.meta && typeof result.meta === "object") {
1067
+ const usage = result.meta.usage;
1068
+ if (usage && (usage.inputTokens != null || usage.outputTokens != null)) {
1069
+ const overrides = (config.pricing ?? []).map((r) => ({
1070
+ match: new RegExp(r.match, "i"),
1071
+ pricing: {
1072
+ input: r.input,
1073
+ output: r.output,
1074
+ cacheWrite: r.cacheWrite,
1075
+ cacheRead: r.cacheRead
1076
+ }
1077
+ }));
1078
+ usage.costUsd = computeCostUsd(usage, overrides);
1079
+ if (usage.costUsd === null) {
1080
+ onEvent({
1081
+ type: "status",
1082
+ timestamp: iso(),
1083
+ text: `usage: no price for model ${usage.model ? `"${usage.model}"` : "(unknown)"} \u2192 costUsd=null (set config.pricing to override)`
1084
+ });
1085
+ }
1086
+ }
1087
+ }
1034
1088
  const status = detector.detect({ result, events, abortReason, error: runError }) ?? "failed";
1035
1089
  const endedAt = iso();
1036
1090
  const error = runError && !result?.error ? { message: runError.message, code: "ADAPTER_THREW" } : result?.error ?? (abortReason === "timeout" || abortReason === "idle" ? {
@@ -1557,6 +1611,7 @@ function runPtySession(opts, ctx) {
1557
1611
  const triggerQuit = (reason) => {
1558
1612
  if (finished || quitting) return;
1559
1613
  quitting = true;
1614
+ settled = true;
1560
1615
  if (completionTimer) clearTimeout(completionTimer);
1561
1616
  completionTimer = void 0;
1562
1617
  emit("status", `task appears complete (${reason})`);
@@ -1776,7 +1831,14 @@ function runPtySession(opts, ctx) {
1776
1831
  exitCode,
1777
1832
  error,
1778
1833
  sessionRef: void 0,
1779
- meta: { interactions, settled, ...usage ? { usage } : {} }
1834
+ // `completedCleanly` is an unambiguous success flag (exit 0, not aborted/
1835
+ // error) — use it instead of `settled` for "did the run finish well?".
1836
+ meta: {
1837
+ interactions,
1838
+ settled,
1839
+ completedCleanly: success,
1840
+ ...usage ? { usage } : {}
1841
+ }
1780
1842
  });
1781
1843
  });
1782
1844
  if (ctx.signal.aborted) onAbort();
@@ -1876,8 +1938,17 @@ var InteractivePtyAdapter = class {
1876
1938
  else args.push(input.prompt);
1877
1939
  return this.spawn(args, initialInput, input, ctx);
1878
1940
  }
1879
- spawn(args, initialInput, input, ctx, extra) {
1941
+ async spawn(args, initialInput, input, ctx, extra) {
1880
1942
  const decider = ctx.decider ?? new RuleDecider();
1943
+ try {
1944
+ const resolved = await which(this.cfg.command);
1945
+ ctx.onEvent({
1946
+ type: "status",
1947
+ timestamp: (this.cfg.now ?? (() => /* @__PURE__ */ new Date()))().toISOString(),
1948
+ text: resolved ? `resolved ${this.cfg.command} \u2192 ${resolved}` : `${this.cfg.command} not found on PATH; relying on spawn-time resolution`
1949
+ });
1950
+ } catch {
1951
+ }
1881
1952
  return runPtySession(
1882
1953
  {
1883
1954
  command: this.cfg.command,
@@ -2010,8 +2081,8 @@ function scrapeClaudeStatusLine(text) {
2010
2081
  u.raw = ctx[0].trim().replace(/\s+/g, " ");
2011
2082
  }
2012
2083
  const cost = text.match(/session\s*\$\s*([\d.]+)/i);
2013
- if (cost) u.sessionCostUsd = Number(cost[1]);
2014
- return u.contextPercent !== void 0 || u.sessionCostUsd !== void 0 ? u : void 0;
2084
+ if (cost) u.subscriptionSessionCostUsd = Number(cost[1]);
2085
+ return u.contextPercent !== void 0 || u.subscriptionSessionCostUsd !== void 0 ? u : void 0;
2015
2086
  }
2016
2087
  var DEFINITION2 = {
2017
2088
  name: "claude",
@@ -2257,7 +2328,7 @@ var CodexInteractiveAdapter = class _CodexInteractiveAdapter extends Interactive
2257
2328
  now: opts.now,
2258
2329
  installHint: "Install the Codex CLI (`npm i -g @openai/codex`), run `codex login`, and ensure `codex` is on your PATH.",
2259
2330
  preArgs: (input) => {
2260
- const sandbox = input.approvalMode === "readonly" ? "read-only" : input.sandbox ?? "workspace-write";
2331
+ const sandbox = input.approvalMode === "readonly" ? "read-only" : input.sandbox ?? "danger-full-access";
2261
2332
  const approval = input.approvalMode === "gated" ? "on-request" : "never";
2262
2333
  return ["-s", sandbox, "-a", approval];
2263
2334
  },
@@ -2742,4 +2813,4 @@ async function pruneSessions(opts) {
2742
2813
  };
2743
2814
  }
2744
2815
 
2745
- export { AdapterRegistry, AgentRelayError, AlwaysApproveDecider, ApiDecider, BUILTIN_ADAPTER_DEFINITIONS, CONFIG_FILENAME, ClaudeInteractiveAdapter, CodexInteractiveAdapter, CommandDecider, CompositeCompletionDetector, ConfigError, DEFAULT_DENY_PATTERNS, DefaultCompletionDetector, DefaultKeymap, FakeAgentAdapter, FunctionDecider, InteractivePtyAdapter, OutputPatternDetector, PromptDetector, RuleDecider, RunLogger, SessionManager, SessionNotFoundError, UnknownAdapterError, adapterConfigSchema, approvalPolicySchema, cleanTerminalText, configPath, configSchema, createAdapterFactory, createDecider, createDefaultConfig, deciderConfigFromFlags, deciderSchema, defaultRegistry, defaultsSchema, hooksSchema, listAdapters, listSessions, loadConfig, loadConfigOrDefault, parseCheckbox, parseConfig, parseDecisionReply, pruneSessions, renderDecisionPrompt, resolveApprovalMode, resolvePrompt, resolveSandbox, resumeCommand, runAgent, runCommand, runDoctor, runInit, runPtySession, runShellHook, sandboxSchema, saveConfig, stringifyConfig, stripAnsi, tailLines };
2816
+ export { AdapterRegistry, AgentRelayError, AlwaysApproveDecider, ApiDecider, BUILTIN_ADAPTER_DEFINITIONS, CONFIG_FILENAME, ClaudeInteractiveAdapter, CodexInteractiveAdapter, CommandDecider, CompositeCompletionDetector, ConfigError, DEFAULT_DENY_PATTERNS, DEFAULT_PRICING, DefaultCompletionDetector, DefaultKeymap, FakeAgentAdapter, FunctionDecider, InteractivePtyAdapter, OutputPatternDetector, PromptDetector, RuleDecider, RunLogger, SessionManager, SessionNotFoundError, UnknownAdapterError, adapterConfigSchema, approvalPolicySchema, cleanTerminalText, computeCostUsd, configPath, configSchema, createAdapterFactory, createDecider, createDefaultConfig, deciderConfigFromFlags, deciderSchema, defaultRegistry, defaultsSchema, hooksSchema, listAdapters, listSessions, loadConfig, loadConfigOrDefault, parseCheckbox, parseConfig, parseDecisionReply, pricingForModel, pruneSessions, renderDecisionPrompt, resolveApprovalMode, resolvePrompt, resolveSandbox, resumeCommand, runAgent, runCommand, runDoctor, runInit, runPtySession, runShellHook, sandboxSchema, saveConfig, stringifyConfig, stripAnsi, tailLines };
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@yul-labs/agent-relay",
3
- "version": "0.1.2",
3
+ "version": "0.1.3",
4
4
  "description": "Vendor-agnostic LLM coding agent session orchestrator (Claude, Codex, ...)",
5
5
  "type": "module",
6
6
  "packageManager": "pnpm@10.25.0",