@ljoukov/llm 2.1.0 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -9,7 +9,8 @@ Unified TypeScript wrapper over:
9
9
 
10
10
  - **OpenAI Responses API** (`openai`)
11
11
  - **Google Gemini via Vertex AI** (`@google/genai`)
12
- - **ChatGPT subscription models** via `chatgpt-*` model ids (requires `CHATGPT_AUTH_JSON_B64`)
12
+ - **Fireworks chat-completions models** (`kimi-k2.5`, `glm-5`, `minimax-m2.1`)
13
+ - **ChatGPT subscription models** via `chatgpt-*` model ids (reuses Codex auth store, or a token provider)
13
14
 
14
15
  Designed around a single streaming API that yields:
15
16
 
@@ -69,12 +70,27 @@ If deploying to Cloudflare Workers/Pages:
69
70
  jq -c . < path/to/service-account.json | wrangler secret put GOOGLE_SERVICE_ACCOUNT_JSON
70
71
  ```
71
72
 
73
+ ### Fireworks
74
+
75
+ - `FIREWORKS_TOKEN` (or `FIREWORKS_API_KEY`)
76
+
72
77
  ### ChatGPT subscription models
73
78
 
74
- - `CHATGPT_AUTH_JSON_B64`
79
+ By default, `chatgpt-*` models reuse the ChatGPT OAuth tokens stored by the Codex CLI:
80
+
81
+ - `${CODEX_HOME:-~/.codex}/auth.json`
82
+
83
+ If you deploy to multiple environments (Vercel, GCP, local dev, etc.), use a centralized HTTPS token provider that owns
84
+ refresh-token rotation and serves short-lived access tokens.
75
85
 
76
- This is a base64url-encoded JSON blob containing the ChatGPT OAuth tokens + account id (RFC 4648):
77
- https://www.rfc-editor.org/rfc/rfc4648
86
+ - `CHATGPT_AUTH_TOKEN_PROVIDER_URL` (example: `https://chatgpt-auth.<your-domain>`)
87
+ - `CHATGPT_AUTH_API_KEY` (shared secret; sent as `Authorization: Bearer ...` and `x-chatgpt-auth: ...`)
88
+ - `CHATGPT_AUTH_TOKEN_PROVIDER_STORE` (`kv` or `d1`, defaults to `kv`)
89
+
90
+ This repo includes a Cloudflare Workers token provider implementation in `workers/chatgpt-auth/`.
91
+
92
+ If `CHATGPT_AUTH_TOKEN_PROVIDER_URL` + `CHATGPT_AUTH_API_KEY` are set, `chatgpt-*` models will fetch tokens from the
93
+ token provider and will not read the local Codex auth store.
78
94
 
79
95
  ## Usage
80
96
 
@@ -245,6 +261,21 @@ const result = await generateText({
245
261
  console.log(result.text);
246
262
  ```
247
263
 
264
+ ### Fireworks
265
+
266
+ Use Fireworks model ids directly (for example `kimi-k2.5`, `glm-5`, `minimax-m2.1`):
267
+
268
+ ```ts
269
+ import { generateText } from "@ljoukov/llm";
270
+
271
+ const result = await generateText({
272
+ model: "kimi-k2.5",
273
+ input: "Return exactly: OK",
274
+ });
275
+
276
+ console.log(result.text);
277
+ ```
278
+
248
279
  ### ChatGPT subscription models
249
280
 
250
281
  Use a `chatgpt-` prefix:
@@ -348,14 +379,21 @@ const { value } = await generateJson({
348
379
 
349
380
  ## Tools
350
381
 
351
- This library supports two kinds of tools:
382
+ There are three tool-enabled call patterns:
352
383
 
353
- - Model tools (server-side): `web-search` and `code-execution`
354
- - Your tools (JS/TS code): use `runToolLoop()` and `tool()`
384
+ 1. `generateText()` for provider-native/server-side tools (for example web search).
385
+ 2. `runToolLoop()` for your runtime JS/TS tools (function tools executed in your process).
386
+ 3. `runAgentLoop()` for filesystem tasks (a convenience wrapper around `runToolLoop()`).
355
387
 
356
- ### Model tools (web search / code execution)
388
+ Architecture note:
357
389
 
358
- These tools run on the provider side.
390
+ - Filesystem tools are not a separate execution system.
391
+ - `runAgentLoop()` constructs a filesystem toolset, merges your optional custom tools, then calls the same `runToolLoop()` engine.
392
+ - This behavior is model-agnostic at API level; profile selection only adapts tool shape for model compatibility.
393
+
394
+ ### Provider-Native Tools (`generateText()`)
395
+
396
+ Use this when the model provider executes the tool remotely (for example search/code-exec style tools).
359
397
 
360
398
  ```ts
361
399
  import { generateText } from "@ljoukov/llm";
@@ -369,9 +407,9 @@ const result = await generateText({
369
407
  console.log(result.text);
370
408
  ```
371
409
 
372
- ### Your tools (function calling)
410
+ ### Runtime Tools (`runToolLoop()`)
373
411
 
374
- `runToolLoop()` runs a simple function-calling loop until the model returns a final answer or the step limit is hit.
412
+ Use this when the model should call your local runtime functions.
375
413
 
376
414
  ```ts
377
415
  import { runToolLoop, tool } from "@ljoukov/llm";
@@ -392,47 +430,24 @@ const result = await runToolLoop({
392
430
  console.log(result.text);
393
431
  ```
394
432
 
395
- ### Built-in `apply_patch` tool
433
+ Use `customTool()` only when you need freeform/non-JSON tool input grammar.
396
434
 
397
- The library includes a Codex-style `apply_patch` tool with a pluggable filesystem adapter.
435
+ ### Filesystem Tasks (`runAgentLoop()`)
398
436
 
399
- ```ts
400
- import {
401
- createApplyPatchTool,
402
- createInMemoryAgentFilesystem,
403
- runToolLoop,
404
- } from "@ljoukov/llm";
437
+ Use this for read/search/write tasks in a workspace. The library auto-selects filesystem tool profile by model when `profile: "auto"`:
405
438
 
406
- const fs = createInMemoryAgentFilesystem({
407
- "/repo/index.ts": "export const value = 1;\n",
408
- });
439
+ - Codex-like models: Codex-compatible filesystem tool shape.
440
+ - Gemini models: Gemini-compatible filesystem tool shape.
441
+ - Other models: model-agnostic profile (currently Gemini-style).
409
442
 
410
- const result = await runToolLoop({
411
- model: "chatgpt-gpt-5.3-codex",
412
- input: "Use apply_patch to change value from 1 to 2.",
413
- tools: {
414
- apply_patch: createApplyPatchTool({
415
- cwd: "/repo",
416
- fs,
417
- checkAccess: ({ path }) => {
418
- if (!path.startsWith("/repo/")) {
419
- throw new Error("Writes are allowed only inside /repo");
420
- }
421
- },
422
- }),
423
- },
424
- });
425
-
426
- console.log(result.text);
427
- ```
443
+ Confinement/policy is set through `filesystemTool.options`:
428
444
 
429
- ### `runAgentLoop()` with model-aware filesystem tools
445
+ - `cwd`: workspace root for path resolution.
446
+ - `fs`: backend (`createNodeAgentFilesystem()` or `createInMemoryAgentFilesystem()`).
447
+ - `checkAccess`: hook for allow/deny policy + audit.
448
+ - `allowOutsideCwd`: opt-out confinement (default is false).
430
449
 
431
- Use `runAgentLoop()` when you want a default filesystem toolset chosen by model:
432
-
433
- - Codex-like models -> `apply_patch`, `read_file`, `list_dir`, `grep_files`
434
- - Gemini models -> `read_file`, `write_file`, `replace`, `list_directory`, `grep_search`, `glob`
435
- - Other models -> model-agnostic (Gemini-style) set by default
450
+ Detailed reference: `docs/agent-filesystem-tools.md`.
436
451
 
437
452
  ```ts
438
453
  import { createInMemoryAgentFilesystem, runAgentLoop } from "@ljoukov/llm";
@@ -442,7 +457,7 @@ const fs = createInMemoryAgentFilesystem({
442
457
  });
443
458
 
444
459
  const result = await runAgentLoop({
445
- model: "chatgpt-gpt-5.3-codex",
460
+ model: "chatgpt-gpt-5.3-codex-spark",
446
461
  input: "Change value from 1 to 2 using filesystem tools.",
447
462
  filesystemTool: {
448
463
  profile: "auto",
@@ -456,14 +471,42 @@ const result = await runAgentLoop({
456
471
  console.log(result.text);
457
472
  ```
458
473
 
459
- ## Agent benchmark (micro)
474
+ If you need exact control over tool definitions, build the filesystem toolset yourself and call `runToolLoop()` directly.
460
475
 
461
- For small edit-harness experiments with `chatgpt-gpt-5.3-codex`:
476
+ ```ts
477
+ import {
478
+ createFilesystemToolSetForModel,
479
+ createInMemoryAgentFilesystem,
480
+ runToolLoop,
481
+ } from "@ljoukov/llm";
482
+
483
+ const fs = createInMemoryAgentFilesystem({ "/repo/a.ts": "export const n = 1;\n" });
484
+ const tools = createFilesystemToolSetForModel("chatgpt-gpt-5.3-codex-spark", {
485
+ cwd: "/repo",
486
+ fs,
487
+ });
488
+
489
+ const result = await runToolLoop({
490
+ model: "chatgpt-gpt-5.3-codex-spark",
491
+ input: "Update n to 2.",
492
+ tools,
493
+ });
494
+ ```
495
+
496
+ ## Agent benchmark (filesystem extraction)
497
+
498
+ For filesystem extraction/summarization evaluation across Codex, Fireworks, and Gemini models:
462
499
 
463
500
  ```bash
464
501
  npm run bench:agent
465
502
  ```
466
503
 
504
+ Standard full refresh (all tasks, auto-write `LATEST_RESULTS.md`, refresh `traces/latest`, prune old traces):
505
+
506
+ ```bash
507
+ npm run bench:agent:latest
508
+ ```
509
+
467
510
  Estimate-only:
468
511
 
469
512
  ```bash