@ljoukov/llm 5.0.4 → 7.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +68 -38
- package/dist/index.cjs +836 -350
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +12 -9
- package/dist/index.d.ts +12 -9
- package/dist/index.js +841 -353
- package/dist/index.js.map +1 -1
- package/package.json +2 -1
package/README.md
CHANGED
|
@@ -46,14 +46,20 @@ Use one backend:
|
|
|
46
46
|
|
|
47
47
|
- `GEMINI_API_KEY` or `GOOGLE_API_KEY` for the Gemini Developer API
|
|
48
48
|
- `GOOGLE_SERVICE_ACCOUNT_JSON` for Vertex AI (the contents of a service account JSON key file, not a file path)
|
|
49
|
+
- `LLM_FILES_GCS_BUCKET` for canonical file storage used by `files.create()` and automatic large-attachment offload
|
|
50
|
+
- `LLM_FILES_GCS_PREFIX` (optional object-name prefix inside `LLM_FILES_GCS_BUCKET`)
|
|
49
51
|
- `VERTEX_GCS_BUCKET` for Vertex-backed Gemini file attachments / `file_id` inputs
|
|
50
52
|
- `VERTEX_GCS_PREFIX` (optional object-name prefix inside `VERTEX_GCS_BUCKET`)
|
|
51
53
|
|
|
52
54
|
If a Gemini API key is present, the library uses the Gemini Developer API. Otherwise it falls back to Vertex AI.
|
|
53
55
|
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
56
|
+
Canonical files are stored in GCS with a default `48h` TTL. OpenAI and ChatGPT consume those files via signed HTTPS
|
|
57
|
+
URLs. Gemini still mirrors canonical files lazily into provider-native storage when needed:
|
|
58
|
+
|
|
59
|
+
- Gemini Developer API mirrors into Gemini Files
|
|
60
|
+
- Vertex-backed Gemini mirrors into `VERTEX_GCS_BUCKET` and uses `gs://...` URIs
|
|
61
|
+
|
|
62
|
+
Configure lifecycle rules on those buckets if you want hard 48-hour cleanup for mirrored objects.
|
|
57
63
|
|
|
58
64
|
#### Vertex AI service account setup
|
|
59
65
|
|
|
@@ -137,7 +143,7 @@ configureModelConcurrency({
|
|
|
137
143
|
fireworks: 8,
|
|
138
144
|
},
|
|
139
145
|
modelCaps: {
|
|
140
|
-
"gpt-5.
|
|
146
|
+
"gpt-5.4-mini": 24,
|
|
141
147
|
},
|
|
142
148
|
providerModelCaps: {
|
|
143
149
|
google: {
|
|
@@ -168,7 +174,7 @@ Use OpenAI-style request fields:
|
|
|
168
174
|
import { generateText } from "@ljoukov/llm";
|
|
169
175
|
|
|
170
176
|
const result = await generateText({
|
|
171
|
-
model: "gpt-5.
|
|
177
|
+
model: "gpt-5.4-mini",
|
|
172
178
|
input: "Write one sentence about TypeScript.",
|
|
173
179
|
});
|
|
174
180
|
|
|
@@ -182,7 +188,7 @@ console.log(result.usage, result.costUsd);
|
|
|
182
188
|
import { streamText } from "@ljoukov/llm";
|
|
183
189
|
|
|
184
190
|
const call = streamText({
|
|
185
|
-
model: "gpt-5.
|
|
191
|
+
model: "gpt-5.4-mini",
|
|
186
192
|
input: "Explain what a hash function is in one paragraph.",
|
|
187
193
|
});
|
|
188
194
|
|
|
@@ -228,7 +234,7 @@ const input: LlmInputMessage[] = [
|
|
|
228
234
|
},
|
|
229
235
|
];
|
|
230
236
|
|
|
231
|
-
const result = await generateText({ model: "gpt-5.
|
|
237
|
+
const result = await generateText({ model: "gpt-5.4-mini", input });
|
|
232
238
|
console.log(result.text);
|
|
233
239
|
```
|
|
234
240
|
|
|
@@ -256,13 +262,13 @@ const input: LlmInputMessage[] = [
|
|
|
256
262
|
},
|
|
257
263
|
];
|
|
258
264
|
|
|
259
|
-
const result = await generateText({ model: "gpt-5.
|
|
265
|
+
const result = await generateText({ model: "gpt-5.4-mini", input });
|
|
260
266
|
console.log(result.text);
|
|
261
267
|
```
|
|
262
268
|
|
|
263
|
-
Canonical storage
|
|
269
|
+
Canonical storage now uses GCS-backed objects with a `48h` TTL.
|
|
264
270
|
|
|
265
|
-
- OpenAI models
|
|
271
|
+
- OpenAI and ChatGPT models resolve that `file_id` to a signed HTTPS URL.
|
|
266
272
|
- Gemini Developer API mirrors the file lazily into Gemini Files when needed.
|
|
267
273
|
- Vertex-backed Gemini mirrors the file lazily into `VERTEX_GCS_BUCKET` and uses `gs://...` URIs.
|
|
268
274
|
|
|
@@ -294,7 +300,7 @@ When the combined inline attachment payload in a single request would exceed abo
|
|
|
294
300
|
the library automatically uploads those attachments to the canonical files store first and swaps the prompt to file
|
|
295
301
|
references:
|
|
296
302
|
|
|
297
|
-
- OpenAI:
|
|
303
|
+
- OpenAI / ChatGPT: use signed HTTPS URLs for canonical files
|
|
298
304
|
- Gemini Developer API: mirrors to Gemini Files and sends `fileData.fileUri`
|
|
299
305
|
- Vertex AI: mirrors to `VERTEX_GCS_BUCKET` and sends `gs://...` URIs
|
|
300
306
|
|
|
@@ -314,14 +320,28 @@ const input: LlmInputMessage[] = [
|
|
|
314
320
|
},
|
|
315
321
|
];
|
|
316
322
|
|
|
317
|
-
const result = await generateText({ model: "gpt-5.
|
|
323
|
+
const result = await generateText({ model: "gpt-5.4-mini", input });
|
|
318
324
|
console.log(result.text);
|
|
319
325
|
```
|
|
320
326
|
|
|
321
327
|
You can mix direct `file_id` parts with `inlineData`. Small attachments stay inline; oversized turns are upgraded to
|
|
322
328
|
canonical files automatically. Tool loops do the same for large tool outputs, and they also re-check the combined size
|
|
323
|
-
after parallel tool calls so a batch of individually-small images/files still gets upgraded to
|
|
324
|
-
before the next model request if the aggregate payload is too large.
|
|
329
|
+
after parallel tool calls so a batch of individually-small images/files still gets upgraded to canonical-file
|
|
330
|
+
references before the next model request if the aggregate payload is too large.
|
|
331
|
+
|
|
332
|
+
You can also control image analysis fidelity with request-level `mediaResolution`:
|
|
333
|
+
|
|
334
|
+
- `low`, `medium`, `high`, `original`, `auto`
|
|
335
|
+
- OpenAI / ChatGPT map this onto image `detail`
|
|
336
|
+
- Gemini maps this onto media resolution/tokenization settings
|
|
337
|
+
|
|
338
|
+
```ts
|
|
339
|
+
const result = await generateText({
|
|
340
|
+
model: "gpt-5.4",
|
|
341
|
+
mediaResolution: "original",
|
|
342
|
+
input,
|
|
343
|
+
});
|
|
344
|
+
```
|
|
325
345
|
|
|
326
346
|
OpenAI-style direct file-id example:
|
|
327
347
|
|
|
@@ -364,7 +384,7 @@ const input: LlmInputMessage[] = [
|
|
|
364
384
|
},
|
|
365
385
|
];
|
|
366
386
|
|
|
367
|
-
const result = await generateText({ model: "gpt-5.
|
|
387
|
+
const result = await generateText({ model: "gpt-5.4-mini", input });
|
|
368
388
|
console.log(result.text);
|
|
369
389
|
```
|
|
370
390
|
|
|
@@ -390,7 +410,7 @@ const input: LlmInputMessage[] = [
|
|
|
390
410
|
},
|
|
391
411
|
];
|
|
392
412
|
|
|
393
|
-
const result = await generateText({ model: "gpt-5.
|
|
413
|
+
const result = await generateText({ model: "gpt-5.4-mini", input });
|
|
394
414
|
console.log(result.text);
|
|
395
415
|
```
|
|
396
416
|
|
|
@@ -439,6 +459,11 @@ console.log(result.text);
|
|
|
439
459
|
|
|
440
460
|
`chatgpt-gpt-5.4-fast` is also supported as a convenience alias for ChatGPT-authenticated `gpt-5.4` with priority processing enabled (`service_tier="priority"`), matching Codex `/fast` semantics.
|
|
441
461
|
|
|
462
|
+
Supported OpenAI text model ids are fixed literal unions in code, not arbitrary strings:
|
|
463
|
+
|
|
464
|
+
- OpenAI API: `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.4-nano`
|
|
465
|
+
- ChatGPT auth: `chatgpt-gpt-5.4`, `chatgpt-gpt-5.4-fast`, `chatgpt-gpt-5.4-mini`, `chatgpt-gpt-5.3-codex-spark`
|
|
466
|
+
|
|
442
467
|
## JSON outputs
|
|
443
468
|
|
|
444
469
|
`generateJson()` validates the output with Zod and returns the parsed value.
|
|
@@ -458,7 +483,7 @@ const schema = z.object({
|
|
|
458
483
|
});
|
|
459
484
|
|
|
460
485
|
const { value } = await generateJson({
|
|
461
|
-
model: "gpt-5.
|
|
486
|
+
model: "gpt-5.4-mini",
|
|
462
487
|
input: "Return a JSON object with ok=true and message='hello'.",
|
|
463
488
|
schema,
|
|
464
489
|
});
|
|
@@ -481,7 +506,7 @@ const schema = z.object({
|
|
|
481
506
|
});
|
|
482
507
|
|
|
483
508
|
const call = streamJson({
|
|
484
|
-
model: "gpt-5.
|
|
509
|
+
model: "gpt-5.4-mini",
|
|
485
510
|
input: "Return a JSON object with ok=true and message='hello'.",
|
|
486
511
|
schema,
|
|
487
512
|
});
|
|
@@ -503,7 +528,7 @@ If you only want thought deltas (no partial JSON), set `streamMode: "final"`.
|
|
|
503
528
|
|
|
504
529
|
```ts
|
|
505
530
|
const call = streamJson({
|
|
506
|
-
model: "gpt-5.
|
|
531
|
+
model: "gpt-5.4-mini",
|
|
507
532
|
input: "Return a JSON object with ok=true and message='hello'.",
|
|
508
533
|
schema,
|
|
509
534
|
streamMode: "final",
|
|
@@ -514,7 +539,7 @@ If you want to keep `generateJson()` but still stream thoughts, pass an `onEvent
|
|
|
514
539
|
|
|
515
540
|
```ts
|
|
516
541
|
const { value } = await generateJson({
|
|
517
|
-
model: "gpt-5.
|
|
542
|
+
model: "gpt-5.4-mini",
|
|
518
543
|
input: "Return a JSON object with ok=true and message='hello'.",
|
|
519
544
|
schema,
|
|
520
545
|
onEvent: (event) => {
|
|
@@ -551,13 +576,13 @@ configureTelemetry({
|
|
|
551
576
|
});
|
|
552
577
|
|
|
553
578
|
const { value } = await generateJson({
|
|
554
|
-
model: "gpt-5.
|
|
579
|
+
model: "gpt-5.4-mini",
|
|
555
580
|
input: "Return { ok: true }.",
|
|
556
581
|
schema: z.object({ ok: z.boolean() }),
|
|
557
582
|
});
|
|
558
583
|
|
|
559
584
|
await runAgentLoop({
|
|
560
|
-
model: "gpt-5.
|
|
585
|
+
model: "gpt-5.4-mini",
|
|
561
586
|
input: "Inspect the repo and update the file.",
|
|
562
587
|
filesystemTool: true,
|
|
563
588
|
});
|
|
@@ -567,7 +592,7 @@ Per-call opt-out:
|
|
|
567
592
|
|
|
568
593
|
```ts
|
|
569
594
|
await generateJson({
|
|
570
|
-
model: "gpt-5.
|
|
595
|
+
model: "gpt-5.4-mini",
|
|
571
596
|
input: "Return { ok: true }.",
|
|
572
597
|
schema: z.object({ ok: z.boolean() }),
|
|
573
598
|
telemetry: false,
|
|
@@ -598,7 +623,7 @@ Use this when the model provider executes the tool remotely (for example search/
|
|
|
598
623
|
import { generateText } from "@ljoukov/llm";
|
|
599
624
|
|
|
600
625
|
const result = await generateText({
|
|
601
|
-
model: "gpt-5.
|
|
626
|
+
model: "gpt-5.4-mini",
|
|
602
627
|
input: "Find 3 relevant sources about X and summarize them.",
|
|
603
628
|
tools: [{ type: "web-search", mode: "live" }, { type: "code-execution" }],
|
|
604
629
|
});
|
|
@@ -615,7 +640,7 @@ import { runToolLoop, tool } from "@ljoukov/llm";
|
|
|
615
640
|
import { z } from "zod";
|
|
616
641
|
|
|
617
642
|
const result = await runToolLoop({
|
|
618
|
-
model: "gpt-5.
|
|
643
|
+
model: "gpt-5.4-mini",
|
|
619
644
|
input: "What is 12 * 9? Use the tool.",
|
|
620
645
|
tools: {
|
|
621
646
|
multiply: tool({
|
|
@@ -641,7 +666,7 @@ import { streamToolLoop, tool } from "@ljoukov/llm";
|
|
|
641
666
|
import { z } from "zod";
|
|
642
667
|
|
|
643
668
|
const call = streamToolLoop({
|
|
644
|
-
model: "chatgpt-gpt-5.3-codex",
|
|
669
|
+
model: "chatgpt-gpt-5.3-codex-spark",
|
|
645
670
|
input: "Start implementing the feature.",
|
|
646
671
|
tools: {
|
|
647
672
|
echo: tool({
|
|
@@ -665,7 +690,7 @@ import { createToolLoopSteeringChannel, runAgentLoop } from "@ljoukov/llm";
|
|
|
665
690
|
|
|
666
691
|
const steering = createToolLoopSteeringChannel();
|
|
667
692
|
const run = runAgentLoop({
|
|
668
|
-
model: "chatgpt-gpt-5.3-codex",
|
|
693
|
+
model: "chatgpt-gpt-5.3-codex-spark",
|
|
669
694
|
input: "Implement the task.",
|
|
670
695
|
filesystemTool: true,
|
|
671
696
|
steering,
|
|
@@ -683,13 +708,15 @@ const result = await run;
|
|
|
683
708
|
- built-in subagent orchestration (delegate work across spawned agents),
|
|
684
709
|
- your own custom runtime tools.
|
|
685
710
|
|
|
711
|
+
Subagents always inherit the parent run model. The subagent control tools do not expose a model override.
|
|
712
|
+
|
|
686
713
|
For interactive runs where you want to stream events and inject steering mid-run, use `streamAgentLoop()`:
|
|
687
714
|
|
|
688
715
|
```ts
|
|
689
716
|
import { streamAgentLoop } from "@ljoukov/llm";
|
|
690
717
|
|
|
691
718
|
const call = streamAgentLoop({
|
|
692
|
-
model: "chatgpt-gpt-5.3-codex",
|
|
719
|
+
model: "chatgpt-gpt-5.3-codex-spark",
|
|
693
720
|
input: "Start implementation.",
|
|
694
721
|
filesystemTool: true,
|
|
695
722
|
});
|
|
@@ -704,7 +731,7 @@ console.log(result.text);
|
|
|
704
731
|
For read/search/write tasks in a workspace, enable `filesystemTool`. The library auto-selects a tool profile by model
|
|
705
732
|
when `profile: "auto"`:
|
|
706
733
|
|
|
707
|
-
- Codex-like models (`gpt-5.4`, `chatgpt-gpt-5.4`, `chatgpt-gpt-5.4-fast`, and
|
|
734
|
+
- Codex-like models (`gpt-5.4`, `chatgpt-gpt-5.4`, `chatgpt-gpt-5.4-fast`, and `chatgpt-gpt-5.3-codex-spark`): Codex-compatible filesystem tool shape.
|
|
708
735
|
- Gemini models: Gemini-compatible filesystem tool shape.
|
|
709
736
|
- Other models: model-agnostic profile (currently Gemini-style).
|
|
710
737
|
|
|
@@ -714,6 +741,7 @@ Confinement/policy is set through `filesystemTool.options`:
|
|
|
714
741
|
- `fs`: backend (`createNodeAgentFilesystem()` or `createInMemoryAgentFilesystem()`).
|
|
715
742
|
- `checkAccess`: hook for allow/deny policy + audit.
|
|
716
743
|
- `allowOutsideCwd`: opt-out confinement (default is false).
|
|
744
|
+
- `mediaResolution`: default image fidelity for built-in `view_image` outputs.
|
|
717
745
|
|
|
718
746
|
Detailed reference: `docs/agent-filesystem-tools.md`.
|
|
719
747
|
|
|
@@ -727,7 +755,7 @@ const fs = createInMemoryAgentFilesystem({
|
|
|
727
755
|
});
|
|
728
756
|
|
|
729
757
|
const result = await runAgentLoop({
|
|
730
|
-
model: "chatgpt-gpt-5.3-codex",
|
|
758
|
+
model: "chatgpt-gpt-5.3-codex-spark",
|
|
731
759
|
input: "Change value from 1 to 2 using filesystem tools.",
|
|
732
760
|
filesystemTool: {
|
|
733
761
|
profile: "auto",
|
|
@@ -753,7 +781,7 @@ Enable `subagentTool` to allow delegation via Codex-style control tools:
|
|
|
753
781
|
import { runAgentLoop } from "@ljoukov/llm";
|
|
754
782
|
|
|
755
783
|
const result = await runAgentLoop({
|
|
756
|
-
model: "chatgpt-gpt-5.3-codex",
|
|
784
|
+
model: "chatgpt-gpt-5.3-codex-spark",
|
|
757
785
|
input: "Plan the work, delegate in parallel where useful, and return a final merged result.",
|
|
758
786
|
subagentTool: {
|
|
759
787
|
enabled: true,
|
|
@@ -775,7 +803,7 @@ const fs = createInMemoryAgentFilesystem({
|
|
|
775
803
|
});
|
|
776
804
|
|
|
777
805
|
const result = await runAgentLoop({
|
|
778
|
-
model: "chatgpt-gpt-5.3-codex",
|
|
806
|
+
model: "chatgpt-gpt-5.3-codex-spark",
|
|
779
807
|
input: "Change value from 1 to 2 using filesystem tools.",
|
|
780
808
|
filesystemTool: {
|
|
781
809
|
profile: "auto",
|
|
@@ -819,7 +847,7 @@ import path from "node:path";
|
|
|
819
847
|
import { runAgentLoop } from "@ljoukov/llm";
|
|
820
848
|
|
|
821
849
|
await runAgentLoop({
|
|
822
|
-
model: "chatgpt-gpt-5.3-codex",
|
|
850
|
+
model: "chatgpt-gpt-5.3-codex-spark",
|
|
823
851
|
input: "Do the task",
|
|
824
852
|
filesystemTool: true,
|
|
825
853
|
logging: {
|
|
@@ -848,13 +876,13 @@ import {
|
|
|
848
876
|
} from "@ljoukov/llm";
|
|
849
877
|
|
|
850
878
|
const fs = createInMemoryAgentFilesystem({ "/repo/a.ts": "export const n = 1;\n" });
|
|
851
|
-
const tools = createFilesystemToolSetForModel("chatgpt-gpt-5.3-codex", {
|
|
879
|
+
const tools = createFilesystemToolSetForModel("chatgpt-gpt-5.3-codex-spark", {
|
|
852
880
|
cwd: "/repo",
|
|
853
881
|
fs,
|
|
854
882
|
});
|
|
855
883
|
|
|
856
884
|
const result = await runToolLoop({
|
|
857
|
-
model: "chatgpt-gpt-5.3-codex",
|
|
885
|
+
model: "chatgpt-gpt-5.3-codex-spark",
|
|
858
886
|
input: "Update n to 2.",
|
|
859
887
|
tools,
|
|
860
888
|
});
|
|
@@ -905,15 +933,17 @@ Standard integration suite:
|
|
|
905
933
|
npm run test:integration
|
|
906
934
|
```
|
|
907
935
|
|
|
908
|
-
Large-file live integration tests are opt-in because they upload multi-megabyte fixtures to real provider
|
|
936
|
+
Large-file live integration tests are opt-in because they upload multi-megabyte fixtures to real canonical/provider
|
|
937
|
+
file stores:
|
|
909
938
|
|
|
910
939
|
```bash
|
|
911
940
|
LLM_INTEGRATION_LARGE_FILES=1 npm run test:integration
|
|
912
941
|
```
|
|
913
942
|
|
|
914
|
-
Those tests generate valid PDFs programmatically so the canonical upload path,
|
|
943
|
+
Those tests generate valid PDFs programmatically so the canonical upload path, signed-URL reuse, and automatic large
|
|
915
944
|
attachment offload all exercise real provider APIs. The unit suite also covers direct-call upload logging plus
|
|
916
|
-
`runAgentLoop()` upload telemetry/logging for combined-image overflow
|
|
945
|
+
`runAgentLoop()` upload telemetry/logging for combined-image overflow, and the integration suite includes provider
|
|
946
|
+
format coverage for common document and image attachments.
|
|
917
947
|
|
|
918
948
|
## License
|
|
919
949
|
|