@ljoukov/llm 4.0.12 → 4.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +127 -4
- package/dist/index.cjs +2026 -547
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +85 -13
- package/dist/index.d.ts +85 -13
- package/dist/index.js +1986 -510
- package/dist/index.js.map +1 -1
- package/package.json +4 -3
package/README.md
CHANGED
|
@@ -8,7 +8,7 @@
|
|
|
8
8
|
Unified TypeScript wrapper over:
|
|
9
9
|
|
|
10
10
|
- **OpenAI Responses API** (`openai`)
|
|
11
|
-
- **Google Gemini via Vertex AI** (`@google/genai`)
|
|
11
|
+
- **Google Gemini** via **Vertex AI** or the **Gemini Developer API** (`@google/genai`)
|
|
12
12
|
- **Fireworks chat-completions models** (`kimi-k2.5`, `glm-5`, `minimax-m2.1`, `gpt-oss-120b`)
|
|
13
13
|
- **ChatGPT subscription models** via `chatgpt-*` model ids (reuses Codex auth store, or a token provider)
|
|
14
14
|
- **Agentic orchestration with subagents** via `runAgentLoop()` + built-in delegation control tools
|
|
@@ -38,11 +38,22 @@ See Node.js docs on environment variables and dotenv files: https://nodejs.org/a
|
|
|
38
38
|
- `OPENAI_RESPONSES_WEBSOCKET_MODE` (`auto` | `off` | `only`, default: `auto`)
|
|
39
39
|
- `OPENAI_BASE_URL` (optional; defaults to `https://api.openai.com/v1`)
|
|
40
40
|
|
|
41
|
-
### Gemini
|
|
41
|
+
### Gemini
|
|
42
|
+
|
|
43
|
+
Use one backend:
|
|
44
|
+
|
|
45
|
+
- `GEMINI_API_KEY` or `GOOGLE_API_KEY` for the Gemini Developer API
|
|
46
|
+
- `GOOGLE_SERVICE_ACCOUNT_JSON` for Vertex AI (the contents of a service account JSON key file, not a file path)
|
|
47
|
+
- `VERTEX_GCS_BUCKET` for Vertex-backed Gemini file attachments / `file_id` inputs
|
|
48
|
+
- `VERTEX_GCS_PREFIX` (optional object-name prefix inside `VERTEX_GCS_BUCKET`)
|
|
49
|
+
|
|
50
|
+
If a Gemini API key is present, the library uses the Gemini Developer API. Otherwise it falls back to Vertex AI.
|
|
42
51
|
|
|
43
|
-
-
|
|
52
|
+
For Vertex-backed Gemini file inputs, the library mirrors OpenAI-backed canonical files into GCS and then passes the
|
|
53
|
+
resulting `gs://...` URI to Vertex. Configure a lifecycle rule on that bucket to delete objects after 2 days if you
|
|
54
|
+
want hard 48-hour cleanup for mirrored objects.
|
|
44
55
|
|
|
45
|
-
####
|
|
56
|
+
#### Vertex AI service account setup
|
|
46
57
|
|
|
47
58
|
You need a **Google service account key JSON** for your Firebase / GCP project (this is what you put into
|
|
48
59
|
`GOOGLE_SERVICE_ACCOUNT_JSON`).
|
|
@@ -219,16 +230,72 @@ const result = await generateText({ model: "gpt-5.2", input });
|
|
|
219
230
|
console.log(result.text);
|
|
220
231
|
```
|
|
221
232
|
|
|
233
|
+
### Files API
|
|
234
|
+
|
|
235
|
+
The library now exposes an OpenAI-like canonical files API:
|
|
236
|
+
|
|
237
|
+
```ts
|
|
238
|
+
import fs from "node:fs";
|
|
239
|
+
import { files, generateText, type LlmInputMessage } from "@ljoukov/llm";
|
|
240
|
+
|
|
241
|
+
const stored = await files.create({
|
|
242
|
+
data: fs.readFileSync("report.pdf"),
|
|
243
|
+
filename: "report.pdf",
|
|
244
|
+
mimeType: "application/pdf",
|
|
245
|
+
});
|
|
246
|
+
|
|
247
|
+
const input: LlmInputMessage[] = [
|
|
248
|
+
{
|
|
249
|
+
role: "user",
|
|
250
|
+
content: [
|
|
251
|
+
{ type: "text", text: "Summarize the PDF in 5 bullets." },
|
|
252
|
+
{ type: "input_file", file_id: stored.id, filename: stored.filename },
|
|
253
|
+
],
|
|
254
|
+
},
|
|
255
|
+
];
|
|
256
|
+
|
|
257
|
+
const result = await generateText({ model: "gpt-5.2", input });
|
|
258
|
+
console.log(result.text);
|
|
259
|
+
```
|
|
260
|
+
|
|
261
|
+
Canonical storage defaults to OpenAI Files with `purpose: "user_data"` and a `48h` TTL.
|
|
262
|
+
|
|
263
|
+
- OpenAI models use that `file_id` directly.
|
|
264
|
+
- Gemini Developer API mirrors the file lazily into Gemini Files when needed.
|
|
265
|
+
- Vertex-backed Gemini mirrors the file lazily into `VERTEX_GCS_BUCKET` and uses `gs://...` URIs.
|
|
266
|
+
|
|
267
|
+
Available methods:
|
|
268
|
+
|
|
269
|
+
- `files.create({ path | data, filename?, mimeType? })`
|
|
270
|
+
- `files.retrieve(fileId)`
|
|
271
|
+
- `files.delete(fileId)`
|
|
272
|
+
- `files.content(fileId)`
|
|
273
|
+
|
|
222
274
|
### Attachments (files / images)
|
|
223
275
|
|
|
224
276
|
Use `inlineData` parts to attach base64-encoded bytes (intermixed with text). `inlineData.data` is base64 (not a data
|
|
225
277
|
URL).
|
|
226
278
|
|
|
279
|
+
Optional: set `filename` on `inlineData` to preserve the original file name when the provider supports it.
|
|
280
|
+
|
|
227
281
|
Note: `inlineData` is mapped based on `mimeType`.
|
|
228
282
|
|
|
229
283
|
- `image/*` -> image input (`input_image`)
|
|
230
284
|
- otherwise -> file input (`input_file`, e.g. `application/pdf`)
|
|
231
285
|
|
|
286
|
+
You can also pass OpenAI-style file/image parts directly:
|
|
287
|
+
|
|
288
|
+
- `input_file` with `file_id`
|
|
289
|
+
- `input_image` with `file_id`
|
|
290
|
+
|
|
291
|
+
When the combined inline attachment payload in a single request would exceed about `20 MiB` of base64/data-URL text,
|
|
292
|
+
the library automatically uploads those attachments to the canonical files store first and swaps the prompt to file
|
|
293
|
+
references:
|
|
294
|
+
|
|
295
|
+
- OpenAI: uses canonical OpenAI `file_id`s directly
|
|
296
|
+
- Gemini Developer API: mirrors to Gemini Files and sends `fileData.fileUri`
|
|
297
|
+
- Vertex AI: mirrors to `VERTEX_GCS_BUCKET` and sends `gs://...` URIs
|
|
298
|
+
|
|
232
299
|
```ts
|
|
233
300
|
import fs from "node:fs";
|
|
234
301
|
import { generateText, type LlmInputMessage } from "@ljoukov/llm";
|
|
@@ -249,6 +316,34 @@ const result = await generateText({ model: "gpt-5.2", input });
|
|
|
249
316
|
console.log(result.text);
|
|
250
317
|
```
|
|
251
318
|
|
|
319
|
+
You can mix direct `file_id` parts with `inlineData`. Small attachments stay inline; oversized turns are upgraded to
|
|
320
|
+
canonical files automatically. Tool loops do the same for large tool outputs, and they also re-check the combined size
|
|
321
|
+
after parallel tool calls so a batch of individually-small images/files still gets upgraded to `file_id` references
|
|
322
|
+
before the next model request if the aggregate payload is too large.
|
|
323
|
+
|
|
324
|
+
OpenAI-style direct file-id example:
|
|
325
|
+
|
|
326
|
+
```ts
|
|
327
|
+
import { files, generateText, type LlmInputMessage } from "@ljoukov/llm";
|
|
328
|
+
|
|
329
|
+
const stored = await files.create({
|
|
330
|
+
path: "doc.pdf",
|
|
331
|
+
});
|
|
332
|
+
|
|
333
|
+
const input: LlmInputMessage[] = [
|
|
334
|
+
{
|
|
335
|
+
role: "user",
|
|
336
|
+
content: [
|
|
337
|
+
{ type: "text", text: "Summarize the attachment." },
|
|
338
|
+
{ type: "input_file", file_id: stored.id, filename: stored.filename },
|
|
339
|
+
],
|
|
340
|
+
},
|
|
341
|
+
];
|
|
342
|
+
|
|
343
|
+
const result = await generateText({ model: "gemini-2.5-pro", input });
|
|
344
|
+
console.log(result.text);
|
|
345
|
+
```
|
|
346
|
+
|
|
252
347
|
PDF attachment example:
|
|
253
348
|
|
|
254
349
|
```ts
|
|
@@ -664,6 +759,7 @@ const result = await runAgentLoop({
|
|
|
664
759
|
emit: (event) => {
|
|
665
760
|
// Forward to your backend (Cloud Logging, OpenTelemetry, Datadog, etc.)
|
|
666
761
|
// event.type: "agent.run.started" | "agent.run.stream" | "agent.run.completed"
|
|
762
|
+
// agent.run.completed also includes uploadCount, uploadBytes, and uploadLatencyMs
|
|
667
763
|
// event carries runId, parentRunId, depth, model, timestamp + payload
|
|
668
764
|
},
|
|
669
765
|
flush: async () => {
|
|
@@ -693,6 +789,9 @@ Each LLM call writes:
|
|
|
693
789
|
- `error.txt` plus `response.metadata.json` on failure.
|
|
694
790
|
|
|
695
791
|
`image_url` data URLs are redacted in text/metadata logs (`data:...,...`) so base64 payloads are not printed inline.
|
|
792
|
+
Every canonical upload or provider mirror is also appended to `agent.log` as a `[upload] ...` line with source,
|
|
793
|
+
backend, bytes, and latency. Direct `generateText()` / `streamText()` calls inherit the same upload logging when you run
|
|
794
|
+
them inside an agent logging session, and their `response.metadata.json` includes an `uploads` summary.
|
|
696
795
|
|
|
697
796
|
```ts
|
|
698
797
|
import path from "node:path";
|
|
@@ -771,6 +870,30 @@ the current directory, subagents enabled, and `Esc` interrupt support:
|
|
|
771
870
|
npm run example:cli-chat
|
|
772
871
|
```
|
|
773
872
|
|
|
873
|
+
## Testing
|
|
874
|
+
|
|
875
|
+
Unit tests:
|
|
876
|
+
|
|
877
|
+
```bash
|
|
878
|
+
npm run test:unit
|
|
879
|
+
```
|
|
880
|
+
|
|
881
|
+
Standard integration suite:
|
|
882
|
+
|
|
883
|
+
```bash
|
|
884
|
+
npm run test:integration
|
|
885
|
+
```
|
|
886
|
+
|
|
887
|
+
Large-file live integration tests are opt-in because they upload multi-megabyte fixtures to real provider file stores:
|
|
888
|
+
|
|
889
|
+
```bash
|
|
890
|
+
LLM_INTEGRATION_LARGE_FILES=1 npm run test:integration
|
|
891
|
+
```
|
|
892
|
+
|
|
893
|
+
Those tests generate valid PDFs programmatically so the canonical upload path, `file_id` reuse, and automatic large
|
|
894
|
+
attachment offload all exercise real provider APIs. The unit suite also covers direct-call upload logging plus
|
|
895
|
+
`runAgentLoop()` upload telemetry/logging for combined-image overflow.
|
|
896
|
+
|
|
774
897
|
## License
|
|
775
898
|
|
|
776
899
|
MIT
|