@ljoukov/llm 3.0.3 → 3.0.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +136 -8
- package/dist/index.cjs +1734 -278
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +89 -1
- package/dist/index.d.ts +89 -1
- package/dist/index.js +1732 -278
- package/dist/index.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -11,6 +11,7 @@ Unified TypeScript wrapper over:
|
|
|
11
11
|
- **Google Gemini via Vertex AI** (`@google/genai`)
|
|
12
12
|
- **Fireworks chat-completions models** (`kimi-k2.5`, `glm-5`, `minimax-m2.1`, `gpt-oss-120b`)
|
|
13
13
|
- **ChatGPT subscription models** via `chatgpt-*` model ids (reuses Codex auth store, or a token provider)
|
|
14
|
+
- **Agentic orchestration with subagents** via `runAgentLoop()` + built-in delegation control tools
|
|
14
15
|
|
|
15
16
|
Designed around a single streaming API that yields:
|
|
16
17
|
|
|
@@ -107,9 +108,42 @@ to HTTP/SSE automatically when needed.
|
|
|
107
108
|
When fallback is triggered by an unsupported WebSocket upgrade response (for example `426`), the library keeps using
|
|
108
109
|
SSE for the rest of the process to avoid repeated failing upgrade attempts.
|
|
109
110
|
|
|
111
|
+
### Adaptive per-model concurrency
|
|
112
|
+
|
|
113
|
+
Provider calls use adaptive, overload-aware concurrency (with retry/backoff where supported). Configure hard caps in
|
|
114
|
+
code (clamped to `1..64`):
|
|
115
|
+
|
|
116
|
+
```ts
|
|
117
|
+
import { configureModelConcurrency } from "@ljoukov/llm";
|
|
118
|
+
|
|
119
|
+
configureModelConcurrency({
|
|
120
|
+
globalCap: 8,
|
|
121
|
+
providerCaps: {
|
|
122
|
+
openai: 16,
|
|
123
|
+
google: 3,
|
|
124
|
+
fireworks: 8,
|
|
125
|
+
},
|
|
126
|
+
modelCaps: {
|
|
127
|
+
"gpt-5.2": 24,
|
|
128
|
+
},
|
|
129
|
+
providerModelCaps: {
|
|
130
|
+
google: {
|
|
131
|
+
"gemini-3.1-pro-preview": 2,
|
|
132
|
+
},
|
|
133
|
+
},
|
|
134
|
+
});
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
Default caps (without configuration):
|
|
138
|
+
|
|
139
|
+
- OpenAI: `12`
|
|
140
|
+
- Google preview models (`*preview*`): `2`
|
|
141
|
+
- Other Google models: `4`
|
|
142
|
+
- Fireworks: `6`
|
|
143
|
+
|
|
110
144
|
## Usage
|
|
111
145
|
|
|
112
|
-
|
|
146
|
+
Use OpenAI-style request fields:
|
|
113
147
|
|
|
114
148
|
- `input`: string or message array
|
|
115
149
|
- `instructions`: optional top-level system instructions
|
|
@@ -312,8 +346,8 @@ console.log(result.text);
|
|
|
312
346
|
|
|
313
347
|
- OpenAI API models use structured outputs (`json_schema`) when possible.
|
|
314
348
|
- Gemini uses `responseJsonSchema`.
|
|
315
|
-
- `chatgpt-*` models try to use structured outputs too; if
|
|
316
|
-
JSON parsing.
|
|
349
|
+
- `chatgpt-*` models try to use structured outputs too; if the endpoint/account/model rejects `json_schema`, the call
|
|
350
|
+
retries with best-effort JSON parsing.
|
|
317
351
|
|
|
318
352
|
```ts
|
|
319
353
|
import { generateJson } from "@ljoukov/llm";
|
|
@@ -398,12 +432,12 @@ There are three tool-enabled call patterns:
|
|
|
398
432
|
|
|
399
433
|
1. `generateText()` for provider-native/server-side tools (for example web search).
|
|
400
434
|
2. `runToolLoop()` for your runtime JS/TS tools (function tools executed in your process).
|
|
401
|
-
3. `runAgentLoop()` for
|
|
435
|
+
3. `runAgentLoop()` for full agentic loops (a convenience wrapper around `runToolLoop()` with built-in subagent orchestration and optional filesystem tools).
|
|
402
436
|
|
|
403
437
|
Architecture note:
|
|
404
438
|
|
|
405
|
-
-
|
|
406
|
-
- `runAgentLoop()`
|
|
439
|
+
- Built-in filesystem tools are not a separate execution system.
|
|
440
|
+
- `runAgentLoop()` can construct a filesystem toolset, merges your optional custom tools, and calls the same `runToolLoop()` engine.
|
|
407
441
|
- This behavior is model-agnostic at API level; profile selection only adapts tool shape for model compatibility.
|
|
408
442
|
|
|
409
443
|
### Provider-Native Tools (`generateText()`)
|
|
@@ -447,9 +481,18 @@ console.log(result.text);
|
|
|
447
481
|
|
|
448
482
|
Use `customTool()` only when you need freeform/non-JSON tool input grammar.
|
|
449
483
|
|
|
450
|
-
###
|
|
484
|
+
### Agentic Loop (`runAgentLoop()`)
|
|
485
|
+
|
|
486
|
+
`runAgentLoop()` is the high-level agentic API. It supports:
|
|
451
487
|
|
|
452
|
-
|
|
488
|
+
- optional filesystem workspace tools,
|
|
489
|
+
- built-in subagent orchestration (delegate work across spawned agents),
|
|
490
|
+
- your own custom runtime tools.
|
|
491
|
+
|
|
492
|
+
#### 1) Filesystem agent loop
|
|
493
|
+
|
|
494
|
+
For read/search/write tasks in a workspace, enable `filesystemTool`. The library auto-selects a tool profile by model
|
|
495
|
+
when `profile: "auto"`:
|
|
453
496
|
|
|
454
497
|
- Codex-like models: Codex-compatible filesystem tool shape.
|
|
455
498
|
- Gemini models: Gemini-compatible filesystem tool shape.
|
|
@@ -464,6 +507,8 @@ Confinement/policy is set through `filesystemTool.options`:
|
|
|
464
507
|
|
|
465
508
|
Detailed reference: `docs/agent-filesystem-tools.md`.
|
|
466
509
|
|
|
510
|
+
Filesystem-only example:
|
|
511
|
+
|
|
467
512
|
```ts
|
|
468
513
|
import { createInMemoryAgentFilesystem, runAgentLoop } from "@ljoukov/llm";
|
|
469
514
|
|
|
@@ -486,6 +531,89 @@ const result = await runAgentLoop({
|
|
|
486
531
|
console.log(result.text);
|
|
487
532
|
```
|
|
488
533
|
|
|
534
|
+
#### 2) Add subagent orchestration
|
|
535
|
+
|
|
536
|
+
Enable `subagentTool` to allow delegation via Codex-style control tools:
|
|
537
|
+
|
|
538
|
+
- `spawn_agent`, `send_input`, `resume_agent`, `wait`, `close_agent`
|
|
539
|
+
- optional limits: `maxAgents`, `maxDepth`, wait timeouts
|
|
540
|
+
|
|
541
|
+
```ts
|
|
542
|
+
import { runAgentLoop } from "@ljoukov/llm";
|
|
543
|
+
|
|
544
|
+
const result = await runAgentLoop({
|
|
545
|
+
model: "chatgpt-gpt-5.3-codex",
|
|
546
|
+
input: "Plan the work, delegate in parallel where useful, and return a final merged result.",
|
|
547
|
+
subagentTool: {
|
|
548
|
+
enabled: true,
|
|
549
|
+
maxAgents: 4,
|
|
550
|
+
maxDepth: 2,
|
|
551
|
+
},
|
|
552
|
+
});
|
|
553
|
+
|
|
554
|
+
console.log(result.text);
|
|
555
|
+
```
|
|
556
|
+
|
|
557
|
+
#### 3) Combine filesystem + subagents
|
|
558
|
+
|
|
559
|
+
```ts
|
|
560
|
+
import { createInMemoryAgentFilesystem, runAgentLoop } from "@ljoukov/llm";
|
|
561
|
+
|
|
562
|
+
const fs = createInMemoryAgentFilesystem({
|
|
563
|
+
"/repo/src/a.ts": "export const value = 1;\n",
|
|
564
|
+
});
|
|
565
|
+
|
|
566
|
+
const result = await runAgentLoop({
|
|
567
|
+
model: "chatgpt-gpt-5.3-codex",
|
|
568
|
+
input: "Change value from 1 to 2 using filesystem tools.",
|
|
569
|
+
filesystemTool: {
|
|
570
|
+
profile: "auto",
|
|
571
|
+
options: {
|
|
572
|
+
cwd: "/repo",
|
|
573
|
+
fs,
|
|
574
|
+
},
|
|
575
|
+
},
|
|
576
|
+
subagentTool: {
|
|
577
|
+
enabled: true,
|
|
578
|
+
maxAgents: 4,
|
|
579
|
+
maxDepth: 2,
|
|
580
|
+
},
|
|
581
|
+
});
|
|
582
|
+
|
|
583
|
+
console.log(result.text);
|
|
584
|
+
```
|
|
585
|
+
|
|
586
|
+
### Agent Telemetry (Pluggable Backends)
|
|
587
|
+
|
|
588
|
+
`runAgentLoop()` supports optional telemetry hooks that keep default behavior unchanged.
|
|
589
|
+
You can attach any backend by implementing a sink with `emit(event)` and optional `flush()`.
|
|
590
|
+
|
|
591
|
+
```ts
|
|
592
|
+
import { runAgentLoop } from "@ljoukov/llm";
|
|
593
|
+
|
|
594
|
+
const result = await runAgentLoop({
|
|
595
|
+
model: "chatgpt-gpt-5.3-codex",
|
|
596
|
+
input: "Summarize the report and update output JSON files.",
|
|
597
|
+
filesystemTool: true,
|
|
598
|
+
telemetry: {
|
|
599
|
+
includeLlmStreamEvents: false, // enable only if you need token/delta event fan-out
|
|
600
|
+
sink: {
|
|
601
|
+
emit: (event) => {
|
|
602
|
+
// Forward to your backend (Cloud Logging, OpenTelemetry, Datadog, etc.)
|
|
603
|
+
// event.type: "agent.run.started" | "agent.run.stream" | "agent.run.completed"
|
|
604
|
+
// event carries runId, parentRunId, depth, model, timestamp + payload
|
|
605
|
+
},
|
|
606
|
+
flush: async () => {
|
|
607
|
+
// Optional: flush buffered telemetry on run completion.
|
|
608
|
+
},
|
|
609
|
+
},
|
|
610
|
+
},
|
|
611
|
+
});
|
|
612
|
+
```
|
|
613
|
+
|
|
614
|
+
Telemetry emits parent/child run correlation (`runId` + `parentRunId`) for subagents.
|
|
615
|
+
See `docs/agent-telemetry.md` for event schema, design rationale, and backend adapter guidance.
|
|
616
|
+
|
|
489
617
|
If you need exact control over tool definitions, build the filesystem toolset yourself and call `runToolLoop()` directly.
|
|
490
618
|
|
|
491
619
|
```ts
|