@ljoukov/llm 3.0.4 → 3.0.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +193 -19
- package/dist/index.cjs +1929 -788
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +149 -19
- package/dist/index.d.ts +149 -19
- package/dist/index.js +1924 -788
- package/dist/index.js.map +1 -1
- package/package.json +2 -1
package/README.md
CHANGED
|
@@ -11,6 +11,7 @@ Unified TypeScript wrapper over:
|
|
|
11
11
|
- **Google Gemini via Vertex AI** (`@google/genai`)
|
|
12
12
|
- **Fireworks chat-completions models** (`kimi-k2.5`, `glm-5`, `minimax-m2.1`, `gpt-oss-120b`)
|
|
13
13
|
- **ChatGPT subscription models** via `chatgpt-*` model ids (reuses Codex auth store, or a token provider)
|
|
14
|
+
- **Agentic orchestration with subagents** via `runAgentLoop()` + built-in delegation control tools
|
|
14
15
|
|
|
15
16
|
Designed around a single streaming API that yields:
|
|
16
17
|
|
|
@@ -109,21 +110,40 @@ SSE for the rest of the process to avoid repeated failing upgrade attempts.
|
|
|
109
110
|
|
|
110
111
|
### Adaptive per-model concurrency
|
|
111
112
|
|
|
112
|
-
Provider calls use adaptive, overload-aware concurrency (with retry/backoff where supported). Configure hard caps
|
|
113
|
-
|
|
113
|
+
Provider calls use adaptive, overload-aware concurrency (with retry/backoff where supported). Configure hard caps in
|
|
114
|
+
code (clamped to `1..64`):
|
|
114
115
|
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
116
|
+
```ts
|
|
117
|
+
import { configureModelConcurrency } from "@ljoukov/llm";
|
|
118
|
+
|
|
119
|
+
configureModelConcurrency({
|
|
120
|
+
globalCap: 8,
|
|
121
|
+
providerCaps: {
|
|
122
|
+
openai: 16,
|
|
123
|
+
google: 3,
|
|
124
|
+
fireworks: 8,
|
|
125
|
+
},
|
|
126
|
+
modelCaps: {
|
|
127
|
+
"gpt-5.2": 24,
|
|
128
|
+
},
|
|
129
|
+
providerModelCaps: {
|
|
130
|
+
google: {
|
|
131
|
+
"gemini-3.1-pro-preview": 2,
|
|
132
|
+
},
|
|
133
|
+
},
|
|
134
|
+
});
|
|
135
|
+
```
|
|
121
136
|
|
|
122
|
-
|
|
137
|
+
Default caps (without configuration):
|
|
138
|
+
|
|
139
|
+
- OpenAI: `12`
|
|
140
|
+
- Google preview models (`*preview*`): `2`
|
|
141
|
+
- Other Google models: `4`
|
|
142
|
+
- Fireworks: `6`
|
|
123
143
|
|
|
124
144
|
## Usage
|
|
125
145
|
|
|
126
|
-
|
|
146
|
+
Use OpenAI-style request fields:
|
|
127
147
|
|
|
128
148
|
- `input`: string or message array
|
|
129
149
|
- `instructions`: optional top-level system instructions
|
|
@@ -326,8 +346,8 @@ console.log(result.text);
|
|
|
326
346
|
|
|
327
347
|
- OpenAI API models use structured outputs (`json_schema`) when possible.
|
|
328
348
|
- Gemini uses `responseJsonSchema`.
|
|
329
|
-
- `chatgpt-*` models try to use structured outputs too; if
|
|
330
|
-
JSON parsing.
|
|
349
|
+
- `chatgpt-*` models try to use structured outputs too; if the endpoint/account/model rejects `json_schema`, the call
|
|
350
|
+
retries with best-effort JSON parsing.
|
|
331
351
|
|
|
332
352
|
```ts
|
|
333
353
|
import { generateJson } from "@ljoukov/llm";
|
|
@@ -412,12 +432,12 @@ There are three tool-enabled call patterns:
|
|
|
412
432
|
|
|
413
433
|
1. `generateText()` for provider-native/server-side tools (for example web search).
|
|
414
434
|
2. `runToolLoop()` for your runtime JS/TS tools (function tools executed in your process).
|
|
415
|
-
3. `runAgentLoop()` for
|
|
435
|
+
3. `runAgentLoop()` for full agentic loops (a convenience wrapper around `runToolLoop()` with built-in subagent orchestration and optional filesystem tools).
|
|
416
436
|
|
|
417
437
|
Architecture note:
|
|
418
438
|
|
|
419
|
-
-
|
|
420
|
-
- `runAgentLoop()`
|
|
439
|
+
- Built-in filesystem tools are not a separate execution system.
|
|
440
|
+
- `runAgentLoop()` can construct a filesystem toolset, merges your optional custom tools, and calls the same `runToolLoop()` engine.
|
|
421
441
|
- This behavior is model-agnostic at API level; profile selection only adapts tool shape for model compatibility.
|
|
422
442
|
|
|
423
443
|
### Provider-Native Tools (`generateText()`)
|
|
@@ -461,9 +481,78 @@ console.log(result.text);
|
|
|
461
481
|
|
|
462
482
|
Use `customTool()` only when you need freeform/non-JSON tool input grammar.
|
|
463
483
|
|
|
464
|
-
###
|
|
484
|
+
### Mid-Run Steering (Queued Input)
|
|
485
|
+
|
|
486
|
+
You can queue user steering while a tool loop is already running. Steering is applied on the next model step (it does
|
|
487
|
+
not interrupt the current generation/tool execution).
|
|
488
|
+
|
|
489
|
+
```ts
|
|
490
|
+
import { streamToolLoop, tool } from "@ljoukov/llm";
|
|
491
|
+
import { z } from "zod";
|
|
492
|
+
|
|
493
|
+
const call = streamToolLoop({
|
|
494
|
+
model: "chatgpt-gpt-5.3-codex",
|
|
495
|
+
input: "Start implementing the feature.",
|
|
496
|
+
tools: {
|
|
497
|
+
echo: tool({
|
|
498
|
+
inputSchema: z.object({ text: z.string() }),
|
|
499
|
+
execute: ({ text }) => ({ text }),
|
|
500
|
+
}),
|
|
501
|
+
},
|
|
502
|
+
});
|
|
465
503
|
|
|
466
|
-
|
|
504
|
+
// Append steering while run is active.
|
|
505
|
+
call.append("Focus on tests first, then refactor.");
|
|
506
|
+
|
|
507
|
+
const result = await call.result;
|
|
508
|
+
console.log(result.text);
|
|
509
|
+
```
|
|
510
|
+
|
|
511
|
+
If you already manage your own run lifecycle, you can create and pass a steering channel directly:
|
|
512
|
+
|
|
513
|
+
```ts
|
|
514
|
+
import { createToolLoopSteeringChannel, runAgentLoop } from "@ljoukov/llm";
|
|
515
|
+
|
|
516
|
+
const steering = createToolLoopSteeringChannel();
|
|
517
|
+
const run = runAgentLoop({
|
|
518
|
+
model: "chatgpt-gpt-5.3-codex",
|
|
519
|
+
input: "Implement the task.",
|
|
520
|
+
filesystemTool: true,
|
|
521
|
+
steering,
|
|
522
|
+
});
|
|
523
|
+
|
|
524
|
+
steering.append("Do not interrupt; apply this guidance on the next turn.");
|
|
525
|
+
const result = await run;
|
|
526
|
+
```
|
|
527
|
+
|
|
528
|
+
### Agentic Loop (`runAgentLoop()`)
|
|
529
|
+
|
|
530
|
+
`runAgentLoop()` is the high-level agentic API. It supports:
|
|
531
|
+
|
|
532
|
+
- optional filesystem workspace tools,
|
|
533
|
+
- built-in subagent orchestration (delegate work across spawned agents),
|
|
534
|
+
- your own custom runtime tools.
|
|
535
|
+
|
|
536
|
+
For interactive runs where you want to stream events and inject steering mid-run, use `streamAgentLoop()`:
|
|
537
|
+
|
|
538
|
+
```ts
|
|
539
|
+
import { streamAgentLoop } from "@ljoukov/llm";
|
|
540
|
+
|
|
541
|
+
const call = streamAgentLoop({
|
|
542
|
+
model: "chatgpt-gpt-5.3-codex",
|
|
543
|
+
input: "Start implementation.",
|
|
544
|
+
filesystemTool: true,
|
|
545
|
+
});
|
|
546
|
+
|
|
547
|
+
call.append("Prioritize a minimal diff and update tests.");
|
|
548
|
+
const result = await call.result;
|
|
549
|
+
console.log(result.text);
|
|
550
|
+
```
|
|
551
|
+
|
|
552
|
+
#### 1) Filesystem agent loop
|
|
553
|
+
|
|
554
|
+
For read/search/write tasks in a workspace, enable `filesystemTool`. The library auto-selects a tool profile by model
|
|
555
|
+
when `profile: "auto"`:
|
|
467
556
|
|
|
468
557
|
- Codex-like models: Codex-compatible filesystem tool shape.
|
|
469
558
|
- Gemini models: Gemini-compatible filesystem tool shape.
|
|
@@ -478,10 +567,55 @@ Confinement/policy is set through `filesystemTool.options`:
|
|
|
478
567
|
|
|
479
568
|
Detailed reference: `docs/agent-filesystem-tools.md`.
|
|
480
569
|
|
|
481
|
-
|
|
570
|
+
Filesystem-only example:
|
|
571
|
+
|
|
572
|
+
```ts
|
|
573
|
+
import { createInMemoryAgentFilesystem, runAgentLoop } from "@ljoukov/llm";
|
|
574
|
+
|
|
575
|
+
const fs = createInMemoryAgentFilesystem({
|
|
576
|
+
"/repo/src/a.ts": "export const value = 1;\n",
|
|
577
|
+
});
|
|
578
|
+
|
|
579
|
+
const result = await runAgentLoop({
|
|
580
|
+
model: "chatgpt-gpt-5.3-codex",
|
|
581
|
+
input: "Change value from 1 to 2 using filesystem tools.",
|
|
582
|
+
filesystemTool: {
|
|
583
|
+
profile: "auto",
|
|
584
|
+
options: {
|
|
585
|
+
cwd: "/repo",
|
|
586
|
+
fs,
|
|
587
|
+
},
|
|
588
|
+
},
|
|
589
|
+
});
|
|
590
|
+
|
|
591
|
+
console.log(result.text);
|
|
592
|
+
```
|
|
593
|
+
|
|
594
|
+
#### 2) Add subagent orchestration
|
|
595
|
+
|
|
596
|
+
Enable `subagentTool` to allow delegation via Codex-style control tools:
|
|
482
597
|
|
|
483
598
|
- `spawn_agent`, `send_input`, `resume_agent`, `wait`, `close_agent`
|
|
484
|
-
-
|
|
599
|
+
- optional limits: `maxAgents`, `maxDepth`, wait timeouts
|
|
600
|
+
- `spawn_agent.agent_type` supports built-ins aligned with codex-rs-style roles: `default`, `researcher`, `worker`, `reviewer`
|
|
601
|
+
|
|
602
|
+
```ts
|
|
603
|
+
import { runAgentLoop } from "@ljoukov/llm";
|
|
604
|
+
|
|
605
|
+
const result = await runAgentLoop({
|
|
606
|
+
model: "chatgpt-gpt-5.3-codex",
|
|
607
|
+
input: "Plan the work, delegate in parallel where useful, and return a final merged result.",
|
|
608
|
+
subagentTool: {
|
|
609
|
+
enabled: true,
|
|
610
|
+
maxAgents: 4,
|
|
611
|
+
maxDepth: 2,
|
|
612
|
+
},
|
|
613
|
+
});
|
|
614
|
+
|
|
615
|
+
console.log(result.text);
|
|
616
|
+
```
|
|
617
|
+
|
|
618
|
+
#### 3) Combine filesystem + subagents
|
|
485
619
|
|
|
486
620
|
```ts
|
|
487
621
|
import { createInMemoryAgentFilesystem, runAgentLoop } from "@ljoukov/llm";
|
|
@@ -510,6 +644,37 @@ const result = await runAgentLoop({
|
|
|
510
644
|
console.log(result.text);
|
|
511
645
|
```
|
|
512
646
|
|
|
647
|
+
### Agent Telemetry (Pluggable Backends)
|
|
648
|
+
|
|
649
|
+
`runAgentLoop()` supports optional telemetry hooks that keep default behavior unchanged.
|
|
650
|
+
You can attach any backend by implementing a sink with `emit(event)` and optional `flush()`.
|
|
651
|
+
|
|
652
|
+
```ts
|
|
653
|
+
import { runAgentLoop } from "@ljoukov/llm";
|
|
654
|
+
|
|
655
|
+
const result = await runAgentLoop({
|
|
656
|
+
model: "chatgpt-gpt-5.3-codex",
|
|
657
|
+
input: "Summarize the report and update output JSON files.",
|
|
658
|
+
filesystemTool: true,
|
|
659
|
+
telemetry: {
|
|
660
|
+
includeLlmStreamEvents: false, // enable only if you need token/delta event fan-out
|
|
661
|
+
sink: {
|
|
662
|
+
emit: (event) => {
|
|
663
|
+
// Forward to your backend (Cloud Logging, OpenTelemetry, Datadog, etc.)
|
|
664
|
+
// event.type: "agent.run.started" | "agent.run.stream" | "agent.run.completed"
|
|
665
|
+
// event carries runId, parentRunId, depth, model, timestamp + payload
|
|
666
|
+
},
|
|
667
|
+
flush: async () => {
|
|
668
|
+
// Optional: flush buffered telemetry on run completion.
|
|
669
|
+
},
|
|
670
|
+
},
|
|
671
|
+
},
|
|
672
|
+
});
|
|
673
|
+
```
|
|
674
|
+
|
|
675
|
+
Telemetry emits parent/child run correlation (`runId` + `parentRunId`) for subagents.
|
|
676
|
+
See `docs/agent-telemetry.md` for event schema, design rationale, and backend adapter guidance.
|
|
677
|
+
|
|
513
678
|
If you need exact control over tool definitions, build the filesystem toolset yourself and call `runToolLoop()` directly.
|
|
514
679
|
|
|
515
680
|
```ts
|
|
@@ -554,6 +719,15 @@ npm run bench:agent:estimate
|
|
|
554
719
|
|
|
555
720
|
See `benchmarks/agent/README.md` for options and output format.
|
|
556
721
|
|
|
722
|
+
## Examples
|
|
723
|
+
|
|
724
|
+
Interactive CLI chat with mid-run steering, thought streaming, filesystem tools rooted at
|
|
725
|
+
the current directory, subagents enabled, and `Esc` interrupt support:
|
|
726
|
+
|
|
727
|
+
```bash
|
|
728
|
+
npm run example:cli-chat
|
|
729
|
+
```
|
|
730
|
+
|
|
557
731
|
## License
|
|
558
732
|
|
|
559
733
|
MIT
|