@lightining/general.ai 1.0.0 → 1.1.0-beta.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,11 +1,15 @@
1
- <div align="center">
2
-
3
1
  # General.AI
4
2
 
5
- **Production-ready, TypeScript-first OpenAI orchestration for Node and Bun**
3
+ Beta-stage, TypeScript-first OpenAI-compatible orchestration runtime for Node and Bun.
4
+
5
+ Use `native` when you want exact SDK behavior.
6
+ Use `agent` when you want protocol-guided orchestration, tools, subagents, retries, context management, and cleaned output.
7
+
8
+ General.AI is not a thin wrapper. It is a protocol-guided orchestration runtime designed to make model behavior more stable and controllable.
6
9
 
7
- Native OpenAI passthrough when you want exact SDK behavior.
8
- An agent runtime when you want prompts, protocol parsing, tools, subagents, safety, memory, retries, and cleaned output.
10
+ Tested heavily on NVIDIA-compatible OpenAI-style endpoints. Broader provider validation is in progress.
11
+
12
+ > This README follows the current beta track of General.AI. If you are on the stable `latest` channel, newer capabilities such as context management/compression, structured checkpoints, parallel action batching, and `classic_v2` compatibility may not be available yet. Use the beta install instructions below when you want the features called out in the Beta Changelog.
9
13
 
10
14
  [![npm version](https://img.shields.io/npm/v/@lightining/general.ai?color=cb3837&label=npm)](https://npmjs.com/package/@lightining/general.ai)
11
15
  [![npm downloads](https://img.shields.io/npm/dm/@lightining/general.ai)](https://npmjs.com/package/@lightining/general.ai)
@@ -13,50 +17,55 @@ An agent runtime when you want prompts, protocol parsing, tools, subagents, safe
13
17
  [![Bun >=1.1](https://img.shields.io/badge/bun-%3E%3D1.1-000000)](https://bun.sh/)
14
18
  [![License: Apache-2.0](https://img.shields.io/badge/license-Apache%202.0-blue)](./LICENSE)
15
19
 
16
- [npm](https://npmjs.com/package/@lightining/general.ai) • [GitHub](https://github.com/nixaut-codelabs/general.ai)
17
-
18
- </div>
19
-
20
- ---
21
-
22
- ## What General.AI Is
23
-
24
- `@lightining/general.ai` exposes **two complementary surfaces**:
25
-
26
- - `native`: exact OpenAI SDK access with no request, response, or stream-shape mutation
27
- - `agent`: a structured orchestration runtime that layers prompt assembly, protocol parsing, retries, tools, subagents, safety, memory, streaming, and cleaned output on top of OpenAI models
28
-
29
- This split is intentional:
30
-
31
- - use **`native`** when you want raw provider behavior
32
- - use **`agent`** when you want a consistent runtime with higher-level orchestration
33
-
34
- > General.AI’s bundled prompts are written in English for consistency, but user-visible output still mirrors the user’s language unless the user explicitly asks for another one.
35
-
36
- ---
20
+ - npm: <https://npmjs.com/package/@lightining/general.ai>
21
+ - GitHub: <https://github.com/nixaut-codelabs/general.ai>
37
22
 
38
23
  ## Table Of Contents
39
24
 
40
- - [Install](#install)
41
25
  - [Why General.AI](#why-generalai)
42
- - [Feature Matrix](#feature-matrix)
26
+ - [Why Use It](#why-use-it)
27
+ - [Install](#install)
28
+ - [Beta Install](#beta-install)
43
29
  - [Quick Start](#quick-start)
44
- - [Native Surface](#native-surface)
45
- - [Agent Surface](#agent-surface)
30
+ - [Killer Demo](#killer-demo)
31
+ - [Native And Agent](#native-and-agent)
32
+ - [Compatibility Profiles](#compatibility-profiles)
46
33
  - [Tools](#tools)
47
34
  - [Subagents](#subagents)
48
- - [Prompt Packs And Overrides](#prompt-packs-and-overrides)
49
- - [Thinking, Safety, Personality, Memory](#thinking-safety-personality-memory)
35
+ - [Thinking, Safety, And Context](#thinking-safety-and-context)
36
+ - [Observability](#observability)
37
+ - [Prompt Overrides](#prompt-overrides)
50
38
  - [Streaming](#streaming)
51
- - [Compatibility Mode](#compatibility-mode)
52
- - [Protocol](#protocol)
53
- - [Examples](#examples)
54
39
  - [Testing](#testing)
55
- - [Publishing](#publishing)
40
+ - [Beta Changelog](#beta-changelog)
56
41
  - [Package Notes](#package-notes)
57
42
  - [License](#license)
58
43
 
59
- ---
44
+ ## Why General.AI
45
+
46
+ Most projects end up in one of two bad places:
47
+
48
+ - they stay very close to the raw provider API and rebuild orchestration from scratch
49
+ - or they use a wrapper that hides too much and makes advanced provider behavior harder to reach
50
+
51
+ General.AI tries to sit in the middle:
52
+
53
+ - `native` keeps the OpenAI client shape intact
54
+ - `agent` adds a controllable orchestration runtime on top
55
+
56
+ That means you can stay close to the transport layer when you want, and move up to a higher-level runtime when you need more stability, structure, and visibility.
57
+
58
+ ## Why Use It
59
+
60
+ Use General.AI when you want:
61
+
62
+ - more stable behavior from smaller or inconsistent models
63
+ - a protocol-guided runtime instead of ad hoc prompt glue
64
+ - tools, subagents, retries, cleaned output, and context handling in one place
65
+ - visibility into why the runtime called a tool, opened a subagent, or compacted context
66
+ - direct access to OpenAI-compatible APIs without losing provider-native escape hatches
67
+
68
+ Do not use it if all you want is a very thin helper around the OpenAI SDK. In that case, stay on `native`.
60
69
 
61
70
  ## Install
62
71
 
@@ -70,56 +79,60 @@ or:
70
79
  bun add @lightining/general.ai openai
71
80
  ```
72
81
 
73
- **Runtime targets**
82
+ Runtime targets:
74
83
 
75
84
  - Node `>=22`
76
85
  - Bun `>=1.1.0`
77
86
 
78
- General.AI is **ESM-only**.
87
+ General.AI is ESM-only.
79
88
 
80
- ---
89
+ ## Beta Install
81
90
 
82
- ## Why General.AI
91
+ If you want the current beta track:
83
92
 
84
- Most wrappers do one of two things badly:
93
+ ```bash
94
+ npm install @lightining/general.ai@beta openai
95
+ ```
96
+
97
+ or:
85
98
 
86
- - they hide the provider too much and make advanced OpenAI features harder to reach
87
- - or they stay so thin that you still have to rebuild orchestration yourself
99
+ ```bash
100
+ bun add @lightining/general.ai@beta openai
101
+ ```
88
102
 
89
- General.AI is designed to avoid both failures.
103
+ Channel guide:
90
104
 
91
- ### Design goals
105
+ - `latest`: slower-moving stable channel
106
+ - `beta`: newest runtime features, compatibility work, and beta-only capabilities documented in this README
92
107
 
93
- - **No lock-in at the transport layer**: `native` exposes the injected OpenAI client exactly
94
- - **Strong orchestration defaults**: `agent` ships with an opinionated runtime and robust prompts
95
- - **TypeScript-first**: public types are shipped from `dist/*.d.ts`
96
- - **OpenAI-first but provider-friendly**: supports official OpenAI and OpenAI-compatible providers
97
- - **Operationally pragmatic**: retries, parser tolerance, compatibility modes, tool gating, memory, and streaming are already built in
108
+ If you only want the stable channel, stay on `latest`.
98
109
 
99
- ---
110
+ ## Quick Start
100
111
 
101
- ## Feature Matrix
112
+ ### Simple Start
102
113
 
103
- | Capability | `native` | `agent` |
104
- | --- | --- | --- |
105
- | Exact OpenAI SDK shapes | Yes | No, returns General.AI runtime results |
106
- | `responses` endpoint | Yes | Yes |
107
- | `chat.completions` endpoint | Yes | Yes |
108
- | Streaming | Yes, exact provider events | Yes, parsed runtime events + cleaned deltas |
109
- | Prompt assembly | No | Yes |
110
- | Protocol parsing | No | Yes |
111
- | Cleaned user-visible output | No | Yes |
112
- | Tool loop | Provider-native only | Yes, protocol-driven |
113
- | Subagents | No | Yes |
114
- | Safety markers | No | Yes |
115
- | Thinking checkpoints | No | Yes |
116
- | Memory adapter | No | Yes |
117
- | Retry on malformed protocol / execution failures | No | Yes |
118
- | Compatibility mode for classic providers | N/A | Yes |
114
+ ```ts
115
+ import OpenAI from "openai";
116
+ import { GeneralAI } from "@lightining/general.ai";
119
117
 
120
- ---
118
+ const openai = new OpenAI({
119
+ apiKey: process.env.OPENAI_API_KEY,
120
+ });
121
121
 
122
- ## Quick Start
122
+ const generalAI = new GeneralAI({ openai });
123
+
124
+ const result = await generalAI.agent.generate({
125
+ endpoint: "chat_completions",
126
+ model: "gpt-5.4-mini",
127
+ messages: [
128
+ { role: "user", content: "Say hello briefly in Turkish." },
129
+ ],
130
+ });
131
+
132
+ console.log(result.cleaned);
133
+ ```
134
+
135
+ ### Advanced Start
123
136
 
124
137
  ```ts
125
138
  import OpenAI from "openai";
@@ -132,24 +145,26 @@ const openai = new OpenAI({
132
145
  const generalAI = new GeneralAI({ openai });
133
146
 
134
147
  const result = await generalAI.agent.generate({
135
- endpoint: "responses",
148
+ endpoint: "chat_completions",
136
149
  model: "gpt-5.4-mini",
137
150
  messages: [
138
- { role: "user", content: "Explain prompt caching briefly." },
151
+ { role: "user", content: "Say hello briefly in Turkish." },
139
152
  ],
153
+ compatibility: {
154
+ profile: "classic_v2",
155
+ },
140
156
  });
141
157
 
142
158
  console.log(result.cleaned);
143
- console.log(result.events);
144
- console.log(result.usage);
159
+ console.log(result.meta.warnings);
145
160
  ```
146
161
 
147
- ### Returned shape
162
+ Returned shape:
148
163
 
149
164
  ```ts
150
165
  type GeneralAIAgentResult = {
151
- output: string; // full raw protocol output
152
- cleaned: string; // only writing blocks
166
+ output: string;
167
+ cleaned: string;
153
168
  events: ProtocolEvent[];
154
169
  meta: {
155
170
  warnings: string[];
@@ -159,6 +174,9 @@ type GeneralAIAgentResult = {
159
174
  toolCallCount: number;
160
175
  subagentCallCount: number;
161
176
  protocolErrorCount: number;
177
+ contextOperations: string[];
178
+ contextSummaryCount: number;
179
+ contextDropCount: number;
162
180
  memorySessionId?: string;
163
181
  endpointResults: unknown[];
164
182
  };
@@ -173,22 +191,57 @@ type GeneralAIAgentResult = {
173
191
  };
174
192
  ```
175
193
 
176
- ---
177
-
178
- ## Native Surface
194
+ ## Killer Demo
179
195
 
180
- Use the native surface when you want **exact OpenAI SDK behavior**.
196
+ This is the kind of call where General.AI starts to feel different from a thin wrapper:
181
197
 
182
198
  ```ts
183
- import OpenAI from "openai";
184
- import { GeneralAI } from "@lightining/general.ai";
185
-
186
- const openai = new OpenAI({
187
- apiKey: process.env.OPENAI_API_KEY,
199
+ const result = await generalAI.agent.generate({
200
+ endpoint: "chat_completions",
201
+ model: "gpt-5.4-mini",
202
+ messages: [
203
+ {
204
+ role: "user",
205
+ content: "Use tools if needed, delegate arithmetic to a subagent if useful, and give me a short final answer.",
206
+ },
207
+ ],
208
+ compatibility: {
209
+ profile: "classic_v2",
210
+ },
211
+ tools: {
212
+ registry: [weatherTool, calculatorTool],
213
+ },
214
+ subagents: {
215
+ registry: [mathHelper],
216
+ },
217
+ context: {
218
+ mode: "auto",
219
+ strategy: "hybrid",
220
+ },
188
221
  });
189
222
 
190
- const generalAI = new GeneralAI({ openai });
223
+ console.log(result.cleaned);
224
+ console.log(result.meta.contextOperations);
225
+ console.log(result.meta.warnings);
226
+ ```
191
227
 
228
+ In one runtime call, General.AI can:
229
+
230
+ - call one or more tools
231
+ - delegate to one or more subagents
232
+ - retry after malformed protocol output
233
+ - summarize or drop older context
234
+ - return cleaned user-visible output separately from raw protocol output
235
+
236
+ ## Native And Agent
237
+
238
+ General.AI exposes two surfaces.
239
+
240
+ ### `native`
241
+
242
+ Use `native` when you want exact OpenAI SDK behavior.
243
+
244
+ ```ts
192
245
  const response = await generalAI.native.responses.create({
193
246
  model: "gpt-5.4-mini",
194
247
  input: "Give a one-sentence explanation of prompt caching.",
@@ -200,88 +253,67 @@ const completion = await generalAI.native.chat.completions.create({
200
253
  { role: "user", content: "Say hello in one sentence." },
201
254
  ],
202
255
  });
203
-
204
- console.log(response.output_text);
205
- console.log(completion.choices[0]?.message?.content ?? "");
206
256
  ```
207
257
 
208
- ### Why this matters
258
+ This keeps:
209
259
 
210
- - request bodies stay OpenAI-native
211
- - response objects stay OpenAI-native
212
- - stream events stay OpenAI-native
213
- - advanced provider parameters stay available exactly where the SDK supports them
260
+ - request bodies OpenAI-native
261
+ - response objects OpenAI-native
262
+ - stream events OpenAI-native
214
263
 
215
- This is the right surface when you need:
264
+ ### `agent`
216
265
 
217
- - exact built-in OpenAI tool behavior
218
- - exact stream event handling
219
- - structured outputs or advanced endpoint fields without wrapper interpretation
220
- - minimal abstraction
266
+ Use `agent` when you want runtime orchestration.
221
267
 
222
- ---
268
+ The agent runtime can:
223
269
 
224
- ## Agent Surface
270
+ - assemble layered prompts
271
+ - enforce a structured text protocol
272
+ - parse runtime events from model output
273
+ - retry recoverable protocol failures
274
+ - call tools and subagents
275
+ - maintain optional memory
276
+ - summarize or drop old context before the model hits its limit
277
+ - return both raw protocol output and cleaned user-visible output
225
278
 
226
- Use the agent surface when you want **runtime orchestration** rather than raw provider behavior.
279
+ ## Compatibility Profiles
227
280
 
228
- ```ts
229
- const result = await generalAI.agent.generate({
230
- endpoint: "chat_completions",
231
- model: "gpt-5.4-mini",
232
- messages: [
233
- { role: "user", content: "Introduce yourself briefly." },
234
- ],
235
- compatibility: {
236
- chatRoleMode: "classic",
237
- },
238
- });
281
+ Some OpenAI-compatible providers are stricter than others about message roles and continuation shaping.
239
282
 
240
- console.log(result.cleaned);
283
+ General.AI supports compatibility profiles:
284
+
285
+ - `modern`
286
+ - `classic`
287
+ - `classic_v2`
288
+ - `auto`
289
+
290
+ Example:
291
+
292
+ ```ts
293
+ compatibility: {
294
+ profile: "classic_v2",
295
+ }
241
296
  ```
242
297
 
243
- ### Agent responsibilities
298
+ What they mean:
244
299
 
245
- - assemble a strong internal prompt stack
246
- - drive a strict protocol
247
- - parse runtime events from model output
248
- - retry recoverable protocol/execution failures
249
- - execute tools and subagents
250
- - maintain optional memory
251
- - return both raw protocol and cleaned output
252
-
253
- ### Core agent parameters
254
-
255
- | Field | Required | Description |
256
- | --- | --- | --- |
257
- | `endpoint` | Yes | `"responses"` or `"chat_completions"` |
258
- | `model` | Yes | Provider model name |
259
- | `messages` | Yes | Normalized conversation array |
260
- | `personality` | No | Persona, style, behavior, boundaries, prompt text |
261
- | `safety` | No | Input/output safety behavior |
262
- | `thinking` | No | Checkpointed thinking strategy |
263
- | `tools` | No | Runtime tool registry |
264
- | `subagents` | No | Delegated specialist registry |
265
- | `memory` | No | Session memory adapter config |
266
- | `prompts` | No | Prompt section overrides |
267
- | `limits` | No | Step/tool/subagent/protocol error limits |
268
- | `request` | No | Endpoint-native OpenAI pass-through values |
269
- | `compatibility` | No | Provider compatibility knobs such as classic chat role mode |
270
- | `metadata` | No | Extra metadata for prompt/task context |
271
- | `debug` | No | Enable debug-oriented prompt/runtime behavior |
272
-
273
- ---
300
+ - `modern`: modern OpenAI-style behavior
301
+ - `classic`: safer classic `system` / `user` / `assistant` shaping
302
+ - `classic_v2`: stricter provider-safe continuation shaping for gateways that dislike late system-style messages
303
+ - `auto`: currently resolves to `modern` unless explicitly overridden
304
+
305
+ If you are using stricter compatible gateways, `classic_v2` is the safest place to start.
274
306
 
275
307
  ## Tools
276
308
 
277
- General.AI tools are **runtime-defined JavaScript functions** triggered by protocol markers.
309
+ General.AI tools are runtime-defined JavaScript functions triggered by protocol markers.
278
310
 
279
311
  ```ts
280
312
  import { defineTool } from "@lightining/general.ai";
281
313
 
282
314
  const echoTool = defineTool({
283
315
  name: "echo",
284
- description: "Echo a string back for runtime testing.",
316
+ description: "Echo text back for runtime testing.",
285
317
  inputSchema: {
286
318
  type: "object",
287
319
  additionalProperties: false,
@@ -296,13 +328,11 @@ const echoTool = defineTool({
296
328
  });
297
329
  ```
298
330
 
299
- ### Tool access policy
331
+ Tool access can be scoped:
300
332
 
301
- You can explicitly decide whether a tool is callable:
302
-
303
- - from the root agent
304
- - from all subagents
305
- - from selected subagents only
333
+ - root only
334
+ - all subagents
335
+ - selected subagents only
306
336
 
307
337
  ```ts
308
338
  const rootOnlyTool = defineTool({
@@ -315,40 +345,13 @@ const rootOnlyTool = defineTool({
315
345
  return { ok: true };
316
346
  },
317
347
  });
318
-
319
- const mathOnlyTool = defineTool({
320
- name: "math_only",
321
- description: "Only callable from the math_helper subagent.",
322
- access: {
323
- subagents: ["math_helper"],
324
- },
325
- async execute() {
326
- return { ok: true };
327
- },
328
- });
329
348
  ```
330
349
 
331
- ### Built-in helper
332
-
333
- General.AI also ships a helper for OpenAI web search via Responses:
334
-
335
- ```ts
336
- import OpenAI from "openai";
337
- import { createOpenAIWebSearchTool } from "@lightining/general.ai";
338
-
339
- const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
340
-
341
- const webSearch = createOpenAIWebSearchTool({
342
- openai,
343
- model: "gpt-5.4-mini",
344
- });
345
- ```
346
-
347
- ---
350
+ The runtime also supports multiple tool calls in the same step, with configurable parallel limits.
348
351
 
349
352
  ## Subagents
350
353
 
351
- Subagents are **bounded delegated General.AI runs** with their own instructions, model, limits, safety, and tool access.
354
+ Subagents are bounded delegated General.AI runs with their own config.
352
355
 
353
356
  ```ts
354
357
  import { defineSubagent } from "@lightining/general.ai";
@@ -356,123 +359,58 @@ import { defineSubagent } from "@lightining/general.ai";
356
359
  const mathHelper = defineSubagent({
357
360
  name: "math_helper",
358
361
  description: "A precise arithmetic specialist.",
359
- instructions: [
360
- "Solve delegated arithmetic carefully.",
361
- "Return a concise answer.",
362
- "Do not call nested subagents unless explicitly required.",
363
- ].join(" "),
364
- });
365
- ```
366
-
367
- Use them in a run:
368
-
369
- ```ts
370
- const result = await generalAI.agent.generate({
371
- endpoint: "chat_completions",
362
+ instructions: "Solve delegated arithmetic carefully and return a concise answer.",
372
363
  model: "gpt-5.4-mini",
373
- messages: [
374
- {
375
- role: "system",
376
- content: "Delegate arithmetic work to the available subagent when useful.",
377
- },
378
- {
379
- role: "user",
380
- content: "What is 17 multiplied by 23?",
381
- },
382
- ],
383
- subagents: {
384
- registry: [mathHelper],
385
- },
386
- compatibility: {
387
- chatRoleMode: "classic",
388
- },
389
- });
390
- ```
391
-
392
- ### What the runtime already handles for you
393
-
394
- - subagent instructions are automatically injected
395
- - subagents inherit compatibility mode
396
- - nested subagents can be disabled
397
- - tool visibility can be filtered per subagent
398
- - recoverable subagent execution failures can trigger retries
399
-
400
- ---
401
-
402
- ## Prompt Packs And Overrides
403
-
404
- General.AI renders a layered prompt stack in this order:
405
-
406
- 1. identity
407
- 2. endpoint adapter rules
408
- 3. protocol
409
- 4. safety
410
- 5. personality
411
- 6. thinking
412
- 7. tools and subagents
413
- 8. memory
414
- 9. task context
415
-
416
- Bundled prompts live in `prompts/*.txt`.
417
-
418
- ### Override a section
419
-
420
- ```ts
421
- const prompt = await generalAI.agent.renderPrompts({
422
- endpoint: "responses",
423
- model: "gpt-5.4-mini",
424
- messages: [{ role: "user", content: "Hello" }],
425
- prompts: {
426
- sections: {
427
- task: "Task override.\n{block:task_context}",
364
+ request: {
365
+ chat_completions: {
366
+ temperature: 0.1,
428
367
  },
429
368
  },
430
369
  });
431
370
  ```
432
371
 
433
- ### Placeholders
434
-
435
- - `{data:key}` for scalar values
436
- - `{block:key}` for multiline blocks
437
-
438
- ### Raw prompt overrides
372
+ Subagents can override:
439
373
 
440
- ```ts
441
- prompts: {
442
- raw: {
443
- prepend: "Extra preamble",
444
- append: "Extra appendix",
445
- replace: "Replace the full rendered prompt entirely",
446
- },
447
- }
448
- ```
374
+ - `endpoint`
375
+ - `model`
376
+ - `request`
377
+ - `personality`
378
+ - `safety`
379
+ - `thinking`
380
+ - `context`
381
+ - `prompts`
382
+ - `limits`
383
+ - `tools`
384
+ - `subagents`
385
+ - `compatibility`
386
+ - `memory`
449
387
 
450
- ---
388
+ They can also participate in parallel action batches.
451
389
 
452
- ## Thinking, Safety, Personality, Memory
390
+ ## Thinking, Safety, And Context
453
391
 
454
392
  These systems are separate on purpose.
455
393
 
456
394
  ### Thinking
457
395
 
458
- Thinking defaults to a checkpointed strategy in agent mode.
459
-
460
396
  ```ts
461
397
  thinking: {
462
398
  enabled: true,
399
+ mode: "hybrid",
463
400
  strategy: "checkpointed",
401
+ checkpointFormat: "structured",
464
402
  effort: "high",
465
- checkpoints: [
466
- "Before the first writing block",
467
- "After each tool result",
468
- "Before final completion",
469
- ],
470
403
  }
471
404
  ```
472
405
 
473
- ### Safety
406
+ Available thinking modes:
407
+
408
+ - `none`
409
+ - `inline`
410
+ - `orchestrated`
411
+ - `hybrid`
474
412
 
475
- Safety is configured independently for input and output.
413
+ ### Safety
476
414
 
477
415
  ```ts
478
416
  safety: {
@@ -480,61 +418,117 @@ safety: {
480
418
  mode: "balanced",
481
419
  input: {
482
420
  enabled: true,
483
- instructions: "Inspect the user request carefully.",
484
421
  },
485
422
  output: {
486
423
  enabled: true,
487
- instructions: "Inspect the final answer before completion.",
488
424
  },
489
425
  }
490
426
  ```
491
427
 
492
- ### Personality
428
+ Safety runs inside the agent protocol instead of forcing separate moderation-style API calls for every step.
429
+
430
+ ### Context Management
493
431
 
494
432
  ```ts
495
- personality: {
433
+ context: {
496
434
  enabled: true,
497
- profile: "direct_technical",
498
- persona: { honesty: "high" },
499
- style: { verbosity: "medium", tone: "direct" },
500
- behavior: { avoid_sycophancy: true },
501
- boundaries: { insult_user: false },
502
- instructions: "Be clear, direct, and technically precise.",
435
+ mode: "auto",
436
+ strategy: "hybrid",
437
+ trigger: {
438
+ contextRatio: 0.9,
439
+ },
503
440
  }
504
441
  ```
505
442
 
506
- ### Memory
443
+ Supported context strategies:
507
444
 
508
- General.AI ships with `InMemoryMemoryAdapter`, and you can inject your own adapter.
445
+ - `summarize`
446
+ - `drop_oldest`
447
+ - `drop_nonessential`
448
+ - `hybrid`
509
449
 
510
- ```ts
511
- import { GeneralAI, InMemoryMemoryAdapter } from "@lightining/general.ai";
450
+ Supported modes:
512
451
 
513
- const memoryAdapter = new InMemoryMemoryAdapter();
514
- const generalAI = new GeneralAI({ openai, memoryAdapter });
452
+ - `off`
453
+ - `auto`
454
+ - `manual`
455
+ - `hybrid`
515
456
 
516
- await generalAI.agent.generate({
517
- endpoint: "chat_completions",
457
+ This is runtime-managed context control. It is not a built-in provider compression feature.
458
+
459
+ ## Observability
460
+
461
+ General.AI is designed to be inspectable.
462
+
463
+ You can already inspect:
464
+
465
+ - parsed protocol events
466
+ - warnings and retry reasons
467
+ - cleaned output and raw protocol output
468
+ - tool and subagent counts
469
+ - prompt rendering output
470
+ - context compaction operations
471
+ - endpoint result history
472
+
473
+ This helps answer questions like:
474
+
475
+ - why did it call a tool?
476
+ - why did it open a subagent?
477
+ - why did it summarize or drop old messages?
478
+ - why did it retry after malformed model output?
479
+
480
+ ## Prompt Overrides
481
+
482
+ General.AI renders a layered prompt stack in this order:
483
+
484
+ 1. identity
485
+ 2. endpoint adapter rules
486
+ 3. protocol
487
+ 4. safety
488
+ 5. personality
489
+ 6. thinking
490
+ 7. tools and subagents
491
+ 8. memory
492
+ 9. task context
493
+
494
+ Bundled prompts live in `prompts/*.txt`.
495
+
496
+ Prompt placeholders:
497
+
498
+ - `{data:key}` for scalar values
499
+ - `{block:key}` for multiline blocks
500
+
501
+ Example:
502
+
503
+ ```ts
504
+ const prompt = await generalAI.agent.renderPrompts({
505
+ endpoint: "responses",
518
506
  model: "gpt-5.4-mini",
519
- messages: [{ role: "user", content: "Remember this preference." }],
520
- memory: {
521
- enabled: true,
522
- sessionId: "user-123",
507
+ messages: [{ role: "user", content: "Hello" }],
508
+ prompts: {
509
+ sections: {
510
+ task: "Task override.\n{block:task_context}",
511
+ },
523
512
  },
524
513
  });
525
514
  ```
526
515
 
527
- ---
528
-
529
- ## Streaming
530
-
531
- ### Native streaming
516
+ Raw overrides are also supported:
532
517
 
533
- Use the OpenAI SDK directly through `native` when you want exact provider stream events.
518
+ ```ts
519
+ prompts: {
520
+ raw: {
521
+ prepend: "Extra preamble",
522
+ append: "Extra appendix",
523
+ replace: "Replace the entire rendered prompt",
524
+ },
525
+ }
526
+ ```
534
527
 
535
- ### Agent streaming
528
+ ## Streaming
536
529
 
537
- Use `agent.stream()` when you want parsed runtime events and cleaned writing deltas.
530
+ Use `native` for exact provider stream events.
531
+ Use `agent.stream()` for parsed runtime events and cleaned writing deltas.
538
532
 
539
533
  ```ts
540
534
  const stream = generalAI.agent.stream({
@@ -550,7 +544,7 @@ for await (const event of stream) {
550
544
  }
551
545
  ```
552
546
 
553
- Typical stream events include:
547
+ Common stream events:
554
548
 
555
549
  - `run_started`
556
550
  - `prompt_rendered`
@@ -558,161 +552,49 @@ Typical stream events include:
558
552
  - `raw_text_delta`
559
553
  - `writing_delta`
560
554
  - `protocol_event`
555
+ - `batch_started`
561
556
  - `tool_started`
562
557
  - `tool_result`
563
558
  - `subagent_started`
564
559
  - `subagent_result`
560
+ - `context_compacted`
565
561
  - `warning`
566
562
  - `run_completed`
567
563
 
568
- ---
569
-
570
- ## Compatibility Mode
571
-
572
- Some OpenAI-compatible providers do not fully support newer chat roles such as `developer`.
573
-
574
- For those providers, use:
575
-
576
- ```ts
577
- compatibility: {
578
- chatRoleMode: "classic",
579
- }
580
- ```
581
-
582
- This enables safer continuation behavior for providers that expect classic `system` / `user` / `assistant` flows.
583
-
584
- This is especially useful with:
585
-
586
- - older compatible gateways
587
- - NVIDIA-style OpenAI-compatible endpoints
588
- - providers that reject post-assistant `system` or `developer` messages
589
-
590
- ---
591
-
592
- ## Protocol
593
-
594
- General.AI’s agent runtime uses a text protocol based on triple-bracket markers.
595
-
596
- ### Common markers
597
-
598
- - `[[[status:thinking]]]`
599
- - `[[[status:writing]]]`
600
- - `[[[status:input_safety:{...}]]]`
601
- - `[[[status:output_safety:{...}]]]`
602
- - `[[[status:call_tool:"name":{...}]]]`
603
- - `[[[status:call_subagent:"name":{...}]]]`
604
- - `[[[status:checkpoint]]]`
605
- - `[[[status:revise]]]`
606
- - `[[[status:error:{...}]]]`
607
- - `[[[status:done]]]`
608
-
609
- ### Important runtime rule
610
-
611
- Only `writing` blocks survive into `result.cleaned`.
612
-
613
- That means:
614
-
615
- - `thinking` is runtime-only
616
- - safety markers are runtime-only
617
- - tool and subagent markers are runtime-only
618
- - `cleaned` is the user-facing answer
619
-
620
- ### Parser behavior
621
-
622
- The parser is intentionally tolerant of real-world model behavior:
623
-
624
- - block-style JSON markers are supported
625
- - one-missing-bracket marker near-misses are tolerated
626
- - inline marker runs can be normalized onto separate lines
627
- - malformed protocol can trigger automatic retries up to `limits.maxProtocolErrors`
628
-
629
- ---
630
-
631
- ## Advanced OpenAI Pass-Through
632
-
633
- The `agent` surface owns the orchestration keys, but endpoint-native extra parameters still pass through via:
634
-
635
- - `request.responses`
636
- - `request.chat_completions`
637
-
638
- Example:
639
-
640
- ```ts
641
- const result = await generalAI.agent.generate({
642
- endpoint: "responses",
643
- model: "gpt-5.4-mini",
644
- messages: [{ role: "user", content: "Summarize this." }],
645
- request: {
646
- responses: {
647
- prompt_cache_key: "summary:v1",
648
- reasoning: { effort: "medium" },
649
- service_tier: "auto",
650
- store: false,
651
- background: false,
652
- },
653
- },
654
- });
655
- ```
656
-
657
- Reserved keys that would break agent orchestration, such as `input`, `messages`, or native tool transport fields, are stripped and reported in `result.meta.strippedRequestKeys`.
658
-
659
- ---
660
-
661
- ## Examples
662
-
663
- Included examples:
664
-
665
- - [examples/native-chat.mjs](./examples/native-chat.mjs)
666
- - [examples/native-responses.mjs](./examples/native-responses.mjs)
667
- - [examples/agent-basic.mjs](./examples/agent-basic.mjs)
668
-
669
- Run an example:
670
-
671
- ```bash
672
- npm run build
673
- node examples/native-chat.mjs
674
- ```
675
-
676
- ---
564
+ The streaming path also includes recovery for malformed protocol output from real models.
677
565
 
678
566
  ## Testing
679
567
 
680
- ### Deterministic test suite
568
+ Deterministic tests:
681
569
 
682
570
  ```bash
683
571
  npm test
684
572
  ```
685
573
 
686
- This runs:
687
-
688
- - build
689
- - unit and runtime integration tests in `test/**/*.test.js`
690
-
691
- ### Cross-runtime smoke tests
574
+ Cross-runtime smoke tests:
692
575
 
693
576
  ```bash
694
577
  npm run smoke
695
578
  ```
696
579
 
697
- ### Full public-surface and live smoke script
580
+ Manual public-surface walkthrough:
698
581
 
699
582
  ```bash
700
583
  bun run test.js
701
584
  ```
702
585
 
703
- The root [test.js](./test.js) is a comprehensive manual verification script that covers:
586
+ `test.js` can also exercise optional live provider checks when environment variables are set. It covers:
704
587
 
705
- - deterministic API surface checks with fake clients
706
- - parser behavior
707
- - prompt rendering
708
- - memory
709
- - tool gating
710
- - subagent execution
711
- - retry behavior
588
+ - native chat
589
+ - agent protocol generation
590
+ - parallel tool batching
591
+ - subagent delegation
592
+ - orchestrated thinking
593
+ - context summarization
594
+ - context dropping
712
595
  - streaming
713
- - live provider smoke tests
714
596
 
715
- #### Useful environment variables
597
+ Useful environment variables:
716
598
 
717
599
  ```bash
718
600
  GENERAL_AI_API_KEY=...
@@ -721,88 +603,56 @@ GENERAL_AI_MODEL=...
721
603
  GENERAL_AI_SKIP_LIVE=1
722
604
  ```
723
605
 
724
- If `GENERAL_AI_SKIP_LIVE=1` is set, `test.js` skips live provider checks.
725
-
726
- ---
727
-
728
- ## Publishing
606
+ If `GENERAL_AI_SKIP_LIVE=1` is set, the broader manual scripts skip live provider checks.
729
607
 
730
- The package is configured for production publishing with:
608
+ ## Beta Changelog
731
609
 
732
- - repository metadata
733
- - homepage and issue tracker links
734
- - Apache-2.0 license file
735
- - ESM entrypoints and declaration files
736
- - `sideEffects: false`
737
- - `prepublishOnly` checks
738
- - `publishConfig.provenance`
739
-
740
- ### Publish pipeline
610
+ Install the beta channel:
741
611
 
742
612
  ```bash
743
- npm test
744
- npm run smoke
745
- npm run pack:check
746
- npm publish
613
+ npm install @lightining/general.ai@beta openai
747
614
  ```
748
615
 
749
- Or rely on:
616
+ or:
750
617
 
751
618
  ```bash
752
- npm publish
619
+ bun add @lightining/general.ai@beta openai
753
620
  ```
754
621
 
755
- because `prepublishOnly` already runs:
622
+ Current beta highlights:
756
623
 
757
- - `npm test`
758
- - `npm run smoke`
759
- - `npm run pack:check`
624
+ - parallel tool and subagent action batching
625
+ - subagent-specific models and endpoint request parameters
626
+ - thinking modes: `inline`, `orchestrated`, `hybrid`
627
+ - structured `checkpoint` and `revise` support
628
+ - context management with summarize / drop / hybrid strategies
629
+ - stronger streaming fallback and retry behavior
630
+ - compatibility profiles including `classic_v2`
760
631
 
761
- ### Inspect the tarball
632
+ Features listed above are beta-track features. If you install `@lightining/general.ai` without `@beta`, you may be on an older stable release that does not include all of them yet.
762
633
 
763
- ```bash
764
- npm pack --dry-run
765
- ```
634
+ Beta reality check:
766
635
 
767
- ---
636
+ - protocol compliance still depends on model quality
637
+ - some providers are stricter than others about message shaping
638
+ - broader provider validation is still in progress
768
639
 
769
640
  ## Package Notes
770
641
 
771
- ### Internal prompt language
772
-
773
- Bundled prompts are English by default for consistency across providers and prompt packs.
642
+ Bundled prompts are written in English for consistency, but user-visible output still follows the user’s language unless they explicitly ask for another one.
774
643
 
775
- ### User-facing language
644
+ General.AI is ESM-only.
776
645
 
777
- The assistant should still answer in the user’s language unless the user explicitly asks for another language.
646
+ The current SDK baseline is `openai@^6.33.0`.
778
647
 
779
- ### ESM-only package
780
-
781
- Use `import`, not `require`.
782
-
783
- ### OpenAI SDK baseline
784
-
785
- General.AI currently targets the installed OpenAI Node SDK family represented by `openai@^6.33.0`.
786
-
787
- ### Production scope
788
-
789
- General.AI is built for:
648
+ General.AI beta is aimed at:
790
649
 
791
650
  - app backends
792
651
  - internal LLM runtimes
793
652
  - tool and subagent orchestration layers
794
- - OpenAI and OpenAI-compatible provider integrations
795
-
796
- It is **not** intended as a browser bundle.
797
-
798
- ---
799
-
800
- ## Links
801
-
802
- - npm: [npmjs.com/package/@lightining/general.ai](https://npmjs.com/package/@lightining/general.ai)
803
- - GitHub: [github.com/nixaut-codelabs/general.ai](https://github.com/nixaut-codelabs/general.ai)
653
+ - OpenAI-compatible provider integrations
804
654
 
805
- ---
655
+ It is not intended as a browser bundle.
806
656
 
807
657
  ## License
808
658