npm - @lightining/general.ai - Versions diffs - 1.0.0 → 1.1.0-beta.0 - Mend

@lightining/general.ai 1.0.0 → 1.1.0-beta.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/README.md +342 -492
package/dist/defaults.d.ts +3 -1
package/dist/defaults.js +48 -0
package/dist/defaults.js.map +1 -1
package/dist/endpoint-adapters.d.ts +2 -1
package/dist/endpoint-adapters.js +60 -6
package/dist/endpoint-adapters.js.map +1 -1
package/dist/prompts/memory.txt +1 -0
package/dist/prompts/protocol.txt +7 -3
package/dist/prompts/thinking.txt +3 -0
package/dist/prompts/tools-subagents.txt +1 -0
package/dist/protocol.js +20 -0
package/dist/protocol.js.map +1 -1
package/dist/runtime.js +577 -88
package/dist/runtime.js.map +1 -1
package/dist/types.d.ts +82 -1
package/dist/utils.d.ts +3 -0
package/dist/utils.js +24 -0
package/dist/utils.js.map +1 -1
package/package.json +23 -3

package/README.md CHANGED Viewed

@@ -1,11 +1,15 @@
-<div align="center">
 # General.AI
-**Production-ready, TypeScript-first OpenAI orchestration for Node and Bun**
+Beta-stage, TypeScript-first OpenAI-compatible orchestration runtime for Node and Bun.
+Use `native` when you want exact SDK behavior.
+Use `agent` when you want protocol-guided orchestration, tools, subagents, retries, context management, and cleaned output.
+General.AI is not a thin wrapper. It is a protocol-guided orchestration runtime designed to make model behavior more stable and controllable.
-Native OpenAI passthrough when you want exact SDK behavior.
-An agent runtime when you want prompts, protocol parsing, tools, subagents, safety, memory, retries, and cleaned output.
+Tested heavily on NVIDIA-compatible OpenAI-style endpoints. Broader provider validation is in progress.
+> This README follows the current beta track of General.AI. If you are on the stable `latest` channel, newer capabilities such as context management/compression, structured checkpoints, parallel action batching, and `classic_v2` compatibility may not be available yet. Use the beta install instructions below when you want the features called out in the Beta Changelog.
 [![npm version](https://img.shields.io/npm/v/@lightining/general.ai?color=cb3837&label=npm)](https://npmjs.com/package/@lightining/general.ai)
 [![npm downloads](https://img.shields.io/npm/dm/@lightining/general.ai)](https://npmjs.com/package/@lightining/general.ai)
@@ -13,50 +17,55 @@ An agent runtime when you want prompts, protocol parsing, tools, subagents, safe
 [![Bun >=1.1](https://img.shields.io/badge/bun-%3E%3D1.1-000000)](https://bun.sh/)
 [![License: Apache-2.0](https://img.shields.io/badge/license-Apache%202.0-blue)](./LICENSE)
-[npm](https://npmjs.com/package/@lightining/general.ai) • [GitHub](https://github.com/nixaut-codelabs/general.ai)
-</div>
----
-## What General.AI Is
-`@lightining/general.ai` exposes **two complementary surfaces**:
-- `native`: exact OpenAI SDK access with no request, response, or stream-shape mutation
-- `agent`: a structured orchestration runtime that layers prompt assembly, protocol parsing, retries, tools, subagents, safety, memory, streaming, and cleaned output on top of OpenAI models
-This split is intentional:
-- use **`native`** when you want raw provider behavior
-- use **`agent`** when you want a consistent runtime with higher-level orchestration
-> General.AI’s bundled prompts are written in English for consistency, but user-visible output still mirrors the user’s language unless the user explicitly asks for another one.
----
+- npm: <https://npmjs.com/package/@lightining/general.ai>
+- GitHub: <https://github.com/nixaut-codelabs/general.ai>
 ## Table Of Contents
-- [Install](#install)
 - [Why General.AI](#why-generalai)
-- [Feature Matrix](#feature-matrix)
+- [Why Use It](#why-use-it)
+- [Install](#install)
+- [Beta Install](#beta-install)
 - [Quick Start](#quick-start)
-- [Native Surface](#native-surface)
-- [Agent Surface](#agent-surface)
+- [Killer Demo](#killer-demo)
+- [Native And Agent](#native-and-agent)
+- [Compatibility Profiles](#compatibility-profiles)
 - [Tools](#tools)
 - [Subagents](#subagents)
-- [Prompt Packs And Overrides](#prompt-packs-and-overrides)
-- [Thinking, Safety, Personality, Memory](#thinking-safety-personality-memory)
+- [Thinking, Safety, And Context](#thinking-safety-and-context)
+- [Observability](#observability)
+- [Prompt Overrides](#prompt-overrides)
 - [Streaming](#streaming)
-- [Compatibility Mode](#compatibility-mode)
-- [Protocol](#protocol)
-- [Examples](#examples)
 - [Testing](#testing)
-- [Publishing](#publishing)
+- [Beta Changelog](#beta-changelog)
 - [Package Notes](#package-notes)
 - [License](#license)
----
+## Why General.AI
+Most projects end up in one of two bad places:
+- they stay very close to the raw provider API and rebuild orchestration from scratch
+- or they use a wrapper that hides too much and makes advanced provider behavior harder to reach
+General.AI tries to sit in the middle:
+- `native` keeps the OpenAI client shape intact
+- `agent` adds a controllable orchestration runtime on top
+That means you can stay close to the transport layer when you want, and move up to a higher-level runtime when you need more stability, structure, and visibility.
+## Why Use It
+Use General.AI when you want:
+- more stable behavior from smaller or inconsistent models
+- a protocol-guided runtime instead of ad hoc prompt glue
+- tools, subagents, retries, cleaned output, and context handling in one place
+- visibility into why the runtime called a tool, opened a subagent, or compacted context
+- direct access to OpenAI-compatible APIs without losing provider-native escape hatches
+Do not use it if all you want is a very thin helper around the OpenAI SDK. In that case, stay on `native`.
 ## Install
@@ -70,56 +79,60 @@ or:
 bun add @lightining/general.ai openai
 ```
-**Runtime targets**
+Runtime targets:
 - Node `>=22`
 - Bun `>=1.1.0`
-General.AI is **ESM-only**.
+General.AI is ESM-only.
----
+## Beta Install
-## Why General.AI
+If you want the current beta track:
-Most wrappers do one of two things badly:
+```bash
+npm install @lightining/general.ai@beta openai
+```
+or:
-- they hide the provider too much and make advanced OpenAI features harder to reach
-- or they stay so thin that you still have to rebuild orchestration yourself
+```bash
+bun add @lightining/general.ai@beta openai
+```
-General.AI is designed to avoid both failures.
+Channel guide:
-### Design goals
+- `latest`: slower-moving stable channel
+- `beta`: newest runtime features, compatibility work, and beta-only capabilities documented in this README
-- **No lock-in at the transport layer**: `native` exposes the injected OpenAI client exactly
-- **Strong orchestration defaults**: `agent` ships with an opinionated runtime and robust prompts
-- **TypeScript-first**: public types are shipped from `dist/*.d.ts`
-- **OpenAI-first but provider-friendly**: supports official OpenAI and OpenAI-compatible providers
-- **Operationally pragmatic**: retries, parser tolerance, compatibility modes, tool gating, memory, and streaming are already built in
+If you only want the stable channel, stay on `latest`.
----
+## Quick Start
-## Feature Matrix
+### Simple Start
-| Capability | `native` | `agent` |
-| --- | --- | --- |
-| Exact OpenAI SDK shapes | Yes | No, returns General.AI runtime results |
-| `responses` endpoint | Yes | Yes |
-| `chat.completions` endpoint | Yes | Yes |
-| Streaming | Yes, exact provider events | Yes, parsed runtime events + cleaned deltas |
-| Prompt assembly | No | Yes |
-| Protocol parsing | No | Yes |
-| Cleaned user-visible output | No | Yes |
-| Tool loop | Provider-native only | Yes, protocol-driven |
-| Subagents | No | Yes |
-| Safety markers | No | Yes |
-| Thinking checkpoints | No | Yes |
-| Memory adapter | No | Yes |
-| Retry on malformed protocol / execution failures | No | Yes |
-| Compatibility mode for classic providers | N/A | Yes |
+```ts
+import OpenAI from "openai";
+import { GeneralAI } from "@lightining/general.ai";
----
+const openai = new OpenAI({
+  apiKey: process.env.OPENAI_API_KEY,
+});
-## Quick Start
+const generalAI = new GeneralAI({ openai });
+const result = await generalAI.agent.generate({
+  endpoint: "chat_completions",
+  model: "gpt-5.4-mini",
+  messages: [
+    { role: "user", content: "Say hello briefly in Turkish." },
+  ],
+});
+console.log(result.cleaned);
+```
+### Advanced Start
 ```ts
 import OpenAI from "openai";
@@ -132,24 +145,26 @@ const openai = new OpenAI({
 const generalAI = new GeneralAI({ openai });
 const result = await generalAI.agent.generate({
-  endpoint: "responses",
+  endpoint: "chat_completions",
   model: "gpt-5.4-mini",
   messages: [
-    { role: "user", content: "Explain prompt caching briefly." },
+    { role: "user", content: "Say hello briefly in Turkish." },
   ],
+  compatibility: {
+    profile: "classic_v2",
+  },
 });
 console.log(result.cleaned);
-console.log(result.events);
-console.log(result.usage);
+console.log(result.meta.warnings);
 ```
-### Returned shape
+Returned shape:
 ```ts
 type GeneralAIAgentResult = {
-  output: string;        // full raw protocol output
-  cleaned: string;       // only writing blocks
+  output: string;
+  cleaned: string;
   events: ProtocolEvent[];
   meta: {
     warnings: string[];
@@ -159,6 +174,9 @@ type GeneralAIAgentResult = {
     toolCallCount: number;
     subagentCallCount: number;
     protocolErrorCount: number;
+    contextOperations: string[];
+    contextSummaryCount: number;
+    contextDropCount: number;
     memorySessionId?: string;
     endpointResults: unknown[];
   };
@@ -173,22 +191,57 @@ type GeneralAIAgentResult = {
 };
 ```
----
-## Native Surface
+## Killer Demo
-Use the native surface when you want **exact OpenAI SDK behavior**.
+This is the kind of call where General.AI starts to feel different from a thin wrapper:
 ```ts
-import OpenAI from "openai";
-import { GeneralAI } from "@lightining/general.ai";
-const openai = new OpenAI({
-  apiKey: process.env.OPENAI_API_KEY,
+const result = await generalAI.agent.generate({
+  endpoint: "chat_completions",
+  model: "gpt-5.4-mini",
+  messages: [
+    {
+      role: "user",
+      content: "Use tools if needed, delegate arithmetic to a subagent if useful, and give me a short final answer.",
+    },
+  ],
+  compatibility: {
+    profile: "classic_v2",
+  },
+  tools: {
+    registry: [weatherTool, calculatorTool],
+  },
+  subagents: {
+    registry: [mathHelper],
+  },
+  context: {
+    mode: "auto",
+    strategy: "hybrid",
+  },
 });
-const generalAI = new GeneralAI({ openai });
+console.log(result.cleaned);
+console.log(result.meta.contextOperations);
+console.log(result.meta.warnings);
+```
+In one runtime call, General.AI can:
+- call one or more tools
+- delegate to one or more subagents
+- retry after malformed protocol output
+- summarize or drop older context
+- return cleaned user-visible output separately from raw protocol output
+## Native And Agent
+General.AI exposes two surfaces.
+### `native`
+Use `native` when you want exact OpenAI SDK behavior.
+```ts
 const response = await generalAI.native.responses.create({
   model: "gpt-5.4-mini",
   input: "Give a one-sentence explanation of prompt caching.",
@@ -200,88 +253,67 @@ const completion = await generalAI.native.chat.completions.create({
     { role: "user", content: "Say hello in one sentence." },
   ],
 });
-console.log(response.output_text);
-console.log(completion.choices[0]?.message?.content ?? "");
 ```
-### Why this matters
+This keeps:
-- request bodies stay OpenAI-native
-- response objects stay OpenAI-native
-- stream events stay OpenAI-native
-- advanced provider parameters stay available exactly where the SDK supports them
+- request bodies OpenAI-native
+- response objects OpenAI-native
+- stream events OpenAI-native
-This is the right surface when you need:
+### `agent`
-- exact built-in OpenAI tool behavior
-- exact stream event handling
-- structured outputs or advanced endpoint fields without wrapper interpretation
-- minimal abstraction
+Use `agent` when you want runtime orchestration.
----
+The agent runtime can:
-## Agent Surface
+- assemble layered prompts
+- enforce a structured text protocol
+- parse runtime events from model output
+- retry recoverable protocol failures
+- call tools and subagents
+- maintain optional memory
+- summarize or drop old context before the model hits its limit
+- return both raw protocol output and cleaned user-visible output
-Use the agent surface when you want **runtime orchestration** rather than raw provider behavior.
+## Compatibility Profiles
-```ts
-const result = await generalAI.agent.generate({
-  endpoint: "chat_completions",
-  model: "gpt-5.4-mini",
-  messages: [
-    { role: "user", content: "Introduce yourself briefly." },
-  ],
-  compatibility: {
-    chatRoleMode: "classic",
-  },
-});
+Some OpenAI-compatible providers are stricter than others about message roles and continuation shaping.
-console.log(result.cleaned);
+General.AI supports compatibility profiles:
+- `modern`
+- `classic`
+- `classic_v2`
+- `auto`
+Example:
+```ts
+compatibility: {
+  profile: "classic_v2",
+}
 ```
-### Agent responsibilities
+What they mean:
-- assemble a strong internal prompt stack
-- drive a strict protocol
-- parse runtime events from model output
-- retry recoverable protocol/execution failures
-- execute tools and subagents
-- maintain optional memory
-- return both raw protocol and cleaned output
-### Core agent parameters
-| Field | Required | Description |
-| --- | --- | --- |
-| `endpoint` | Yes | `"responses"` or `"chat_completions"` |
-| `model` | Yes | Provider model name |
-| `messages` | Yes | Normalized conversation array |
-| `personality` | No | Persona, style, behavior, boundaries, prompt text |
-| `safety` | No | Input/output safety behavior |
-| `thinking` | No | Checkpointed thinking strategy |
-| `tools` | No | Runtime tool registry |
-| `subagents` | No | Delegated specialist registry |
-| `memory` | No | Session memory adapter config |
-| `prompts` | No | Prompt section overrides |
-| `limits` | No | Step/tool/subagent/protocol error limits |
-| `request` | No | Endpoint-native OpenAI pass-through values |
-| `compatibility` | No | Provider compatibility knobs such as classic chat role mode |
-| `metadata` | No | Extra metadata for prompt/task context |
-| `debug` | No | Enable debug-oriented prompt/runtime behavior |
----
+- `modern`: modern OpenAI-style behavior
+- `classic`: safer classic `system` / `user` / `assistant` shaping
+- `classic_v2`: stricter provider-safe continuation shaping for gateways that dislike late system-style messages
+- `auto`: currently resolves to `modern` unless explicitly overridden
+If you are using stricter compatible gateways, `classic_v2` is the safest place to start.
 ## Tools
-General.AI tools are **runtime-defined JavaScript functions** triggered by protocol markers.
+General.AI tools are runtime-defined JavaScript functions triggered by protocol markers.
 ```ts
 import { defineTool } from "@lightining/general.ai";
 const echoTool = defineTool({
   name: "echo",
-  description: "Echo a string back for runtime testing.",
+  description: "Echo text back for runtime testing.",
   inputSchema: {
     type: "object",
     additionalProperties: false,
@@ -296,13 +328,11 @@ const echoTool = defineTool({
 });
 ```
-### Tool access policy
+Tool access can be scoped:
-You can explicitly decide whether a tool is callable:
-- from the root agent
-- from all subagents
-- from selected subagents only
+- root only
+- all subagents
+- selected subagents only
 ```ts
 const rootOnlyTool = defineTool({
@@ -315,40 +345,13 @@ const rootOnlyTool = defineTool({
     return { ok: true };
   },
 });
-const mathOnlyTool = defineTool({
-  name: "math_only",
-  description: "Only callable from the math_helper subagent.",
-  access: {
-    subagents: ["math_helper"],
-  },
-  async execute() {
-    return { ok: true };
-  },
-});
 ```
-### Built-in helper
-General.AI also ships a helper for OpenAI web search via Responses:
-```ts
-import OpenAI from "openai";
-import { createOpenAIWebSearchTool } from "@lightining/general.ai";
-const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
-const webSearch = createOpenAIWebSearchTool({
-  openai,
-  model: "gpt-5.4-mini",
-});
-```
----
+The runtime also supports multiple tool calls in the same step, with configurable parallel limits.
 ## Subagents
-Subagents are **bounded delegated General.AI runs** with their own instructions, model, limits, safety, and tool access.
+Subagents are bounded delegated General.AI runs with their own config.
 ```ts
 import { defineSubagent } from "@lightining/general.ai";
@@ -356,123 +359,58 @@ import { defineSubagent } from "@lightining/general.ai";
 const mathHelper = defineSubagent({
   name: "math_helper",
   description: "A precise arithmetic specialist.",
-  instructions: [
-    "Solve delegated arithmetic carefully.",
-    "Return a concise answer.",
-    "Do not call nested subagents unless explicitly required.",
-  ].join(" "),
-});
-```
-Use them in a run:
-```ts
-const result = await generalAI.agent.generate({
-  endpoint: "chat_completions",
+  instructions: "Solve delegated arithmetic carefully and return a concise answer.",
   model: "gpt-5.4-mini",
-  messages: [
-    {
-      role: "system",
-      content: "Delegate arithmetic work to the available subagent when useful.",
-    },
-    {
-      role: "user",
-      content: "What is 17 multiplied by 23?",
-    },
-  ],
-  subagents: {
-    registry: [mathHelper],
-  },
-  compatibility: {
-    chatRoleMode: "classic",
-  },
-});
-```
-### What the runtime already handles for you
-- subagent instructions are automatically injected
-- subagents inherit compatibility mode
-- nested subagents can be disabled
-- tool visibility can be filtered per subagent
-- recoverable subagent execution failures can trigger retries
----
-## Prompt Packs And Overrides
-General.AI renders a layered prompt stack in this order:
-1. identity
-2. endpoint adapter rules
-3. protocol
-4. safety
-5. personality
-6. thinking
-7. tools and subagents
-8. memory
-9. task context
-Bundled prompts live in `prompts/*.txt`.
-### Override a section
-```ts
-const prompt = await generalAI.agent.renderPrompts({
-  endpoint: "responses",
-  model: "gpt-5.4-mini",
-  messages: [{ role: "user", content: "Hello" }],
-  prompts: {
-    sections: {
-      task: "Task override.\n{block:task_context}",
+  request: {
+    chat_completions: {
+      temperature: 0.1,
     },
   },
 });
 ```
-### Placeholders
-- `{data:key}` for scalar values
-- `{block:key}` for multiline blocks
-### Raw prompt overrides
+Subagents can override:
-```ts
-prompts: {
-  raw: {
-    prepend: "Extra preamble",
-    append: "Extra appendix",
-    replace: "Replace the full rendered prompt entirely",
-  },
-}
-```
+- `endpoint`
+- `model`
+- `request`
+- `personality`
+- `safety`
+- `thinking`
+- `context`
+- `prompts`
+- `limits`
+- `tools`
+- `subagents`
+- `compatibility`
+- `memory`
----
+They can also participate in parallel action batches.
-## Thinking, Safety, Personality, Memory
+## Thinking, Safety, And Context
 These systems are separate on purpose.
 ### Thinking
-Thinking defaults to a checkpointed strategy in agent mode.
 ```ts
 thinking: {
   enabled: true,
+  mode: "hybrid",
   strategy: "checkpointed",
+  checkpointFormat: "structured",
   effort: "high",
-  checkpoints: [
-    "Before the first writing block",
-    "After each tool result",
-    "Before final completion",
-  ],
 }
 ```
-### Safety
+Available thinking modes:
+- `none`
+- `inline`
+- `orchestrated`
+- `hybrid`
-Safety is configured independently for input and output.
+### Safety
 ```ts
 safety: {
@@ -480,61 +418,117 @@ safety: {
   mode: "balanced",
   input: {
     enabled: true,
-    instructions: "Inspect the user request carefully.",
   },
   output: {
     enabled: true,
-    instructions: "Inspect the final answer before completion.",
   },
 }
 ```
-### Personality
+Safety runs inside the agent protocol instead of forcing separate moderation-style API calls for every step.
+### Context Management
 ```ts
-personality: {
+context: {
   enabled: true,
-  profile: "direct_technical",
-  persona: { honesty: "high" },
-  style: { verbosity: "medium", tone: "direct" },
-  behavior: { avoid_sycophancy: true },
-  boundaries: { insult_user: false },
-  instructions: "Be clear, direct, and technically precise.",
+  mode: "auto",
+  strategy: "hybrid",
+  trigger: {
+    contextRatio: 0.9,
+  },
 }
 ```
-### Memory
+Supported context strategies:
-General.AI ships with `InMemoryMemoryAdapter`, and you can inject your own adapter.
+- `summarize`
+- `drop_oldest`
+- `drop_nonessential`
+- `hybrid`
-```ts
-import { GeneralAI, InMemoryMemoryAdapter } from "@lightining/general.ai";
+Supported modes:
-const memoryAdapter = new InMemoryMemoryAdapter();
-const generalAI = new GeneralAI({ openai, memoryAdapter });
+- `off`
+- `auto`
+- `manual`
+- `hybrid`
-await generalAI.agent.generate({
-  endpoint: "chat_completions",
+This is runtime-managed context control. It is not a built-in provider compression feature.
+## Observability
+General.AI is designed to be inspectable.
+You can already inspect:
+- parsed protocol events
+- warnings and retry reasons
+- cleaned output and raw protocol output
+- tool and subagent counts
+- prompt rendering output
+- context compaction operations
+- endpoint result history
+This helps answer questions like:
+- why did it call a tool?
+- why did it open a subagent?
+- why did it summarize or drop old messages?
+- why did it retry after malformed model output?
+## Prompt Overrides
+General.AI renders a layered prompt stack in this order:
+1. identity
+2. endpoint adapter rules
+3. protocol
+4. safety
+5. personality
+6. thinking
+7. tools and subagents
+8. memory
+9. task context
+Bundled prompts live in `prompts/*.txt`.
+Prompt placeholders:
+- `{data:key}` for scalar values
+- `{block:key}` for multiline blocks
+Example:
+```ts
+const prompt = await generalAI.agent.renderPrompts({
+  endpoint: "responses",
   model: "gpt-5.4-mini",
-  messages: [{ role: "user", content: "Remember this preference." }],
-  memory: {
-    enabled: true,
-    sessionId: "user-123",
+  messages: [{ role: "user", content: "Hello" }],
+  prompts: {
+    sections: {
+      task: "Task override.\n{block:task_context}",
+    },
   },
 });
 ```
----
-## Streaming
-### Native streaming
+Raw overrides are also supported:
-Use the OpenAI SDK directly through `native` when you want exact provider stream events.
+```ts
+prompts: {
+  raw: {
+    prepend: "Extra preamble",
+    append: "Extra appendix",
+    replace: "Replace the entire rendered prompt",
+  },
+}
+```
-### Agent streaming
+## Streaming
-Use `agent.stream()` when you want parsed runtime events and cleaned writing deltas.
+Use `native` for exact provider stream events.
+Use `agent.stream()` for parsed runtime events and cleaned writing deltas.
 ```ts
 const stream = generalAI.agent.stream({
@@ -550,7 +544,7 @@ for await (const event of stream) {
 }
 ```
-Typical stream events include:
+Common stream events:
 - `run_started`
 - `prompt_rendered`
@@ -558,161 +552,49 @@ Typical stream events include:
 - `raw_text_delta`
 - `writing_delta`
 - `protocol_event`
+- `batch_started`
 - `tool_started`
 - `tool_result`
 - `subagent_started`
 - `subagent_result`
+- `context_compacted`
 - `warning`
 - `run_completed`
----
-## Compatibility Mode
-Some OpenAI-compatible providers do not fully support newer chat roles such as `developer`.
-For those providers, use:
-```ts
-compatibility: {
-  chatRoleMode: "classic",
-}
-```
-This enables safer continuation behavior for providers that expect classic `system` / `user` / `assistant` flows.
-This is especially useful with:
-- older compatible gateways
-- NVIDIA-style OpenAI-compatible endpoints
-- providers that reject post-assistant `system` or `developer` messages
----
-## Protocol
-General.AI’s agent runtime uses a text protocol based on triple-bracket markers.
-### Common markers
-- `[[[status:thinking]]]`
-- `[[[status:writing]]]`
-- `[[[status:input_safety:{...}]]]`
-- `[[[status:output_safety:{...}]]]`
-- `[[[status:call_tool:"name":{...}]]]`
-- `[[[status:call_subagent:"name":{...}]]]`
-- `[[[status:checkpoint]]]`
-- `[[[status:revise]]]`
-- `[[[status:error:{...}]]]`
-- `[[[status:done]]]`
-### Important runtime rule
-Only `writing` blocks survive into `result.cleaned`.
-That means:
-- `thinking` is runtime-only
-- safety markers are runtime-only
-- tool and subagent markers are runtime-only
-- `cleaned` is the user-facing answer
-### Parser behavior
-The parser is intentionally tolerant of real-world model behavior:
-- block-style JSON markers are supported
-- one-missing-bracket marker near-misses are tolerated
-- inline marker runs can be normalized onto separate lines
-- malformed protocol can trigger automatic retries up to `limits.maxProtocolErrors`
----
-## Advanced OpenAI Pass-Through
-The `agent` surface owns the orchestration keys, but endpoint-native extra parameters still pass through via:
-- `request.responses`
-- `request.chat_completions`
-Example:
-```ts
-const result = await generalAI.agent.generate({
-  endpoint: "responses",
-  model: "gpt-5.4-mini",
-  messages: [{ role: "user", content: "Summarize this." }],
-  request: {
-    responses: {
-      prompt_cache_key: "summary:v1",
-      reasoning: { effort: "medium" },
-      service_tier: "auto",
-      store: false,
-      background: false,
-    },
-  },
-});
-```
-Reserved keys that would break agent orchestration, such as `input`, `messages`, or native tool transport fields, are stripped and reported in `result.meta.strippedRequestKeys`.
----
-## Examples
-Included examples:
-- [examples/native-chat.mjs](./examples/native-chat.mjs)
-- [examples/native-responses.mjs](./examples/native-responses.mjs)
-- [examples/agent-basic.mjs](./examples/agent-basic.mjs)
-Run an example:
-```bash
-npm run build
-node examples/native-chat.mjs
-```
----
+The streaming path also includes recovery for malformed protocol output from real models.
 ## Testing
-### Deterministic test suite
+Deterministic tests:
 ```bash
 npm test
 ```
-This runs:
-- build
-- unit and runtime integration tests in `test/**/*.test.js`
-### Cross-runtime smoke tests
+Cross-runtime smoke tests:
 ```bash
 npm run smoke
 ```
-### Full public-surface and live smoke script
+Manual public-surface walkthrough:
 ```bash
 bun run test.js
 ```
-The root [test.js](./test.js) is a comprehensive manual verification script that covers:
+`test.js` can also exercise optional live provider checks when environment variables are set. It covers:
-- deterministic API surface checks with fake clients
-- parser behavior
-- prompt rendering
-- memory
-- tool gating
-- subagent execution
-- retry behavior
+- native chat
+- agent protocol generation
+- parallel tool batching
+- subagent delegation
+- orchestrated thinking
+- context summarization
+- context dropping
 - streaming
-- live provider smoke tests
-#### Useful environment variables
+Useful environment variables:
 ```bash
 GENERAL_AI_API_KEY=...
@@ -721,88 +603,56 @@ GENERAL_AI_MODEL=...
 GENERAL_AI_SKIP_LIVE=1
 ```
-If `GENERAL_AI_SKIP_LIVE=1` is set, `test.js` skips live provider checks.
----
-## Publishing
+If `GENERAL_AI_SKIP_LIVE=1` is set, the broader manual scripts skip live provider checks.
-The package is configured for production publishing with:
+## Beta Changelog
-- repository metadata
-- homepage and issue tracker links
-- Apache-2.0 license file
-- ESM entrypoints and declaration files
-- `sideEffects: false`
-- `prepublishOnly` checks
-- `publishConfig.provenance`
-### Publish pipeline
+Install the beta channel:
 ```bash
-npm test
-npm run smoke
-npm run pack:check
-npm publish
+npm install @lightining/general.ai@beta openai
 ```
-Or rely on:
+or:
 ```bash
-npm publish
+bun add @lightining/general.ai@beta openai
 ```
-because `prepublishOnly` already runs:
+Current beta highlights:
-- `npm test`
-- `npm run smoke`
-- `npm run pack:check`
+- parallel tool and subagent action batching
+- subagent-specific models and endpoint request parameters
+- thinking modes: `inline`, `orchestrated`, `hybrid`
+- structured `checkpoint` and `revise` support
+- context management with summarize / drop / hybrid strategies
+- stronger streaming fallback and retry behavior
+- compatibility profiles including `classic_v2`
-### Inspect the tarball
+Features listed above are beta-track features. If you install `@lightining/general.ai` without `@beta`, you may be on an older stable release that does not include all of them yet.
-```bash
-npm pack --dry-run
-```
+Beta reality check:
----
+- protocol compliance still depends on model quality
+- some providers are stricter than others about message shaping
+- broader provider validation is still in progress
 ## Package Notes
-### Internal prompt language
-Bundled prompts are English by default for consistency across providers and prompt packs.
+Bundled prompts are written in English for consistency, but user-visible output still follows the user’s language unless they explicitly ask for another one.
-### User-facing language
+General.AI is ESM-only.
-The assistant should still answer in the user’s language unless the user explicitly asks for another language.
+The current SDK baseline is `openai@^6.33.0`.
-### ESM-only package
-Use `import`, not `require`.
-### OpenAI SDK baseline
-General.AI currently targets the installed OpenAI Node SDK family represented by `openai@^6.33.0`.
-### Production scope
-General.AI is built for:
+General.AI beta is aimed at:
 - app backends
 - internal LLM runtimes
 - tool and subagent orchestration layers
-- OpenAI and OpenAI-compatible provider integrations
-It is **not** intended as a browser bundle.
----
-## Links
-- npm: [npmjs.com/package/@lightining/general.ai](https://npmjs.com/package/@lightining/general.ai)
-- GitHub: [github.com/nixaut-codelabs/general.ai](https://github.com/nixaut-codelabs/general.ai)
+- OpenAI-compatible provider integrations
----
+It is not intended as a browser bundle.
 ## License