npm - extrait - Versions diffs - 0.5.6 → 0.6.0 - Mend

extrait 0.5.6 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/README.md +159 -9
package/dist/generate-shared.d.ts +79 -0
package/dist/generate.d.ts +3 -0
package/dist/index.cjs +854 -516
package/dist/index.d.ts +2 -1
package/dist/index.js +854 -516
package/dist/llm.d.ts +18 -2
package/dist/structured.d.ts +4 -4
package/dist/types.d.ts +75 -8
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # extrait
-Structured JSON extraction from LLMs with validation, repair, and streaming.
+High-level LLM text generation and structured JSON extraction with validation, repair, and streaming.
 <p align="left">
   <a href="https://www.npmjs.com/package/extrait">
@@ -68,6 +68,7 @@ console.log(result.data);
 These examples cover the most common usage patterns in the repository.
 - [`examples/simple.ts`](examples/simple.ts) - Basic structured output with streaming
+- [`examples/generate.ts`](examples/generate.ts) - High-level text generation
 - [`examples/streaming.ts`](examples/streaming.ts) - Real-time partial output and snapshot updates
 - [`examples/calculator-tool.ts`](examples/calculator-tool.ts) - Structured extraction with MCP tools
 - [`examples/conversation.ts`](examples/conversation.ts) - Multi-turn prompts and multimodal content
@@ -76,6 +77,7 @@ These examples cover the most common usage patterns in the repository.
 ```bash
 bun run dev simple "Bun.js runtime"
+bun run dev generate "Bun.js runtime"
 bun run dev streaming
 bun run dev calculator-tool
 ```
@@ -107,6 +109,8 @@ const llm = createLLM({
     mode: "loose" | "strict",             // loose allows repair
     selfHeal: 1,                          // optional retry attempts
     debug: false,                         // optional structured debug output
+    // or:
+    // debug: { enabled: true, verbose: true },
     systemPrompt: "You are a helpful assistant.",
     timeout: {
       request: 30_000,
@@ -219,7 +223,17 @@ const result = await llm.structured(
     stream: {
       to: "stdout",
       onData: (event) => {
-        console.log("Partial data:", event.data);
+        if (event.delta.text) {
+          console.log("New visible text:", event.delta.text);
+        }
+        if (event.delta.reasoning) {
+          console.log("New reasoning text:", event.delta.reasoning);
+        }
+        console.log("Current visible text:", event.snapshot.text);
+        console.log("Current reasoning:", event.snapshot.reasoning);
+        console.log("Current structured snapshot:", event.snapshot.data);
         if (event.done) {
           console.log("Streaming done.");
         }
@@ -227,6 +241,7 @@ const result = await llm.structured(
     },
     request: {
       signal: AbortSignal.timeout(30_000),  // optional AbortSignal
+      reasoningEffort: "medium",            // optional reasoning effort hint
     },
     timeout: {
       request: 30_000,  // ms per LLM HTTP request
@@ -238,6 +253,21 @@ const result = await llm.structured(
 `prompt()` builds an ordered `messages` payload. Use ``prompt`...` `` for a single string prompt, or the fluent builder for multi-turn conversations. The `LLMMessage` type is exported if you need to type your own message arrays.
+In `stream.onData`, the event is split into two layers:
+- `event.delta.text` is only the newly received visible text since the previous event.
+- `event.delta.reasoning` is only the newly received reasoning text since the previous event.
+- `event.snapshot.text` is the full visible text accumulated so far.
+- `event.snapshot.reasoning` is the full normalized reasoning accumulated so far.
+- `event.snapshot.data` is the best structured JSON snapshot that can be parsed from the stream so far. It may stay unchanged while `event.delta.text` continues to grow.
+Typical usage is:
+- render `event.delta.text` directly to a terminal or chat UI
+- optionally render `event.delta.reasoning` in a separate reasoning panel
+- use `event.snapshot.data` to drive partial structured UI state
+- use `event.snapshot.text` / `event.snapshot.reasoning` when you need the full accumulated state instead of only the latest increment
 You can also pass provider request options through `request`:
 ```typescript
@@ -254,6 +284,89 @@ const result = await llm.structured(
 );
 ```
+### Making Text Calls
+`generate()` is the high-level API for non-structured generation. It accepts the same prompt shapes as `structured()`, but does not inject any schema or parse the output.
+```typescript
+// Simple prompt
+const result = await llm.generate(
+  prompt`Write a short summary of ${topic}.`
+);
+// Multi-message prompt
+const result = await llm.generate(
+  prompt()
+    .system`You are a concise assistant.`
+    .user`Summarize: """${text}"""`
+);
+// Raw messages payload
+const result = await llm.generate({
+  prompt: {
+    messages: [
+      { role: "user", content: "Say hello in one sentence." },
+    ],
+  },
+});
+```
+Streaming mirrors `structured()`, except the snapshot only contains `text` and `reasoning`:
+```typescript
+const result = await llm.generate(
+  prompt`Explain ${topic} in one short paragraph.`,
+  {
+    stream: {
+      enabled: true,
+      onData: (event) => {
+        process.stdout.write(event.delta.text);
+        console.log("Full text so far:", event.snapshot.text);
+        console.log("Full reasoning so far:", event.snapshot.reasoning);
+        if (event.done) {
+          console.log("Streaming done.");
+        }
+      },
+    },
+  }
+);
+```
+Provider request options and MCP tools still go through `request`:
+```typescript
+const result = await llm.generate(
+  prompt`Use tools if needed and answer the user clearly.`,
+  {
+    request: {
+      temperature: 0,
+      maxTokens: 800,
+      reasoningEffort: "medium",
+      mcpClients: [calculatorMCP],
+      maxToolRounds: 8,
+    },
+  }
+);
+```
+On `openai-compatible`, this is sent as `reasoning_effort`, with `max` mapped to `xhigh`. On `anthropic-compatible`, this is sent as `output_config.effort` and auto-enables `thinking: { type: "adaptive" }`.
+For existing history or multi-turn conversations, pass `messages` directly:
+```typescript
+const messages = conversation("You are a helpful assistant.", [
+  { role: "user", text: "What is the speed of light?" },
+  { role: "assistant", text: "Approximately 299,792 km/s in a vacuum." },
+  { role: "user", text: "How long does light take to reach Earth from the Sun?" },
+]);
+const result = await llm.generate({ prompt: { messages } });
+```
+Use `llm.adapter.complete(...)` or `llm.adapter.stream(...)` only when you need the raw low-level provider interface.
 ### Images (multimodal)
 Use `images()` to build base64 image content blocks for vision-capable models.
@@ -310,8 +423,8 @@ const messages = conversation("You are a helpful assistant.", [
   { role: "user",      text: "How long does light take to reach Earth from the Sun?" },
 ]);
-// Pass to adapter directly
-const response = await llm.adapter.complete({ messages });
+// High-level text generation
+const response = await llm.generate({ prompt: { messages } });
 // Or to structured extraction
 const result = await llm.structured(Schema, { messages });
@@ -331,13 +444,43 @@ const messages = conversation("You are a vision assistant.", [
 ### Result Object
-Successful structured calls return validated data plus the raw response and trace metadata.
+Successful `generate()` calls return normalized text/reasoning plus request metadata:
+```typescript
+{
+  text: string,
+  reasoning: string,
+  attempts: GenerateAttempt[],
+  usage?: {
+    inputTokens?: number,
+    outputTokens?: number,
+    totalTokens?: number,
+    cost?: number,
+  },
+  finishReason?: string,
+}
+```
+Each `attempts` entry includes:
+```typescript
+{
+  attempt: number,
+  via: "complete" | "stream",
+  text: string,
+  reasoning: string,
+  usage?: LLMUsage,
+  finishReason?: string,
+}
+```
+Successful `structured()` calls return validated data plus normalized text/reasoning and trace metadata.
 ```typescript
 {
   data: T,                      // Validated data matching schema
-  raw: string,                  // Raw LLM response
-  thinkBlocks: ThinkBlock[],    // Extracted <think> blocks
+  text: string,                 // Visible model text, without inline <think> blocks
+  reasoning: string,            // Normalized reasoning across dedicated fields and inline <think>
   json: unknown | null,         // Parsed JSON before validation
   attempts: StructuredAttempt<T>[], // One entry per parse / self-heal attempt
   usage?: {
@@ -357,8 +500,8 @@ Each `attempts` entry includes:
   attempt: number,
   selfHeal: boolean,
   via: "complete" | "stream",
-  raw: string,
-  thinkBlocks: ThinkBlock[],
+  text: string,
+  reasoning: string,
   json: unknown | null,
   candidates: string[],
   repairLog: string[],
@@ -370,6 +513,8 @@ Each `attempts` entry includes:
 }
 ```
+Legacy inline `<think>...</think>` blocks are still supported, but the high-level `structured()` API now folds them into `reasoning` internally instead of exposing block metadata.
 ### Error Handling
 Catch `StructuredParseError` when repair and validation still fail.
@@ -547,6 +692,7 @@ const llm = createLLM({
 Run repository examples with `bun run dev <example-name>`.
 Available examples:
+- `generate` - High-level text generation ([generate.ts](examples/generate.ts))
 - `streaming` - Real LLM streaming + snapshot self-check ([streaming.ts](examples/streaming.ts))
 - `streaming-with-tools` - Real text streaming with MCP tools + self-check ([streaming-with-tools.ts](examples/streaming-with-tools.ts))
 - `abort-signal` - Start a generation then cancel quickly with `AbortSignal` ([abort-signal.ts](examples/abort-signal.ts))
@@ -563,6 +709,7 @@ Available examples:
 Pass arguments after the example name:
 ```bash
+bun run dev generate "Why Bun is fast"
 bun run dev streaming
 bun run dev streaming-with-tools
 bun run dev abort-signal 120 "JSON cancellation demo"
@@ -582,6 +729,9 @@ These environment variables are used across the examples and common client setup
 - `LLM_MODEL` - Model name (default: `gpt-5-nano`)
 - `LLM_API_KEY` - API key for the provider
 - `STRUCTURED_DEBUG=1` - Enable debug output
+  By default, structured debug prints `text` (public visible output) and
+  `reasoning` (normalized reasoning). `parseSource` (the internal source used by
+  parsing and self-heal) is only printed when `debug.verbose` is enabled.
 ## Testing

package/dist/generate-shared.d.ts ADDED Viewed

@@ -0,0 +1,79 @@
+import type { LLMAdapter, LLMMessage, LLMRequest, LLMUsage, MCPToolClient, StructuredDebugOptions, StructuredPromptBuilder, StructuredPromptContext, StructuredPromptPayload, StructuredPromptValue, StructuredTimeoutOptions, ThinkBlock } from "./types";
+export type PromptRequestOptions = Omit<LLMRequest, "prompt" | "systemPrompt" | "messages">;
+export interface StreamDelta {
+    text: string;
+    reasoning: string;
+}
+export interface NormalizedStreamConfig<TSnapshot> {
+    enabled: boolean;
+    onData?: (event: {
+        delta: StreamDelta;
+        snapshot: TSnapshot;
+        done: boolean;
+        usage?: LLMUsage;
+        finishReason?: string;
+    }) => void;
+    to?: "stdout";
+}
+export interface NormalizedDebugConfig {
+    enabled: boolean;
+    colors: boolean;
+    verbose: boolean;
+    logger: (line: string) => void;
+}
+export interface NormalizedModelOutput {
+    text: string;
+    reasoning: string;
+    thinkBlocks: ThinkBlock[];
+    parseSource: string;
+}
+export interface ModelCallOptions<TSnapshot, TTraceEvent> {
+    prompt?: string;
+    messages?: LLMMessage[];
+    systemPrompt?: string;
+    request?: PromptRequestOptions;
+    stream: NormalizedStreamConfig<TSnapshot>;
+    observe?: (event: TTraceEvent) => void;
+    buildEvent: (input: {
+        stage: "llm.request" | "llm.response" | "llm.stream.delta" | "llm.stream.data";
+        message: string;
+        details?: unknown;
+    }) => TTraceEvent;
+    buildSnapshot: (input: NormalizedModelOutput) => TSnapshot;
+    debug: NormalizedDebugConfig;
+    debugLabel: string;
+    attempt: number;
+    selfHeal: boolean;
+    selfHealEnabled: boolean;
+    timeout?: StructuredTimeoutOptions;
+}
+export interface ModelCallResult {
+    text: string;
+    reasoning: string;
+    thinkBlocks: ThinkBlock[];
+    parseSource: string;
+    via: "complete" | "stream";
+    usage?: LLMUsage;
+    finishReason?: string;
+}
+export declare function resolvePrompt(prompt: StructuredPromptBuilder, context: StructuredPromptContext): StructuredPromptPayload;
+export declare function normalizePromptValue(value: StructuredPromptValue, _context: StructuredPromptContext): StructuredPromptPayload;
+export declare function normalizePromptPayload(value: StructuredPromptPayload): StructuredPromptPayload;
+export declare function applyPromptOutdent(payload: StructuredPromptPayload, enabled: boolean): StructuredPromptPayload;
+export declare function applyOutdentToOptionalPrompt(value: string | undefined, enabled: boolean): string | undefined;
+export declare function mergeSystemPrompts(primary?: string, secondary?: string): string | undefined;
+export declare function normalizeStreamConfig<TSnapshot>(option: boolean | {
+    enabled?: boolean;
+    onData?: NormalizedStreamConfig<TSnapshot>["onData"];
+    to?: "stdout";
+} | undefined): NormalizedStreamConfig<TSnapshot>;
+export declare function normalizeDebugConfig(option: StructuredDebugOptions | boolean | undefined): NormalizedDebugConfig;
+export declare function withToolTimeout(client: MCPToolClient, toolTimeoutMs: number): MCPToolClient;
+export declare function applyToolTimeout(clients: MCPToolClient[], toolTimeoutMs: number): MCPToolClient[];
+export declare function callModel<TSnapshot, TTraceEvent>(adapter: LLMAdapter, options: ModelCallOptions<TSnapshot, TTraceEvent>): Promise<ModelCallResult>;
+export declare function normalizeModelOutput(text: string, dedicatedReasoning?: string): NormalizedModelOutput;
+export declare function composeParseSource(text: string, reasoning?: string): string;
+export declare function aggregateUsage<T extends {
+    usage?: LLMUsage;
+}>(attempts: T[]): LLMUsage | undefined;
+export declare function mergeUsage(base: LLMUsage | undefined, next: LLMUsage | undefined): LLMUsage | undefined;

package/dist/generate.d.ts ADDED Viewed

@@ -0,0 +1,3 @@
+import type { GenerateCallOptions, GenerateOptions, GenerateResult, LLMAdapter, StructuredPromptBuilder } from "./types";
+export declare function generate(adapter: LLMAdapter, prompt: StructuredPromptBuilder, options?: GenerateCallOptions): Promise<GenerateResult>;
+export declare function generate(adapter: LLMAdapter, options: GenerateOptions): Promise<GenerateResult>;