llmbic 1.0.0 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +41 -0
- package/README.md +177 -29
- package/dist/extractor.d.ts +13 -7
- package/dist/extractor.d.ts.map +1 -1
- package/dist/extractor.js +50 -22
- package/dist/extractor.js.map +1 -1
- package/dist/index.d.ts +2 -2
- package/dist/index.d.ts.map +1 -1
- package/dist/merge.d.ts.map +1 -1
- package/dist/merge.js +50 -4
- package/dist/merge.js.map +1 -1
- package/dist/prompt.d.ts +20 -13
- package/dist/prompt.d.ts.map +1 -1
- package/dist/prompt.js +97 -39
- package/dist/prompt.js.map +1 -1
- package/dist/rules.d.ts +9 -2
- package/dist/rules.d.ts.map +1 -1
- package/dist/rules.js +19 -15
- package/dist/rules.js.map +1 -1
- package/dist/types/extractor.types.d.ts +54 -4
- package/dist/types/extractor.types.d.ts.map +1 -1
- package/dist/types/merge.types.d.ts +42 -0
- package/dist/types/merge.types.d.ts.map +1 -1
- package/dist/types/prompt.types.d.ts +29 -0
- package/dist/types/prompt.types.d.ts.map +1 -1
- package/dist/types/rule.types.d.ts +16 -0
- package/dist/types/rule.types.d.ts.map +1 -1
- package/package.json +4 -1
package/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,47 @@ All notable changes to this project will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [1.2.0] - 2026-04-16
|
|
9
|
+
|
|
10
|
+
Non-breaking. Production-readiness pass: per-field provenance, per-field merge policy, and pre/post LLM transformers. Token / cost tracking deliberately stays out of scope - `LlmProvider` keeps observability as a caller concern; wrap your `complete` for telemetry.
|
|
11
|
+
|
|
12
|
+
### Added
|
|
13
|
+
|
|
14
|
+
- `ExtractionResult.sources` - per-field origin of the kept value, as a `FieldSource` discriminated union (`'rule' | 'llm' | 'agreement' | 'flag'`). Variants involving a rule carry the `ruleId` of the rule that produced the match. Use it to attribute extractions back to specific rules, monitor rule quality at scale, or filter results on agreement vs LLM-only fields.
|
|
15
|
+
- `ExtractionRule.id` and `rule.create` / `rule.regex` `options.id` - declare a stable identifier surfaced in `ExtractionResult.sources`. When omitted, `rule.apply` auto-generates `${field}#${declarationIndex}` based on the rule's position in the array.
|
|
16
|
+
- `MergeApplyOptions.policyByField` and `ExtractorConfig.policyByField` - per-field overrides of `FieldMergePolicy` (strategy, confidences, compare). Precedence: defaults < `policy` < `policyByField[field]`. TypeScript validates field names against the schema. Lets a single extractor flag conflicts on critical fields, prefer rules on parser-friendly fields, and prefer the LLM on free-form fields without writing custom merge code.
|
|
17
|
+
- `ExtractorLlmConfig.transformRequest` / `transformResponse` - async hooks called around `provider.complete`. `transformRequest` rewrites the built `LlmRequest` (PII redaction, locale tagging); `transformResponse` rewrites the parsed `LlmResult` before the merge step (PII restoration, post-processing). Errors propagate, no implicit catch.
|
|
18
|
+
- `examples/pii-redaction.ts` - runnable, offline demo of the redact-then-restore pattern using `transformRequest` + `transformResponse` (also wired as `npm run example:pii-redaction`).
|
|
19
|
+
|
|
20
|
+
### Public types
|
|
21
|
+
|
|
22
|
+
- `FieldSource` exported from the package root.
|
|
23
|
+
- `RulesResult.sourceIds` (optional) - populated by `rule.apply`, consumed by `merge.apply` to compute `ExtractionResult.sources`. External callers building `RulesResult` by hand can omit it; provenance simply falls back to an empty `ruleId`.
|
|
24
|
+
- `ExtractionRule` gains optional `id`. `RuleMatch`, `FieldMergeResult` and `merge.field`'s signature are unchanged - provenance is computed from the merge outcome plus the policy, not stored on the per-field primitive.
|
|
25
|
+
|
|
26
|
+
## [1.1.0] - 2026-04-16
|
|
27
|
+
|
|
28
|
+
Non-breaking. Unblocks hybrid workflows that rely on nested schemas, agreement/conflict detection, and extractor-level merge options.
|
|
29
|
+
|
|
30
|
+
### Added
|
|
31
|
+
|
|
32
|
+
- `prompt.build` now supports `z.array(...)`, `z.object(...)`, `z.optional(...)` and `z.default(...)` in the response JSON Schema. Optional fields are preserved in `properties` but excluded from `required`.
|
|
33
|
+
- Cross-check mode on `prompt.build` and `ExtractorLlmConfig`: `mode: 'cross-check'` asks the LLM about every schema field, not just `partial.missing`, enabling the per-field agreement / conflict machinery in `merge.apply`. `crossCheckHints: 'bias' | 'unbiased'` (default `unbiased`) controls whether rule values are surfaced to the LLM as hints.
|
|
34
|
+
- `ExtractorConfig` now accepts `normalizers`, `validators`, `policy` and `logger` directly; previously these had to be threaded into a manual `merge.apply` call. The options are forwarded to every internal merge, so `extract`, `extractSync` and `extractor.merge` all honor them.
|
|
35
|
+
- Zod `.describe("...")` (equivalent to `.meta({ description })`) is now propagated to the generated JSON Schema at the level it was declared; providers' structured-output features consume it natively, so per-field prompt guidance no longer requires an expanded system prompt.
|
|
36
|
+
- README "Batch / async mode" section expanded with a worked OpenAI Batch API example (JSONL shape, upload / poll / download / merge), plus a full runnable script at `examples/openai-batch.ts`.
|
|
37
|
+
|
|
38
|
+
### Fixed
|
|
39
|
+
|
|
40
|
+
- Object schemas emitted by `prompt.build` now carry `additionalProperties: false`, matching the requirement of OpenAI Chat Completions Structured Outputs with `strict: true`. Other providers (Anthropic tool use, Ollama JSON Schema) ignore the extra key. Aligned with `prompt.parse` which already drops unexpected fields with a warning.
|
|
41
|
+
- `createExtractor` was not forwarding the configured `logger` to `rule.apply`, so schema-rejection warnings from the rules pass were silently dropped. The logger is now plumbed through every phase.
|
|
42
|
+
|
|
43
|
+
### Public types
|
|
44
|
+
|
|
45
|
+
- `PromptBuildMode`, `CrossCheckHints`, `PromptBuildOptions` exported from the package root.
|
|
46
|
+
- `ExtractorConfig<S>` gains optional `normalizers`, `validators`, `policy`, `logger`.
|
|
47
|
+
- `ExtractorLlmConfig` gains optional `mode`, `crossCheckHints`.
|
|
48
|
+
|
|
8
49
|
## [1.0.0] — 2026-04-15
|
|
9
50
|
|
|
10
51
|
Initial public release.
|
package/README.md
CHANGED
|
@@ -29,7 +29,7 @@ Llmbic has a single dependency: [Zod](https://zod.dev). No vendor SDK is pulled
|
|
|
29
29
|
|
|
30
30
|
```typescript
|
|
31
31
|
import { z } from 'zod';
|
|
32
|
-
import { createExtractor, rule
|
|
32
|
+
import { createExtractor, rule } from 'llmbic';
|
|
33
33
|
|
|
34
34
|
const InvoiceSchema = z.object({
|
|
35
35
|
total: z.number().nullable(),
|
|
@@ -41,14 +41,14 @@ const InvoiceSchema = z.object({
|
|
|
41
41
|
const extractor = createExtractor({
|
|
42
42
|
schema: InvoiceSchema,
|
|
43
43
|
rules: [
|
|
44
|
-
rule('total', (text) => {
|
|
44
|
+
rule.create('total', (text) => {
|
|
45
45
|
const m = text.match(/Total[:\s]*(\d[\d.,\s]+)\s*€/i);
|
|
46
46
|
if (!m) return null;
|
|
47
|
-
return confidence(parseFloat(m[1].replace(/[\s.]/g, '').replace(',', '.')), 1.0);
|
|
47
|
+
return rule.confidence(parseFloat(m[1].replace(/[\s.]/g, '').replace(',', '.')), 1.0);
|
|
48
48
|
}),
|
|
49
|
-
rule('currency', (text) => {
|
|
50
|
-
if (/€|EUR/i.test(text)) return confidence('EUR', 1.0);
|
|
51
|
-
if (/\$|USD/i.test(text)) return confidence('USD', 1.0);
|
|
49
|
+
rule.create('currency', (text) => {
|
|
50
|
+
if (/€|EUR/i.test(text)) return rule.confidence('EUR', 1.0);
|
|
51
|
+
if (/\$|USD/i.test(text)) return rule.confidence('USD', 1.0);
|
|
52
52
|
return null;
|
|
53
53
|
}),
|
|
54
54
|
],
|
|
@@ -69,7 +69,7 @@ console.log(result.missing);
|
|
|
69
69
|
### Rules + LLM
|
|
70
70
|
|
|
71
71
|
```typescript
|
|
72
|
-
import { createExtractor, rule
|
|
72
|
+
import { createExtractor, rule } from 'llmbic';
|
|
73
73
|
import type { LlmProvider } from 'llmbic';
|
|
74
74
|
import OpenAI from 'openai';
|
|
75
75
|
|
|
@@ -135,6 +135,48 @@ const llmResult = extractor.parse(rawJsonResponse);
|
|
|
135
135
|
const result = extractor.merge(partial, llmResult, markdown);
|
|
136
136
|
```
|
|
137
137
|
|
|
138
|
+
Steps 1, 2 and 4 are pure and synchronous: persist `partial` between (2) and (4); the merge re-runs the rules internally so no private state leaks across the async gap.
|
|
139
|
+
|
|
140
|
+
#### Worked example: OpenAI Batch API
|
|
141
|
+
|
|
142
|
+
The Batch API expects a JSONL file where each line is a Chat Completions request. Using `extractor.prompt(...)` as the per-document payload builder maps 1:1 onto that format:
|
|
143
|
+
|
|
144
|
+
```typescript
|
|
145
|
+
// For each document, build one JSONL line:
|
|
146
|
+
const partial = extractor.extractSync(doc.markdown);
|
|
147
|
+
const request = extractor.prompt(doc.markdown, partial);
|
|
148
|
+
|
|
149
|
+
const line = JSON.stringify({
|
|
150
|
+
custom_id: doc.id, // how you'll re-match later
|
|
151
|
+
method: 'POST',
|
|
152
|
+
url: '/v1/chat/completions',
|
|
153
|
+
body: {
|
|
154
|
+
model: 'gpt-4o-mini',
|
|
155
|
+
messages: [
|
|
156
|
+
{ role: 'system', content: request.systemPrompt },
|
|
157
|
+
{ role: 'user', content: request.userContent },
|
|
158
|
+
],
|
|
159
|
+
response_format: {
|
|
160
|
+
type: 'json_schema',
|
|
161
|
+
json_schema: { name: 'extraction', strict: true, schema: request.responseSchema },
|
|
162
|
+
},
|
|
163
|
+
},
|
|
164
|
+
});
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
Upload the JSONL, create the batch, poll until `status === 'completed'`, download the output file. Each output line carries the same `custom_id` so you can map back to the `partial` you kept in memory (or in Redis, or on disk):
|
|
168
|
+
|
|
169
|
+
```typescript
|
|
170
|
+
for (const entry of prepared) {
|
|
171
|
+
const raw = responsesById.get(entry.id); // from output JSONL
|
|
172
|
+
const llmResult = extractor.parse(raw);
|
|
173
|
+
const result = extractor.merge(entry.partial, llmResult, entry.markdown);
|
|
174
|
+
// ... persist result ...
|
|
175
|
+
}
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
End-to-end runnable example (upload + poll + download + merge): [`examples/openai-batch.ts`](./examples/openai-batch.ts). At current OpenAI pricing the Batch API is ~50% cheaper than realtime Chat Completions, with a 24h completion window.
|
|
179
|
+
|
|
138
180
|
## Features
|
|
139
181
|
|
|
140
182
|
### Per-field confidence scoring
|
|
@@ -150,6 +192,28 @@ Every field in the result carries a confidence score (0.0–1.0):
|
|
|
150
192
|
| Rule + LLM disagree | 0.3 (flagged as conflict) |
|
|
151
193
|
| No source | `null` |
|
|
152
194
|
|
|
195
|
+
### Per-field provenance
|
|
196
|
+
|
|
197
|
+
Alongside `confidence`, every field carries a `source` describing where the kept value came from. Useful for attributing extractions back to the rule that produced them, monitoring rule quality at scale, or filtering on agreement vs LLM-only fields:
|
|
198
|
+
|
|
199
|
+
```typescript
|
|
200
|
+
result.sources;
|
|
201
|
+
// {
|
|
202
|
+
// total: { kind: 'agreement', ruleId: 'total-eur' }, // rule + LLM agreed
|
|
203
|
+
// currency: { kind: 'rule', ruleId: 'currency#1' }, // only the rule produced a value
|
|
204
|
+
// vendor: { kind: 'llm' }, // only the LLM produced a value
|
|
205
|
+
// date: { kind: 'flag', ruleId: 'date-iso' }, // rule and LLM disagreed under flag strategy
|
|
206
|
+
// notes: null, // missing
|
|
207
|
+
// }
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
`ruleId` defaults to `${field}#${declarationIndex}` based on the rule's position in the array - stable as long as you don't reorder. For long-lived production code, declare ids explicitly so refactors don't break observability:
|
|
211
|
+
|
|
212
|
+
```typescript
|
|
213
|
+
rule.create('total', extractTotal, { id: 'total-eur' });
|
|
214
|
+
rule.regex('date', /(\d{4}-\d{2}-\d{2})/, 0.95, undefined, { id: 'date-iso' });
|
|
215
|
+
```
|
|
216
|
+
|
|
153
217
|
### Conflict detection
|
|
154
218
|
|
|
155
219
|
When a rule and the LLM extract different values for the same field, Llmbic flags it:
|
|
@@ -161,6 +225,53 @@ result.conflicts;
|
|
|
161
225
|
|
|
162
226
|
Three conflict strategies: `'flag'` (default — keep rule value, record conflict), `'prefer-rule'`, or `'prefer-llm'`.
|
|
163
227
|
|
|
228
|
+
In the default `'fill-gaps'` mode the LLM is only asked about fields the rules could not resolve, so conflicts are impossible. To actually trigger conflict detection, opt into cross-check (see below).
|
|
229
|
+
|
|
230
|
+
#### Per-field strategies
|
|
231
|
+
|
|
232
|
+
`policy` is a single strategy applied to every field. When fields have different criticality (a `price` you want to flag vs a `postal_code` your regex always nails vs a free-form `description` you'd rather defer to the LLM), use `policyByField` to override per field. Precedence: library defaults < `policy` < `policyByField[field]`.
|
|
233
|
+
|
|
234
|
+
```typescript
|
|
235
|
+
const extractor = createExtractor({
|
|
236
|
+
schema: ListingSchema,
|
|
237
|
+
rules: [...],
|
|
238
|
+
policy: { strategy: 'flag' }, // default for every field
|
|
239
|
+
policyByField: {
|
|
240
|
+
postal_code: { strategy: 'prefer-rule' },
|
|
241
|
+
description: { strategy: 'prefer-llm' },
|
|
242
|
+
},
|
|
243
|
+
});
|
|
244
|
+
```
|
|
245
|
+
|
|
246
|
+
You can override any subset of `FieldMergePolicy` per field - strategy, confidences, even the `compare` callback (e.g. fuzzy equality for free-form strings). TypeScript validates field names against your schema, so typos surface at compile time.
|
|
247
|
+
|
|
248
|
+
### Cross-check mode
|
|
249
|
+
|
|
250
|
+
Switch the LLM call from fill-gaps (ask only about missing fields) to cross-check (ask about every schema field, whether the rules resolved it or not):
|
|
251
|
+
|
|
252
|
+
```typescript
|
|
253
|
+
const extractor = createExtractor({
|
|
254
|
+
schema: InvoiceSchema,
|
|
255
|
+
rules: [...],
|
|
256
|
+
llm: {
|
|
257
|
+
provider,
|
|
258
|
+
mode: 'cross-check',
|
|
259
|
+
crossCheckHints: 'unbiased', // default; hides rule values from the LLM
|
|
260
|
+
},
|
|
261
|
+
});
|
|
262
|
+
```
|
|
263
|
+
|
|
264
|
+
The merge step now sees two candidates per field and surfaces real disagreements through `result.conflicts`. `crossCheckHints: 'bias'` re-exposes the rule values as hints to save tokens, at the cost of confirmation bias (the LLM tends to agree with what it was shown).
|
|
265
|
+
|
|
266
|
+
### Rich schemas
|
|
267
|
+
|
|
268
|
+
The JSON Schema handed to the LLM supports the Zod constructs that show up in real-world extraction targets:
|
|
269
|
+
|
|
270
|
+
- Primitives: `z.string()`, `z.number()`, `z.boolean()`, `z.enum([...])`.
|
|
271
|
+
- Composition: `z.array(...)`, `z.object({...})`, nested arbitrarily.
|
|
272
|
+
- Wrappers: `.nullable()`, `.optional()`, `.default(...)`.
|
|
273
|
+
- Descriptions: `z.string().describe("price in EUR, tax included")` propagates to the JSON Schema `description` at the declared level (array root vs items, object root vs property), and providers' structured-output features consume it natively. No need to inflate the system prompt with per-field hints.
|
|
274
|
+
|
|
164
275
|
### Normalizers
|
|
165
276
|
|
|
166
277
|
Post-merge transformations. Run in sequence, receive the merged data + original content:
|
|
@@ -184,9 +295,9 @@ const extractor = createExtractor({
|
|
|
184
295
|
Check the final output for logical consistency:
|
|
185
296
|
|
|
186
297
|
```typescript
|
|
187
|
-
import {
|
|
298
|
+
import { validator } from 'llmbic';
|
|
188
299
|
|
|
189
|
-
const { field, crossField } =
|
|
300
|
+
const { field, crossField } = validator.of<MySchemaShape>();
|
|
190
301
|
|
|
191
302
|
const extractor = createExtractor({
|
|
192
303
|
schema: MySchema,
|
|
@@ -202,6 +313,30 @@ result.validation;
|
|
|
202
313
|
// or { valid: false, violations: [{ field: 'price', rule: 'price_positive', message: '...', severity: 'error' }] }
|
|
203
314
|
```
|
|
204
315
|
|
|
316
|
+
### Request / response transformers
|
|
317
|
+
|
|
318
|
+
Two optional hooks let you intercept the LLM exchange without wrapping the provider yourself: `transformRequest` runs after `prompt.build` and before `provider.complete`; `transformResponse` runs after `prompt.parse` and before the merge. Both can be async; errors propagate.
|
|
319
|
+
|
|
320
|
+
```typescript
|
|
321
|
+
const extractor = createExtractor({
|
|
322
|
+
schema: ContactSchema,
|
|
323
|
+
rules: [...],
|
|
324
|
+
llm: {
|
|
325
|
+
provider,
|
|
326
|
+
transformRequest: (request, content) => ({
|
|
327
|
+
...request,
|
|
328
|
+
systemPrompt: `Language: ${detectLocale(content)}\n${request.systemPrompt}`,
|
|
329
|
+
}),
|
|
330
|
+
},
|
|
331
|
+
});
|
|
332
|
+
```
|
|
333
|
+
|
|
334
|
+
Common patterns:
|
|
335
|
+
|
|
336
|
+
- **PII redaction (RGPD)**: replace emails / phones / IDs with placeholders in `userContent`, stash the originals in `knownValues` under a private key, restore them in `transformResponse`. Worked example: [`examples/pii-redaction.ts`](./examples/pii-redaction.ts).
|
|
337
|
+
- **Locale tagging**: prepend `Language: ...` to `systemPrompt` after caller-side detection.
|
|
338
|
+
- **Caching**: wrap your `LlmProvider.complete` directly - cleaner than short-circuiting in a hook, since it sits at the actual transport boundary.
|
|
339
|
+
|
|
205
340
|
## Writing a provider
|
|
206
341
|
|
|
207
342
|
Llmbic does not ship vendor-specific adapters. The `LlmProvider` contract is a single method — wiring to any backend (OpenAI, Anthropic, Ollama, vLLM, Gemini, custom HTTP, ...) is ~10 lines you write and own.
|
|
@@ -227,7 +362,7 @@ const provider: LlmProvider = {
|
|
|
227
362
|
|
|
228
363
|
Ready-made snippets for common backends:
|
|
229
364
|
|
|
230
|
-
**OpenAI** (Chat Completions + Structured Outputs):
|
|
365
|
+
**OpenAI** (Chat Completions + Structured Outputs). The response schema llmbic emits always carries `additionalProperties: false`, so `strict: true` works out of the box:
|
|
231
366
|
|
|
232
367
|
```typescript
|
|
233
368
|
const client = new OpenAI();
|
|
@@ -307,28 +442,41 @@ Creates an extractor instance. Config:
|
|
|
307
442
|
|
|
308
443
|
| Field | Type | Required | Description |
|
|
309
444
|
|-------|------|----------|-------------|
|
|
310
|
-
| `schema` | `ZodObject` | yes | Output schema |
|
|
311
|
-
| `rules` | `ExtractionRule[]` | yes | Deterministic extraction rules |
|
|
312
|
-
| `llm` | `
|
|
313
|
-
| `normalizers` | `Normalizer[]` | no | Post-merge transformations |
|
|
314
|
-
| `validators` | `Validator[]` | no |
|
|
315
|
-
| `
|
|
316
|
-
| `
|
|
445
|
+
| `schema` | `ZodObject` | yes | Output schema (drives field enumeration and re-validation). |
|
|
446
|
+
| `rules` | `ExtractionRule[]` | yes | Deterministic extraction rules. |
|
|
447
|
+
| `llm` | `ExtractorLlmConfig` | no | LLM fallback. Omit for rules-only mode. See below. |
|
|
448
|
+
| `normalizers` | `Normalizer<T>[]` | no | Post-merge transformations, run in declared order. |
|
|
449
|
+
| `validators` | `Validator<ExtractedData<T>>[]` | no | Invariants populating `result.validation`. |
|
|
450
|
+
| `policy` | `Partial<FieldMergePolicy>` | no | Overrides the per-field merge policy (conflict strategy, confidence defaults, equality) for every field. |
|
|
451
|
+
| `policyByField` | `{ [K in keyof T]?: Partial<FieldMergePolicy> }` | no | Per-field overrides applied on top of `policy`. Precedence: defaults < `policy` < `policyByField[field]`. |
|
|
452
|
+
| `logger` | `Logger` | no | Pino/Winston/console-compatible. Warnings from `rule.apply` and `merge.apply` flow through. |
|
|
317
453
|
|
|
318
|
-
|
|
454
|
+
`ExtractorLlmConfig`:
|
|
319
455
|
|
|
320
|
-
|
|
456
|
+
| Field | Type | Required | Description |
|
|
457
|
+
|-------|------|----------|-------------|
|
|
458
|
+
| `provider` | `LlmProvider` | yes | Single-method adapter the extractor calls. |
|
|
459
|
+
| `systemPrompt` | `string` | no | Overrides the built-in system prompt. |
|
|
460
|
+
| `mode` | `'fill-gaps' \| 'cross-check'` | no | `'fill-gaps'` (default) asks the LLM only about fields the rules did not resolve. `'cross-check'` asks about every schema field so `merge.apply` can surface agreements / conflicts. |
|
|
461
|
+
| `crossCheckHints` | `'bias' \| 'unbiased'` | no | In cross-check mode only. `'unbiased'` (default) hides rule values from the LLM for genuine disagreement detection; `'bias'` re-exposes them to save tokens. |
|
|
462
|
+
| `transformRequest` | `(request, content) => LlmRequest \| Promise<LlmRequest>` | no | Hook called with the built request before `provider.complete`. PII redaction, locale tagging, etc. |
|
|
463
|
+
| `transformResponse` | `(result, request) => LlmResult \| Promise<LlmResult>` | no | Hook called with the parsed LLM result before the merge. PII restoration, post-processing, etc. |
|
|
321
464
|
|
|
322
|
-
### `
|
|
465
|
+
### `rule` namespace
|
|
323
466
|
|
|
324
|
-
|
|
467
|
+
| Member | Signature | Description |
|
|
468
|
+
|---|---|---|
|
|
469
|
+
| `rule.create` | `(field, extract, options?) => ExtractionRule` | Declare a rule. `extract(content)` returns a `RuleMatch` or `null`. `options.id` sets the stable identifier surfaced in `result.sources`. |
|
|
470
|
+
| `rule.regex` | `(field, pattern, score, transform?, options?) => ExtractionRule` | Regex-based rule. On match, capture group 1 (or the full match) is fed to `transform`. `options.id` sets the stable identifier surfaced in `result.sources`. |
|
|
471
|
+
| `rule.confidence` | `(value, score) => RuleMatch` | Wrap a value and a confidence score; sugar for custom `extract` callbacks. |
|
|
472
|
+
| `rule.apply` | `(content, rules, schema, logger?) => RulesResult` | Run every rule, pick the highest-confidence match per field, type-check against the schema. |
|
|
325
473
|
|
|
326
|
-
### `
|
|
474
|
+
### `validator.of<T>()`
|
|
327
475
|
|
|
328
|
-
|
|
476
|
+
Binds a target data shape `T` and returns two validator builders:
|
|
329
477
|
|
|
330
|
-
- `field(name,
|
|
331
|
-
- `crossField(
|
|
478
|
+
- `field(name, ruleName, check, message, severity?)`: single-field validator. `check(value, data)` receives the precise type of the field (`T[name]`) as first argument.
|
|
479
|
+
- `crossField(ruleName, check, message, severity?)`: whole-object validator, produces a violation without a `field` property.
|
|
332
480
|
|
|
333
481
|
Binding `T` once lets TypeScript infer each field's type from the field name, so predicates are fully typed without manual annotations.
|
|
334
482
|
|
|
@@ -336,10 +484,10 @@ Binding `T` once lets TypeScript infer each field's type from the field name, so
|
|
|
336
484
|
|
|
337
485
|
| Method | Sync | Description |
|
|
338
486
|
|--------|------|-------------|
|
|
339
|
-
| `extract(content)` | async | Full pipeline: rules
|
|
340
|
-
| `extractSync(content)` | sync | Rules only. Returns partial result + missing fields. |
|
|
341
|
-
| `prompt(content, partial)` | sync | Builds LLM
|
|
342
|
-
| `parse(raw)` | sync | Parses raw LLM JSON response. |
|
|
487
|
+
| `extract(content)` | async | Full pipeline: rules -> LLM (if configured) -> merge -> normalize -> validate. |
|
|
488
|
+
| `extractSync(content)` | sync | Rules only. Returns the partial result + `missing` fields. |
|
|
489
|
+
| `prompt(content, partial)` | sync | Builds the LLM request. Covers `partial.missing` in fill-gaps mode, every schema field in cross-check mode. |
|
|
490
|
+
| `parse(raw)` | sync | Parses a raw LLM JSON response, validating each field individually. Never throws. |
|
|
343
491
|
| `merge(partial, llmResult, content)` | sync | Merges rules + LLM, detects conflicts, normalizes, validates. |
|
|
344
492
|
|
|
345
493
|
## License
|
package/dist/extractor.d.ts
CHANGED
|
@@ -1,18 +1,24 @@
|
|
|
1
1
|
import type { z } from 'zod';
|
|
2
2
|
import type { Extractor, ExtractorConfig } from './types/extractor.types.js';
|
|
3
3
|
/**
|
|
4
|
-
* Bind a schema, deterministic rules and
|
|
4
|
+
* Bind a schema, deterministic rules and their merge-time options into an
|
|
5
5
|
* {@link Extractor}. The returned object exposes the extraction pipeline as
|
|
6
6
|
* pre-configured methods; call sites stop having to thread `schema`,
|
|
7
|
-
* `rules` and provider wiring through
|
|
7
|
+
* `rules`, `policy`, normalizers/validators and provider wiring through
|
|
8
|
+
* every step.
|
|
8
9
|
*
|
|
9
|
-
* {@link Extractor.extract} runs {@link rule.apply}, then
|
|
10
|
-
* configured
|
|
11
|
-
*
|
|
12
|
-
*
|
|
10
|
+
* {@link Extractor.extract} runs {@link rule.apply}, then - when an LLM is
|
|
11
|
+
* configured - asks the provider either for the missing fields only
|
|
12
|
+
* (`mode: 'fill-gaps'`, default) or for every schema field
|
|
13
|
+
* (`mode: 'cross-check'`, which always triggers the LLM call so conflicts
|
|
14
|
+
* can be detected even when the rules resolved every field). The response
|
|
15
|
+
* is parsed with {@link prompt.parse} and fused through {@link merge.apply}.
|
|
13
16
|
*
|
|
14
17
|
* @typeParam S - A Zod object schema describing the target data shape.
|
|
15
|
-
* @param config - Schema, deterministic rules, and optional LLM fallback
|
|
18
|
+
* @param config - Schema, deterministic rules, and optional LLM fallback,
|
|
19
|
+
* plus `policy`, `normalizers`, `validators` and `logger` forwarded to
|
|
20
|
+
* every internal {@link merge.apply} call. The logger is also forwarded
|
|
21
|
+
* to {@link rule.apply} so schema-rejection warnings stay visible.
|
|
16
22
|
* @returns An {@link Extractor} bound to `config.schema`.
|
|
17
23
|
*/
|
|
18
24
|
export declare function createExtractor<S extends z.ZodObject<z.ZodRawShape>>(config: ExtractorConfig<S>): Extractor<z.infer<S>>;
|
package/dist/extractor.d.ts.map
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"extractor.d.ts","sourceRoot":"","sources":["../src/extractor.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAI7B,OAAO,KAAK,EAAE,SAAS,EAAE,eAAe,EAAE,MAAM,4BAA4B,CAAC;
|
|
1
|
+
{"version":3,"file":"extractor.d.ts","sourceRoot":"","sources":["../src/extractor.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAI7B,OAAO,KAAK,EAAE,SAAS,EAAE,eAAe,EAAE,MAAM,4BAA4B,CAAC;AAmD7E;;;;;;;;;;;;;;;;;;;;GAoBG;AACH,wBAAgB,eAAe,CAAC,CAAC,SAAS,CAAC,CAAC,SAAS,CAAC,CAAC,CAAC,WAAW,CAAC,EAClE,MAAM,EAAE,eAAe,CAAC,CAAC,CAAC,GACzB,SAAS,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CA4EvB"}
|
package/dist/extractor.js
CHANGED
|
@@ -10,6 +10,7 @@ import { prompt } from './prompt.js';
|
|
|
10
10
|
function rulesResultFromPartial(partial, allFields) {
|
|
11
11
|
const values = {};
|
|
12
12
|
const confidence = {};
|
|
13
|
+
const sourceIds = {};
|
|
13
14
|
for (const field of allFields) {
|
|
14
15
|
const value = partial.data[field];
|
|
15
16
|
if (value === null) {
|
|
@@ -20,8 +21,12 @@ function rulesResultFromPartial(partial, allFields) {
|
|
|
20
21
|
if (fieldConfidence !== null) {
|
|
21
22
|
confidence[field] = fieldConfidence;
|
|
22
23
|
}
|
|
24
|
+
const source = partial.sources[field];
|
|
25
|
+
if (source !== null && 'ruleId' in source) {
|
|
26
|
+
sourceIds[field] = source.ruleId;
|
|
27
|
+
}
|
|
23
28
|
}
|
|
24
|
-
return { values, confidence, missing: [...partial.missing] };
|
|
29
|
+
return { values, confidence, sourceIds, missing: [...partial.missing] };
|
|
25
30
|
}
|
|
26
31
|
/**
|
|
27
32
|
* Stamp `result.meta.durationMs` with the wall-clock elapsed since `startedAt`.
|
|
@@ -36,18 +41,24 @@ function stampDuration(result, startedAt) {
|
|
|
36
41
|
};
|
|
37
42
|
}
|
|
38
43
|
/**
|
|
39
|
-
* Bind a schema, deterministic rules and
|
|
44
|
+
* Bind a schema, deterministic rules and their merge-time options into an
|
|
40
45
|
* {@link Extractor}. The returned object exposes the extraction pipeline as
|
|
41
46
|
* pre-configured methods; call sites stop having to thread `schema`,
|
|
42
|
-
* `rules` and provider wiring through
|
|
47
|
+
* `rules`, `policy`, normalizers/validators and provider wiring through
|
|
48
|
+
* every step.
|
|
43
49
|
*
|
|
44
|
-
* {@link Extractor.extract} runs {@link rule.apply}, then
|
|
45
|
-
* configured
|
|
46
|
-
*
|
|
47
|
-
*
|
|
50
|
+
* {@link Extractor.extract} runs {@link rule.apply}, then - when an LLM is
|
|
51
|
+
* configured - asks the provider either for the missing fields only
|
|
52
|
+
* (`mode: 'fill-gaps'`, default) or for every schema field
|
|
53
|
+
* (`mode: 'cross-check'`, which always triggers the LLM call so conflicts
|
|
54
|
+
* can be detected even when the rules resolved every field). The response
|
|
55
|
+
* is parsed with {@link prompt.parse} and fused through {@link merge.apply}.
|
|
48
56
|
*
|
|
49
57
|
* @typeParam S - A Zod object schema describing the target data shape.
|
|
50
|
-
* @param config - Schema, deterministic rules, and optional LLM fallback
|
|
58
|
+
* @param config - Schema, deterministic rules, and optional LLM fallback,
|
|
59
|
+
* plus `policy`, `normalizers`, `validators` and `logger` forwarded to
|
|
60
|
+
* every internal {@link merge.apply} call. The logger is also forwarded
|
|
61
|
+
* to {@link rule.apply} so schema-rejection warnings stay visible.
|
|
51
62
|
* @returns An {@link Extractor} bound to `config.schema`.
|
|
52
63
|
*/
|
|
53
64
|
export function createExtractor(config) {
|
|
@@ -55,32 +66,49 @@ export function createExtractor(config) {
|
|
|
55
66
|
if (allFields.length === 0) {
|
|
56
67
|
throw new Error('createExtractor: schema must declare at least one field');
|
|
57
68
|
}
|
|
69
|
+
const buildOptions = {
|
|
70
|
+
systemPrompt: config.llm?.systemPrompt,
|
|
71
|
+
mode: config.llm?.mode ?? 'fill-gaps',
|
|
72
|
+
crossCheckHints: config.llm?.crossCheckHints ?? 'unbiased',
|
|
73
|
+
};
|
|
74
|
+
const mergeOptions = {
|
|
75
|
+
policy: config.policy,
|
|
76
|
+
policyByField: config.policyByField,
|
|
77
|
+
normalizers: config.normalizers,
|
|
78
|
+
validators: config.validators,
|
|
79
|
+
logger: config.logger,
|
|
80
|
+
};
|
|
58
81
|
return {
|
|
59
82
|
async extract(content) {
|
|
60
83
|
const startedAt = performance.now();
|
|
61
|
-
const rulesResult = rule.apply(content, config.rules, config.schema);
|
|
62
|
-
const partial = merge.apply(config.schema, rulesResult, null, content);
|
|
63
|
-
|
|
84
|
+
const rulesResult = rule.apply(content, config.rules, config.schema, config.logger);
|
|
85
|
+
const partial = merge.apply(config.schema, rulesResult, null, content, mergeOptions);
|
|
86
|
+
const shouldCallLlm = config.llm !== undefined &&
|
|
87
|
+
(buildOptions.mode === 'cross-check' || partial.missing.length > 0);
|
|
88
|
+
if (!shouldCallLlm) {
|
|
64
89
|
return stampDuration(partial, startedAt);
|
|
65
90
|
}
|
|
66
|
-
const
|
|
67
|
-
|
|
68
|
-
|
|
91
|
+
const builtRequest = prompt.build(config.schema, partial, content, buildOptions);
|
|
92
|
+
const request = config.llm.transformRequest
|
|
93
|
+
? await config.llm.transformRequest(builtRequest, content)
|
|
94
|
+
: builtRequest;
|
|
69
95
|
const completion = await config.llm.provider.complete(request);
|
|
70
|
-
const
|
|
71
|
-
const
|
|
96
|
+
const llmTargetFields = buildOptions.mode === 'cross-check' ? allFields : partial.missing;
|
|
97
|
+
const parsedLlmResult = prompt.parse(config.schema, llmTargetFields, completion.values);
|
|
98
|
+
const llmResult = config.llm.transformResponse
|
|
99
|
+
? await config.llm.transformResponse(parsedLlmResult, request)
|
|
100
|
+
: parsedLlmResult;
|
|
101
|
+
const final = merge.apply(config.schema, rulesResult, llmResult, content, mergeOptions);
|
|
72
102
|
return stampDuration(final, startedAt);
|
|
73
103
|
},
|
|
74
104
|
extractSync(content) {
|
|
75
105
|
const startedAt = performance.now();
|
|
76
|
-
const rulesResult = rule.apply(content, config.rules, config.schema);
|
|
77
|
-
const partial = merge.apply(config.schema, rulesResult, null, content);
|
|
106
|
+
const rulesResult = rule.apply(content, config.rules, config.schema, config.logger);
|
|
107
|
+
const partial = merge.apply(config.schema, rulesResult, null, content, mergeOptions);
|
|
78
108
|
return stampDuration(partial, startedAt);
|
|
79
109
|
},
|
|
80
110
|
prompt(content, partial) {
|
|
81
|
-
return prompt.build(config.schema, partial, content,
|
|
82
|
-
systemPrompt: config.llm?.systemPrompt,
|
|
83
|
-
});
|
|
111
|
+
return prompt.build(config.schema, partial, content, buildOptions);
|
|
84
112
|
},
|
|
85
113
|
parse(raw) {
|
|
86
114
|
return prompt.parse(config.schema, allFields, raw);
|
|
@@ -88,7 +116,7 @@ export function createExtractor(config) {
|
|
|
88
116
|
merge(partial, llmResult, content) {
|
|
89
117
|
const startedAt = performance.now();
|
|
90
118
|
const rulesResult = rulesResultFromPartial(partial, allFields);
|
|
91
|
-
const result = merge.apply(config.schema, rulesResult, llmResult, content);
|
|
119
|
+
const result = merge.apply(config.schema, rulesResult, llmResult, content, mergeOptions);
|
|
92
120
|
return stampDuration(result, startedAt);
|
|
93
121
|
},
|
|
94
122
|
};
|
package/dist/extractor.js.map
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"extractor.js","sourceRoot":"","sources":["../src/extractor.ts"],"names":[],"mappings":"AACA,OAAO,EAAE,IAAI,EAAE,MAAM,YAAY,CAAC;AAClC,OAAO,EAAE,KAAK,EAAE,MAAM,YAAY,CAAC;AACnC,OAAO,EAAE,MAAM,EAAE,MAAM,aAAa,CAAC;AAKrC;;;;;GAKG;AACH,SAAS,sBAAsB,CAC7B,OAA4B,EAC5B,SAA+B;IAE/B,MAAM,MAAM,GAAe,EAAE,CAAC;IAC9B,MAAM,UAAU,GAAqC,EAAE,CAAC;IACxD,KAAK,MAAM,KAAK,IAAI,SAAS,EAAE,CAAC;QAC9B,MAAM,KAAK,GAAG,OAAO,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC;QAClC,IAAI,KAAK,KAAK,IAAI,EAAE,CAAC;YACnB,SAAS;QACX,CAAC;QACD,MAAM,CAAC,KAAK,CAAC,GAAG,KAAmB,CAAC;QACpC,MAAM,eAAe,GAAG,OAAO,CAAC,UAAU,CAAC,KAAK,CAAC,CAAC;QAClD,IAAI,eAAe,KAAK,IAAI,EAAE,CAAC;YAC7B,UAAU,CAAC,KAAK,CAAC,GAAG,eAAe,CAAC;QACtC,CAAC;IACH,CAAC;IACD,OAAO,EAAE,MAAM,EAAE,UAAU,EAAE,OAAO,EAAE,CAAC,GAAG,OAAO,CAAC,OAAO,CAAC,EAAE,CAAC;
|
|
1
|
+
{"version":3,"file":"extractor.js","sourceRoot":"","sources":["../src/extractor.ts"],"names":[],"mappings":"AACA,OAAO,EAAE,IAAI,EAAE,MAAM,YAAY,CAAC;AAClC,OAAO,EAAE,KAAK,EAAE,MAAM,YAAY,CAAC;AACnC,OAAO,EAAE,MAAM,EAAE,MAAM,aAAa,CAAC;AAKrC;;;;;GAKG;AACH,SAAS,sBAAsB,CAC7B,OAA4B,EAC5B,SAA+B;IAE/B,MAAM,MAAM,GAAe,EAAE,CAAC;IAC9B,MAAM,UAAU,GAAqC,EAAE,CAAC;IACxD,MAAM,SAAS,GAAqC,EAAE,CAAC;IACvD,KAAK,MAAM,KAAK,IAAI,SAAS,EAAE,CAAC;QAC9B,MAAM,KAAK,GAAG,OAAO,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC;QAClC,IAAI,KAAK,KAAK,IAAI,EAAE,CAAC;YACnB,SAAS;QACX,CAAC;QACD,MAAM,CAAC,KAAK,CAAC,GAAG,KAAmB,CAAC;QACpC,MAAM,eAAe,GAAG,OAAO,CAAC,UAAU,CAAC,KAAK,CAAC,CAAC;QAClD,IAAI,eAAe,KAAK,IAAI,EAAE,CAAC;YAC7B,UAAU,CAAC,KAAK,CAAC,GAAG,eAAe,CAAC;QACtC,CAAC;QACD,MAAM,MAAM,GAAG,OAAO,CAAC,OAAO,CAAC,KAAK,CAAC,CAAC;QACtC,IAAI,MAAM,KAAK,IAAI,IAAI,QAAQ,IAAI,MAAM,EAAE,CAAC;YAC1C,SAAS,CAAC,KAAK,CAAC,GAAG,MAAM,CAAC,MAAM,CAAC;QACnC,CAAC;IACH,CAAC;IACD,OAAO,EAAE,MAAM,EAAE,UAAU,EAAE,SAAS,EAAE,OAAO,EAAE,CAAC,GAAG,OAAO,CAAC,OAAO,CAAC,EAAE,CAAC;AAC1E,CAAC;AAED;;;;;GAKG;AACH,SAAS,aAAa,CACpB,MAA2B,EAC3B,SAAiB;IAEjB,OAAO;QACL,GAAG,MAAM;QACT,IAAI,EAAE,EAAE,GAAG,MAAM,CAAC,IAAI,EAAE,UAAU,EAAE,WAAW,CAAC,GAAG,EAAE,GAAG,SAAS,EAAE;KACpE,CAAC;AACJ,CAAC;AAED;;;;;;;;;;;;;;;;;;;;GAoBG;AACH,MAAM,UAAU,eAAe,CAC7B,MAA0B;IAG1B,MAAM,SAAS,GAAG,MAAM,CAAC,IAAI,CAAC,MAAM,CAAC,MAAM,CAAC,KAAK,CAAmB,CAAC;IAErE,IAAI,SAAS,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;QAC3B,MAAM,IAAI,KAAK,CAAC,yDAAyD,CAAC,CAAC;IAC7E,CAAC;IAED,MAAM,YAAY,GAAG;QACnB,YAAY,EAAE,MAAM,CAAC,GAAG,EAAE,YAAY;QACtC,IAAI,EAAE,MAAM,CAAC,GAAG,EAAE,IAAI,IAAI,WAAW;QACrC,eAAe,EAAE,MAAM,CAAC,GAAG,EAAE,eAAe,IAAI,UAAU;KAClD,CAAC;IAEX,MAAM,YAAY,GAA4B;QAC5C,MAAM,EAAE,MAAM,CAAC,MAAM;QACrB,aAAa,EAAE,MAAM,CAAC,aAAa;QACnC,WAAW,EAAE,MAAM,CAAC,WAAW;QAC/B,UAAU,EAAE,MAAM,CAAC,UAAU;QAC7B,MAAM,EAAE,MAAM,CAAC,MAAM;KACtB,CAAC;IAEF,OAAO;QACL,KAAK,CAAC,OAAO,CAAC,OAAO;YACnB,MAAM,SAAS,GAAG,WAAW,CAAC,GAAG,EAAE,CAAC;YACpC,MAAM,WAAW,GAAG,IAAI,CAAC,KAAK,CAAC,OAAO,EAAE,MAAM,CAAC,KAAK,EAAE,MAAM,CAAC,MAAM,EAAE,MAAM,CAAC,MAAM,CAAC,CAAC;YACpF,MAAM,OAAO,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,WAAW,EAAE,IAAI,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;YAErF,MAAM,aAAa,GACjB,MAAM,CAAC,GAAG,KAAK,SAAS;gBACxB,CAAC,YAAY,CAAC,IAAI,KAAK,aAAa,IAAI,OAAO,CAAC,OAAO,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC;YACtE,IAAI,CAAC,aAAa,EAAE,CAAC;gBACnB,OAAO,aAAa,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;YAC3C,CAAC;YAED,MAAM,YAAY,GAAG,MAAM,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,OAAO,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;YACjF,MAAM,OAAO,GAAG,MAAM,CAAC,GAAI,CAAC,gBAAgB;gBAC1C,CAAC,CAAC,MAAM,MAAM,CAAC,GAAI,CAAC,gBAAgB,CAAC,YAAY,EAAE,OAAO,CAAC;gBAC3D,CAAC,CAAC,YAAY,CAAC;YACjB,MAAM,UAAU,GAAG,MAAM,MAAM,CAAC,GAAI,CAAC,QAAQ,CAAC,QAAQ,CAAC,OAAO,CAAC,CAAC;YAChE,MAAM,eAAe,GACnB,YAAY,CAAC,IAAI,KAAK,aAAa,CAAC,CAAC,CAAC,SAAS,CAAC,CAAC,CAAC,OAAO,CAAC,OAAO,CAAC;YACpE,MAAM,eAAe,GAAG,MAAM,CAAC,KAAK,CAClC,MAAM,CAAC,MAAM,EACb,eAAe,EACf,UAAU,CAAC,MAAM,CAClB,CAAC;YACF,MAAM,SAAS,GAAG,MAAM,CAAC,GAAI,CAAC,iBAAiB;gBAC7C,CAAC,CAAC,MAAM,MAAM,CAAC,GAAI,CAAC,iBAAiB,CAAC,eAAe,EAAE,OAAO,CAAC;gBAC/D,CAAC,CAAC,eAAe,CAAC;YACpB,MAAM,KAAK,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,WAAW,EAAE,SAAS,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;YACxF,OAAO,aAAa,CAAC,KAAK,EAAE,SAAS,CAAC,CAAC;QACzC,CAAC;QAED,WAAW,CAAC,OAAO;YACjB,MAAM,SAAS,GAAG,WAAW,CAAC,GAAG,EAAE,CAAC;YACpC,MAAM,WAAW,GAAG,IAAI,CAAC,KAAK,CAAC,OAAO,EAAE,MAAM,CAAC,KAAK,EAAE,MAAM,CAAC,MAAM,EAAE,MAAM,CAAC,MAAM,CAAC,CAAC;YACpF,MAAM,OAAO,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,WAAW,EAAE,IAAI,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;YACrF,OAAO,aAAa,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;QAC3C,CAAC;QAED,MAAM,CAAC,OAAO,EAAE,OAAO;YACrB,OAAO,MAAM,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,OAAO,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;QACrE,CAAC;QAED,KAAK,CAAC,GAAG;YACP,OAAO,MAAM,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,SAAS,EAAE,GAAG,CAAC,CAAC;QACrD,CAAC;QAED,KAAK,CAAC,OAAO,EAAE,SAAS,EAAE,OAAO;YAC/B,MAAM,SAAS,GAAG,WAAW,CAAC,GAAG,EAAE,CAAC;YACpC,MAAM,WAAW,GAAG,sBAAsB,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;YAC/D,MAAM,MAAM,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,WAAW,EAAE,SAAS,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;YACzF,OAAO,aAAa,CAAC,MAAM,EAAE,SAAS,CAAC,CAAC;QAC1C,CAAC;KACF,CAAC;AACJ,CAAC"}
|
package/dist/index.d.ts
CHANGED
|
@@ -16,9 +16,9 @@ export { prompt } from './prompt.js';
|
|
|
16
16
|
export { validator } from './validate.js';
|
|
17
17
|
export type { ExtractionRule, RuleMatch, RulesResult, } from './types/rule.types.js';
|
|
18
18
|
export type { Extractor, ExtractorConfig, ExtractorLlmConfig, } from './types/extractor.types.js';
|
|
19
|
-
export type { LlmRequest } from './types/prompt.types.js';
|
|
19
|
+
export type { CrossCheckHints, LlmRequest, PromptBuildMode, PromptBuildOptions, } from './types/prompt.types.js';
|
|
20
20
|
export type { LlmProvider } from './types/provider.types.js';
|
|
21
21
|
export type { Logger } from './types/logger.types.js';
|
|
22
22
|
export type { Severity, Violation, Validator, } from './types/validate.types.js';
|
|
23
|
-
export type { Conflict, ConflictStrategy, ExtractedData, ExtractionMeta, ExtractionResult, FieldCompare, FieldMergePolicy, FieldMergeResult, LlmResult, MergeApplyOptions, Normalizer, ValidationResult, } from './types/merge.types.js';
|
|
23
|
+
export type { Conflict, ConflictStrategy, ExtractedData, ExtractionMeta, ExtractionResult, FieldCompare, FieldMergePolicy, FieldMergeResult, FieldSource, LlmResult, MergeApplyOptions, Normalizer, ValidationResult, } from './types/merge.types.js';
|
|
24
24
|
//# sourceMappingURL=index.d.ts.map
|
package/dist/index.d.ts.map
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA;;;;;;;;;;GAUG;AAEH,OAAO,EAAE,eAAe,EAAE,MAAM,gBAAgB,CAAC;AACjD,OAAO,EAAE,IAAI,EAAE,MAAM,YAAY,CAAC;AAClC,OAAO,EAAE,KAAK,EAAE,MAAM,YAAY,CAAC;AACnC,OAAO,EAAE,MAAM,EAAE,MAAM,aAAa,CAAC;AACrC,OAAO,EAAE,SAAS,EAAE,MAAM,eAAe,CAAC;AAE1C,YAAY,EACV,cAAc,EACd,SAAS,EACT,WAAW,GACZ,MAAM,uBAAuB,CAAC;AAE/B,YAAY,EACV,SAAS,EACT,eAAe,EACf,kBAAkB,GACnB,MAAM,4BAA4B,CAAC;AAEpC,YAAY,
|
|
1
|
+
{"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA;;;;;;;;;;GAUG;AAEH,OAAO,EAAE,eAAe,EAAE,MAAM,gBAAgB,CAAC;AACjD,OAAO,EAAE,IAAI,EAAE,MAAM,YAAY,CAAC;AAClC,OAAO,EAAE,KAAK,EAAE,MAAM,YAAY,CAAC;AACnC,OAAO,EAAE,MAAM,EAAE,MAAM,aAAa,CAAC;AACrC,OAAO,EAAE,SAAS,EAAE,MAAM,eAAe,CAAC;AAE1C,YAAY,EACV,cAAc,EACd,SAAS,EACT,WAAW,GACZ,MAAM,uBAAuB,CAAC;AAE/B,YAAY,EACV,SAAS,EACT,eAAe,EACf,kBAAkB,GACnB,MAAM,4BAA4B,CAAC;AAEpC,YAAY,EACV,eAAe,EACf,UAAU,EACV,eAAe,EACf,kBAAkB,GACnB,MAAM,yBAAyB,CAAC;AACjC,YAAY,EAAE,WAAW,EAAE,MAAM,2BAA2B,CAAC;AAC7D,YAAY,EAAE,MAAM,EAAE,MAAM,yBAAyB,CAAC;AAEtD,YAAY,EACV,QAAQ,EACR,SAAS,EACT,SAAS,GACV,MAAM,2BAA2B,CAAC;AAEnC,YAAY,EACV,QAAQ,EACR,gBAAgB,EAChB,aAAa,EACb,cAAc,EACd,gBAAgB,EAChB,YAAY,EACZ,gBAAgB,EAChB,gBAAgB,EAChB,WAAW,EACX,SAAS,EACT,iBAAiB,EACjB,UAAU,EACV,gBAAgB,GACjB,MAAM,wBAAwB,CAAC"}
|
package/dist/merge.d.ts.map
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"merge.d.ts","sourceRoot":"","sources":["../src/merge.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAC7B,OAAO,KAAK,EAAE,MAAM,EAAE,MAAM,yBAAyB,CAAC;AACtD,OAAO,KAAK,EAAE,SAAS,EAAE,WAAW,EAAE,MAAM,uBAAuB,CAAC;AAEpE,OAAO,KAAK,EAGV,gBAAgB,EAChB,gBAAgB,EAChB,gBAAgB,
|
|
1
|
+
{"version":3,"file":"merge.d.ts","sourceRoot":"","sources":["../src/merge.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAC7B,OAAO,KAAK,EAAE,MAAM,EAAE,MAAM,yBAAyB,CAAC;AACtD,OAAO,KAAK,EAAE,SAAS,EAAE,WAAW,EAAE,MAAM,uBAAuB,CAAC;AAEpE,OAAO,KAAK,EAGV,gBAAgB,EAChB,gBAAgB,EAChB,gBAAgB,EAEhB,SAAS,EACT,iBAAiB,EAElB,MAAM,wBAAwB,CAAC;AAuKhC;;;;;GAKG;AACH,eAAO,MAAM,KAAK;IAChB;;;;;;OAMG;;QAED,6CAA6C;;QAE7C,yDAAyD;;QAEzD,sDAAsD;;QAEtD,wDAAwD;;QAExD,qGAAqG;qBACxF,OAAO,KAAK,OAAO,KAAG,OAAO;;IAQ5C;;;;;;;;;;;;;;;;;;;;OAoBG;UACG,CAAC,SACE,MAAM,aACF,SAAS,CAAC,CAAC,CAAC,GAAG,IAAI,YACpB,OAAO,WACR,OAAO,CAAC,gBAAgB,CAAC,WACzB,MAAM,GACd,gBAAgB,CAAC,CAAC,CAAC;IAgEtB;;;;;;;;;;;;;;;;;;;;OAoBG;UACG,CAAC,SAAS,CAAC,CAAC,SAAS,CAAC,CAAC,CAAC,WAAW,CAAC,UAChC,CAAC,eACI,WAAW,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,aACzB,SAAS,GAAG,IAAI,WAClB,MAAM,YACL,iBAAiB,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,GACtC,gBAAgB,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;CAqChC,CAAC"}
|