llmbic 1.0.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,47 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [1.2.0] - 2026-04-16
9
+
10
+ Non-breaking. Production-readiness pass: per-field provenance, per-field merge policy, and pre/post LLM transformers. Token / cost tracking deliberately stays out of scope - `LlmProvider` keeps observability as a caller concern; wrap your `complete` for telemetry.
11
+
12
+ ### Added
13
+
14
+ - `ExtractionResult.sources` - per-field origin of the kept value, as a `FieldSource` discriminated union (`'rule' | 'llm' | 'agreement' | 'flag'`). Variants involving a rule carry the `ruleId` of the rule that produced the match. Use it to attribute extractions back to specific rules, monitor rule quality at scale, or filter results on agreement vs LLM-only fields.
15
+ - `ExtractionRule.id` and `rule.create` / `rule.regex` `options.id` - declare a stable identifier surfaced in `ExtractionResult.sources`. When omitted, `rule.apply` auto-generates `${field}#${declarationIndex}` based on the rule's position in the array.
16
+ - `MergeApplyOptions.policyByField` and `ExtractorConfig.policyByField` - per-field overrides of `FieldMergePolicy` (strategy, confidences, compare). Precedence: defaults < `policy` < `policyByField[field]`. TypeScript validates field names against the schema. Lets a single extractor flag conflicts on critical fields, prefer rules on parser-friendly fields, and prefer the LLM on free-form fields without writing custom merge code.
17
+ - `ExtractorLlmConfig.transformRequest` / `transformResponse` - async hooks called around `provider.complete`. `transformRequest` rewrites the built `LlmRequest` (PII redaction, locale tagging); `transformResponse` rewrites the parsed `LlmResult` before the merge step (PII restoration, post-processing). Errors propagate, no implicit catch.
18
+ - `examples/pii-redaction.ts` - runnable, offline demo of the redact-then-restore pattern using `transformRequest` + `transformResponse` (also wired as `npm run example:pii-redaction`).
19
+
20
+ ### Public types
21
+
22
+ - `FieldSource` exported from the package root.
23
+ - `RulesResult.sourceIds` (optional) - populated by `rule.apply`, consumed by `merge.apply` to compute `ExtractionResult.sources`. External callers building `RulesResult` by hand can omit it; provenance simply falls back to an empty `ruleId`.
24
+ - `ExtractionRule` gains optional `id`. `RuleMatch`, `FieldMergeResult` and `merge.field`'s signature are unchanged - provenance is computed from the merge outcome plus the policy, not stored on the per-field primitive.
25
+
26
+ ## [1.1.0] - 2026-04-16
27
+
28
+ Non-breaking. Unblocks hybrid workflows that rely on nested schemas, agreement/conflict detection, and extractor-level merge options.
29
+
30
+ ### Added
31
+
32
+ - `prompt.build` now supports `z.array(...)`, `z.object(...)`, `z.optional(...)` and `z.default(...)` in the response JSON Schema. Optional fields are preserved in `properties` but excluded from `required`.
33
+ - Cross-check mode on `prompt.build` and `ExtractorLlmConfig`: `mode: 'cross-check'` asks the LLM about every schema field, not just `partial.missing`, enabling the per-field agreement / conflict machinery in `merge.apply`. `crossCheckHints: 'bias' | 'unbiased'` (default `unbiased`) controls whether rule values are surfaced to the LLM as hints.
34
+ - `ExtractorConfig` now accepts `normalizers`, `validators`, `policy` and `logger` directly; previously these had to be threaded into a manual `merge.apply` call. The options are forwarded to every internal merge, so `extract`, `extractSync` and `extractor.merge` all honor them.
35
+ - Zod `.describe("...")` (equivalent to `.meta({ description })`) is now propagated to the generated JSON Schema at the level it was declared; providers' structured-output features consume it natively, so per-field prompt guidance no longer requires an expanded system prompt.
36
+ - README "Batch / async mode" section expanded with a worked OpenAI Batch API example (JSONL shape, upload / poll / download / merge), plus a full runnable script at `examples/openai-batch.ts`.
37
+
38
+ ### Fixed
39
+
40
+ - Object schemas emitted by `prompt.build` now carry `additionalProperties: false`, matching the requirement of OpenAI Chat Completions Structured Outputs with `strict: true`. Other providers (Anthropic tool use, Ollama JSON Schema) ignore the extra key. Aligned with `prompt.parse` which already drops unexpected fields with a warning.
41
+ - `createExtractor` was not forwarding the configured `logger` to `rule.apply`, so schema-rejection warnings from the rules pass were silently dropped. The logger is now plumbed through every phase.
42
+
43
+ ### Public types
44
+
45
+ - `PromptBuildMode`, `CrossCheckHints`, `PromptBuildOptions` exported from the package root.
46
+ - `ExtractorConfig<S>` gains optional `normalizers`, `validators`, `policy`, `logger`.
47
+ - `ExtractorLlmConfig` gains optional `mode`, `crossCheckHints`.
48
+
8
49
  ## [1.0.0] — 2026-04-15
9
50
 
10
51
  Initial public release.
package/README.md CHANGED
@@ -29,7 +29,7 @@ Llmbic has a single dependency: [Zod](https://zod.dev). No vendor SDK is pulled
29
29
 
30
30
  ```typescript
31
31
  import { z } from 'zod';
32
- import { createExtractor, rule, confidence } from 'llmbic';
32
+ import { createExtractor, rule } from 'llmbic';
33
33
 
34
34
  const InvoiceSchema = z.object({
35
35
  total: z.number().nullable(),
@@ -41,14 +41,14 @@ const InvoiceSchema = z.object({
41
41
  const extractor = createExtractor({
42
42
  schema: InvoiceSchema,
43
43
  rules: [
44
- rule('total', (text) => {
44
+ rule.create('total', (text) => {
45
45
  const m = text.match(/Total[:\s]*(\d[\d.,\s]+)\s*€/i);
46
46
  if (!m) return null;
47
- return confidence(parseFloat(m[1].replace(/[\s.]/g, '').replace(',', '.')), 1.0);
47
+ return rule.confidence(parseFloat(m[1].replace(/[\s.]/g, '').replace(',', '.')), 1.0);
48
48
  }),
49
- rule('currency', (text) => {
50
- if (/€|EUR/i.test(text)) return confidence('EUR', 1.0);
51
- if (/\$|USD/i.test(text)) return confidence('USD', 1.0);
49
+ rule.create('currency', (text) => {
50
+ if (/€|EUR/i.test(text)) return rule.confidence('EUR', 1.0);
51
+ if (/\$|USD/i.test(text)) return rule.confidence('USD', 1.0);
52
52
  return null;
53
53
  }),
54
54
  ],
@@ -69,7 +69,7 @@ console.log(result.missing);
69
69
  ### Rules + LLM
70
70
 
71
71
  ```typescript
72
- import { createExtractor, rule, confidence } from 'llmbic';
72
+ import { createExtractor, rule } from 'llmbic';
73
73
  import type { LlmProvider } from 'llmbic';
74
74
  import OpenAI from 'openai';
75
75
 
@@ -135,6 +135,48 @@ const llmResult = extractor.parse(rawJsonResponse);
135
135
  const result = extractor.merge(partial, llmResult, markdown);
136
136
  ```
137
137
 
138
+ Steps 1, 2 and 4 are pure and synchronous: persist `partial` between (2) and (4); the merge re-runs the rules internally so no private state leaks across the async gap.
139
+
140
+ #### Worked example: OpenAI Batch API
141
+
142
+ The Batch API expects a JSONL file where each line is a Chat Completions request. Using `extractor.prompt(...)` as the per-document payload builder maps 1:1 onto that format:
143
+
144
+ ```typescript
145
+ // For each document, build one JSONL line:
146
+ const partial = extractor.extractSync(doc.markdown);
147
+ const request = extractor.prompt(doc.markdown, partial);
148
+
149
+ const line = JSON.stringify({
150
+ custom_id: doc.id, // how you'll re-match later
151
+ method: 'POST',
152
+ url: '/v1/chat/completions',
153
+ body: {
154
+ model: 'gpt-4o-mini',
155
+ messages: [
156
+ { role: 'system', content: request.systemPrompt },
157
+ { role: 'user', content: request.userContent },
158
+ ],
159
+ response_format: {
160
+ type: 'json_schema',
161
+ json_schema: { name: 'extraction', strict: true, schema: request.responseSchema },
162
+ },
163
+ },
164
+ });
165
+ ```
166
+
167
+ Upload the JSONL, create the batch, poll until `status === 'completed'`, download the output file. Each output line carries the same `custom_id` so you can map back to the `partial` you kept in memory (or in Redis, or on disk):
168
+
169
+ ```typescript
170
+ for (const entry of prepared) {
171
+ const raw = responsesById.get(entry.id); // from output JSONL
172
+ const llmResult = extractor.parse(raw);
173
+ const result = extractor.merge(entry.partial, llmResult, entry.markdown);
174
+ // ... persist result ...
175
+ }
176
+ ```
177
+
178
+ End-to-end runnable example (upload + poll + download + merge): [`examples/openai-batch.ts`](./examples/openai-batch.ts). At current OpenAI pricing the Batch API is ~50% cheaper than realtime Chat Completions, with a 24h completion window.
179
+
138
180
  ## Features
139
181
 
140
182
  ### Per-field confidence scoring
@@ -150,6 +192,28 @@ Every field in the result carries a confidence score (0.0–1.0):
150
192
  | Rule + LLM disagree | 0.3 (flagged as conflict) |
151
193
  | No source | `null` |
152
194
 
195
+ ### Per-field provenance
196
+
197
+ Alongside `confidence`, every field carries a `source` describing where the kept value came from. Useful for attributing extractions back to the rule that produced them, monitoring rule quality at scale, or filtering on agreement vs LLM-only fields:
198
+
199
+ ```typescript
200
+ result.sources;
201
+ // {
202
+ // total: { kind: 'agreement', ruleId: 'total-eur' }, // rule + LLM agreed
203
+ // currency: { kind: 'rule', ruleId: 'currency#1' }, // only the rule produced a value
204
+ // vendor: { kind: 'llm' }, // only the LLM produced a value
205
+ // date: { kind: 'flag', ruleId: 'date-iso' }, // rule and LLM disagreed under flag strategy
206
+ // notes: null, // missing
207
+ // }
208
+ ```
209
+
210
+ `ruleId` defaults to `${field}#${declarationIndex}` based on the rule's position in the array - stable as long as you don't reorder. For long-lived production code, declare ids explicitly so refactors don't break observability:
211
+
212
+ ```typescript
213
+ rule.create('total', extractTotal, { id: 'total-eur' });
214
+ rule.regex('date', /(\d{4}-\d{2}-\d{2})/, 0.95, undefined, { id: 'date-iso' });
215
+ ```
216
+
153
217
  ### Conflict detection
154
218
 
155
219
  When a rule and the LLM extract different values for the same field, Llmbic flags it:
@@ -161,6 +225,53 @@ result.conflicts;
161
225
 
162
226
  Three conflict strategies: `'flag'` (default — keep rule value, record conflict), `'prefer-rule'`, or `'prefer-llm'`.
163
227
 
228
+ In the default `'fill-gaps'` mode the LLM is only asked about fields the rules could not resolve, so conflicts are impossible. To actually trigger conflict detection, opt into cross-check (see below).
229
+
230
+ #### Per-field strategies
231
+
232
+ `policy` is a single strategy applied to every field. When fields have different criticality (a `price` you want to flag vs a `postal_code` your regex always nails vs a free-form `description` you'd rather defer to the LLM), use `policyByField` to override per field. Precedence: library defaults < `policy` < `policyByField[field]`.
233
+
234
+ ```typescript
235
+ const extractor = createExtractor({
236
+ schema: ListingSchema,
237
+ rules: [...],
238
+ policy: { strategy: 'flag' }, // default for every field
239
+ policyByField: {
240
+ postal_code: { strategy: 'prefer-rule' },
241
+ description: { strategy: 'prefer-llm' },
242
+ },
243
+ });
244
+ ```
245
+
246
+ You can override any subset of `FieldMergePolicy` per field - strategy, confidences, even the `compare` callback (e.g. fuzzy equality for free-form strings). TypeScript validates field names against your schema, so typos surface at compile time.
247
+
248
+ ### Cross-check mode
249
+
250
+ Switch the LLM call from fill-gaps (ask only about missing fields) to cross-check (ask about every schema field, whether the rules resolved it or not):
251
+
252
+ ```typescript
253
+ const extractor = createExtractor({
254
+ schema: InvoiceSchema,
255
+ rules: [...],
256
+ llm: {
257
+ provider,
258
+ mode: 'cross-check',
259
+ crossCheckHints: 'unbiased', // default; hides rule values from the LLM
260
+ },
261
+ });
262
+ ```
263
+
264
+ The merge step now sees two candidates per field and surfaces real disagreements through `result.conflicts`. `crossCheckHints: 'bias'` re-exposes the rule values as hints to save tokens, at the cost of confirmation bias (the LLM tends to agree with what it was shown).
265
+
266
+ ### Rich schemas
267
+
268
+ The JSON Schema handed to the LLM supports the Zod constructs that show up in real-world extraction targets:
269
+
270
+ - Primitives: `z.string()`, `z.number()`, `z.boolean()`, `z.enum([...])`.
271
+ - Composition: `z.array(...)`, `z.object({...})`, nested arbitrarily.
272
+ - Wrappers: `.nullable()`, `.optional()`, `.default(...)`.
273
+ - Descriptions: `z.string().describe("price in EUR, tax included")` propagates to the JSON Schema `description` at the declared level (array root vs items, object root vs property), and providers' structured-output features consume it natively. No need to inflate the system prompt with per-field hints.
274
+
164
275
  ### Normalizers
165
276
 
166
277
  Post-merge transformations. Run in sequence, receive the merged data + original content:
@@ -184,9 +295,9 @@ const extractor = createExtractor({
184
295
  Check the final output for logical consistency:
185
296
 
186
297
  ```typescript
187
- import { validators } from 'llmbic';
298
+ import { validator } from 'llmbic';
188
299
 
189
- const { field, crossField } = validators<MySchemaShape>();
300
+ const { field, crossField } = validator.of<MySchemaShape>();
190
301
 
191
302
  const extractor = createExtractor({
192
303
  schema: MySchema,
@@ -202,6 +313,30 @@ result.validation;
202
313
  // or { valid: false, violations: [{ field: 'price', rule: 'price_positive', message: '...', severity: 'error' }] }
203
314
  ```
204
315
 
316
+ ### Request / response transformers
317
+
318
+ Two optional hooks let you intercept the LLM exchange without wrapping the provider yourself: `transformRequest` runs after `prompt.build` and before `provider.complete`; `transformResponse` runs after `prompt.parse` and before the merge. Both can be async; errors propagate.
319
+
320
+ ```typescript
321
+ const extractor = createExtractor({
322
+ schema: ContactSchema,
323
+ rules: [...],
324
+ llm: {
325
+ provider,
326
+ transformRequest: (request, content) => ({
327
+ ...request,
328
+ systemPrompt: `Language: ${detectLocale(content)}\n${request.systemPrompt}`,
329
+ }),
330
+ },
331
+ });
332
+ ```
333
+
334
+ Common patterns:
335
+
336
+ - **PII redaction (RGPD)**: replace emails / phones / IDs with placeholders in `userContent`, stash the originals in `knownValues` under a private key, restore them in `transformResponse`. Worked example: [`examples/pii-redaction.ts`](./examples/pii-redaction.ts).
337
+ - **Locale tagging**: prepend `Language: ...` to `systemPrompt` after caller-side detection.
338
+ - **Caching**: wrap your `LlmProvider.complete` directly - cleaner than short-circuiting in a hook, since it sits at the actual transport boundary.
339
+
205
340
  ## Writing a provider
206
341
 
207
342
  Llmbic does not ship vendor-specific adapters. The `LlmProvider` contract is a single method — wiring to any backend (OpenAI, Anthropic, Ollama, vLLM, Gemini, custom HTTP, ...) is ~10 lines you write and own.
@@ -227,7 +362,7 @@ const provider: LlmProvider = {
227
362
 
228
363
  Ready-made snippets for common backends:
229
364
 
230
- **OpenAI** (Chat Completions + Structured Outputs):
365
+ **OpenAI** (Chat Completions + Structured Outputs). The response schema llmbic emits always carries `additionalProperties: false`, so `strict: true` works out of the box:
231
366
 
232
367
  ```typescript
233
368
  const client = new OpenAI();
@@ -307,28 +442,41 @@ Creates an extractor instance. Config:
307
442
 
308
443
  | Field | Type | Required | Description |
309
444
  |-------|------|----------|-------------|
310
- | `schema` | `ZodObject` | yes | Output schema |
311
- | `rules` | `ExtractionRule[]` | yes | Deterministic extraction rules |
312
- | `llm` | `{ provider, systemPrompt, defaultConfidence? }` | no | LLM configuration. Omit for rules-only mode. |
313
- | `normalizers` | `Normalizer[]` | no | Post-merge transformations |
314
- | `validators` | `Validator[]` | no | Output invariants |
315
- | `conflictStrategy` | `'flag' \| 'prefer-rule' \| 'prefer-llm'` | no | Default: `'flag'` |
316
- | `logger` | `Logger` | no | Injectable logger (compatible with Pino, Winston, console) |
445
+ | `schema` | `ZodObject` | yes | Output schema (drives field enumeration and re-validation). |
446
+ | `rules` | `ExtractionRule[]` | yes | Deterministic extraction rules. |
447
+ | `llm` | `ExtractorLlmConfig` | no | LLM fallback. Omit for rules-only mode. See below. |
448
+ | `normalizers` | `Normalizer<T>[]` | no | Post-merge transformations, run in declared order. |
449
+ | `validators` | `Validator<ExtractedData<T>>[]` | no | Invariants populating `result.validation`. |
450
+ | `policy` | `Partial<FieldMergePolicy>` | no | Overrides the per-field merge policy (conflict strategy, confidence defaults, equality) for every field. |
451
+ | `policyByField` | `{ [K in keyof T]?: Partial<FieldMergePolicy> }` | no | Per-field overrides applied on top of `policy`. Precedence: defaults < `policy` < `policyByField[field]`. |
452
+ | `logger` | `Logger` | no | Pino/Winston/console-compatible. Warnings from `rule.apply` and `merge.apply` flow through. |
317
453
 
318
- ### `rule(field, extractFn)`
454
+ `ExtractorLlmConfig`:
319
455
 
320
- Factory to create an `ExtractionRule`.
456
+ | Field | Type | Required | Description |
457
+ |-------|------|----------|-------------|
458
+ | `provider` | `LlmProvider` | yes | Single-method adapter the extractor calls. |
459
+ | `systemPrompt` | `string` | no | Overrides the built-in system prompt. |
460
+ | `mode` | `'fill-gaps' \| 'cross-check'` | no | `'fill-gaps'` (default) asks the LLM only about fields the rules did not resolve. `'cross-check'` asks about every schema field so `merge.apply` can surface agreements / conflicts. |
461
+ | `crossCheckHints` | `'bias' \| 'unbiased'` | no | In cross-check mode only. `'unbiased'` (default) hides rule values from the LLM for genuine disagreement detection; `'bias'` re-exposes them to save tokens. |
462
+ | `transformRequest` | `(request, content) => LlmRequest \| Promise<LlmRequest>` | no | Hook called with the built request before `provider.complete`. PII redaction, locale tagging, etc. |
463
+ | `transformResponse` | `(result, request) => LlmResult \| Promise<LlmResult>` | no | Hook called with the parsed LLM result before the merge. PII restoration, post-processing, etc. |
321
464
 
322
- ### `confidence(value, score)`
465
+ ### `rule` namespace
323
466
 
324
- Factory to create a `RuleMatch` with a confidence score.
467
+ | Member | Signature | Description |
468
+ |---|---|---|
469
+ | `rule.create` | `(field, extract, options?) => ExtractionRule` | Declare a rule. `extract(content)` returns a `RuleMatch` or `null`. `options.id` sets the stable identifier surfaced in `result.sources`. |
470
+ | `rule.regex` | `(field, pattern, score, transform?, options?) => ExtractionRule` | Regex-based rule. On match, capture group 1 (or the full match) is fed to `transform`. `options.id` sets the stable identifier surfaced in `result.sources`. |
471
+ | `rule.confidence` | `(value, score) => RuleMatch` | Wrap a value and a confidence score; sugar for custom `extract` callbacks. |
472
+ | `rule.apply` | `(content, rules, schema, logger?) => RulesResult` | Run every rule, pick the highest-confidence match per field, type-check against the schema. |
325
473
 
326
- ### `validators<T>()`
474
+ ### `validator.of<T>()`
327
475
 
328
- Factory bound to the data shape `T`. Returns `{ field, crossField }`:
476
+ Binds a target data shape `T` and returns two validator builders:
329
477
 
330
- - `field(name, rule, checkFn, message, severity?)` single-field validator. `checkFn` receives the precise type of the field (`T[name]`).
331
- - `crossField(rule, checkFn, message, severity?)` whole-object validator, produces a violation without a `field` property.
478
+ - `field(name, ruleName, check, message, severity?)`: single-field validator. `check(value, data)` receives the precise type of the field (`T[name]`) as first argument.
479
+ - `crossField(ruleName, check, message, severity?)`: whole-object validator, produces a violation without a `field` property.
332
480
 
333
481
  Binding `T` once lets TypeScript infer each field's type from the field name, so predicates are fully typed without manual annotations.
334
482
 
@@ -336,10 +484,10 @@ Binding `T` once lets TypeScript infer each field's type from the field name, so
336
484
 
337
485
  | Method | Sync | Description |
338
486
  |--------|------|-------------|
339
- | `extract(content)` | async | Full pipeline: rules LLM merge validate |
340
- | `extractSync(content)` | sync | Rules only. Returns partial result + missing fields. |
341
- | `prompt(content, partial)` | sync | Builds LLM prompt for missing fields only. |
342
- | `parse(raw)` | sync | Parses raw LLM JSON response. |
487
+ | `extract(content)` | async | Full pipeline: rules -> LLM (if configured) -> merge -> normalize -> validate. |
488
+ | `extractSync(content)` | sync | Rules only. Returns the partial result + `missing` fields. |
489
+ | `prompt(content, partial)` | sync | Builds the LLM request. Covers `partial.missing` in fill-gaps mode, every schema field in cross-check mode. |
490
+ | `parse(raw)` | sync | Parses a raw LLM JSON response, validating each field individually. Never throws. |
343
491
  | `merge(partial, llmResult, content)` | sync | Merges rules + LLM, detects conflicts, normalizes, validates. |
344
492
 
345
493
  ## License
@@ -1,18 +1,24 @@
1
1
  import type { z } from 'zod';
2
2
  import type { Extractor, ExtractorConfig } from './types/extractor.types.js';
3
3
  /**
4
- * Bind a schema, deterministic rules and an optional LLM fallback into an
4
+ * Bind a schema, deterministic rules and their merge-time options into an
5
5
  * {@link Extractor}. The returned object exposes the extraction pipeline as
6
6
  * pre-configured methods; call sites stop having to thread `schema`,
7
- * `rules` and provider wiring through every step.
7
+ * `rules`, `policy`, normalizers/validators and provider wiring through
8
+ * every step.
8
9
  *
9
- * {@link Extractor.extract} runs {@link rule.apply}, then when an LLM is
10
- * configured and some fields are still missing — asks the provider for those
11
- * fields only, parses the response with {@link prompt.parse} and fuses
12
- * everything through {@link merge.apply}.
10
+ * {@link Extractor.extract} runs {@link rule.apply}, then - when an LLM is
11
+ * configured - asks the provider either for the missing fields only
12
+ * (`mode: 'fill-gaps'`, default) or for every schema field
13
+ * (`mode: 'cross-check'`, which always triggers the LLM call so conflicts
14
+ * can be detected even when the rules resolved every field). The response
15
+ * is parsed with {@link prompt.parse} and fused through {@link merge.apply}.
13
16
  *
14
17
  * @typeParam S - A Zod object schema describing the target data shape.
15
- * @param config - Schema, deterministic rules, and optional LLM fallback.
18
+ * @param config - Schema, deterministic rules, and optional LLM fallback,
19
+ * plus `policy`, `normalizers`, `validators` and `logger` forwarded to
20
+ * every internal {@link merge.apply} call. The logger is also forwarded
21
+ * to {@link rule.apply} so schema-rejection warnings stay visible.
16
22
  * @returns An {@link Extractor} bound to `config.schema`.
17
23
  */
18
24
  export declare function createExtractor<S extends z.ZodObject<z.ZodRawShape>>(config: ExtractorConfig<S>): Extractor<z.infer<S>>;
@@ -1 +1 @@
1
- {"version":3,"file":"extractor.d.ts","sourceRoot":"","sources":["../src/extractor.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAI7B,OAAO,KAAK,EAAE,SAAS,EAAE,eAAe,EAAE,MAAM,4BAA4B,CAAC;AA8C7E;;;;;;;;;;;;;;GAcG;AACH,wBAAgB,eAAe,CAAC,CAAC,SAAS,CAAC,CAAC,SAAS,CAAC,CAAC,CAAC,WAAW,CAAC,EAClE,MAAM,EAAE,eAAe,CAAC,CAAC,CAAC,GACzB,SAAS,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CAuDvB"}
1
+ {"version":3,"file":"extractor.d.ts","sourceRoot":"","sources":["../src/extractor.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAI7B,OAAO,KAAK,EAAE,SAAS,EAAE,eAAe,EAAE,MAAM,4BAA4B,CAAC;AAmD7E;;;;;;;;;;;;;;;;;;;;GAoBG;AACH,wBAAgB,eAAe,CAAC,CAAC,SAAS,CAAC,CAAC,SAAS,CAAC,CAAC,CAAC,WAAW,CAAC,EAClE,MAAM,EAAE,eAAe,CAAC,CAAC,CAAC,GACzB,SAAS,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CA4EvB"}
package/dist/extractor.js CHANGED
@@ -10,6 +10,7 @@ import { prompt } from './prompt.js';
10
10
  function rulesResultFromPartial(partial, allFields) {
11
11
  const values = {};
12
12
  const confidence = {};
13
+ const sourceIds = {};
13
14
  for (const field of allFields) {
14
15
  const value = partial.data[field];
15
16
  if (value === null) {
@@ -20,8 +21,12 @@ function rulesResultFromPartial(partial, allFields) {
20
21
  if (fieldConfidence !== null) {
21
22
  confidence[field] = fieldConfidence;
22
23
  }
24
+ const source = partial.sources[field];
25
+ if (source !== null && 'ruleId' in source) {
26
+ sourceIds[field] = source.ruleId;
27
+ }
23
28
  }
24
- return { values, confidence, missing: [...partial.missing] };
29
+ return { values, confidence, sourceIds, missing: [...partial.missing] };
25
30
  }
26
31
  /**
27
32
  * Stamp `result.meta.durationMs` with the wall-clock elapsed since `startedAt`.
@@ -36,18 +41,24 @@ function stampDuration(result, startedAt) {
36
41
  };
37
42
  }
38
43
  /**
39
- * Bind a schema, deterministic rules and an optional LLM fallback into an
44
+ * Bind a schema, deterministic rules and their merge-time options into an
40
45
  * {@link Extractor}. The returned object exposes the extraction pipeline as
41
46
  * pre-configured methods; call sites stop having to thread `schema`,
42
- * `rules` and provider wiring through every step.
47
+ * `rules`, `policy`, normalizers/validators and provider wiring through
48
+ * every step.
43
49
  *
44
- * {@link Extractor.extract} runs {@link rule.apply}, then when an LLM is
45
- * configured and some fields are still missing — asks the provider for those
46
- * fields only, parses the response with {@link prompt.parse} and fuses
47
- * everything through {@link merge.apply}.
50
+ * {@link Extractor.extract} runs {@link rule.apply}, then - when an LLM is
51
+ * configured - asks the provider either for the missing fields only
52
+ * (`mode: 'fill-gaps'`, default) or for every schema field
53
+ * (`mode: 'cross-check'`, which always triggers the LLM call so conflicts
54
+ * can be detected even when the rules resolved every field). The response
55
+ * is parsed with {@link prompt.parse} and fused through {@link merge.apply}.
48
56
  *
49
57
  * @typeParam S - A Zod object schema describing the target data shape.
50
- * @param config - Schema, deterministic rules, and optional LLM fallback.
58
+ * @param config - Schema, deterministic rules, and optional LLM fallback,
59
+ * plus `policy`, `normalizers`, `validators` and `logger` forwarded to
60
+ * every internal {@link merge.apply} call. The logger is also forwarded
61
+ * to {@link rule.apply} so schema-rejection warnings stay visible.
51
62
  * @returns An {@link Extractor} bound to `config.schema`.
52
63
  */
53
64
  export function createExtractor(config) {
@@ -55,32 +66,49 @@ export function createExtractor(config) {
55
66
  if (allFields.length === 0) {
56
67
  throw new Error('createExtractor: schema must declare at least one field');
57
68
  }
69
+ const buildOptions = {
70
+ systemPrompt: config.llm?.systemPrompt,
71
+ mode: config.llm?.mode ?? 'fill-gaps',
72
+ crossCheckHints: config.llm?.crossCheckHints ?? 'unbiased',
73
+ };
74
+ const mergeOptions = {
75
+ policy: config.policy,
76
+ policyByField: config.policyByField,
77
+ normalizers: config.normalizers,
78
+ validators: config.validators,
79
+ logger: config.logger,
80
+ };
58
81
  return {
59
82
  async extract(content) {
60
83
  const startedAt = performance.now();
61
- const rulesResult = rule.apply(content, config.rules, config.schema);
62
- const partial = merge.apply(config.schema, rulesResult, null, content);
63
- if (!config.llm || partial.missing.length === 0) {
84
+ const rulesResult = rule.apply(content, config.rules, config.schema, config.logger);
85
+ const partial = merge.apply(config.schema, rulesResult, null, content, mergeOptions);
86
+ const shouldCallLlm = config.llm !== undefined &&
87
+ (buildOptions.mode === 'cross-check' || partial.missing.length > 0);
88
+ if (!shouldCallLlm) {
64
89
  return stampDuration(partial, startedAt);
65
90
  }
66
- const request = prompt.build(config.schema, partial, content, {
67
- systemPrompt: config.llm.systemPrompt,
68
- });
91
+ const builtRequest = prompt.build(config.schema, partial, content, buildOptions);
92
+ const request = config.llm.transformRequest
93
+ ? await config.llm.transformRequest(builtRequest, content)
94
+ : builtRequest;
69
95
  const completion = await config.llm.provider.complete(request);
70
- const llmResult = prompt.parse(config.schema, partial.missing, completion.values);
71
- const final = merge.apply(config.schema, rulesResult, llmResult, content);
96
+ const llmTargetFields = buildOptions.mode === 'cross-check' ? allFields : partial.missing;
97
+ const parsedLlmResult = prompt.parse(config.schema, llmTargetFields, completion.values);
98
+ const llmResult = config.llm.transformResponse
99
+ ? await config.llm.transformResponse(parsedLlmResult, request)
100
+ : parsedLlmResult;
101
+ const final = merge.apply(config.schema, rulesResult, llmResult, content, mergeOptions);
72
102
  return stampDuration(final, startedAt);
73
103
  },
74
104
  extractSync(content) {
75
105
  const startedAt = performance.now();
76
- const rulesResult = rule.apply(content, config.rules, config.schema);
77
- const partial = merge.apply(config.schema, rulesResult, null, content);
106
+ const rulesResult = rule.apply(content, config.rules, config.schema, config.logger);
107
+ const partial = merge.apply(config.schema, rulesResult, null, content, mergeOptions);
78
108
  return stampDuration(partial, startedAt);
79
109
  },
80
110
  prompt(content, partial) {
81
- return prompt.build(config.schema, partial, content, {
82
- systemPrompt: config.llm?.systemPrompt,
83
- });
111
+ return prompt.build(config.schema, partial, content, buildOptions);
84
112
  },
85
113
  parse(raw) {
86
114
  return prompt.parse(config.schema, allFields, raw);
@@ -88,7 +116,7 @@ export function createExtractor(config) {
88
116
  merge(partial, llmResult, content) {
89
117
  const startedAt = performance.now();
90
118
  const rulesResult = rulesResultFromPartial(partial, allFields);
91
- const result = merge.apply(config.schema, rulesResult, llmResult, content);
119
+ const result = merge.apply(config.schema, rulesResult, llmResult, content, mergeOptions);
92
120
  return stampDuration(result, startedAt);
93
121
  },
94
122
  };
@@ -1 +1 @@
1
- {"version":3,"file":"extractor.js","sourceRoot":"","sources":["../src/extractor.ts"],"names":[],"mappings":"AACA,OAAO,EAAE,IAAI,EAAE,MAAM,YAAY,CAAC;AAClC,OAAO,EAAE,KAAK,EAAE,MAAM,YAAY,CAAC;AACnC,OAAO,EAAE,MAAM,EAAE,MAAM,aAAa,CAAC;AAKrC;;;;;GAKG;AACH,SAAS,sBAAsB,CAC7B,OAA4B,EAC5B,SAA+B;IAE/B,MAAM,MAAM,GAAe,EAAE,CAAC;IAC9B,MAAM,UAAU,GAAqC,EAAE,CAAC;IACxD,KAAK,MAAM,KAAK,IAAI,SAAS,EAAE,CAAC;QAC9B,MAAM,KAAK,GAAG,OAAO,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC;QAClC,IAAI,KAAK,KAAK,IAAI,EAAE,CAAC;YACnB,SAAS;QACX,CAAC;QACD,MAAM,CAAC,KAAK,CAAC,GAAG,KAAmB,CAAC;QACpC,MAAM,eAAe,GAAG,OAAO,CAAC,UAAU,CAAC,KAAK,CAAC,CAAC;QAClD,IAAI,eAAe,KAAK,IAAI,EAAE,CAAC;YAC7B,UAAU,CAAC,KAAK,CAAC,GAAG,eAAe,CAAC;QACtC,CAAC;IACH,CAAC;IACD,OAAO,EAAE,MAAM,EAAE,UAAU,EAAE,OAAO,EAAE,CAAC,GAAG,OAAO,CAAC,OAAO,CAAC,EAAE,CAAC;AAC/D,CAAC;AAED;;;;;GAKG;AACH,SAAS,aAAa,CACpB,MAA2B,EAC3B,SAAiB;IAEjB,OAAO;QACL,GAAG,MAAM;QACT,IAAI,EAAE,EAAE,GAAG,MAAM,CAAC,IAAI,EAAE,UAAU,EAAE,WAAW,CAAC,GAAG,EAAE,GAAG,SAAS,EAAE;KACpE,CAAC;AACJ,CAAC;AAED;;;;;;;;;;;;;;GAcG;AACH,MAAM,UAAU,eAAe,CAC7B,MAA0B;IAG1B,MAAM,SAAS,GAAG,MAAM,CAAC,IAAI,CAAC,MAAM,CAAC,MAAM,CAAC,KAAK,CAAmB,CAAC;IAErE,IAAI,SAAS,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;QAC3B,MAAM,IAAI,KAAK,CAAC,yDAAyD,CAAC,CAAC;IAC7E,CAAC;IAED,OAAO;QACL,KAAK,CAAC,OAAO,CAAC,OAAO;YACnB,MAAM,SAAS,GAAG,WAAW,CAAC,GAAG,EAAE,CAAC;YACpC,MAAM,WAAW,GAAG,IAAI,CAAC,KAAK,CAAC,OAAO,EAAE,MAAM,CAAC,KAAK,EAAE,MAAM,CAAC,MAAM,CAAC,CAAC;YACrE,MAAM,OAAO,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,WAAW,EAAE,IAAI,EAAE,OAAO,CAAC,CAAC;YAEvE,IAAI,CAAC,MAAM,CAAC,GAAG,IAAI,OAAO,CAAC,OAAO,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;gBAChD,OAAO,aAAa,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;YAC3C,CAAC;YAED,MAAM,OAAO,GAAG,MAAM,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,OAAO,EAAE,OAAO,EAAE;gBAC5D,YAAY,EAAE,MAAM,CAAC,GAAG,CAAC,YAAY;aACtC,CAAC,CAAC;YACH,MAAM,UAAU,GAAG,MAAM,MAAM,CAAC,GAAG,CAAC,QAAQ,CAAC,QAAQ,CAAC,OAAO,CAAC,CAAC;YAC/D,MAAM,SAAS,GAAG,MAAM,CAAC,KAAK,CAC5B,MAAM,CAAC,MAAM,EACb,OAAO,CAAC,OAAO,EACf,UAAU,CAAC,MAAM,CAClB,CAAC;YACF,MAAM,KAAK,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,WAAW,EAAE,SAAS,EAAE,OAAO,CAAC,CAAC;YAC1E,OAAO,aAAa,CAAC,KAAK,EAAE,SAAS,CAAC,CAAC;QACzC,CAAC;QAED,WAAW,CAAC,OAAO;YACjB,MAAM,SAAS,GAAG,WAAW,CAAC,GAAG,EAAE,CAAC;YACpC,MAAM,WAAW,GAAG,IAAI,CAAC,KAAK,CAAC,OAAO,EAAE,MAAM,CAAC,KAAK,EAAE,MAAM,CAAC,MAAM,CAAC,CAAC;YACrE,MAAM,OAAO,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,WAAW,EAAE,IAAI,EAAE,OAAO,CAAC,CAAC;YACvE,OAAO,aAAa,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;QAC3C,CAAC;QAED,MAAM,CAAC,OAAO,EAAE,OAAO;YACrB,OAAO,MAAM,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,OAAO,EAAE,OAAO,EAAE;gBACnD,YAAY,EAAE,MAAM,CAAC,GAAG,EAAE,YAAY;aACvC,CAAC,CAAC;QACL,CAAC;QAED,KAAK,CAAC,GAAG;YACP,OAAO,MAAM,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,SAAS,EAAE,GAAG,CAAC,CAAC;QACrD,CAAC;QAED,KAAK,CAAC,OAAO,EAAE,SAAS,EAAE,OAAO;YAC/B,MAAM,SAAS,GAAG,WAAW,CAAC,GAAG,EAAE,CAAC;YACpC,MAAM,WAAW,GAAG,sBAAsB,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;YAC/D,MAAM,MAAM,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,WAAW,EAAE,SAAS,EAAE,OAAO,CAAC,CAAC;YAC3E,OAAO,aAAa,CAAC,MAAM,EAAE,SAAS,CAAC,CAAC;QAC1C,CAAC;KACF,CAAC;AACJ,CAAC"}
1
+ {"version":3,"file":"extractor.js","sourceRoot":"","sources":["../src/extractor.ts"],"names":[],"mappings":"AACA,OAAO,EAAE,IAAI,EAAE,MAAM,YAAY,CAAC;AAClC,OAAO,EAAE,KAAK,EAAE,MAAM,YAAY,CAAC;AACnC,OAAO,EAAE,MAAM,EAAE,MAAM,aAAa,CAAC;AAKrC;;;;;GAKG;AACH,SAAS,sBAAsB,CAC7B,OAA4B,EAC5B,SAA+B;IAE/B,MAAM,MAAM,GAAe,EAAE,CAAC;IAC9B,MAAM,UAAU,GAAqC,EAAE,CAAC;IACxD,MAAM,SAAS,GAAqC,EAAE,CAAC;IACvD,KAAK,MAAM,KAAK,IAAI,SAAS,EAAE,CAAC;QAC9B,MAAM,KAAK,GAAG,OAAO,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC;QAClC,IAAI,KAAK,KAAK,IAAI,EAAE,CAAC;YACnB,SAAS;QACX,CAAC;QACD,MAAM,CAAC,KAAK,CAAC,GAAG,KAAmB,CAAC;QACpC,MAAM,eAAe,GAAG,OAAO,CAAC,UAAU,CAAC,KAAK,CAAC,CAAC;QAClD,IAAI,eAAe,KAAK,IAAI,EAAE,CAAC;YAC7B,UAAU,CAAC,KAAK,CAAC,GAAG,eAAe,CAAC;QACtC,CAAC;QACD,MAAM,MAAM,GAAG,OAAO,CAAC,OAAO,CAAC,KAAK,CAAC,CAAC;QACtC,IAAI,MAAM,KAAK,IAAI,IAAI,QAAQ,IAAI,MAAM,EAAE,CAAC;YAC1C,SAAS,CAAC,KAAK,CAAC,GAAG,MAAM,CAAC,MAAM,CAAC;QACnC,CAAC;IACH,CAAC;IACD,OAAO,EAAE,MAAM,EAAE,UAAU,EAAE,SAAS,EAAE,OAAO,EAAE,CAAC,GAAG,OAAO,CAAC,OAAO,CAAC,EAAE,CAAC;AAC1E,CAAC;AAED;;;;;GAKG;AACH,SAAS,aAAa,CACpB,MAA2B,EAC3B,SAAiB;IAEjB,OAAO;QACL,GAAG,MAAM;QACT,IAAI,EAAE,EAAE,GAAG,MAAM,CAAC,IAAI,EAAE,UAAU,EAAE,WAAW,CAAC,GAAG,EAAE,GAAG,SAAS,EAAE;KACpE,CAAC;AACJ,CAAC;AAED;;;;;;;;;;;;;;;;;;;;GAoBG;AACH,MAAM,UAAU,eAAe,CAC7B,MAA0B;IAG1B,MAAM,SAAS,GAAG,MAAM,CAAC,IAAI,CAAC,MAAM,CAAC,MAAM,CAAC,KAAK,CAAmB,CAAC;IAErE,IAAI,SAAS,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;QAC3B,MAAM,IAAI,KAAK,CAAC,yDAAyD,CAAC,CAAC;IAC7E,CAAC;IAED,MAAM,YAAY,GAAG;QACnB,YAAY,EAAE,MAAM,CAAC,GAAG,EAAE,YAAY;QACtC,IAAI,EAAE,MAAM,CAAC,GAAG,EAAE,IAAI,IAAI,WAAW;QACrC,eAAe,EAAE,MAAM,CAAC,GAAG,EAAE,eAAe,IAAI,UAAU;KAClD,CAAC;IAEX,MAAM,YAAY,GAA4B;QAC5C,MAAM,EAAE,MAAM,CAAC,MAAM;QACrB,aAAa,EAAE,MAAM,CAAC,aAAa;QACnC,WAAW,EAAE,MAAM,CAAC,WAAW;QAC/B,UAAU,EAAE,MAAM,CAAC,UAAU;QAC7B,MAAM,EAAE,MAAM,CAAC,MAAM;KACtB,CAAC;IAEF,OAAO;QACL,KAAK,CAAC,OAAO,CAAC,OAAO;YACnB,MAAM,SAAS,GAAG,WAAW,CAAC,GAAG,EAAE,CAAC;YACpC,MAAM,WAAW,GAAG,IAAI,CAAC,KAAK,CAAC,OAAO,EAAE,MAAM,CAAC,KAAK,EAAE,MAAM,CAAC,MAAM,EAAE,MAAM,CAAC,MAAM,CAAC,CAAC;YACpF,MAAM,OAAO,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,WAAW,EAAE,IAAI,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;YAErF,MAAM,aAAa,GACjB,MAAM,CAAC,GAAG,KAAK,SAAS;gBACxB,CAAC,YAAY,CAAC,IAAI,KAAK,aAAa,IAAI,OAAO,CAAC,OAAO,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC;YACtE,IAAI,CAAC,aAAa,EAAE,CAAC;gBACnB,OAAO,aAAa,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;YAC3C,CAAC;YAED,MAAM,YAAY,GAAG,MAAM,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,OAAO,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;YACjF,MAAM,OAAO,GAAG,MAAM,CAAC,GAAI,CAAC,gBAAgB;gBAC1C,CAAC,CAAC,MAAM,MAAM,CAAC,GAAI,CAAC,gBAAgB,CAAC,YAAY,EAAE,OAAO,CAAC;gBAC3D,CAAC,CAAC,YAAY,CAAC;YACjB,MAAM,UAAU,GAAG,MAAM,MAAM,CAAC,GAAI,CAAC,QAAQ,CAAC,QAAQ,CAAC,OAAO,CAAC,CAAC;YAChE,MAAM,eAAe,GACnB,YAAY,CAAC,IAAI,KAAK,aAAa,CAAC,CAAC,CAAC,SAAS,CAAC,CAAC,CAAC,OAAO,CAAC,OAAO,CAAC;YACpE,MAAM,eAAe,GAAG,MAAM,CAAC,KAAK,CAClC,MAAM,CAAC,MAAM,EACb,eAAe,EACf,UAAU,CAAC,MAAM,CAClB,CAAC;YACF,MAAM,SAAS,GAAG,MAAM,CAAC,GAAI,CAAC,iBAAiB;gBAC7C,CAAC,CAAC,MAAM,MAAM,CAAC,GAAI,CAAC,iBAAiB,CAAC,eAAe,EAAE,OAAO,CAAC;gBAC/D,CAAC,CAAC,eAAe,CAAC;YACpB,MAAM,KAAK,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,WAAW,EAAE,SAAS,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;YACxF,OAAO,aAAa,CAAC,KAAK,EAAE,SAAS,CAAC,CAAC;QACzC,CAAC;QAED,WAAW,CAAC,OAAO;YACjB,MAAM,SAAS,GAAG,WAAW,CAAC,GAAG,EAAE,CAAC;YACpC,MAAM,WAAW,GAAG,IAAI,CAAC,KAAK,CAAC,OAAO,EAAE,MAAM,CAAC,KAAK,EAAE,MAAM,CAAC,MAAM,EAAE,MAAM,CAAC,MAAM,CAAC,CAAC;YACpF,MAAM,OAAO,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,WAAW,EAAE,IAAI,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;YACrF,OAAO,aAAa,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;QAC3C,CAAC;QAED,MAAM,CAAC,OAAO,EAAE,OAAO;YACrB,OAAO,MAAM,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,OAAO,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;QACrE,CAAC;QAED,KAAK,CAAC,GAAG;YACP,OAAO,MAAM,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,SAAS,EAAE,GAAG,CAAC,CAAC;QACrD,CAAC;QAED,KAAK,CAAC,OAAO,EAAE,SAAS,EAAE,OAAO;YAC/B,MAAM,SAAS,GAAG,WAAW,CAAC,GAAG,EAAE,CAAC;YACpC,MAAM,WAAW,GAAG,sBAAsB,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;YAC/D,MAAM,MAAM,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,MAAM,EAAE,WAAW,EAAE,SAAS,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;YACzF,OAAO,aAAa,CAAC,MAAM,EAAE,SAAS,CAAC,CAAC;QAC1C,CAAC;KACF,CAAC;AACJ,CAAC"}
package/dist/index.d.ts CHANGED
@@ -16,9 +16,9 @@ export { prompt } from './prompt.js';
16
16
  export { validator } from './validate.js';
17
17
  export type { ExtractionRule, RuleMatch, RulesResult, } from './types/rule.types.js';
18
18
  export type { Extractor, ExtractorConfig, ExtractorLlmConfig, } from './types/extractor.types.js';
19
- export type { LlmRequest } from './types/prompt.types.js';
19
+ export type { CrossCheckHints, LlmRequest, PromptBuildMode, PromptBuildOptions, } from './types/prompt.types.js';
20
20
  export type { LlmProvider } from './types/provider.types.js';
21
21
  export type { Logger } from './types/logger.types.js';
22
22
  export type { Severity, Violation, Validator, } from './types/validate.types.js';
23
- export type { Conflict, ConflictStrategy, ExtractedData, ExtractionMeta, ExtractionResult, FieldCompare, FieldMergePolicy, FieldMergeResult, LlmResult, MergeApplyOptions, Normalizer, ValidationResult, } from './types/merge.types.js';
23
+ export type { Conflict, ConflictStrategy, ExtractedData, ExtractionMeta, ExtractionResult, FieldCompare, FieldMergePolicy, FieldMergeResult, FieldSource, LlmResult, MergeApplyOptions, Normalizer, ValidationResult, } from './types/merge.types.js';
24
24
  //# sourceMappingURL=index.d.ts.map
@@ -1 +1 @@
1
- {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA;;;;;;;;;;GAUG;AAEH,OAAO,EAAE,eAAe,EAAE,MAAM,gBAAgB,CAAC;AACjD,OAAO,EAAE,IAAI,EAAE,MAAM,YAAY,CAAC;AAClC,OAAO,EAAE,KAAK,EAAE,MAAM,YAAY,CAAC;AACnC,OAAO,EAAE,MAAM,EAAE,MAAM,aAAa,CAAC;AACrC,OAAO,EAAE,SAAS,EAAE,MAAM,eAAe,CAAC;AAE1C,YAAY,EACV,cAAc,EACd,SAAS,EACT,WAAW,GACZ,MAAM,uBAAuB,CAAC;AAE/B,YAAY,EACV,SAAS,EACT,eAAe,EACf,kBAAkB,GACnB,MAAM,4BAA4B,CAAC;AAEpC,YAAY,EAAE,UAAU,EAAE,MAAM,yBAAyB,CAAC;AAC1D,YAAY,EAAE,WAAW,EAAE,MAAM,2BAA2B,CAAC;AAC7D,YAAY,EAAE,MAAM,EAAE,MAAM,yBAAyB,CAAC;AAEtD,YAAY,EACV,QAAQ,EACR,SAAS,EACT,SAAS,GACV,MAAM,2BAA2B,CAAC;AAEnC,YAAY,EACV,QAAQ,EACR,gBAAgB,EAChB,aAAa,EACb,cAAc,EACd,gBAAgB,EAChB,YAAY,EACZ,gBAAgB,EAChB,gBAAgB,EAChB,SAAS,EACT,iBAAiB,EACjB,UAAU,EACV,gBAAgB,GACjB,MAAM,wBAAwB,CAAC"}
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA;;;;;;;;;;GAUG;AAEH,OAAO,EAAE,eAAe,EAAE,MAAM,gBAAgB,CAAC;AACjD,OAAO,EAAE,IAAI,EAAE,MAAM,YAAY,CAAC;AAClC,OAAO,EAAE,KAAK,EAAE,MAAM,YAAY,CAAC;AACnC,OAAO,EAAE,MAAM,EAAE,MAAM,aAAa,CAAC;AACrC,OAAO,EAAE,SAAS,EAAE,MAAM,eAAe,CAAC;AAE1C,YAAY,EACV,cAAc,EACd,SAAS,EACT,WAAW,GACZ,MAAM,uBAAuB,CAAC;AAE/B,YAAY,EACV,SAAS,EACT,eAAe,EACf,kBAAkB,GACnB,MAAM,4BAA4B,CAAC;AAEpC,YAAY,EACV,eAAe,EACf,UAAU,EACV,eAAe,EACf,kBAAkB,GACnB,MAAM,yBAAyB,CAAC;AACjC,YAAY,EAAE,WAAW,EAAE,MAAM,2BAA2B,CAAC;AAC7D,YAAY,EAAE,MAAM,EAAE,MAAM,yBAAyB,CAAC;AAEtD,YAAY,EACV,QAAQ,EACR,SAAS,EACT,SAAS,GACV,MAAM,2BAA2B,CAAC;AAEnC,YAAY,EACV,QAAQ,EACR,gBAAgB,EAChB,aAAa,EACb,cAAc,EACd,gBAAgB,EAChB,YAAY,EACZ,gBAAgB,EAChB,gBAAgB,EAChB,WAAW,EACX,SAAS,EACT,iBAAiB,EACjB,UAAU,EACV,gBAAgB,GACjB,MAAM,wBAAwB,CAAC"}
@@ -1 +1 @@
1
- {"version":3,"file":"merge.d.ts","sourceRoot":"","sources":["../src/merge.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAC7B,OAAO,KAAK,EAAE,MAAM,EAAE,MAAM,yBAAyB,CAAC;AACtD,OAAO,KAAK,EAAE,SAAS,EAAE,WAAW,EAAE,MAAM,uBAAuB,CAAC;AAEpE,OAAO,KAAK,EAGV,gBAAgB,EAChB,gBAAgB,EAChB,gBAAgB,EAChB,SAAS,EACT,iBAAiB,EAElB,MAAM,wBAAwB,CAAC;AA+GhC;;;;;GAKG;AACH,eAAO,MAAM,KAAK;IAChB;;;;;;OAMG;;QAED,6CAA6C;;QAE7C,yDAAyD;;QAEzD,sDAAsD;;QAEtD,wDAAwD;;QAExD,qGAAqG;qBACxF,OAAO,KAAK,OAAO,KAAG,OAAO;;IAQ5C;;;;;;;;;;;;;;;;;;;;OAoBG;UACG,CAAC,SACE,MAAM,aACF,SAAS,CAAC,CAAC,CAAC,GAAG,IAAI,YACpB,OAAO,WACR,OAAO,CAAC,gBAAgB,CAAC,WACzB,MAAM,GACd,gBAAgB,CAAC,CAAC,CAAC;IAgEtB;;;;;;;;;;;;;;;;;;;;OAoBG;UACG,CAAC,SAAS,CAAC,CAAC,SAAS,CAAC,CAAC,CAAC,WAAW,CAAC,UAChC,CAAC,eACI,WAAW,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,aACzB,SAAS,GAAG,IAAI,WAClB,MAAM,YACL,iBAAiB,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,GACtC,gBAAgB,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;CAmChC,CAAC"}
1
+ {"version":3,"file":"merge.d.ts","sourceRoot":"","sources":["../src/merge.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAC7B,OAAO,KAAK,EAAE,MAAM,EAAE,MAAM,yBAAyB,CAAC;AACtD,OAAO,KAAK,EAAE,SAAS,EAAE,WAAW,EAAE,MAAM,uBAAuB,CAAC;AAEpE,OAAO,KAAK,EAGV,gBAAgB,EAChB,gBAAgB,EAChB,gBAAgB,EAEhB,SAAS,EACT,iBAAiB,EAElB,MAAM,wBAAwB,CAAC;AAuKhC;;;;;GAKG;AACH,eAAO,MAAM,KAAK;IAChB;;;;;;OAMG;;QAED,6CAA6C;;QAE7C,yDAAyD;;QAEzD,sDAAsD;;QAEtD,wDAAwD;;QAExD,qGAAqG;qBACxF,OAAO,KAAK,OAAO,KAAG,OAAO;;IAQ5C;;;;;;;;;;;;;;;;;;;;OAoBG;UACG,CAAC,SACE,MAAM,aACF,SAAS,CAAC,CAAC,CAAC,GAAG,IAAI,YACpB,OAAO,WACR,OAAO,CAAC,gBAAgB,CAAC,WACzB,MAAM,GACd,gBAAgB,CAAC,CAAC,CAAC;IAgEtB;;;;;;;;;;;;;;;;;;;;OAoBG;UACG,CAAC,SAAS,CAAC,CAAC,SAAS,CAAC,CAAC,CAAC,WAAW,CAAC,UAChC,CAAC,eACI,WAAW,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,aACzB,SAAS,GAAG,IAAI,WAClB,MAAM,YACL,iBAAiB,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,GACtC,gBAAgB,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;CAqChC,CAAC"}