npm - @talonic/docs - Versions diffs - 0.20.8 → 0.20.9 - Mend

@talonic/docs 0.20.8 → 0.20.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/dist/content.js +1560 -276
package/package.json +1 -1

package/dist/content.js CHANGED Viewed

@@ -427,6 +427,26 @@ var sections = [
       {
         type: "paragraph",
         text: "**Supported Formats:** 25+ file types. **Resolution:** 4-phase pipeline. **Instant Matches:** ~30% of cells (free)."
+      },
+      {
+        type: "paragraph",
+        text: "Talonic is an **agentic data structuring platform**. It ingests documents of any type, discovers every data point inside them, builds a knowledge graph of canonical fields, and deploys AI agents to fill structured output schemas. Every cell in the output carries provenance metadata \u2014 which pipeline phase filled it, the confidence score, and an AI reasoning trace linking back to the source document."
+      },
+      {
+        type: "list",
+        ordered: false,
+        items: [
+          "**25+ file formats** \u2014 PDF, DOCX, XLSX, images, HTML, JSON, CSV, email formats (EML, MSG), and ZIP archives.",
+          "**4-phase extraction pipeline** \u2014 resolve from the knowledge graph, extract with AI agents, re-resolve, then transform and validate.",
+          "**~30% instant matches** \u2014 cells filled from graph lookup are free and instant, reducing both cost and latency.",
+          "**Per-cell provenance** \u2014 every value traces back to its source with confidence scores and reasoning.",
+          "**Batch mode** \u2014 process large backlogs at 50% cost with a 48-hour delivery window."
+        ]
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "Talonic uses Anthropic Claude for intelligent extraction and reasoning. The platform handles OCR, classification, field discovery, and schema generation automatically \u2014 you provide documents and define what output you need."
       }
     ],
     related: [
@@ -442,6 +462,14 @@ var sections = [
       {
         question: "How many file formats does Talonic support?",
         answer: "Talonic supports 25+ file types including PDF, DOCX, XLSX, images (PNG, JPG), plain text, HTML, JSON, CSV, email formats (EML, MSG), and ZIP archives."
+      },
+      {
+        question: 'What does "per-cell provenance" mean?',
+        answer: "Every cell in the structured output carries metadata about which pipeline phase filled it, a confidence score, an AI reasoning trace, and references back to the source document. This makes every value auditable and explainable."
+      },
+      {
+        question: "How much do instant graph matches cost?",
+        answer: "Graph matches (approximately 30% of cells) are free. They are filled from the knowledge graph through deterministic lookup, so no LLM call is needed. Only cells that require AI extraction incur cost."
       }
     ],
     mentions: [
@@ -449,7 +477,9 @@ var sections = [
       "structured data",
       "provenance",
       "AI reasoning traces",
-      "4-phase pipeline"
+      "4-phase pipeline",
+      "agentic platform",
+      "knowledge graph"
     ]
   },
   {
@@ -463,6 +493,10 @@ var sections = [
         type: "paragraph",
         text: "The platform revolves around a small set of interconnected concepts. Understanding these will help you navigate every feature."
       },
+      {
+        type: "paragraph",
+        text: "Each concept builds on the previous ones. **Sources** produce **Documents**, documents yield **Fields**, fields aggregate into the **Field Registry**, the registry powers **Schemas**, schemas drive **Jobs**, and jobs produce structured output with **Provenance**. **Cases** emerge organically when documents share entities like names, reference numbers, or project codes."
+      },
       {
         type: "param-table",
         params: [
@@ -507,6 +541,11 @@ var sections = [
             description: "Per-cell metadata: which phase filled it, confidence score, reasoning trace, source references."
           }
         ]
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "The **Field Registry** is the heart of the platform. As you process more documents, the registry grows \u2014 fields are clustered semantically, promoted through tiers, and enriched with master extraction instructions. This accumulated knowledge makes every subsequent extraction faster and more accurate."
       }
     ],
     related: [
@@ -522,6 +561,14 @@ var sections = [
       {
         question: "What is provenance in Talonic?",
         answer: "Provenance is per-cell metadata that tracks which pipeline phase filled the value, the confidence score, an AI reasoning trace, and source references back to the original document."
+      },
+      {
+        question: "How do Cases form?",
+        answer: "Cases form automatically through the linking system. When two or more documents share entities \u2014 like a person's name, a reference number, or a project code \u2014 they are connected into a case via a bipartite document-entity graph."
+      },
+      {
+        question: "What is the difference between a Generated Schema and a Template Schema?",
+        answer: "Generated Schemas are created automatically by the platform based on the document types it discovers. Template Schemas are user-defined for specific output needs \u2014 you choose which fields to include and how they map to the Field Registry."
       }
     ],
     mentions: [
@@ -532,7 +579,8 @@ var sections = [
       "schema",
       "job",
       "case",
-      "provenance"
+      "provenance",
+      "knowledge graph"
     ]
   },
   {
@@ -564,6 +612,15 @@ var sections = [
           "**Review & approve** \u2014 Review with confidence indicators, provenance, and validation flags. Correct any values.",
           "**Deliver** \u2014 Push approved data to webhooks, REST APIs, SFTP, email, or cloud storage."
         ]
+      },
+      {
+        type: "paragraph",
+        text: "The pipeline is designed to be **progressive** \u2014 results appear as each phase completes rather than waiting for the entire job to finish. Phase 1 (graph resolve) fills ~30% of cells instantly and for free. Phase 2 (AI extraction) fills the remaining gaps. Phases 3 and 4 handle re-resolution and transformation. You can start reviewing early results while later phases are still running."
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "The **Dashboard** provides a real-time view of your pipeline progress with telemetry on strategy distribution, tier funnel, capture hit rate, and per-field state distribution. Use it to understand how well the knowledge graph is performing."
       }
     ],
     related: [
@@ -579,6 +636,14 @@ var sections = [
       {
         question: "What percentage of cells are filled by graph matches?",
         answer: "Approximately 30% of cells are filled instantly from graph matches at no AI cost. The remaining ~70% are filled by AI agents in subsequent pipeline phases."
+      },
+      {
+        question: "Can I review results before the full pipeline completes?",
+        answer: "Yes. The pipeline is progressive \u2014 Phase 1 graph matches appear instantly, and you can start reviewing while Phase 2 (AI extraction) and later phases are still running."
+      },
+      {
+        question: "What delivery destinations are supported?",
+        answer: "Six live connectors: webhook (with HMAC-SHA256 signing), SFTP, Amazon S3, Azure Blob Storage, Google Drive, and OneDrive. Additional integrations for Sheets, SharePoint, Gmail, Outlook, and HubSpot are planned."
       }
     ],
     mentions: [
@@ -588,7 +653,9 @@ var sections = [
       "field registry",
       "schema",
       "job execution",
-      "delivery"
+      "delivery",
+      "progressive results",
+      "dashboard"
     ]
   },
   {
@@ -602,6 +669,10 @@ var sections = [
         type: "paragraph",
         text: "Navigate using the sidebar. The platform is organized into three primary sections: **Sources** (ingest), **Structuring** (process & validate), and **Outputs** (deliver)."
       },
+      {
+        type: "paragraph",
+        text: "**Sources** is where documents enter the system \u2014 through manual upload, connected cloud storage, email inboxes, or the API. **Structuring** is where you define schemas, run extraction jobs, and review results. **Outputs** is where approved data is delivered to downstream systems through configured bindings."
+      },
       {
         type: "ui-excerpt",
         id: "sidebar-navigation",
@@ -611,6 +682,17 @@ var sections = [
       {
         type: "callout",
         text: "The fastest path to results: upload documents in **Sources**, then go to **Structuring &rarr; Runs &rarr; New** to create your first extraction job."
+      },
+      {
+        type: "list",
+        ordered: true,
+        items: [
+          "Upload a few sample documents in **Sources** (drag and drop or use a connected source).",
+          "Wait for extraction to complete \u2014 documents are OCR'd, classified, and fields are extracted automatically.",
+          "Navigate to **Structuring** and review the auto-generated schemas or create a custom template.",
+          "Create a new **Run** by selecting a schema and the documents to process.",
+          "Review results in the run view \u2014 each cell shows confidence, provenance, and reasoning."
+        ]
       }
     ],
     related: [
@@ -626,9 +708,17 @@ var sections = [
       {
         question: "How is the Talonic platform organized?",
         answer: "The platform is organized into three primary sections: Sources (document ingest), Structuring (processing & validation), and Outputs (delivery to downstream systems)."
+      },
+      {
+        question: "Do I need to define a schema before processing documents?",
+        answer: "No. Talonic auto-generates schemas based on the document types it discovers during extraction. You can use these generated schemas directly or create custom template schemas for specific output needs."
+      },
+      {
+        question: "What source connections are available?",
+        answer: "Ten source connectors: Google Drive, Gmail, SharePoint, OneDrive, Outlook, Teams, Notion, SQL databases (MSSQL/PostgreSQL), Amazon S3, and Azure Blob Storage. You can also upload files manually or ingest via the REST API."
       }
     ],
-    mentions: ["sidebar", "sources", "structuring", "outputs", "navigation", "Cmd+K"]
+    mentions: ["sidebar", "sources", "structuring", "outputs", "navigation", "Cmd+K", "source connectors"]
   }
 ];
@@ -3449,6 +3539,27 @@ var sections11 = [
       {
         type: "paragraph",
         text: "Navigate to **Workspace Settings &rarr; Shared Dialects** to manage workspace-level formatting. Individual schemas can override these defaults with inline dialect definitions when needed."
+      },
+      {
+        type: "paragraph",
+        text: "Dialects ensure consistency across all your structured output. When your downstream systems expect dates in `YYYY-MM-DD` format, numbers with `.` as the decimal separator, and CSVs delimited by `;`, you configure this once in the shared dialect rather than repeating it in every schema."
+      },
+      {
+        type: "list",
+        ordered: false,
+        items: [
+          "**Date format** \u2014 control how dates are serialized (e.g., `YYYY-MM-DD`, `DD/MM/YYYY`, `MM-DD-YYYY`).",
+          "**Number locale** \u2014 set the decimal separator and thousands grouping.",
+          "**CSV delimiter** \u2014 choose comma, semicolon, tab, or pipe for CSV exports.",
+          "**Null representation** \u2014 define how missing values appear (empty string, `NULL`, `N/A`, etc.).",
+          "**Boolean format** \u2014 choose between `true/false`, `yes/no`, `1/0`, or custom values.",
+          "**Encoding** \u2014 set the character encoding for file exports (UTF-8, ISO-8859-1, etc.)."
+        ]
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "When a schema defines an inline dialect, it takes precedence over the shared dialect for that schema only. All other schemas continue using the workspace defaults. This lets you handle special cases without affecting the rest of your output."
       }
     ],
     related: [
@@ -3464,6 +3575,14 @@ var sections11 = [
       {
         question: "Can individual schemas override shared dialects?",
         answer: "Yes. Individual schemas can override workspace-level shared dialect settings with inline dialect definitions when specific formatting is needed."
+      },
+      {
+        question: "When should I use a shared dialect vs an inline dialect?",
+        answer: "Use shared dialects for workspace-wide defaults that apply to most schemas. Use inline dialects only when a specific schema needs different formatting \u2014 for example, a schema that outputs dates in a different format for a particular downstream system."
+      },
+      {
+        question: "Do shared dialects affect the extraction process?",
+        answer: "No. Dialects only affect output formatting \u2014 how extracted values are serialized in exports and deliveries. The extraction and validation phases work with normalized internal representations regardless of dialect settings."
       }
     ],
     mentions: [
@@ -3471,7 +3590,9 @@ var sections11 = [
       "workspace settings",
       "output formatting",
       "date format",
-      "number locale"
+      "number locale",
+      "CSV delimiter",
+      "null representation"
     ]
   },
   {
@@ -3488,6 +3609,25 @@ var sections11 = [
       {
         type: "paragraph",
         text: "Navigate to **Workspace Settings &rarr; Reference Primitives** to manage workspace-level lookup tables. Each primitive is versioned independently, and schemas reference a specific version to ensure stable resolution."
+      },
+      {
+        type: "paragraph",
+        text: "The lookup convention follows a `key` / `value` structure where the `key` is the output code and the `value` is the human-readable label. During extraction, the platform maps FROM labels found in documents TO the canonical codes defined in the reference primitive. This ensures consistent, machine-readable output regardless of how values appear in source documents."
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "Versioning protects production stability. When you update a reference primitive, existing schemas continue using their pinned version until you explicitly update the version reference. This prevents unexpected changes to live extraction pipelines."
+      },
+      {
+        type: "list",
+        ordered: false,
+        items: [
+          "**Key-value structure** \u2014 `key` is the canonical code, `value` is the label the platform matches against.",
+          "**Independent versioning** \u2014 each update creates a new version; schemas pin to a specific version.",
+          "**Cross-schema sharing** \u2014 one primitive can be referenced by any number of schemas.",
+          "**3-tier lookup cascade** \u2014 string normalization, token fuzzy matching, and Haiku LLM fallback for ambiguous values."
+        ]
       }
     ],
     related: [
@@ -3503,6 +3643,14 @@ var sections11 = [
       {
         question: "How do reference primitives differ from schema reference tables?",
         answer: "Reference primitives are workspace-level and shared across schemas with independent versioning. Schema reference tables are defined inline within a specific schema."
+      },
+      {
+        question: "How does the lookup cascade work?",
+        answer: "The platform tries three tiers: first, exact string normalization (whitespace and case normalization). If that fails, token-based fuzzy matching. If the fuzzy match is below the confidence threshold, a Haiku LLM call resolves the ambiguity."
+      },
+      {
+        question: "What happens when I update a reference primitive?",
+        answer: "A new version is created. Existing schemas continue using their pinned version. You must explicitly update the version reference in each schema to use the new data, which protects production pipelines from unexpected changes."
       }
     ],
     mentions: [
@@ -3510,7 +3658,8 @@ var sections11 = [
       "lookup tables",
       "versioned",
       "workspace-level",
-      "code mapping"
+      "code mapping",
+      "lookup cascade"
     ]
   },
   {
@@ -3527,6 +3676,24 @@ var sections11 = [
       {
         type: "paragraph",
         text: "Navigate to **Workspace Settings &rarr; Change Review** to configure review requirements. When enabled, changes are queued for approval before being applied, ensuring that production data pipelines are not disrupted by unreviewed modifications."
+      },
+      {
+        type: "paragraph",
+        text: "Change review is particularly important for workspaces that feed downstream systems through delivery bindings. A small change to a schema field mapping or a reference primitive value can ripple through to every document processed after that point. The review process creates a checkpoint where a second pair of eyes can verify the change before it goes live."
+      },
+      {
+        type: "list",
+        ordered: false,
+        items: [
+          "**Schema changes** \u2014 field additions, removals, mapping updates, and format constraint modifications.",
+          "**Shared dialect changes** \u2014 date format, number locale, delimiter, and encoding updates.",
+          "**Reference primitive changes** \u2014 new versions of lookup tables and key-value modifications."
+        ]
+      },
+      {
+        type: "callout",
+        variant: "warning",
+        text: "Change review is recommended for any workspace that delivers data to production systems. Without it, a schema modification takes effect immediately and applies to all future extractions \u2014 there is no undo for documents already processed with the new configuration."
       }
     ],
     related: [
@@ -3542,6 +3709,14 @@ var sections11 = [
       {
         question: "When should I enable change review?",
         answer: "Enable change review for production workspaces where unreviewed modifications could disrupt live data pipelines. It ensures all changes are approved before being applied."
+      },
+      {
+        question: "What types of changes require review?",
+        answer: "Three categories: schema changes (field additions, removals, mapping updates, format constraints), shared dialect changes (formatting rules), and reference primitive changes (lookup table updates and new versions)."
+      },
+      {
+        question: "Can I bypass change review for urgent fixes?",
+        answer: "Change review can be disabled temporarily from Workspace Settings if an urgent fix is needed. However, this should be done with caution in production workspaces, and the review requirement should be re-enabled afterward."
       }
     ],
     mentions: [
@@ -3549,7 +3724,8 @@ var sections11 = [
       "governance",
       "production workspace",
       "approval workflow",
-      "review process"
+      "review process",
+      "delivery bindings"
     ]
   }
 ];
@@ -3566,6 +3742,26 @@ var sections12 = [
       {
         type: "paragraph",
         text: "Press `Cmd+K` (or `Ctrl+K`) from any page to open global search. Searches across documents, extracted values, field names, schema names, and sources simultaneously."
+      },
+      {
+        type: "paragraph",
+        text: "Omnisearch is designed to be the single entry point for finding anything in the platform. Rather than navigating to specific pages to search within them, Omnisearch queries a **materialized values index** that aggregates data across all your content. Results are grouped by category so you can quickly distinguish between a document match and a field name match."
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "Omnisearch results update as you type. The materialized index is rebuilt automatically whenever documents are processed or schemas change, so results are always current."
+      },
+      {
+        type: "list",
+        ordered: false,
+        items: [
+          "**Documents** \u2014 matches against file names, extracted text, and metadata.",
+          "**Extracted values** \u2014 finds specific data points across all processed documents.",
+          "**Field names** \u2014 searches the Field Registry for canonical field definitions.",
+          "**Schema names** \u2014 locates generated and template schemas by title.",
+          "**Sources** \u2014 matches source connection names and configurations."
+        ]
       }
     ],
     related: [
@@ -3581,9 +3777,17 @@ var sections12 = [
       {
         question: "What does Omnisearch search across?",
         answer: "Omnisearch searches across documents, extracted values, field names, schema names, and sources simultaneously, providing results from all categories in a single query."
+      },
+      {
+        question: "Does Omnisearch work on extracted data or just file names?",
+        answer: "Omnisearch queries a materialized values index that includes both file-level metadata and individual extracted data points. You can search for a specific invoice number, date, or name and find the document that contains it."
+      },
+      {
+        question: "How quickly are new documents searchable in Omnisearch?",
+        answer: "Documents become searchable as soon as extraction completes. The materialized index is updated automatically during document processing, so there is no manual reindex step."
       }
     ],
-    mentions: ["omnisearch", "global search", "Cmd+K", "Ctrl+K", "document search"]
+    mentions: ["omnisearch", "global search", "Cmd+K", "Ctrl+K", "document search", "materialized values index"]
   },
   {
     slug: "document-filters",
@@ -3595,6 +3799,30 @@ var sections12 = [
       {
         type: "paragraph",
         text: "The Documents page supports advanced filtering by extracted field values. Build conditions with field autocomplete, comparison operators (eq, contains, gt, between, is_empty, etc.), and combine multiple conditions. Filter state is URL-serializable so you can share filtered views. Save frequently-used filters as presets."
+      },
+      {
+        type: "paragraph",
+        text: "Filters operate on the **materialized values index** \u2014 a flattened representation of every extracted field value across your documents. This means filtering is fast even across thousands of documents because queries run against pre-indexed data rather than scanning raw extractions at query time."
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: 'Combine multiple filter conditions to build precise queries. For example, filter by `document_type eq "Invoice"` AND `total_amount gt 10000` AND `date between "2025-01-01" "2025-12-31"` to find high-value invoices from a specific year.'
+      },
+      {
+        type: "list",
+        ordered: false,
+        items: [
+          "`eq` \u2014 exact match (case-insensitive for strings).",
+          "`contains` \u2014 substring match within field values.",
+          "`gt` / `lt` \u2014 greater than or less than for numeric and date fields.",
+          "`between` \u2014 range match with lower and upper bounds.",
+          "`is_empty` \u2014 finds documents where a specific field has no extracted value."
+        ]
+      },
+      {
+        type: "paragraph",
+        text: "Filter state is encoded in the URL query string using dynamic SQL generation on the backend. This means you can bookmark filtered views, share them with teammates via a link, or save them as **presets** for one-click access to commonly used queries."
       }
     ],
     related: [
@@ -3610,6 +3838,14 @@ var sections12 = [
       {
         question: "Can I share filtered views with my team?",
         answer: "Yes. Filter state is URL-serializable, so you can share filtered views by copying the URL. You can also save frequently-used filters as presets."
+      },
+      {
+        question: "How does field autocomplete work in filters?",
+        answer: "When building a filter condition, the field selector shows all canonical fields from your Field Registry with autocomplete. Start typing a field name and matching fields appear, ensuring you always filter on valid, extracted fields."
+      },
+      {
+        question: "Can I filter on fields that have no value?",
+        answer: "Yes. The is_empty operator lets you find documents where a specific field was not extracted or has no value. This is useful for identifying documents that may need reprocessing or manual review."
       }
     ],
     mentions: [
@@ -3617,7 +3853,9 @@ var sections12 = [
       "field autocomplete",
       "comparison operators",
       "URL-serializable",
-      "filter presets"
+      "filter presets",
+      "materialized values index",
+      "dynamic SQL"
     ]
   }
 ];
@@ -3930,6 +4168,10 @@ var sections14 = [
         type: "paragraph",
         text: "Organizations support role-based access control:"
       },
+      {
+        type: "paragraph",
+        text: "Every user in your organization is assigned one of four roles that determine what they can see and do. Roles are hierarchical \u2014 each level includes all permissions of the levels below it. Choose the most restrictive role that still lets a team member do their job."
+      },
       {
         type: "param-table",
         title: "Roles",
@@ -3959,6 +4201,22 @@ var sections14 = [
       {
         type: "paragraph",
         text: "New members are added via domain matching: company email domains auto-match to your org with **pending** status requiring admin approval. Manage from the Team page."
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "Domain matching streamlines onboarding for larger teams. When a new user signs up with an email address matching your organization's domain (e.g., `@yourcompany.com`), they are automatically associated with your org in a **pending** state. An admin must approve them before they gain access."
+      },
+      {
+        type: "list",
+        ordered: true,
+        items: [
+          "Navigate to the **Team** page from the sidebar.",
+          "Review any **pending** members waiting for approval.",
+          "Approve or reject each pending member.",
+          "Assign the appropriate role based on their responsibilities.",
+          "Optionally, change roles later from the same Team page."
+        ]
       }
     ],
     related: [
@@ -3974,6 +4232,14 @@ var sections14 = [
       {
         question: "How are new team members added?",
         answer: "New members are added via domain matching: company email domains auto-match to your organization with pending status. Admin approval is required before access is granted."
+      },
+      {
+        question: "Can I change a team member's role after they join?",
+        answer: "Yes. Navigate to the Team page, find the member, and update their role. Changes take effect immediately. Only Admins and Owners can modify roles."
+      },
+      {
+        question: "What happens if I remove a team member?",
+        answer: "Removing a team member revokes their access to the organization immediately. Their past actions (edits, uploads, approvals) remain in the audit trail. They can be re-added later through the same domain matching process."
       }
     ],
     mentions: [
@@ -3984,7 +4250,8 @@ var sections14 = [
       "Member",
       "Admin",
       "Owner",
-      "domain matching"
+      "domain matching",
+      "pending approval"
     ]
   },
   {
@@ -3998,6 +4265,10 @@ var sections14 = [
         type: "paragraph",
         text: "The Usage & Registry page replaces the legacy credits view with a comprehensive cost breakdown. It shows per-feature cost (extraction, OCR, batch, matching), a daily cost chart, and a full call log with model, tokens, and cost per request. The **Master view** (admin only) shows per-customer breakdowns and platform-wide statistics."
       },
+      {
+        type: "paragraph",
+        text: "Understanding your usage patterns helps optimize costs. For example, if extraction dominates your spend, consider using **batch mode** for non-urgent documents to cut that cost in half. The daily cost chart makes it easy to spot usage spikes and correlate them with specific ingestion events."
+      },
       {
         type: "param-table",
         title: "Usage views",
@@ -4023,6 +4294,11 @@ var sections14 = [
             description: "Per-customer breakdown and platform-wide aggregates. Accessible only in master (all-tenant) mode."
           }
         ]
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "The call log records every LLM and OCR call with full detail \u2014 model name, input/output token counts, latency, and cost. Use it to audit individual extractions or investigate unexpected cost increases."
       }
     ],
     related: [
@@ -4038,6 +4314,10 @@ var sections14 = [
       {
         question: "What is the Master view?",
         answer: "The Master view is an admin-only feature that shows per-customer breakdowns and platform-wide statistics. It is accessible only in master (all-tenant) mode."
+      },
+      {
+        question: "How can I reduce my usage costs?",
+        answer: "Use batch mode for non-urgent documents to cut extraction costs by 50%. Review the per-feature breakdown to identify your highest-cost operations, and use the daily cost chart to spot and investigate usage spikes."
       }
     ],
     mentions: [
@@ -4045,7 +4325,8 @@ var sections14 = [
       "cost breakdown",
       "daily cost chart",
       "call log",
-      "master view"
+      "master view",
+      "per-feature costs"
     ]
   },
   {
@@ -4058,6 +4339,26 @@ var sections14 = [
       {
         type: "paragraph",
         text: "Accessible from the user menu for admins and superadmins. Provides: customer management, user management, usage statistics, data clear & rebuild, and cross-tenant master registry view."
+      },
+      {
+        type: "paragraph",
+        text: "The Admin Panel is the central hub for platform-wide operations. **Customer management** lets you create, view, and delete organizations. **User management** provides a cross-tenant view of all platform users with the ability to remove accounts. The **data clear & rebuild** function wipes all data for a specific customer and reprocesses from scratch \u2014 useful during onboarding or after significant schema changes."
+      },
+      {
+        type: "list",
+        ordered: false,
+        items: [
+          "**Customer management** \u2014 create, list, and delete organizations.",
+          "**User management** \u2014 view all users across tenants, remove accounts.",
+          "**Usage statistics** \u2014 platform-wide cost and usage aggregates.",
+          "**Data clear & rebuild** \u2014 wipe and reprocess all data for a customer.",
+          "**Master registry** \u2014 cross-tenant view of the Field Registry and schemas."
+        ]
+      },
+      {
+        type: "callout",
+        variant: "warning",
+        text: "The **data clear** operation is irreversible. It deletes all documents, extractions, jobs, and results for the selected customer. Use with caution and only when a full reprocessing is genuinely needed."
       }
     ],
     related: [
@@ -4072,6 +4373,14 @@ var sections14 = [
       {
         question: "Who can access the Admin Panel?",
         answer: "The Admin Panel is accessible only to users with admin or superadmin roles, via the user menu in the platform navigation."
+      },
+      {
+        question: "What does the data clear operation do?",
+        answer: "Data clear wipes all documents, extractions, jobs, results, and related data for a specific customer. It is irreversible and intended for full reprocessing scenarios during onboarding or after major schema changes."
+      },
+      {
+        question: "Can I view usage across all customers?",
+        answer: "Yes. The Admin Panel includes a master registry view that shows cross-tenant usage statistics, per-customer cost breakdowns, and platform-wide aggregates."
       }
     ],
     mentions: [
@@ -4079,7 +4388,8 @@ var sections14 = [
       "customer management",
       "user management",
       "data clear",
-      "master registry"
+      "master registry",
+      "superadmin"
     ]
   },
   {
@@ -4089,6 +4399,10 @@ var sections14 = [
     seoTitle: "Keyboard Shortcuts \u2014 Talonic Docs",
     description: "Global keyboard shortcuts: Cmd+K / Ctrl+K for Omnisearch, Cmd+J / Ctrl+J for quick extract, and Escape to close overlays. Speed up your workflow with keyboard navigation.",
     content: [
+      {
+        type: "paragraph",
+        text: "Talonic provides global keyboard shortcuts that work from any page in the platform. These shortcuts let you access common actions without leaving your current context, significantly speeding up daily workflows."
+      },
       {
         type: "param-table",
         title: "Shortcuts",
@@ -4109,6 +4423,11 @@ var sections14 = [
             description: "Close overlays, modals, and search."
           }
         ]
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "The **quick extract** shortcut (`Cmd+J` / `Ctrl+J`) is the fastest way to upload a single document. It opens a streamlined upload interface that lets you drag a file and start processing immediately."
       }
     ],
     related: [
@@ -4123,6 +4442,10 @@ var sections14 = [
       {
         question: "What does the quick extract shortcut do?",
         answer: "Cmd+J / Ctrl+J opens the quick extract interface, allowing you to upload and process a document directly from any page."
+      },
+      {
+        question: "Do shortcuts work inside modals or overlays?",
+        answer: "The Escape shortcut works inside any modal or overlay to close it. Omnisearch (Cmd+K) works globally, even when other overlays are open. Quick extract (Cmd+J) is available from the main interface."
       }
     ],
     mentions: ["keyboard shortcuts", "Cmd+K", "Cmd+J", "Escape", "quick extract"]
@@ -4145,6 +4468,21 @@ var sections15 = [
       {
         type: "callout",
         text: "Batch mode cuts extraction cost in half. Stage 1 (OCR + classify) still runs immediately \u2014 only Stage 2 (Claude extraction) is deferred."
+      },
+      {
+        type: "paragraph",
+        text: "Under the hood, batch inference leverages the provider's native batch API (Anthropic Message Batches or AWS Bedrock invocation jobs). Documents accumulate in a queue and are submitted together, allowing the provider to schedule processing during off-peak capacity. This is why the cost reduction is possible without any loss in extraction quality."
+      },
+      {
+        type: "list",
+        ordered: false,
+        items: [
+          "**50% cost reduction** on all Claude extraction calls in Stage 2.",
+          "**48-hour delivery window** \u2014 most batches complete well within this timeframe.",
+          "**No quality difference** \u2014 the same extraction model and prompts are used as in real-time mode.",
+          "**Immediate visibility** \u2014 documents appear in your library right after Stage 1 (OCR + classification).",
+          "**Automatic result application** \u2014 when the batch completes, results are applied and documents transition to their final status."
+        ]
       }
     ],
     related: [
@@ -4160,9 +4498,17 @@ var sections15 = [
       {
         question: "When should I use batch mode?",
         answer: "Batch mode is ideal for large backlog ingestion where real-time results are not required. It cuts extraction cost in half compared to real-time processing."
+      },
+      {
+        question: "Is there a minimum number of documents for batch processing?",
+        answer: "The batch system requires a minimum of 100 items per batch (a Bedrock requirement). If fewer documents are uploaded in batch mode, the system falls back to real-time processing with a warning."
+      },
+      {
+        question: "Does batch mode affect extraction quality?",
+        answer: "No. Batch mode uses the same Claude extraction model and prompts as real-time processing. The only difference is timing \u2014 extraction is deferred to take advantage of provider off-peak pricing."
       }
     ],
-    mentions: ["batch inference", "50% cost", "48-hour delivery", "backlog ingestion"]
+    mentions: ["batch inference", "50% cost", "48-hour delivery", "backlog ingestion", "Message Batches API"]
   },
   {
     slug: "batch-processing",
@@ -4175,6 +4521,10 @@ var sections15 = [
         type: "paragraph",
         text: 'Set `processing_mode=batch` on upload (API) or toggle the "Batch" switch in the upload UI. Stage 1 (OCR + classification) runs immediately so documents appear in your library right away. Stage 2 (Claude extraction) is deferred to the provider\'s batch API for asynchronous processing.'
       },
+      {
+        type: "paragraph",
+        text: "The two-stage architecture means you get immediate feedback on what was uploaded. Documents are OCR'd, classified by type, and triaged within seconds. Only the AI extraction step \u2014 where Claude reads the document and fills structured fields \u2014 is deferred to the batch queue for cost savings."
+      },
       {
         type: "param-table",
         title: "Batch stages",
@@ -4190,6 +4540,15 @@ var sections15 = [
             description: "Claude extraction is queued for batch processing. Items accumulate, then submit to the batch API on a timer or threshold."
           }
         ]
+      },
+      {
+        type: "callout",
+        variant: "warning",
+        text: "Image-only documents (PNG, JPG) cannot be batched because the batch payload is text-only. These are automatically routed to real-time processing even when batch mode is enabled."
+      },
+      {
+        type: "paragraph",
+        text: "While waiting for batch results, documents show a status of `batch_queued`. Once the provider returns results, the platform applies them through the same post-processing pipeline as real-time extraction \u2014 including markdown pre-processing, field parsing, quality metrics, and extraction metadata computation."
       }
     ],
     related: [
@@ -4205,6 +4564,14 @@ var sections15 = [
       {
         question: "What runs immediately in batch mode?",
         answer: "Stage 1 (OCR, classification, and triage) runs in real-time. Only Stage 2 (Claude extraction) is deferred to the batch API for cost savings."
+      },
+      {
+        question: "What happens if a batch extraction fails to parse?",
+        answer: "Parse failures in batch mode are retried through the real-time extraction path \u2014 never as a new batch. This ensures the original 48-hour SLA is maintained while still recovering from transient issues."
+      },
+      {
+        question: "Can I enable batch mode per source?",
+        answer: "Yes. Each source connection has a batch processing toggle. When enabled, all documents ingested through that source are automatically processed in batch mode."
       }
     ],
     mentions: [
@@ -4212,7 +4579,8 @@ var sections15 = [
       "processing_mode",
       "Stage 1",
       "Stage 2",
-      "deferred extraction"
+      "deferred extraction",
+      "batch_queued"
     ]
   },
   {
@@ -4226,6 +4594,10 @@ var sections15 = [
         type: "paragraph",
         text: "The Batches page at `/sources/batches` shows the status of all batch jobs. Each batch progresses through three states: **accumulating** (items collecting), **submitted** (sent to provider), and **completed** (results applied). The page live-syncs with the provider for real-time status updates."
       },
+      {
+        type: "paragraph",
+        text: "Batches are submitted automatically when the accumulation timer fires (every 15 minutes by default) or when the item count threshold is reached. Once submitted, the platform polls the provider hourly to check for completion. When results arrive, they are applied to the corresponding documents and the batch transitions to **completed** status."
+      },
       {
         type: "param-table",
         title: "Batch statuses",
@@ -4246,6 +4618,11 @@ var sections15 = [
             description: "All results have been received and applied to the corresponding documents."
           }
         ]
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: 'If a batch gets stuck in "processing" due to an unexpected interruption, the platform automatically recovers it on startup. Batches stuck for more than 15 minutes are reverted to "submitted" so the next poll cycle retries them.'
       }
     ],
     related: [
@@ -4261,6 +4638,14 @@ var sections15 = [
       {
         question: "What are the batch statuses?",
         answer: "Three statuses: Accumulating (items collecting), Submitted (sent to provider, polled hourly), and Completed (results received and applied)."
+      },
+      {
+        question: "How often are batches submitted to the provider?",
+        answer: "Batches are submitted on a 15-minute timer or when the item count threshold is reached, whichever comes first. These intervals are configurable in the pipeline settings."
+      },
+      {
+        question: "What happens if a batch gets stuck?",
+        answer: 'The platform includes crash recovery logic. Batches stuck in "processing" for more than 15 minutes are automatically reverted to "submitted" so the next poll cycle retries them. No manual intervention is needed.'
       }
     ],
     mentions: [
@@ -4269,7 +4654,8 @@ var sections15 = [
       "submitted",
       "completed",
       "live-sync",
-      "provider polling"
+      "provider polling",
+      "crash recovery"
     ]
   }
 ];
@@ -4286,6 +4672,25 @@ var sections16 = [
       {
         type: "paragraph",
         text: "Upload CSV or Excel files as lookup tables. These reference datasets are used by the matching engine and by reference strategies in schemas. Each reference dataset is versioned and can be shared across multiple schemas."
+      },
+      {
+        type: "paragraph",
+        text: 'Reference data is the foundation of the matching system. It represents your "ground truth" \u2014 the known records you want to match extracted document data against. Common examples include customer lists, product catalogs, vendor registries, and contract databases.'
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "You can also import reference data directly from a SQL database connection. The import runs asynchronously \u2014 rows are streamed in batches of 500 and column headers appear immediately so you can preview the structure while the import runs."
+      },
+      {
+        type: "list",
+        ordered: false,
+        items: [
+          "**CSV and Excel (XLSX)** file uploads for quick one-time imports.",
+          "**SQL database imports** for live reference data from connected sources.",
+          "**Versioning** \u2014 each dataset tracks versions independently.",
+          "**Cross-schema sharing** \u2014 one dataset can be referenced by multiple schemas and matching configurations."
+        ]
       }
     ],
     related: [
@@ -4301,6 +4706,14 @@ var sections16 = [
       {
         question: "How is reference data used?",
         answer: "Reference datasets are used by the matching engine for field-to-field comparisons and by reference strategies in schemas for code mapping and value resolution."
+      },
+      {
+        question: "Can I import reference data from a database?",
+        answer: "Yes. Use the SQL import option to stream rows from a connected SQL database (MSSQL or PostgreSQL). The import runs asynchronously and you can monitor progress while it loads."
+      },
+      {
+        question: "What happens if I delete a source connection that was used for a SQL import?",
+        answer: 'The reference data remains intact. Deleting a source connection does not cascade to reference datasets \u2014 the UI shows a "source disconnected" indicator, but the imported data continues to work for matching.'
       }
     ],
     mentions: [
@@ -4308,7 +4721,8 @@ var sections16 = [
       "CSV upload",
       "Excel upload",
       "versioned datasets",
-      "matching engine"
+      "matching engine",
+      "SQL import"
     ]
   },
   {
@@ -4351,6 +4765,15 @@ var sections16 = [
       {
         type: "paragraph",
         text: "You can also use **AI strategy generation** to let the platform suggest field mappings and strategies automatically based on the schema and reference data structure."
+      },
+      {
+        type: "paragraph",
+        text: "Each field comparison carries a **weight** that determines how much it contributes to the overall confidence score. Set high weights on fields that are strong identifiers (like reference numbers or unique IDs) and lower weights on fields that are common or prone to variation (like names or descriptions). The weighted aggregate produces a final score between 0% and 100%."
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "Use **AI strategy generation** when setting up matching for the first time. The platform analyzes your schema fields and reference data columns, then suggests which fields to compare and which strategy to use for each. You can review and adjust the suggestions before saving."
       }
     ],
     related: [
@@ -4366,6 +4789,14 @@ var sections16 = [
       {
         question: "Can Talonic suggest matching configurations?",
         answer: "Yes. AI strategy generation can suggest field mappings and strategies automatically based on the schema and reference data structure."
+      },
+      {
+        question: "How do weights affect matching scores?",
+        answer: "Each field comparison carries a weight that determines its contribution to the overall confidence score. Fields with higher weights have more influence on the final score. The weighted aggregate produces a score between 0% and 100%."
+      },
+      {
+        question: "What is the difference between fuzzy and exact matching?",
+        answer: "Exact matching requires an identical string (case-insensitive). Fuzzy matching uses token-based comparison with a configurable similarity threshold, making it suitable for fields with minor variations like misspellings, abbreviations, or word reordering."
       }
     ],
     mentions: [
@@ -4374,7 +4805,8 @@ var sections16 = [
       "exact match",
       "fuzzy matching",
       "date_range",
-      "numeric_range"
+      "numeric_range",
+      "AI strategy generation"
     ]
   },
   {
@@ -4387,6 +4819,26 @@ var sections16 = [
       {
         type: "paragraph",
         text: "Execute a matching run against a reference dataset. Matching runs are processed asynchronously via BullMQ. You can monitor progress from the matching page and cancel running jobs if needed."
+      },
+      {
+        type: "paragraph",
+        text: "There are two types of runs: **manual runs** use only the deterministic matching strategies (exact, fuzzy, date_range, numeric_range) and complete quickly. **Smart runs** add an AI resolution pass \u2014 after the initial matching, an embedding-based search with a Haiku LLM resolver attempts to improve low-confidence results."
+      },
+      {
+        type: "list",
+        ordered: true,
+        items: [
+          "Navigate to the **Matching** page and select a matching configuration.",
+          "Click **Run** for a standard match or **Smart Run** for AI-enhanced matching.",
+          "Monitor progress in real-time on the matching page.",
+          "Cancel the run at any time if needed \u2014 partial results are preserved.",
+          "Review results when the run completes."
+        ]
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "Smart runs take longer but can significantly improve match quality for ambiguous data. The AI resolver uses document embeddings and a Haiku LLM to evaluate low-confidence candidates that the deterministic strategies could not resolve."
       }
     ],
     related: [
@@ -4402,9 +4854,17 @@ var sections16 = [
       {
         question: "Are matching runs synchronous or asynchronous?",
         answer: "Matching runs are processed asynchronously via a job queue. You can monitor progress from the matching page in real-time."
+      },
+      {
+        question: "What is the difference between a manual run and a smart run?",
+        answer: "A manual run uses only deterministic strategies (exact, fuzzy, date_range, numeric_range). A smart run adds an AI resolution pass using embeddings and a Haiku LLM to improve low-confidence results."
+      },
+      {
+        question: "Can I cancel a matching run in progress?",
+        answer: "Yes. You can cancel a running match job from the matching page. Partial results from documents already processed are preserved and available for review."
       }
     ],
-    mentions: ["matching runs", "async execution", "BullMQ", "progress monitoring"]
+    mentions: ["matching runs", "async execution", "BullMQ", "progress monitoring", "smart run", "AI resolution"]
   },
   {
     slug: "matching-results",
@@ -4417,6 +4877,10 @@ var sections16 = [
         type: "paragraph",
         text: "Results are presented per document with the top 5 match candidates. Each candidate includes a confidence score and field-level evidence showing which comparisons contributed to the match and how each field scored."
       },
+      {
+        type: "paragraph",
+        text: "The evidence view is designed to make match decisions transparent. For each candidate, you can see exactly which fields matched, what strategy was used, the individual field score, and the actual values that were compared. This makes it straightforward to verify correct matches and investigate false positives."
+      },
       {
         type: "param-table",
         title: "Result fields",
@@ -4437,6 +4901,11 @@ var sections16 = [
             description: "The five highest-scoring reference records for each document."
           }
         ]
+      },
+      {
+        type: "callout",
+        variant: "info",
+        text: "You can **approve or reject** individual match results. Approved matches can be used downstream in delivery pipelines. Rejected matches are excluded from future consideration for that document."
       }
     ],
     related: [
@@ -4452,6 +4921,14 @@ var sections16 = [
       {
         question: "What evidence is provided for each match?",
         answer: "Per-field evidence shows the strategy used (exact, fuzzy, date_range, numeric_range), the individual score, and the matched values for both the extracted data and the reference record."
+      },
+      {
+        question: "Can I approve or reject match results?",
+        answer: "Yes. Each match result can be individually approved or rejected. Approved matches flow through to downstream delivery pipelines, while rejected matches are excluded from future consideration for that document."
+      },
+      {
+        question: "Why does a match have a low confidence score?",
+        answer: "Low confidence usually means the fields being compared have significant differences or the matching strategies produced weak scores. Check the per-field evidence to identify which comparisons dragged the score down, then consider adjusting weights or strategies in the matching configuration."
       }
     ],
     mentions: [
@@ -4459,7 +4936,9 @@ var sections16 = [
       "top 5 candidates",
       "confidence score",
       "field-level evidence",
-      "weighted aggregate"
+      "weighted aggregate",
+      "approve",
+      "reject"
     ]
   }
 ];
@@ -5331,6 +5810,8 @@ var sections19 = [
     description: "List all uploaded and processed documents with filtering by source, status, date range, and full-text search with pagination support.",
     content: [
       { type: "paragraph", text: "Documents represent files that have been uploaded and processed. Each document retains its original file, extracted text, and metadata." },
+      { type: "paragraph", text: "Use query parameters to filter by source connection, processing status, date range, or full-text search across filenames and extracted content. Results are paginated with cursor-based navigation." },
+      { type: "callout", variant: "info", text: "Documents include **triage metadata** (sensitivity, department, jurisdiction, PII detection) when available. Triage is populated automatically during ingestion via Document AI annotations or Haiku LLM calls." },
       {
         type: "endpoint",
         method: "GET",
@@ -5443,9 +5924,11 @@ var sections19 = [
       { label: "List Extractions", slug: "list-extractions" }
     ],
     faq: [
-      { question: "How do I list documents in Talonic?", answer: "Send a GET request to /v1/documents with optional filters for source, status, date range, and search." }
+      { question: "How do I list documents in Talonic?", answer: "Send a GET request to /v1/documents with optional filters for source, status, date range, and search." },
+      { question: "What document statuses are possible?", answer: "Documents progress through `pending` (uploaded, awaiting processing), `processing` (OCR and extraction in progress), `completed` (ready for use), or `error` (processing failed). Batch-mode documents may also show `batch_queued`." },
+      { question: "How does full-text search work?", answer: "The `search` parameter matches against the document filename and extracted text content. It uses full-text search, so partial word matches are supported." }
     ],
-    mentions: ["documents", "pagination", "filtering"]
+    mentions: ["documents", "pagination", "filtering", "triage", "document status"]
   },
   {
     slug: "get-document",
@@ -5454,6 +5937,7 @@ var sections19 = [
     seoTitle: "Get Document Endpoint \u2014 Talonic Docs",
     description: "Retrieve a single document by ID with full metadata including file size, extracted text length, extraction count, and timestamps.",
     content: [
+      { type: "paragraph", text: "Retrieve full metadata for a single document, including processing status, detected type and language, triage information, and links to related resources. The response includes a `dashboard` link for viewing the document in the Talonic platform UI." },
       {
         type: "endpoint",
         method: "GET",
@@ -5535,8 +6019,11 @@ var sections19 = [
       { label: "List Documents", slug: "list-documents" },
       { label: "Delete Document", slug: "delete-document" }
     ],
-    faq: [],
-    mentions: ["document metadata"]
+    faq: [
+      { question: "How do I get the extracted text for a document?", answer: "Use the `GET /v1/documents/:id/markdown` endpoint to retrieve the OCR markdown output. For structured extraction results, use the extractions endpoint linked in the response." },
+      { question: "What does extraction_count indicate?", answer: "The number of times extraction has been performed on this document. Typically 0 (not yet extracted) or 1 (extracted). Re-extraction increments this count." }
+    ],
+    mentions: ["document metadata", "document detail", "triage"]
   },
   {
     slug: "delete-document",
@@ -5545,6 +6032,8 @@ var sections19 = [
     seoTitle: "Delete Document Endpoint \u2014 Talonic Docs",
     description: "Permanently delete a document and all associated extractions. This action is irreversible and removes the original file and all extraction results.",
     content: [
+      { type: "paragraph", text: "Permanently delete a document along with its original file, extracted text, and all associated extraction results. This operation cannot be undone." },
+      { type: "callout", variant: "warning", text: "Deletion is **irreversible**. The original file, OCR output, and all extraction results are permanently removed. If the document is part of a case or entity group, its links are also removed." },
       {
         type: "endpoint",
         method: "DELETE",
@@ -5584,8 +6073,11 @@ var sections19 = [
     related: [
       { label: "Get Document", slug: "get-document" }
     ],
-    faq: [],
-    mentions: ["delete", "irreversible"]
+    faq: [
+      { question: "Can I recover a deleted document?", answer: "No. Document deletion is permanent and irreversible. There is no soft-delete or trash mechanism. Ensure you have a backup before deleting." },
+      { question: "What happens to linking and cases when a document is deleted?", answer: "Entity links referencing the deleted document are removed. If the document was part of a case, the case may be recomputed with the remaining documents during the next backfill." }
+    ],
+    mentions: ["delete", "irreversible", "permanent deletion"]
   }
 ];
@@ -5599,6 +6091,8 @@ var sections20 = [
     description: "List extraction results with optional filters by document, schema, and status. Supports pagination for browsing large extraction sets.",
     content: [
       { type: "paragraph", text: "An extraction is the result of applying a schema to a document. A single document can have multiple extractions if different schemas are applied to it." },
+      { type: "paragraph", text: "Use this endpoint to browse extraction results across your organization. Filter by **document**, **status**, or **time range** to find specific results. Each extraction summary includes an overall confidence score and links to the full result." },
+      { type: "callout", variant: "info", text: "Extractions are returned in descending order by `created_at` by default. Use the `after` and `before` parameters to narrow results to a specific time window." },
       {
         type: "endpoint",
         method: "GET",
@@ -5679,8 +6173,12 @@ var sections20 = [
       { label: "Get Extraction", slug: "get-extraction" },
       { label: "POST /v1/extract", slug: "post-extract" }
     ],
-    faq: [],
-    mentions: ["extractions", "pagination"]
+    faq: [
+      { question: "Can a single document have multiple extractions?", answer: "Yes. A document can have multiple extractions if different schemas are applied to it, or if it is re-extracted with updated settings." },
+      { question: "What does the confidence_overall score represent?", answer: "It is the average confidence across all extracted fields (0 to 1). A score of 0.94 means the extraction engine is highly confident in the accuracy of the results." },
+      { question: "How do I filter extractions by date range?", answer: "Use the `after` and `before` query parameters with ISO 8601 datetime strings to retrieve extractions within a specific time window." }
+    ],
+    mentions: ["extractions", "pagination", "confidence score"]
   },
   {
     slug: "get-extraction",
@@ -5689,6 +6187,8 @@ var sections20 = [
     seoTitle: "Get Extraction Endpoint \u2014 Talonic Docs",
     description: "Retrieve the full extraction result by ID including extracted data, per-field confidence scores, and document metadata.",
     content: [
+      { type: "paragraph", text: "Retrieve the complete extraction result for a single document, including all **extracted field values**, **per-field confidence scores**, and **processing metadata**. This is the primary endpoint for consuming structured data produced by the extraction pipeline." },
+      { type: "paragraph", text: "The response includes the source document summary, a `data` object with field values, and a `confidence` object with both an overall score and individual scores for each field. Fields that have been manually corrected appear in the `locked_fields` array." },
       {
         type: "endpoint",
         method: "GET",
@@ -5769,8 +6269,11 @@ var sections20 = [
       { label: "List Extractions", slug: "list-extractions" },
       { label: "Get Extraction Data", slug: "get-extraction-fields" }
     ],
-    faq: [],
-    mentions: ["extraction result", "confidence scores"]
+    faq: [
+      { question: "What does a locked_fields entry mean?", answer: "A locked field has been manually corrected via the Correct Fields endpoint. Locked fields always have a confidence of 1.0 and will not be overwritten by re-extraction." },
+      { question: "What processing metadata is available?", answer: "The `processing` object includes `duration_ms` (total extraction time), `pages_processed` (number of document pages analyzed), and `region` (the datacenter region that handled the request)." }
+    ],
+    mentions: ["extraction result", "confidence scores", "locked fields", "processing metadata"]
   },
   {
     slug: "get-extraction-fields",
@@ -5779,6 +6282,8 @@ var sections20 = [
     seoTitle: "Get Extraction Data Endpoint \u2014 Talonic Docs",
     description: "Retrieve just the extracted data from an extraction result without metadata. Supports CSV export format for download.",
     content: [
+      { type: "paragraph", text: "Retrieve only the extracted field values from an extraction, without metadata, confidence scores, or processing details. This is the lightest-weight endpoint for consuming extraction output and is ideal for downstream integrations that only need the structured data." },
+      { type: "callout", variant: "info", text: "Use `?format=csv` to download the data as a CSV file. The response Content-Type changes to `text/csv` and includes a `Content-Disposition` header for browser downloads." },
       {
         type: "endpoint",
         method: "GET",
@@ -5825,10 +6330,14 @@ var sections20 = [
       }
     ],
     related: [
-      { label: "Get Extraction", slug: "get-extraction" }
+      { label: "Get Extraction", slug: "get-extraction" },
+      { label: "Correct Fields", slug: "get-extraction-markdown" }
     ],
-    faq: [],
-    mentions: ["CSV export", "extraction data"]
+    faq: [
+      { question: "What is the difference between this endpoint and GET /v1/extractions/:id?", answer: "This endpoint returns only the extracted key-value data. The full extraction endpoint also includes confidence scores, processing metadata, document details, and locked fields." },
+      { question: "Does the CSV export include column headers?", answer: "Yes. The CSV format includes field names as the header row and extracted values as the data row." }
+    ],
+    mentions: ["CSV export", "extraction data", "data download"]
   },
   {
     slug: "get-extraction-markdown",
@@ -5837,6 +6346,9 @@ var sections20 = [
     seoTitle: "Correct Extraction Fields \u2014 Talonic Docs",
     description: "Submit corrections for specific fields in an extraction result. Corrections are logged and can be propagated to similar extractions.",
     content: [
+      { type: "paragraph", text: "Submit corrections to specific fields in a completed extraction. Corrected fields are **locked** at confidence 1.0 and will not be overwritten by future re-extractions. Use this to fix extraction errors and optionally propagate corrections to similar documents." },
+      { type: "paragraph", text: "Each correction includes the field name, new value, and an optional reason. The `propagate` parameter controls whether corrections apply only to this document or are applied to all similar extractions." },
+      { type: "callout", variant: "warning", text: "Corrected fields are locked and cannot be overridden by re-extraction. To unlock a field, submit a new correction or contact support." },
       {
         type: "endpoint",
         method: "PATCH",
@@ -5942,10 +6454,15 @@ var sections20 = [
       }
     ],
     related: [
-      { label: "Get Extraction", slug: "get-extraction" }
+      { label: "Get Extraction", slug: "get-extraction" },
+      { label: "Get Extraction Data", slug: "get-extraction-fields" }
     ],
-    faq: [],
-    mentions: ["field corrections", "propagation"]
+    faq: [
+      { question: "What does propagate: all_similar do?", answer: "When set to `all_similar`, the correction is applied to all extractions from documents with the same detected type and matching field values. Use this to fix systematic extraction errors across a batch." },
+      { question: "Can I correct multiple fields in a single request?", answer: "Yes. The `corrections` array accepts multiple entries. Each correction is applied atomically in the same request." },
+      { question: "What happens to the confidence score of a corrected field?", answer: "Corrected fields are automatically set to confidence 1.0 and added to the `locked_fields` array. The overall confidence score is recalculated to reflect the correction." }
+    ],
+    mentions: ["field corrections", "propagation", "locked fields", "confidence override"]
   }
 ];
@@ -5959,6 +6476,8 @@ var sections21 = [
     description: "List all saved schemas with field counts and extraction counts. Schemas define the structure you want to extract from documents.",
     content: [
       { type: "paragraph", text: "Schemas define the structure you want to extract from documents. Save schemas to reuse them across extractions and maintain consistency." },
+      { type: "paragraph", text: "Use this endpoint to browse your organization's schemas, search by name, and paginate through large collections. Each schema in the response includes its **field count**, **version number**, and links to related extractions." },
+      { type: "callout", variant: "info", text: "Schemas are organization-scoped. You will only see schemas created by your organization, not other tenants." },
       {
         type: "endpoint",
         method: "GET",
@@ -6052,8 +6571,12 @@ var sections21 = [
       { label: "Create Schema", slug: "create-schema" },
       { label: "Get Schema", slug: "get-schema" }
     ],
-    faq: [],
-    mentions: ["schemas", "extraction"]
+    faq: [
+      { question: "How do I search for a schema by name?", answer: "Pass the `search` query parameter with a partial name string. The search is case-insensitive and matches any substring of the schema name." },
+      { question: "What is the maximum number of schemas I can retrieve in one request?", answer: "Set `limit` up to 100 per request. Use the `next_cursor` value from the response to paginate through additional results." },
+      { question: "Does listing schemas include the full field definitions?", answer: "Yes. Each schema object in the response includes the complete `definition` with all field properties, types, and required markers." }
+    ],
+    mentions: ["schemas", "extraction", "pagination", "cursor"]
   },
   {
     slug: "get-schema",
@@ -6062,6 +6585,8 @@ var sections21 = [
     seoTitle: "Get Schema Endpoint \u2014 Talonic Docs",
     description: "Retrieve a schema by ID including its full definition with all field names, data types, extraction instructions, and configuration options. Returns version history and extraction count.",
     content: [
+      { type: "paragraph", text: "Retrieve the full details of a single schema, including its **JSON Schema definition**, field count, and version number. Use this to inspect a schema before running extractions or to verify that an update was applied correctly." },
+      { type: "callout", variant: "info", text: "You can pass either a UUID or a `SCH-` prefixed short ID (e.g. `SCH-A1B2C3D4`) as the `:id` path parameter." },
       {
         type: "endpoint",
         method: "GET",
@@ -6132,8 +6657,11 @@ var sections21 = [
       { label: "List Schemas", slug: "list-schemas" },
       { label: "Update Schema", slug: "update-schema" }
     ],
-    faq: [],
-    mentions: ["schema definition"]
+    faq: [
+      { question: "Can I use the short ID instead of the UUID?", answer: "Yes. Both the full UUID and the `SCH-` prefixed short ID (e.g. `SCH-A1B2C3D4`) are accepted as the `:id` path parameter." },
+      { question: "Does this endpoint return the schema version history?", answer: "The response includes the current `version` number. To see previous versions, compare schemas over time using the `updated_at` timestamp." }
+    ],
+    mentions: ["schema definition", "short ID", "UUID"]
   },
   {
     slug: "create-schema",
@@ -6142,6 +6670,16 @@ var sections21 = [
     seoTitle: "Create Schema Endpoint \u2014 Talonic Docs",
     description: "Create a new schema with a name and definition in any of the three supported formats: JSON Schema, simplified fields, or flat key-type map.",
     content: [
+      { type: "paragraph", text: "Create a new schema to define the fields you want to extract from documents. The schema definition is **normalized** to JSON Schema format on creation, regardless of which input format you use." },
+      { type: "paragraph", text: "Talonic accepts three definition formats to make schema creation flexible. You can pass a full JSON Schema object, a simplified fields array, or a flat key-type map. All three are converted to the same internal representation." },
+      {
+        type: "list",
+        items: [
+          '**JSON Schema** \u2014 Standard `{ "type": "object", "properties": { ... } }` format with full control over types and required fields.',
+          '**Simplified fields** \u2014 An array of `{ "name": "field_name", "type": "string" }` objects for quick creation.',
+          '**Flat key-type map** \u2014 A simple `{ "field_name": "string", "amount": "number" }` object for the fastest path.'
+        ]
+      },
       {
         type: "endpoint",
         method: "POST",
@@ -6219,8 +6757,12 @@ var sections21 = [
       { label: "Schema Formats", slug: "extract-schemas" },
       { label: "List Schemas", slug: "list-schemas" }
     ],
-    faq: [],
-    mentions: ["create schema"]
+    faq: [
+      { question: "Which schema definition format should I use?", answer: 'Use the flat key-type map (`{ "field": "type" }`) for quick prototyping and JSON Schema for production schemas that need `required` fields or descriptions.' },
+      { question: "What happens if my definition has no valid fields?", answer: "The API returns a `400 validation_error`. Every schema must produce at least one extractable field after normalization." },
+      { question: "Is the version always 1 for new schemas?", answer: "Yes. New schemas always start at version 1. The version increments when you update the schema definition via `PUT /v1/schemas/:id`." }
+    ],
+    mentions: ["create schema", "JSON Schema", "definition format"]
   },
   {
     slug: "update-schema",
@@ -6229,6 +6771,8 @@ var sections21 = [
     seoTitle: "Update Schema Endpoint \u2014 Talonic Docs",
     description: "Replace a schema definition, creating a new version internally. Existing extractions retain their original schema version.",
     content: [
+      { type: "paragraph", text: "Replace a schema's definition, name, or description. Each update creates a **new version** internally, so you can track how a schema evolves over time. Existing extractions are not affected by schema updates \u2014 they retain the schema version that was active when they were created." },
+      { type: "callout", variant: "warning", text: "Updating a schema definition changes the fields used for **future** extractions. Already-completed extractions continue to reference their original schema version." },
       {
         type: "endpoint",
         method: "PUT",
@@ -6307,10 +6851,14 @@ var sections21 = [
       }
     ],
     related: [
-      { label: "Get Schema", slug: "get-schema" }
+      { label: "Get Schema", slug: "get-schema" },
+      { label: "Create Schema", slug: "create-schema" }
     ],
-    faq: [],
-    mentions: ["schema versioning"]
+    faq: [
+      { question: "Do existing extractions change when I update a schema?", answer: "No. Existing extractions retain the schema version that was active when they were created. Only new extractions use the updated definition." },
+      { question: "Can I update just the name without changing the definition?", answer: "Yes. All body parameters are optional. You can send only `name` or only `description` to update those fields without touching the definition." }
+    ],
+    mentions: ["schema versioning", "schema update"]
   },
   {
     slug: "delete-schema",
@@ -6319,6 +6867,8 @@ var sections21 = [
     seoTitle: "Delete Schema Endpoint \u2014 Talonic Docs",
     description: "Delete a schema by ID. Associated extractions are retained and not deleted when the schema is removed.",
     content: [
+      { type: "paragraph", text: "Permanently delete a schema from your organization. This operation is **irreversible**. Associated extractions are retained and remain accessible \u2014 only the schema definition itself is removed." },
+      { type: "callout", variant: "warning", text: "Deleting a schema does not delete its associated extractions. Extraction results remain available via the extractions API." },
       {
         type: "endpoint",
         method: "DELETE",
@@ -6356,10 +6906,14 @@ var sections21 = [
       }
     ],
     related: [
-      { label: "List Schemas", slug: "list-schemas" }
+      { label: "List Schemas", slug: "list-schemas" },
+      { label: "Get Schema", slug: "get-schema" }
     ],
-    faq: [],
-    mentions: ["delete schema"]
+    faq: [
+      { question: "What happens to extractions when I delete a schema?", answer: "Extractions are retained and remain accessible. Only the schema definition is removed. You can still query extraction results via the extractions API." },
+      { question: "Can I recover a deleted schema?", answer: "No. Schema deletion is permanent. If you need the same structure again, create a new schema with the same definition." }
+    ],
+    mentions: ["delete schema", "extraction retention"]
   }
 ];
@@ -6373,6 +6927,7 @@ var sections22 = [
     description: "List all extraction jobs with status, progress, and pagination. Filter by status to find queued, processing, completed, failed, or cancelled jobs.",
     content: [
       { type: "paragraph", text: "Jobs track asynchronous extraction work. Create a job with a schema and document set, then poll for progress. Each job runs the full 4-phase extraction pipeline." },
+      { type: "paragraph", text: "The 4-phase pipeline consists of: **Phase 1 (Resolve)** which fills 60-80% of cells using registry transfer, raw extraction mapping, lookup cascades, and deterministic compute; **Phase 2 (Agent)** which uses Claude to extract remaining values; **Phase 3 (Resolve II)** which normalizes LLM-extracted values to canonical codes; and **Phase 4 (Transform)** which applies deterministic transforms, validation, and format constraints." },
       {
         type: "endpoint",
         method: "GET",
@@ -6472,8 +7027,12 @@ var sections22 = [
       { label: "Create Job", slug: "create-job" },
       { label: "Get Job", slug: "get-job" }
     ],
-    faq: [],
-    mentions: ["jobs", "extraction pipeline"]
+    faq: [
+      { question: "What does fill_rate in grid_stats represent?", answer: "Fill rate is the ratio of filled cells to total cells in the extraction grid (documents x schema fields). A fill rate of 0.94 means 94% of expected field values were successfully extracted." },
+      { question: "What does current_phase tell me?", answer: "It indicates which pipeline phase the job is currently executing: `phase_1_resolve`, `phase_2_execute`, `phase_3_resolve`, or `phase_4_transform`. It is `null` when the job is complete or has not started." },
+      { question: "How does cursor-based pagination work?", answer: "Pass the `next_cursor` value from the previous response as the `cursor` query parameter to fetch the next page. When `has_more` is `false`, you have reached the last page." }
+    ],
+    mentions: ["jobs", "extraction pipeline", "4-phase pipeline", "grid stats"]
   },
   {
     slug: "create-job",
@@ -6482,6 +7041,8 @@ var sections22 = [
     seoTitle: "Create Job Endpoint \u2014 Talonic Docs",
     description: "Create and run an extraction job with a schema and optional document set. Returns a job ID for polling progress and retrieving results.",
     content: [
+      { type: "paragraph", text: "Create a new extraction job targeting a specific schema. The job immediately enters `pending` status and begins processing asynchronously. If `document_ids` is omitted, the job processes all completed documents in your organization. Poll the job status endpoint to track progress." },
+      { type: "callout", variant: "info", text: "Jobs are limited to 2,000 documents per run. If you need to process more, split your document set across multiple jobs." },
       {
         type: "endpoint",
         method: "POST",
@@ -6539,8 +7100,11 @@ var sections22 = [
       { label: "List Jobs", slug: "list-jobs" },
       { label: "Get Job", slug: "get-job" }
     ],
-    faq: [],
-    mentions: ["create job", "extraction"]
+    faq: [
+      { question: "What happens if I omit document_ids?", answer: "The job processes all documents with `completed` status in your organization. This is convenient for initial runs but may include documents you do not want to process. Use `document_ids` for targeted extraction." },
+      { question: "Can I run multiple jobs simultaneously?", answer: "Yes. Jobs are processed independently. However, concurrent jobs compete for the same compute resources, so running many large jobs in parallel may increase processing times." }
+    ],
+    mentions: ["create job", "extraction", "schema_id"]
   },
   {
     slug: "get-job",
@@ -6549,6 +7113,7 @@ var sections22 = [
     seoTitle: "Get Job Endpoint \u2014 Talonic Docs",
     description: "Get job status, progress percentage, current phase, document counts, grid statistics, and estimated completion time for an extraction job.",
     content: [
+      { type: "paragraph", text: "Retrieve the current status and progress of a specific extraction job. Use this endpoint to poll for completion after creating a job. The `grid_stats` field provides real-time cell fill rates, and `current_phase` indicates which pipeline phase is actively executing." },
       {
         type: "endpoint",
         method: "GET",
@@ -6623,8 +7188,11 @@ var sections22 = [
       { label: "List Jobs", slug: "list-jobs" },
       { label: "Create Job", slug: "create-job" }
     ],
-    faq: [],
-    mentions: ["job status", "grid stats"]
+    faq: [
+      { question: "How often should I poll for job status?", answer: "Every 2-5 seconds is a reasonable polling interval. The `progress` field updates as cells are filled across pipeline phases, giving you real-time feedback." },
+      { question: "What does it mean when grid_stats is null?", answer: "The grid has not been initialized yet. This happens when the job is still in `pending` status or has just started processing before Phase 1 begins." }
+    ],
+    mentions: ["job status", "grid stats", "polling"]
   },
   {
     slug: "cancel-job",
@@ -6633,6 +7201,8 @@ var sections22 = [
     seoTitle: "Cancel Job Endpoint \u2014 Talonic Docs",
     description: "Cancel a pending or processing job. Returns the final job state with status set to failed and an error message indicating cancellation.",
     content: [
+      { type: "paragraph", text: "Cancel a job that is currently `pending` or `processing`. The job is immediately marked as `failed` with an error message indicating it was cancelled via the API. Any results that were already processed are preserved and can still be retrieved via the results endpoint." },
+      { type: "callout", variant: "info", text: "Cancellation is best-effort. If the job completes between your cancel request and the server processing it, you will receive a `409 conflict` error. Already-completed or already-failed jobs cannot be cancelled." },
       {
         type: "endpoint",
         method: "POST",
@@ -6697,8 +7267,11 @@ var sections22 = [
       { label: "Get Job", slug: "get-job" },
       { label: "List Jobs", slug: "list-jobs" }
     ],
-    faq: [],
-    mentions: ["cancel job"]
+    faq: [
+      { question: "Can I retrieve partial results from a cancelled job?", answer: "Yes. Any results that were written before cancellation are preserved. Use `GET /v1/jobs/:id/results` to retrieve whatever was processed before the job was stopped." },
+      { question: "Does cancelling a job refund credits?", answer: "No. Credits consumed during the processing that already occurred are not refunded. Only future phases that did not execute are saved." }
+    ],
+    mentions: ["cancel job", "cancellation"]
   },
   {
     slug: "get-job-results",
@@ -6707,6 +7280,8 @@ var sections22 = [
     seoTitle: "Job Results Endpoint \u2014 Talonic Docs",
     description: "Retrieve the extracted result rows for a job. Each row corresponds to one document and contains extracted field values, confidence score, and validation flags.",
     content: [
+      { type: "paragraph", text: "Retrieve the extracted result rows for a completed (or partially completed) job. Each row corresponds to one document and contains the extracted field values keyed by field name, a row-level confidence score, and any validation flags raised during Phase 4. Results are available progressively as each pipeline phase flushes to the database." },
+      { type: "callout", variant: "info", text: "Results are available even while the job is still processing. The grid flushes to the database after each phase, so you can read partial results before the job completes." },
       {
         type: "endpoint",
         method: "GET",
@@ -6795,8 +7370,12 @@ var sections22 = [
       { label: "Get Job", slug: "get-job" },
       { label: "Create Job", slug: "create-job" }
     ],
-    faq: [],
-    mentions: ["job results", "extracted values", "validation flags"]
+    faq: [
+      { question: "What do validation_flags contain?", answer: "Validation flags are strings indicating data quality issues detected in Phase 4. Common flags include `missing_required_field:<field_name>` and `format_mismatch:<field_name>`. An empty array means all validations passed." },
+      { question: 'What is the difference between status "approved" and "review"?', answer: "Rows with `approved` status passed all validation checks and are ready for delivery. Rows with `review` status have one or more validation flags and may require manual inspection before use." },
+      { question: "Can I get results for a failed job?", answer: "Yes. If the job processed some documents before failing, those partial results are available. Check `job_status` in the response to understand the context." }
+    ],
+    mentions: ["job results", "extracted values", "validation flags", "confidence score"]
   }
 ];
@@ -6810,6 +7389,8 @@ var sections23 = [
     description: "List all input sources that group documents by origin. Each source has its own API key for programmatic document ingestion.",
     content: [
       { type: "paragraph", text: "Inputs group documents by origin. Each input source has its own API key for programmatic document ingestion." },
+      { type: "paragraph", text: "Sources are the entry point for document ingestion in Talonic. Each source represents a distinct pipeline or integration \u2014 for example, an invoice processing pipeline or a contract review workflow. Documents ingested through a source inherit its configuration, including any default schema." },
+      { type: "callout", variant: "info", text: "Each source has a unique `endpoint` URL for document ingestion. Use the source-scoped API key (returned at creation) to authenticate uploads to that endpoint." },
       {
         type: "endpoint",
         method: "GET",
@@ -6871,8 +7452,11 @@ var sections23 = [
       { label: "Create Input", slug: "create-source" },
       { label: "Get Source", slug: "get-source" }
     ],
-    faq: [],
-    mentions: ["sources", "input"]
+    faq: [
+      { question: "What is the difference between a source and a schema?", answer: "A source groups documents by origin (e.g. an API integration or upload pipeline). A schema defines the fields to extract from documents. A source can optionally have a default schema applied to all its documents." },
+      { question: "Can I see the API key for an existing source?", answer: "No. The source-scoped API key is only shown once in the creation response. If lost, delete the source and create a new one." }
+    ],
+    mentions: ["sources", "input", "document ingestion"]
   },
   {
     slug: "create-source",
@@ -6881,6 +7465,8 @@ var sections23 = [
     seoTitle: "Create Source Endpoint \u2014 Talonic Docs",
     description: "Create a new input source and receive a source-scoped API key. The key is only shown once in the creation response \u2014 store it securely.",
     content: [
+      { type: "paragraph", text: "Create a new source to start ingesting documents. The response includes a **source-scoped API key** (`tlnc_sk_*`) that authenticates uploads to this source's endpoint. This key is shown only once \u2014 store it securely immediately after creation." },
+      { type: "callout", variant: "warning", text: "The `api_key` is only returned in the creation response. It cannot be retrieved later. If you lose it, delete the source and create a new one." },
       {
         type: "endpoint",
         method: "POST",
@@ -6948,10 +7534,14 @@ var sections23 = [
       }
     ],
     related: [
-      { label: "List Inputs", slug: "list-sources" }
+      { label: "List Inputs", slug: "list-sources" },
+      { label: "Source Documents", slug: "update-source" }
     ],
-    faq: [],
-    mentions: ["API key", "source creation"]
+    faq: [
+      { question: "What source types are available?", answer: "The `type` field accepts `api` (programmatic ingestion via REST), `upload` (manual file uploads), and `connector` (third-party integrations like Google Drive or SharePoint)." },
+      { question: "Can I set a default schema when creating a source?", answer: "Yes. Pass a `default_schema_id` in the request body to automatically apply a schema to all documents ingested through this source." }
+    ],
+    mentions: ["API key", "source creation", "source-scoped key"]
   },
   {
     slug: "get-source",
@@ -6960,6 +7550,7 @@ var sections23 = [
     seoTitle: "Manage Source Endpoint \u2014 Talonic Docs",
     description: "Get source details, update a source name, or delete a source. Documents are retained but unlinked when a source is deleted.",
     content: [
+      { type: "paragraph", text: "Manage an individual source with GET, PATCH, and DELETE operations on the same path. Retrieve source details, update its name, or permanently delete it. When a source is deleted, its documents are **retained** but unlinked from the source." },
       {
         type: "endpoint",
         method: "GET",
@@ -7108,10 +7699,14 @@ var sections23 = [
       }
     ],
     related: [
-      { label: "List Inputs", slug: "list-sources" }
+      { label: "List Inputs", slug: "list-sources" },
+      { label: "Source Documents", slug: "update-source" }
     ],
-    faq: [],
-    mentions: ["source management"]
+    faq: [
+      { question: "What happens to documents when I delete a source?", answer: "Documents are retained in your organization but unlinked from the deleted source. They remain accessible via the documents API and any existing extractions are preserved." },
+      { question: "Does deleting a source revoke its API key?", answer: "Yes. The source-scoped API key is immediately invalidated when the source is deleted. Any subsequent upload attempts using that key will return 401." }
+    ],
+    mentions: ["source management", "source deletion", "document retention"]
   },
   {
     slug: "update-source",
@@ -7120,6 +7715,9 @@ var sections23 = [
     seoTitle: "Source Documents Endpoint \u2014 Talonic Docs",
     description: "Ingest documents into a specific source or list documents belonging to a source. Supports batch processing mode at 50% cost discount.",
     content: [
+      { type: "paragraph", text: "Ingest documents into a source for processing, or list all documents that belong to a source. The ingestion endpoint accepts a file upload or a URL, processes the document through the extraction pipeline, and returns the document ID for status tracking." },
+      { type: "paragraph", text: "Documents can be processed in **realtime** (default, results in seconds) or **batch** mode (50% cost discount, results within 48 hours). Duplicate files are detected via SHA-256 hash and rejected with a `duplicate` status." },
+      { type: "callout", variant: "info", text: "Batch processing mode reduces cost by 50% but delivers results within 48 hours. Use `processing_mode=batch` for large ingestion jobs where latency is not critical." },
       {
         type: "endpoint",
         method: "POST",
@@ -7254,8 +7852,12 @@ var sections23 = [
       { label: "List Inputs", slug: "list-sources" },
       { label: "List Documents", slug: "list-documents" }
     ],
-    faq: [],
-    mentions: ["document ingestion", "batch processing"]
+    faq: [
+      { question: "What happens if I upload a duplicate file?", answer: 'Duplicate files are detected via SHA-256 hash comparison. The response returns `status: "duplicate"` with `existing_document_id` pointing to the original document. No new document is created.' },
+      { question: "Can I use both file and file_url in the same request?", answer: "No. Provide either `file` (binary upload) or `file_url` (URL fetch), but not both. The API returns a 400 error if neither is provided." },
+      { question: "What file formats are supported for document ingestion?", answer: "Talonic supports 25+ formats including PDF, DOCX, XLSX, CSV, PPTX, MSG, EML, PNG, JPG, HTML, XML, JSON, and more. See the supported file types documentation for the full list." }
+    ],
+    mentions: ["document ingestion", "batch processing", "duplicate detection", "file upload"]
   }
 ];
@@ -7269,6 +7871,7 @@ var sections24 = [
     description: "Autocomplete field names from the registry ranked by relevance and occurrence count. Power field picker UIs with type-ahead search.",
     content: [
       { type: "paragraph", text: "Search and filter documents by their extracted field values. Includes field autocomplete, document filtering with composable conditions, global omnisearch, and saved filter management." },
+      { type: "paragraph", text: "The field autocomplete endpoint enables type-ahead search over the field registry. Results are ranked by a combination of name relevance and occurrence count, making frequently seen fields surface first. Use this to power field picker dropdowns and search-as-you-type UIs." },
       {
         type: "endpoint",
         method: "GET",
@@ -7335,8 +7938,12 @@ var sections24 = [
       { label: "Field Values", slug: "field-values" },
       { label: "Filter Documents", slug: "filter-documents" }
     ],
-    faq: [],
-    mentions: ["autocomplete", "field registry"]
+    faq: [
+      { question: "How does the ranking work?", answer: "Results are ranked by a combined score of name match relevance (canonical name, display name, and alias matches) and `occurrenceCount`. Fields that appear in more documents rank higher for equivalent relevance." },
+      { question: "What does the tier field represent?", answer: "Tier 1 fields have high confidence and consistent extraction across documents. Higher tiers indicate lower confidence or less frequent occurrence. Use tier to decide which fields to display prominently." },
+      { question: "Can I search by alias names?", answer: "Yes. The autocomplete matches against canonical names, display names, and registered aliases. The `matchSource` field indicates how the match was made." }
+    ],
+    mentions: ["autocomplete", "field registry", "type-ahead search"]
   },
   {
     slug: "field-values",
@@ -7345,6 +7952,7 @@ var sections24 = [
     seoTitle: "Field Values Endpoint \u2014 Talonic Docs",
     description: "List distinct values for a field across documents with counts. Useful for building filter dropdowns and faceted search interfaces.",
     content: [
+      { type: "paragraph", text: "Retrieve the distinct values for a specific field across all documents in your workspace. Results are sorted by count descending, making the most common values appear first. Use this to populate filter dropdowns, build faceted search interfaces, or analyze value distributions for data quality." },
       {
         type: "endpoint",
         method: "GET",
@@ -7400,8 +8008,11 @@ var sections24 = [
       { label: "Field Autocomplete", slug: "field-autocomplete" },
       { label: "Filter Documents", slug: "filter-documents" }
     ],
-    faq: [],
-    mentions: ["field values", "faceted search"]
+    faq: [
+      { question: "Are values case-sensitive?", answer: 'Values are returned as extracted. The `q` substring filter is case-insensitive, so searching for "acme" will match "Acme Corp".' },
+      { question: "What does totalDistinct represent when a limit is applied?", answer: 'It shows the total number of unique values for this field across all documents, regardless of the `limit` parameter. Use it to indicate "showing 20 of 156 values" in your UI.' }
+    ],
+    mentions: ["field values", "faceted search", "value distribution"]
   },
   {
     slug: "filter-documents",
@@ -7410,6 +8021,8 @@ var sections24 = [
     seoTitle: "Filter Documents Endpoint \u2014 Talonic Docs",
     description: "Filter documents by field value conditions using composable operators including equality, comparison, range, containment, and emptiness checks.",
     content: [
+      { type: "paragraph", text: "Filter documents by composing conditions on extracted field values. Each condition targets a specific field and applies an operator to test its value. Multiple conditions are AND-combined. The endpoint also supports free-text search across document content and sorting by any field." },
+      { type: "callout", variant: "info", text: "Field names in conditions must be valid field registry IDs (e.g. `fld_a1b2c3d4`). Use the field autocomplete endpoint to discover available field IDs for your workspace." },
       {
         type: "endpoint",
         method: "POST",
@@ -7497,8 +8110,12 @@ var sections24 = [
       { label: "Field Autocomplete", slug: "field-autocomplete" },
       { label: "Omnisearch", slug: "omnisearch" }
     ],
-    faq: [],
-    mentions: ["filter", "conditions", "operators"]
+    faq: [
+      { question: "How do I use the between operator?", answer: 'Provide both `value` (lower bound, inclusive) and `valueTo` (upper bound, inclusive) in the condition. Works with dates, numbers, and strings. Example: `{ "fieldId": "fld_x", "operator": "between", "value": "2024-01-01", "valueTo": "2024-12-31" }`.' },
+      { question: "What happens if a document does not have a value for a filtered field?", answer: "Documents missing the filtered field are excluded from results unless you use the `is_empty` operator, which explicitly matches documents where the field is null or absent." },
+      { question: "Can I combine free-text search with field conditions?", answer: "Yes. Set the `search` parameter alongside `conditions`. Both are AND-combined \u2014 documents must match the search text and all conditions." }
+    ],
+    mentions: ["filter", "conditions", "operators", "composable filters"]
   },
   {
     slug: "filter-documents-export",
@@ -7507,6 +8124,7 @@ var sections24 = [
     seoTitle: "Omnisearch Endpoint \u2014 Talonic Docs",
     description: "Global omnisearch across documents, fields, sources, and schemas. Unified search endpoint that powers the Cmd+K search experience.",
     content: [
+      { type: "paragraph", text: "The omnisearch endpoint provides a unified search across all entity types in your workspace: documents, field values, source connections, schemas, and registry fields. A single query returns categorized results, making it ideal for building global search UIs like Cmd+K palettes." },
       {
         type: "endpoint",
         method: "GET",
@@ -7576,8 +8194,11 @@ var sections24 = [
       { label: "Filter Documents", slug: "filter-documents" },
       { label: "Field Autocomplete", slug: "field-autocomplete" }
     ],
-    faq: [],
-    mentions: ["omnisearch", "Cmd+K"]
+    faq: [
+      { question: "Does omnisearch return results from all entity types in every call?", answer: "Yes. Every call searches documents, field values, sources, schemas, and registry fields simultaneously. Empty categories are returned as empty arrays." },
+      { question: "How is the limit parameter applied?", answer: "The `limit` applies independently to each entity type. Setting `limit=5` returns up to 5 documents, 5 field matches, 5 sources, 5 schemas, and 5 fields." }
+    ],
+    mentions: ["omnisearch", "Cmd+K", "unified search"]
   },
   {
     slug: "omnisearch",
@@ -7586,6 +8207,7 @@ var sections24 = [
     seoTitle: "Saved Filters Endpoints \u2014 Talonic Docs",
     description: "Create, list, and delete saved filter configurations for reuse. Persist filter conditions, search terms, and sort settings.",
     content: [
+      { type: "paragraph", text: "Saved filters persist reusable filter configurations so you can apply the same conditions, search terms, and sort settings across multiple sessions. Create a saved filter from any combination of conditions, then reload it later without re-specifying each parameter." },
       {
         type: "endpoint",
         method: "GET",
@@ -7719,8 +8341,11 @@ var sections24 = [
     related: [
       { label: "Filter Documents", slug: "filter-documents" }
     ],
-    faq: [],
-    mentions: ["saved filters"]
+    faq: [
+      { question: "Can I update a saved filter?", answer: "There is no PUT endpoint for saved filters. To update, delete the existing filter and create a new one with the desired configuration." },
+      { question: "Are saved filters shared across team members?", answer: "Yes. Saved filters are scoped to the organization, so all team members with read access can list and use them." }
+    ],
+    mentions: ["saved filters", "reusable filters"]
   },
   {
     slug: "saved-filters",
@@ -7734,8 +8359,11 @@ var sections24 = [
     related: [
       { label: "Filter Documents", slug: "filter-documents" }
     ],
-    faq: [],
-    mentions: ["document counts", "faceted navigation"]
+    faq: [
+      { question: "Does the document counts endpoint support the same conditions as the filter endpoint?", answer: "Yes. The conditions format is identical to the filter documents endpoint. You can reuse the same condition arrays to get counts before fetching the actual documents." },
+      { question: "Can I use document counts without any conditions?", answer: "Yes. Omitting conditions returns the total document count across all sources, which is useful for dashboard overview widgets." }
+    ],
+    mentions: ["document counts", "faceted navigation", "aggregate counts"]
   },
   {
     slug: "document-counts",
@@ -7744,6 +8372,8 @@ var sections24 = [
     seoTitle: "Materialize Index Endpoint \u2014 Talonic Docs",
     description: "Trigger materialization backfill for filter indexes. Rebuilds the materialized field value index used by filter queries after bulk ingestion.",
     content: [
+      { type: "paragraph", text: "Trigger a rebuild of the materialized field value index. The materialized index pre-computes extracted field values for every document, enabling sub-second filter queries even on large workspaces. Run this endpoint after bulk document ingestion or schema changes to ensure filter results are current." },
+      { type: "callout", variant: "warning", text: "Materialization can be resource-intensive on large workspaces. Avoid calling this endpoint during peak usage. For incremental ingestion, the index updates automatically per document." },
       {
         type: "endpoint",
         method: "POST",
@@ -7780,8 +8410,11 @@ var sections24 = [
     related: [
       { label: "Filter Documents", slug: "filter-documents" }
     ],
-    faq: [],
-    mentions: ["materialized index", "backfill"]
+    faq: [
+      { question: "When do I need to manually trigger materialization?", answer: "Only after bulk ingestion (e.g. uploading hundreds of documents at once). For normal single-document uploads, the index is updated automatically during post-extraction processing." },
+      { question: "Is materialization idempotent?", answer: "Yes. Running it multiple times produces the same result. Existing materialized values are upserted, not duplicated." }
+    ],
+    mentions: ["materialized index", "backfill", "index rebuild"]
   },
   {
     slug: "materialized-index",
@@ -7795,8 +8428,11 @@ var sections24 = [
     related: [
       { label: "Materialize", slug: "document-counts" }
     ],
-    faq: [],
-    mentions: ["materialized index"]
+    faq: [
+      { question: "What happens if the materialized index is out of date?", answer: "Filter queries may return stale results or miss recently ingested documents. Trigger a materialization backfill via `POST /filter/materialize` to rebuild the index." },
+      { question: "Does the materialized index affect omnisearch results?", answer: "Yes. Omnisearch queries field values from the same materialized index. An out-of-date index may cause missing results in both filter and omnisearch endpoints." }
+    ],
+    mentions: ["materialized index", "filter performance"]
   }
 ];
@@ -7972,6 +8608,7 @@ var sections26 = [
     description: "List resolution runs that apply field normalization, lookup cascades, and value transforms to extracted data. Supports cursor-based pagination.",
     content: [
       { type: "paragraph", text: "Resolution runs apply field normalization, lookup cascades, and value transforms to extracted data. Create a resolution from a completed job run to standardise field values against reference data." },
+      { type: "paragraph", text: 'The resolution pipeline maps raw extracted values (e.g. "Deutschland") to canonical forms (e.g. "DE") using a 3-tier lookup cascade: string normalization, token-based fuzzy matching, and an LLM fallback. Each resolution run captures a snapshot of the active resolution policy and dialect at run time for reproducibility.' },
       {
         type: "endpoint",
         method: "GET",
@@ -8046,8 +8683,12 @@ var sections26 = [
       { label: "Create Resolution", slug: "create-resolution" },
       { label: "Get Resolution Results", slug: "get-resolution-results" }
     ],
-    faq: [],
-    mentions: ["resolution", "normalization", "lookup cascade"]
+    faq: [
+      { question: "What is the difference between a job run and a resolution run?", answer: "A job run extracts raw field values from documents. A resolution run takes those raw values and normalizes them to canonical forms using lookup tables, fuzzy matching, and LLM fallbacks." },
+      { question: "What are policy_snapshot and dialect_snapshot?", answer: "These capture the resolution configuration at run time. The policy snapshot records which normalization strategies were active. The dialect snapshot records output format preferences (e.g. date format, number locale). Both ensure runs are reproducible." },
+      { question: "Can I filter resolutions by the originating job?", answer: "Yes. Use the `source_run_id` query parameter to find all resolution runs derived from a specific job run." }
+    ],
+    mentions: ["resolution", "normalization", "lookup cascade", "canonical forms"]
   },
   {
     slug: "create-resolution",
@@ -8056,6 +8697,8 @@ var sections26 = [
     seoTitle: "Create Resolution Endpoint \u2014 Talonic Docs",
     description: "Create a resolution run from a completed job to standardise extracted field values against reference data using lookup cascades and transforms.",
     content: [
+      { type: "paragraph", text: "Create a new resolution run targeting documents from a completed job run. The resolution enters `pending` status immediately. Call the execute endpoint to start processing, or it will be picked up automatically depending on your pipeline configuration." },
+      { type: "callout", variant: "info", text: "The `source_run_id` must reference a completed job run. Creating a resolution against a pending or failed run returns a `404` error." },
       {
         type: "endpoint",
         method: "POST",
@@ -8130,8 +8773,11 @@ var sections26 = [
       { label: "List Resolutions", slug: "list-resolutions" },
       { label: "Execute Resolution", slug: "execute-resolution" }
     ],
-    faq: [],
-    mentions: ["create resolution"]
+    faq: [
+      { question: "Do I need to call execute after creating a resolution?", answer: "Yes. Creating a resolution only sets it to `pending`. Call `POST /v1/resolutions/{id}/execute` to start the resolution pipeline." },
+      { question: "Can I create multiple resolutions from the same job run?", answer: "Yes. Each resolution run is independent and produces its own set of results. This is useful for testing different resolution policies." }
+    ],
+    mentions: ["create resolution", "source_run_id"]
   },
   {
     slug: "get-resolution",
@@ -8140,6 +8786,7 @@ var sections26 = [
     seoTitle: "Get Resolution Endpoint \u2014 Talonic Docs",
     description: "Retrieve a resolution run by ID with its current status, document count, completion timestamp, and links to results. Requires read scope for the workspace.",
     content: [
+      { type: "paragraph", text: "Retrieve the current status and metadata of a specific resolution run. Use this endpoint to poll for completion after executing a resolution. The response includes links to the results endpoint where you can inspect per-field resolved values." },
       {
         type: "endpoint",
         method: "GET",
@@ -8198,8 +8845,11 @@ var sections26 = [
       { label: "List Resolutions", slug: "list-resolutions" },
       { label: "Get Resolution Results", slug: "get-resolution-results" }
     ],
-    faq: [],
-    mentions: ["resolution status"]
+    faq: [
+      { question: "How long does a resolution run typically take?", answer: "Resolution runs are fast for purely deterministic lookups (seconds). Runs that require LLM fallback for ambiguous values take longer, typically 1-5 minutes depending on the number of unresolved fields." },
+      { question: "What does a failed resolution status mean?", answer: "A `failed` status indicates the resolution pipeline encountered an unrecoverable error. Check the `error_message` field for details. You can delete the failed run and create a new one." }
+    ],
+    mentions: ["resolution status", "polling"]
   },
   {
     slug: "get-resolution-results",
@@ -8208,6 +8858,8 @@ var sections26 = [
     seoTitle: "Get Resolution Results \u2014 Talonic Docs",
     description: "Get per-field resolution results showing original values, resolved values, resolution step used, and confidence scores for each document.",
     content: [
+      { type: "paragraph", text: "Retrieve the per-field resolution results for a completed run. Each result shows the original extracted value alongside the resolved canonical value, the resolution strategy that produced the match, and a confidence score. Use this to audit how values were normalized and identify fields that may need manual review." },
+      { type: "callout", variant: "info", text: "The `resolution_step` field indicates which strategy produced the resolved value: `lookup` (direct table match), `transfer` (registry transfer), `compute` (deterministic computation), or `null` if no resolution was applied." },
       {
         type: "endpoint",
         method: "GET",
@@ -8266,8 +8918,11 @@ var sections26 = [
     related: [
       { label: "Get Resolution", slug: "get-resolution" }
     ],
-    faq: [],
-    mentions: ["resolution results", "resolved values"]
+    faq: [
+      { question: "What does it mean when resolved_value is null?", answer: "A null `resolved_value` means no resolution strategy could match the original value to a canonical form. The field retains its raw extracted value. Consider adding the value to a lookup table for future runs." },
+      { question: "What confidence scores should I expect from each resolution step?", answer: "Direct `lookup` matches typically score 0.95-1.0. Fuzzy token matches score 0.7-0.95. LLM fallback matches score 0.5-0.8. Values below 0.5 usually indicate low-confidence guesses." }
+    ],
+    mentions: ["resolution results", "resolved values", "original values", "resolution step"]
   },
   {
     slug: "execute-resolution",
@@ -8276,6 +8931,8 @@ var sections26 = [
     seoTitle: "Execute Resolution Endpoint \u2014 Talonic Docs",
     description: "Execute the resolution pipeline on all pending fields. Returns immediately \u2014 poll the run for progress. Requires write scope.",
     content: [
+      { type: "paragraph", text: "Start the resolution pipeline on a pending resolution run. The endpoint returns immediately with the updated run status (typically `running`). Poll the get resolution endpoint to track progress and check for completion." },
+      { type: "callout", variant: "warning", text: "Executing a resolution that is already `running` or `completed` has no effect. Only `pending` runs can be executed." },
       {
         type: "endpoint",
         method: "POST",
@@ -8334,8 +8991,11 @@ var sections26 = [
       { label: "Create Resolution", slug: "create-resolution" },
       { label: "Get Resolution Results", slug: "get-resolution-results" }
     ],
-    faq: [],
-    mentions: ["execute resolution"]
+    faq: [
+      { question: "Is the execute call synchronous?", answer: "No. The endpoint returns immediately with status `running`. Poll `GET /v1/resolutions/{id}` to track progress and detect completion." },
+      { question: "What happens if execution fails?", answer: "The run status transitions to `failed` and the `error_message` field is populated. You can delete the failed run and create a new one to retry." }
+    ],
+    mentions: ["execute resolution", "start resolution"]
   },
   {
     slug: "cancel-resolution",
@@ -8344,6 +9004,8 @@ var sections26 = [
     seoTitle: "Delete Resolution Endpoint \u2014 Talonic Docs",
     description: "Delete a resolution run and its results. Requires write scope. This action permanently removes all resolution data.",
     content: [
+      { type: "paragraph", text: "Permanently delete a resolution run and all its associated results. This action is irreversible. Use this to clean up failed runs, remove outdated resolutions, or free up storage. The originating job run and its results are not affected." },
+      { type: "callout", variant: "warning", text: "Deletion is permanent. All per-field resolution results associated with this run are removed. The source job run and its extracted data are unaffected." },
       {
         type: "endpoint",
         method: "DELETE",
@@ -8381,8 +9043,11 @@ var sections26 = [
     related: [
       { label: "List Resolutions", slug: "list-resolutions" }
     ],
-    faq: [],
-    mentions: ["delete resolution"]
+    faq: [
+      { question: "Can I delete a resolution that is currently running?", answer: "Yes. The delete operation cancels the running resolution and removes all data. However, it is recommended to wait for completion or failure before deleting to avoid race conditions." },
+      { question: "Does deleting a resolution affect the source job run?", answer: "No. The source job run, its documents, and its extracted values are completely unaffected by resolution deletion." }
+    ],
+    mentions: ["delete resolution", "permanent deletion"]
   }
 ];
@@ -8396,6 +9061,8 @@ var sections27 = [
     description: "List all discovered link keys with their classification category and frequency. Link keys are field-level entity identifiers used for document linking.",
     content: [
       { type: "paragraph", text: "The linking graph connects documents through shared entity values \u2014 an invoice and a contract sharing the same customer ID are linked. The API exposes the bipartite document-entity graph: link keys (field-level entity identifiers), document-level links, the full graph, document-centric subgraphs, classification (identity, transaction, reference), backfill, and document-to-case mapping." },
+      { type: "paragraph", text: "Link keys are the fields that the linking engine uses to discover connections between documents. Each link key has a **classification category** \u2014 `identity` (e.g. vendor ID, customer number), `transaction` (e.g. invoice number, PO number), or `reference` (e.g. contract reference). Use this endpoint to inspect which fields are currently recognized as link keys and how frequently they appear across your documents." },
+      { type: "callout", variant: "info", text: "Link keys are discovered automatically during extraction. If a field is not yet classified, use the **Classify** endpoint to trigger AI-based classification on ambiguous fields." },
       {
         type: "endpoint",
         method: "GET",
@@ -8457,9 +9124,11 @@ var sections27 = [
       { label: "List Cases", slug: "list-cases" }
     ],
     faq: [
-      { question: "What are link keys in Talonic?", answer: "Link keys are field-level entity identifiers (e.g. customer_id, invoice_number) used to connect documents that share the same value." }
+      { question: "What are link keys in Talonic?", answer: "Link keys are field-level entity identifiers (e.g. customer_id, invoice_number) used to connect documents that share the same value." },
+      { question: "What is the difference between identity, transaction, and reference link keys?", answer: "Identity keys represent stable entity identifiers (e.g. vendor ID, tax number). Transaction keys are document-specific identifiers (e.g. invoice number, PO number). Reference keys are cross-references between documents (e.g. contract reference cited in an invoice)." },
+      { question: "How are link keys discovered?", answer: "Link keys are discovered automatically during document extraction. Fields are classified using rule-based heuristics first, then an LLM call for ambiguous cases. You can also trigger classification manually via the Classify endpoint." }
     ],
-    mentions: ["link keys", "bipartite graph", "entity linking"]
+    mentions: ["link keys", "bipartite graph", "entity linking", "field registry", "link key category"]
   },
   {
     slug: "reclassify-link-key",
@@ -8468,6 +9137,8 @@ var sections27 = [
     seoTitle: "Classify Link Keys \u2014 Talonic Docs",
     description: "Classify link keys into categories (identity, transaction, reference) using AI. Runs asynchronously on ambiguous fields.",
     content: [
+      { type: "paragraph", text: "When new fields are extracted, some may not be automatically classified as link keys. The classify endpoint runs AI-powered classification on ambiguous fields to determine whether they are **identity**, **transaction**, or **reference** link keys. This is useful after onboarding new document types or when the field registry grows." },
+      { type: "callout", variant: "info", text: "Classification uses a two-pass approach: rule-based heuristics handle obvious cases (e.g. fields named `invoice_number`), then an LLM call classifies the remaining ambiguous fields. A backfill is automatically triggered when new link keys are identified." },
       {
         type: "endpoint",
         method: "POST",
@@ -8504,10 +9175,14 @@ var sections27 = [
       }
     ],
     related: [
-      { label: "Link Keys", slug: "list-link-keys" }
+      { label: "Link Keys", slug: "list-link-keys" },
+      { label: "Backfill", slug: "list-cases" }
     ],
-    faq: [],
-    mentions: ["AI classification", "link key categories"]
+    faq: [
+      { question: "Does classification run synchronously?", answer: "The endpoint returns immediately with the count of classified fields. If new link keys are found, a backfill is triggered asynchronously to update entity links across all documents." },
+      { question: "Can I reclassify an already-classified link key?", answer: "The classify endpoint targets unclassified or ambiguous fields. Already-classified link keys are not re-evaluated unless their category is null." }
+    ],
+    mentions: ["AI classification", "link key categories", "rule-based classification", "LLM classification"]
   },
   {
     slug: "list-entities",
@@ -8516,6 +9191,8 @@ var sections27 = [
     seoTitle: "Document Links Endpoint \u2014 Talonic Docs",
     description: "Get all entity links for a specific document showing entity values, types, link keys, and linked document IDs.",
     content: [
+      { type: "paragraph", text: "Retrieve all entity links discovered for a specific document. Each link represents a shared field value \u2014 such as a customer ID or PO number \u2014 that connects this document to others in the workspace. Use this endpoint to understand how a document relates to the rest of your corpus." },
+      { type: "callout", variant: "info", text: "The `document_count` field on each entity indicates how many documents share that value. A high count on an identity entity (e.g. a vendor ID appearing in 50+ documents) is expected, while a high count on a transaction entity may indicate a data quality issue." },
       {
         type: "endpoint",
         method: "GET",
@@ -8572,10 +9249,14 @@ var sections27 = [
     ],
     related: [
       { label: "Full Graph", slug: "list-linked-documents" },
-      { label: "Link Keys", slug: "list-link-keys" }
+      { label: "Link Keys", slug: "list-link-keys" },
+      { label: "Document-Case Map", slug: "refresh-cases" }
     ],
-    faq: [],
-    mentions: ["document links"]
+    faq: [
+      { question: "What does document_count represent?", answer: 'The number of documents in your workspace that share the same entity value for that field. For example, if three invoices reference vendor ID "ACME-001", the document_count is 3.' },
+      { question: "Can a document have zero links?", answer: "Yes. Documents that have no extracted field values matching other documents will return an empty data array. These appear as unlinked in the graph." }
+    ],
+    mentions: ["document links", "entity values", "shared fields"]
   },
   {
     slug: "list-linked-documents",
@@ -8584,6 +9265,15 @@ var sections27 = [
     seoTitle: "Linking Graph Endpoint \u2014 Talonic Docs",
     description: "Get the full document linking graph as nodes and edges, or get the graph neighbourhood for a single document with configurable traversal depth.",
     content: [
+      { type: "paragraph", text: "The linking graph is a **bipartite graph** with two node types: documents and entities. Edges connect documents to the entity values they share. This endpoint returns the complete graph for your workspace, including detected cases (groups of documents linked through transaction or reference entities) and entity groups (documents linked only through identity entities)." },
+      { type: "callout", variant: "warning", text: "The full graph endpoint can return large payloads for workspaces with many documents. For targeted exploration, use the document subgraph endpoint with a configurable `depth` parameter instead." },
+      { type: "list", ordered: false, items: [
+        "**Nodes** represent documents (with filename and detected type) or entities (with value and link key category)",
+        "**Edges** connect a document to an entity through a specific field key",
+        "**Cases** are connected components containing transaction or reference entities (2+ documents)",
+        "**Entity groups** are components linked only through identity entities",
+        "**Excluded entities** are high-frequency or owner entities removed from BFS case detection to avoid merging unrelated cases"
+      ] },
       {
         type: "endpoint",
         method: "GET",
@@ -8654,6 +9344,7 @@ var sections27 = [
   }
 }`
       },
+      { type: "paragraph", text: "To explore the graph from a single document outward, use the document subgraph endpoint below. The `depth` parameter controls how many hops to traverse \u2014 each hop alternates between document and entity nodes." },
       {
         type: "endpoint",
         method: "GET",
@@ -8715,8 +9406,11 @@ var sections27 = [
       { label: "Document Links", slug: "list-entities" },
       { label: "Document-Case Map", slug: "refresh-cases" }
     ],
-    faq: [],
-    mentions: ["linking graph", "bipartite graph"]
+    faq: [
+      { question: "What does the depth parameter control?", answer: "Depth controls BFS traversal hops from the starting document. Depth 1 returns only the document and its direct entities. Depth 2 (default) also includes other documents sharing those entities. Higher depths expand the neighbourhood further." },
+      { question: "Why are some entities excluded from case detection?", answer: "High-frequency entities (e.g. a company name appearing on every document) and owner entities are excluded from BFS to prevent merging unrelated document groups into a single oversized case." }
+    ],
+    mentions: ["linking graph", "bipartite graph", "BFS traversal", "document subgraph"]
   },
   {
     slug: "list-cases",
@@ -8725,6 +9419,8 @@ var sections27 = [
     seoTitle: "Backfill Linking \u2014 Talonic Docs",
     description: "Trigger a backfill of the linking graph for all documents. Useful after link key configuration changes. Poll progress via the backfill progress endpoint.",
     content: [
+      { type: "paragraph", text: "After changing link key configurations \u2014 such as classifying new fields as link keys or reclassifying existing ones \u2014 the linking graph needs to be rebuilt. The backfill endpoint scans all documents and reconstructs entity links based on the current link key set." },
+      { type: "callout", variant: "warning", text: "Backfill is serialized per organization. If a backfill is already running, the request is silently skipped. Poll the progress endpoint to monitor in-flight backfills." },
       {
         type: "endpoint",
         method: "POST",
@@ -8750,6 +9446,7 @@ var sections27 = [
   "message": "Backfill queued."
 }`
       },
+      { type: "paragraph", text: "Use the progress endpoint to monitor the backfill. The `running` field indicates whether the operation is still in progress, and `processed` / `total` track document-level progress." },
       {
         type: "endpoint",
         method: "GET",
@@ -8792,10 +9489,14 @@ var sections27 = [
       }
     ],
     related: [
-      { label: "Link Keys", slug: "list-link-keys" }
+      { label: "Link Keys", slug: "list-link-keys" },
+      { label: "Classify", slug: "reclassify-link-key" }
     ],
-    faq: [],
-    mentions: ["backfill", "linking graph"]
+    faq: [
+      { question: "How long does a backfill take?", answer: "Duration depends on the number of documents in your workspace. Poll the progress endpoint to track completion. The backfill processes documents in batches." },
+      { question: "Is classification automatically triggered during backfill?", answer: "No. Backfill only rebuilds entity links using the current link key set. To classify new fields as link keys, call the Classify endpoint first, which will trigger its own backfill if new keys are found." }
+    ],
+    mentions: ["backfill", "linking graph", "progress polling"]
   },
   {
     slug: "get-case",
@@ -8804,6 +9505,12 @@ var sections27 = [
     seoTitle: "List Cases Endpoint \u2014 Talonic Docs",
     description: "List and retrieve cases \u2014 automatically created groups of 2+ related documents linked through shared field values with narrative summaries.",
     content: [
+      { type: "paragraph", text: "Cases are automatically created groups of two or more documents that are connected through shared **transaction** or **reference** entity values. For example, an invoice, a purchase order, and a delivery note sharing the same PO number form a case. Cases provide a high-level view of document relationships without needing to navigate the full graph." },
+      { type: "list", ordered: false, items: [
+        "Each case has a deterministic **case key** (hex hash of its document IDs)",
+        "Cases are created by the linking pipeline during backfill or real-time processing",
+        "Documents linked only through **identity** entities (e.g. vendor ID) appear as entity groups, not cases"
+      ] },
       {
         type: "endpoint",
         method: "GET",
@@ -8855,10 +9562,15 @@ var sections27 = [
       }
     ],
     related: [
-      { label: "Link Keys", slug: "list-link-keys" }
+      { label: "Link Keys", slug: "list-link-keys" },
+      { label: "Case Graph", slug: "get-case-graph" },
+      { label: "Document-Case Map", slug: "refresh-cases" }
     ],
-    faq: [],
-    mentions: ["cases"]
+    faq: [
+      { question: "How are cases different from entity groups?", answer: "Cases require at least one transaction or reference entity linking the documents (e.g. shared PO number). Entity groups are documents linked only through identity entities (e.g. same vendor ID) and do not form cases." },
+      { question: "Can a document belong to multiple cases?", answer: "No. Each document belongs to at most one case. The case key is a deterministic hash of the sorted document IDs in the group." }
+    ],
+    mentions: ["cases", "document groups", "case key"]
   },
   {
     slug: "get-case-graph",
@@ -8867,6 +9579,7 @@ var sections27 = [
     seoTitle: "Case Graph \u2014 Talonic Docs",
     description: "Retrieve the D3-compatible graph visualization for a single case, showing document nodes and entity edges within the case boundary.",
     content: [
+      { type: "paragraph", text: "Retrieve the graph structure for a single case, formatted for **D3.js** or similar graph visualization libraries. The response contains only the nodes and edges within the case boundary, making it suitable for rendering focused relationship diagrams." },
       {
         type: "endpoint",
         method: "GET",
@@ -8920,8 +9633,11 @@ var sections27 = [
       { label: "Cases", slug: "get-case" },
       { label: "Full Graph", slug: "list-linked-documents" }
     ],
-    faq: [],
-    mentions: ["case graph", "D3 visualization"]
+    faq: [
+      { question: "What graph format does the case graph use?", answer: "The response uses a nodes-and-edges structure compatible with D3.js force-directed graphs. Node IDs are stable across requests, so you can maintain layout state between refreshes." },
+      { question: "Does the case graph include excluded entities?", answer: "No. The case graph is scoped to the case boundary and only includes entities that contributed to forming the case. High-frequency entities excluded from BFS are not shown." }
+    ],
+    mentions: ["case graph", "D3 visualization", "graph rendering"]
   },
   {
     slug: "refresh-cases",
@@ -8930,6 +9646,8 @@ var sections27 = [
     seoTitle: "Document-Case Map Endpoint \u2014 Talonic Docs",
     description: "Get the mapping of documents to their resolved cases. Returns a mapping of document IDs to assigned case keys.",
     content: [
+      { type: "paragraph", text: "The document-case map provides a flat lookup from document ID to case assignment. Use it to quickly determine which case a document belongs to, or to identify documents that are not part of any case. Documents in **entity groups** (linked only through identity entities) are included with `is_case: false`." },
+      { type: "callout", variant: "info", text: "Documents with `is_case: false` are linked to other documents only through identity entities (e.g. same vendor). They appear in the map but do not form a case. Documents with no links at all are not included in the map." },
       {
         type: "endpoint",
         method: "GET",
@@ -8983,10 +9701,14 @@ var sections27 = [
       }
     ],
     related: [
-      { label: "Full Graph", slug: "list-linked-documents" }
+      { label: "Full Graph", slug: "list-linked-documents" },
+      { label: "Cases", slug: "get-case" }
     ],
-    faq: [],
-    mentions: ["document-case mapping"]
+    faq: [
+      { question: "What does an empty case_key mean?", answer: "An empty string for case_key indicates the document is in an entity group (linked via identity entities only) but not in a case. The is_case field will be false." },
+      { question: "Are unlinked documents included in the map?", answer: "No. Only documents with at least one entity link appear in the map. Completely unlinked documents are omitted." }
+    ],
+    mentions: ["document-case mapping", "case assignment", "entity groups"]
   }
 ];
@@ -9000,6 +9722,15 @@ var sections28 = [
     description: "Get an aggregate N-Shot summary for a run comparing field-level extraction quality across schema versions.",
     content: [
       { type: "paragraph", text: "N-Shot endpoints provide field-level comparisons between job runs \u2014 useful for evaluating extraction quality across schema versions. Submit judge decisions (human or AI) to record which run produced the better result. All routes are nested under `/v1/jobs/runs/{runId}/nshot/...`." },
+      { type: "paragraph", text: "The summary endpoint returns aggregate statistics for all N-Shot comparisons in a run: total comparisons, agreement breakdown (green/yellow/red), override count, and overall agreement rate. Use this to quickly assess whether a schema change improved or degraded extraction quality." },
+      {
+        type: "list",
+        items: [
+          "**Green** \u2014 all shots produced the same value (high confidence)",
+          "**Yellow** \u2014 partial agreement between shots (majority value exists but not unanimous)",
+          "**Red** \u2014 no agreement between shots (each shot produced a different value)"
+        ]
+      },
       {
         type: "endpoint",
         method: "GET",
@@ -9052,8 +9783,12 @@ var sections28 = [
       { label: "Comparisons", slug: "nshot-list-shots" },
       { label: "Judge Decision", slug: "nshot-judge-decision" }
     ],
-    faq: [],
-    mentions: ["N-Shot", "extraction quality"]
+    faq: [
+      { question: "What is a good agreement_rate?", answer: "An agreement rate above 0.90 indicates stable extraction. Rates between 0.75-0.90 suggest the schema needs tuning. Below 0.75 typically indicates structural issues with the schema or inconsistent source documents." },
+      { question: "How many shots are typically used?", answer: "Three shots is the default and most common configuration. This provides a reliable majority vote while keeping cost manageable." },
+      { question: "Does the summary update as I submit judge decisions?", answer: "Yes. The `overridden` count increments with each accepted judge decision. The agreement breakdown (green/yellow/red) reflects the original shot outcomes and does not change when overrides are applied." }
+    ],
+    mentions: ["N-Shot", "extraction quality", "agreement rate", "field comparison"]
   },
   {
     slug: "nshot-list-shots",
@@ -9062,6 +9797,7 @@ var sections28 = [
     seoTitle: "N-Shot Comparisons Endpoint \u2014 Talonic Docs",
     description: "List per-document field comparisons for N-Shot evaluation across job runs. Returns all comparisons for a specific job run.",
     content: [
+      { type: "paragraph", text: "Retrieve all per-document field comparisons for a job run. Each comparison shows the values produced by each shot, the agreement status (green/yellow/red), the majority value, and any override or judge decision that has been applied. Use this to drill into specific fields and understand where extraction diverges across shots." },
       {
         type: "endpoint",
         method: "GET",
@@ -9135,8 +9871,11 @@ var sections28 = [
       { label: "Single Comparison", slug: "nshot-compare" },
       { label: "Summary", slug: "nshot-summary" }
     ],
-    faq: [],
-    mentions: ["N-Shot comparisons"]
+    faq: [
+      { question: "What comparison methods are available?", answer: "Currently `exact` (string equality after normalization) and `semantic` (embedding-based similarity). The method is chosen automatically based on the field data type." },
+      { question: "How do I find comparisons that need attention?", answer: 'Filter for `status: "red"` comparisons first (no agreement), then `status: "yellow"` (partial agreement). Green comparisons are confident and typically need no review.' }
+    ],
+    mentions: ["N-Shot comparisons", "field agreement", "per-document comparison"]
   },
   {
     slug: "nshot-compare",
@@ -9145,6 +9884,7 @@ var sections28 = [
     seoTitle: "N-Shot Single Comparison \u2014 Talonic Docs",
     description: "Get a specific field comparison filtered by document and field name. Returns a single N-Shot comparison for detailed evaluation.",
     content: [
+      { type: "paragraph", text: "Retrieve a single N-Shot comparison for a specific document and field. Use this endpoint when you need detailed information about one particular comparison, including per-shot values, any existing override, and the LLM judge recommendation." },
       {
         type: "endpoint",
         method: "GET",
@@ -9228,8 +9968,11 @@ var sections28 = [
       { label: "Comparisons", slug: "nshot-list-shots" },
       { label: "Override", slug: "nshot-select" }
     ],
-    faq: [],
-    mentions: ["N-Shot comparison"]
+    faq: [
+      { question: "What does a judgement with accepted: null mean?", answer: "The LLM judge has produced a recommendation but no human or API decision has been submitted yet. Use the judge decision endpoint to accept or decline the recommendation." },
+      { question: "Can a comparison have both an override and a judgement?", answer: "Yes. If a judge decision is accepted, an override is automatically created from the recommended shot. A manual override can also coexist with a pending (not-yet-decided) judgement." }
+    ],
+    mentions: ["N-Shot comparison", "single comparison"]
   },
   {
     slug: "nshot-select",
@@ -9238,6 +9981,8 @@ var sections28 = [
     seoTitle: "N-Shot Override Endpoint \u2014 Talonic Docs",
     description: "Manually override the N-Shot selected value for a specific document-field pair. Requires write scope.",
     content: [
+      { type: "paragraph", text: "Manually override the selected value for a specific document-field comparison by choosing a specific shot number. The override is recorded with an audit trail including the actor, timestamp, original value, and new value. Use this when the majority value is incorrect and you want to select a different shot's extraction." },
+      { type: "callout", variant: "info", text: 'The `selected_shot` must be a valid shot number from the comparison\'s `values` array. The override records `actor_id` as `"api"` for all API-initiated overrides.' },
       {
         type: "endpoint",
         method: "POST",
@@ -9325,8 +10070,11 @@ var sections28 = [
       { label: "Single Comparison", slug: "nshot-compare" },
       { label: "Judge Decision", slug: "nshot-judge-decision" }
     ],
-    faq: [],
-    mentions: ["N-Shot override"]
+    faq: [
+      { question: "Can I override the same comparison multiple times?", answer: "Yes. Each override replaces the previous one. The `from_value` in the latest override reflects the value before the most recent change, not the original majority value." },
+      { question: "Does an override change the agreement status?", answer: "No. The `status` (green/yellow/red) reflects the original shot agreement and does not change when an override is applied. Overrides are tracked separately." }
+    ],
+    mentions: ["N-Shot override", "manual override"]
   },
   {
     slug: "nshot-judge-decision",
@@ -9335,6 +10083,8 @@ var sections28 = [
     seoTitle: "N-Shot Judge Decision \u2014 Talonic Docs",
     description: "Submit a judge decision (human or AI) for an N-Shot comparison to record which candidate produced the correct extraction result.",
     content: [
+      { type: "paragraph", text: "Submit a decision to accept or decline the LLM judge's recommendation for a specific comparison. When `accepted` is `true`, the recommended shot value is automatically applied as an override. When `false`, the recommendation is recorded as declined and no override is applied. Use this to efficiently review LLM suggestions at scale." },
+      { type: "callout", variant: "info", text: 'Accepting a judge decision automatically creates an override with `actor_id: "judge"`. You can still manually override the value afterwards using the override endpoint.' },
       {
         type: "endpoint",
         method: "POST",
@@ -9426,8 +10176,11 @@ var sections28 = [
       { label: "Summary", slug: "nshot-summary" },
       { label: "Override", slug: "nshot-select" }
     ],
-    faq: [],
-    mentions: ["judge decision", "N-Shot evaluation"]
+    faq: [
+      { question: "What happens if there is no LLM judge recommendation to accept?", answer: "If the comparison has no `judgement` object (or `recommended_shot` is null), the endpoint returns a 404. Only comparisons with existing LLM judge recommendations can receive decisions." },
+      { question: "Can I change a judge decision after submitting it?", answer: "Yes. Submit a new judge decision with the opposite `accepted` value. If you previously accepted and the override was applied, declining will not remove the override \u2014 use the override endpoint to change it manually." }
+    ],
+    mentions: ["judge decision", "N-Shot evaluation", "LLM judge"]
   }
 ];
@@ -9441,6 +10194,7 @@ var sections29 = [
     description: "List all schema graph classes in the versioned ontology of document classes discovered across your workspace with field counts and version info.",
     content: [
       { type: "paragraph", text: "The schema graph is a versioned ontology of document classes discovered across your workspace. Each class captures a document type's canonical fields. The API exposes versioned classes, diffs proposed between versions (with approve/reject workflow), inter-class edges, aliases, and a D3-compatible visualization payload." },
+      { type: "paragraph", text: "Use this endpoint to retrieve all schema graph classes for your organization. Classes are created automatically as the platform processes documents and discovers recurring field patterns. Each class tracks its version history and links to the field registry." },
       {
         type: "endpoint",
         method: "GET",
@@ -9503,8 +10257,12 @@ var sections29 = [
       { label: "Get Class", slug: "get-schema-graph-class" },
       { label: "List Versions", slug: "list-class-versions" }
     ],
-    faq: [],
-    mentions: ["schema graph", "document classes", "ontology"]
+    faq: [
+      { question: "Are schema graph classes created automatically?", answer: "Yes. Classes are generated automatically as the platform discovers recurring document types and their field patterns during extraction. You do not need to create them manually." },
+      { question: "What does the current_version_id represent?", answer: "It points to the latest approved version of the class. Each time a diff is approved, a new version is created and `current_version_id` is updated to point to it." },
+      { question: "How do schema graph classes relate to user schemas?", answer: "Schema graph classes represent discovered document types in the ontology. User schemas are manually defined output schemas for extraction jobs. The two are linked through the field registry \u2014 both reference the same canonical field definitions." }
+    ],
+    mentions: ["schema graph", "document classes", "ontology", "versioned classes"]
   },
   {
     slug: "get-schema-graph-class",
@@ -9513,6 +10271,7 @@ var sections29 = [
     seoTitle: "Get Schema Graph Class \u2014 Talonic Docs",
     description: "Retrieve a schema graph class by ID with its current field definitions, version number, document count, and links to version history. Requires read scope.",
     content: [
+      { type: "paragraph", text: "Retrieve a single schema graph class by its UUID. The response includes the class metadata, its current active version, and links to the full version history. Use this to inspect a specific document type's canonical field structure." },
       {
         type: "endpoint",
         method: "GET",
@@ -9571,8 +10330,11 @@ var sections29 = [
       { label: "List Classes", slug: "list-schema-graph-classes" },
       { label: "List Versions", slug: "list-class-versions" }
     ],
-    faq: [],
-    mentions: ["schema graph class"]
+    faq: [
+      { question: "Can a class have no current version?", answer: "Yes. A newly discovered class may have `current_version_id` set to `null` until the first version is published via the diff approval workflow." },
+      { question: "What does the document_type_id field link to?", answer: "It links to a `DocumentType` entity in the extraction system. This association connects the schema graph ontology to the document classification pipeline." }
+    ],
+    mentions: ["schema graph class", "class detail"]
   },
   {
     slug: "list-class-versions",
@@ -9581,6 +10343,7 @@ var sections29 = [
     seoTitle: "Schema Graph Class Versions \u2014 Talonic Docs",
     description: "List all published versions of a schema graph class ordered by version number descending with field definitions for each version.",
     content: [
+      { type: "paragraph", text: "Retrieve the complete version history of a schema graph class. Each version captures a snapshot of the class's JSON Schema definition and its associated field registry IDs at the time the version was published. Versions are created when diffs are approved, and version numbers increment monotonically." },
       {
         type: "endpoint",
         method: "GET",
@@ -9641,8 +10404,11 @@ var sections29 = [
       { label: "Get Version", slug: "get-class-version" },
       { label: "List Diffs", slug: "list-schema-graph-diffs" }
     ],
-    faq: [],
-    mentions: ["class versions"]
+    faq: [
+      { question: "How are new versions created?", answer: "New versions are created when a pending diff is approved via `POST /v1/schema-graph/diffs/{id}/approve`. The diff's field changes are applied and a new version snapshot is published." },
+      { question: "Can I roll back to a previous version?", answer: "There is no direct rollback endpoint. To revert changes, create a new diff that reverses the unwanted field modifications and approve it to produce a new version." }
+    ],
+    mentions: ["class versions", "version history", "JSON Schema"]
   },
   {
     slug: "get-class-version",
@@ -9651,6 +10417,7 @@ var sections29 = [
     seoTitle: "Get Schema Graph Class Version \u2014 Talonic Docs",
     description: "Retrieve a specific version of a schema graph class by class ID and version number. Requires read scope.",
     content: [
+      { type: "paragraph", text: "Retrieve a specific version of a schema graph class by its class ID and version number. Use this to inspect the exact JSON Schema definition and field composition that was active at a particular point in the class's evolution." },
       {
         type: "endpoint",
         method: "GET",
@@ -9704,8 +10471,11 @@ var sections29 = [
     related: [
       { label: "List Versions", slug: "list-class-versions" }
     ],
-    faq: [],
-    mentions: ["class version"]
+    faq: [
+      { question: "What is the version number path parameter?", answer: "It is the integer version number (e.g. `1`, `2`, `3`), not the version UUID. Use the list versions endpoint to discover available version numbers." },
+      { question: "Does the json_schema field contain a valid JSON Schema?", answer: "Yes. It is a standard JSON Schema object with `type`, `properties`, and optionally `required` arrays. You can use it directly for validation or code generation." }
+    ],
+    mentions: ["class version", "specific version"]
   },
   {
     slug: "list-schema-graph-diffs",
@@ -9714,6 +10484,8 @@ var sections29 = [
     seoTitle: "Schema Graph Diffs \u2014 Talonic Docs",
     description: "List pending, approved, and rejected diffs between schema graph class versions. Shows proposed changes for the approve/reject workflow.",
     content: [
+      { type: "paragraph", text: "Diffs represent proposed changes between schema graph class versions. When the platform discovers new fields or detects field type changes, it creates a diff that can be reviewed and either approved (promoting the changes to a new version) or rejected (discarding them). This endpoint returns all diffs, optionally filtered by class or review status." },
+      { type: "callout", variant: "info", text: "Diffs are classified as `additive` (new fields only) or `breaking` (field removals or type changes). Breaking diffs may affect downstream extraction jobs that depend on the removed fields." },
       {
         type: "endpoint",
         method: "GET",
@@ -9785,8 +10557,11 @@ var sections29 = [
       { label: "Approve Diff", slug: "approve-diff" },
       { label: "Reject Diff", slug: "reject-diff" }
     ],
-    faq: [],
-    mentions: ["schema diffs", "version comparison"]
+    faq: [
+      { question: "How are diffs generated?", answer: "Diffs are generated automatically when the platform detects field changes during extraction. When new documents introduce fields not present in the current class version, a diff is created and set to `pending` for review." },
+      { question: "What happens to pending diffs if I approve a newer diff first?", answer: "Pending diffs reference specific version numbers. If the class advances past a pending diff's `to_version`, the diff becomes stale. Review and reject stale diffs to keep the queue clean." }
+    ],
+    mentions: ["schema diffs", "version comparison", "additive", "breaking"]
   },
   {
     slug: "approve-diff",
@@ -9795,6 +10570,8 @@ var sections29 = [
     seoTitle: "Approve Schema Graph Diff \u2014 Talonic Docs",
     description: "Approve a pending diff to promote proposed changes to the next live class version. Requires write scope.",
     content: [
+      { type: "paragraph", text: "Approve a pending diff to promote its proposed field changes into a new class version. Approving a diff updates the class's `current_version_id` and publishes a new version snapshot with the diff's added fields, removed fields, and type changes applied." },
+      { type: "callout", variant: "warning", text: "Approving a `breaking` diff (one that removes fields or changes types) may affect downstream extraction jobs. Review the `removed_fields` and `type_changes` arrays before approving." },
       {
         type: "endpoint",
         method: "POST",
@@ -9835,8 +10612,11 @@ var sections29 = [
       { label: "List Diffs", slug: "list-schema-graph-diffs" },
       { label: "Reject Diff", slug: "reject-diff" }
     ],
-    faq: [],
-    mentions: ["approve diff"]
+    faq: [
+      { question: "Can I approve a diff that is not in pending status?", answer: 'No. Only diffs with `review_status: "pending"` can be approved. Already approved or rejected diffs return a 404 or are ignored.' },
+      { question: "Does approving a diff immediately update the class version?", answer: "Yes. Approval is synchronous. The new version is created and `current_version_id` is updated in the same request." }
+    ],
+    mentions: ["approve diff", "version promotion"]
   },
   {
     slug: "reject-diff",
@@ -9845,6 +10625,7 @@ var sections29 = [
     seoTitle: "Reject Schema Graph Diff \u2014 Talonic Docs",
     description: "Reject a pending diff to discard proposed changes to a schema graph class version. Requires write scope.",
     content: [
+      { type: "paragraph", text: "Reject a pending diff to discard its proposed field changes. The class version remains unchanged, and the diff is marked as `rejected` for audit purposes. Rejected diffs are retained in the history and can be reviewed later but cannot be re-approved." },
       {
         type: "endpoint",
         method: "POST",
@@ -9885,8 +10666,11 @@ var sections29 = [
       { label: "List Diffs", slug: "list-schema-graph-diffs" },
       { label: "Approve Diff", slug: "approve-diff" }
     ],
-    faq: [],
-    mentions: ["reject diff"]
+    faq: [
+      { question: "Can I re-approve a rejected diff?", answer: "No. Once rejected, a diff cannot be re-approved. If the same field changes are needed later, a new diff will be generated automatically during the next extraction cycle." },
+      { question: "Does rejecting a diff affect the current class version?", answer: "No. The class version remains unchanged. Rejection only marks the proposed changes as discarded." }
+    ],
+    mentions: ["reject diff", "discard changes"]
   },
   {
     slug: "list-schema-graph-edges",
@@ -9895,6 +10679,8 @@ var sections29 = [
     seoTitle: "Schema Graph Edges \u2014 Talonic Docs",
     description: "List inter-class edges (relationships) between schema graph classes with relationship type and weight information.",
     content: [
+      { type: "paragraph", text: "Edges represent relationships between schema graph classes, computed using cosine similarity between field embeddings. Each edge captures a directional relationship (e.g. an Invoice class `references` a Purchase Order class) with a weight indicating the strength of the relationship. Use edges to understand how document types relate to each other across your workspace." },
+      { type: "callout", variant: "info", text: "Edge weights range from 0 to 1. Higher weights indicate stronger field overlap between classes. Edges below a minimum weight threshold are automatically pruned and will not appear in results." },
       {
         type: "endpoint",
         method: "GET",
@@ -9955,8 +10741,11 @@ var sections29 = [
       { label: "List Classes", slug: "list-schema-graph-classes" },
       { label: "Visualize", slug: "visualize-schema-graph" }
     ],
-    faq: [],
-    mentions: ["schema graph edges", "relationships"]
+    faq: [
+      { question: "How are edge weights computed?", answer: "Edge weights are derived from cosine similarity between field embeddings of the two classes. A weight of 0.87 means the two classes share approximately 87% field overlap." },
+      { question: "Are edges directional?", answer: "Yes. An edge from class A to class B means A references B. The reverse relationship may also exist as a separate edge with a different weight." }
+    ],
+    mentions: ["schema graph edges", "relationships", "cosine similarity", "field overlap"]
   },
   {
     slug: "list-schema-graph-aliases",
@@ -9965,6 +10754,7 @@ var sections29 = [
     seoTitle: "Schema Graph Aliases \u2014 Talonic Docs",
     description: "List all class aliases \u2014 alternative names that map to canonical class IDs in the schema graph ontology.",
     content: [
+      { type: "paragraph", text: 'Aliases map alternative names to canonical schema graph classes. For example, "Bill" might be an alias for the "Invoice" class. The platform uses aliases during document classification to resolve variant names to their canonical class. Use this endpoint to audit or inspect the alias mappings for your organization.' },
       {
         type: "endpoint",
         method: "GET",
@@ -10014,8 +10804,11 @@ var sections29 = [
     related: [
       { label: "List Classes", slug: "list-schema-graph-classes" }
     ],
-    faq: [],
-    mentions: ["class aliases"]
+    faq: [
+      { question: "Are aliases case-sensitive?", answer: 'No. Alias matching during classification is case-insensitive. "bill", "Bill", and "BILL" all resolve to the same canonical class.' },
+      { question: "How are aliases created?", answer: "Aliases are generated automatically when the platform encounters documents with variant type labels that resolve to the same canonical class during extraction." }
+    ],
+    mentions: ["class aliases", "alternative names", "name mapping"]
   },
   {
     slug: "visualize-schema-graph",
@@ -10024,6 +10817,7 @@ var sections29 = [
     seoTitle: "Visualize Schema Graph \u2014 Talonic Docs",
     description: "Get D3-compatible visualization data for the schema graph with nodes and edges formatted for graph rendering.",
     content: [
+      { type: "paragraph", text: "Retrieve the entire schema graph as a D3-compatible payload with nodes (classes) and edges (relationships). The response is structured for direct consumption by graph visualization libraries such as D3.js, Cytoscape, or vis.js. Nodes include full class metadata; edges use `source` and `target` fields referencing node IDs." },
       {
         type: "endpoint",
         method: "GET",
@@ -10094,8 +10888,11 @@ var sections29 = [
       { label: "Edges", slug: "list-schema-graph-edges" },
       { label: "List Classes", slug: "list-schema-graph-classes" }
     ],
-    faq: [],
-    mentions: ["D3 visualization", "graph rendering"]
+    faq: [
+      { question: "Can I use this response directly with D3.js force-directed graphs?", answer: "Yes. The `nodes` and `edges` arrays are structured for direct use with D3 force simulations. Edge `source` and `target` fields reference node `id` values." },
+      { question: "Does the visualization endpoint include archived classes?", answer: "Yes. All classes are returned regardless of status. Filter by `status` on the client side if you want to exclude archived nodes from the visualization." }
+    ],
+    mentions: ["D3 visualization", "graph rendering", "force-directed graph"]
   }
 ];
@@ -10109,6 +10906,16 @@ var sections30 = [
     description: "List validation checks in the structuring pipeline. Checks define rules like field_format, value_range, cross_field, and ai_coherence.",
     content: [
       { type: "paragraph", text: "The structuring pipeline validates extracted data through configurable checks and approval gates. Checks define validation rules; gates aggregate checks and determine whether records require manual approval before delivery. Also exposes per-result check outcomes, the pending-approvals queue, approve/reject actions, and the manual delivery trigger for an approved run." },
+      { type: "paragraph", text: "Use this endpoint to retrieve all validation checks configured for your organization. Checks are evaluated against every structuring result to flag data quality issues before delivery. You can scope checks to a specific schema to apply different validation logic per document type." },
+      {
+        type: "list",
+        items: [
+          "**field_format** \u2014 validates that a field value matches an expected pattern (e.g. date format, currency code)",
+          "**value_range** \u2014 ensures numeric or date values fall within defined bounds",
+          "**cross_field** \u2014 validates relationships between two or more fields (e.g. end_date > start_date)",
+          "**ai_coherence** \u2014 uses an LLM to assess whether extracted values are semantically plausible"
+        ]
+      },
       {
         type: "endpoint",
         method: "GET",
@@ -10183,8 +10990,12 @@ var sections30 = [
       { label: "Create Check", slug: "create-structuring-check" },
       { label: "List Gates", slug: "list-structuring-gates" }
     ],
-    faq: [],
-    mentions: ["structuring checks", "validation"]
+    faq: [
+      { question: "What is the difference between warning and error severity?", answer: "A `warning` check flags an issue but does not block delivery. An `error` check causes the result to fail the gate and require manual approval before it can be delivered." },
+      { question: "Can I create checks that apply to all schemas?", answer: "Yes. Omit the `user_schema_id` field when creating a check and it will apply globally to all structuring results regardless of schema." },
+      { question: "How are checks ordered during evaluation?", answer: "Checks are evaluated in `sort_order` ascending, then by `created_at`. You can control evaluation priority by setting `sort_order` on each check." }
+    ],
+    mentions: ["structuring checks", "validation", "field_format", "value_range", "cross_field", "ai_coherence"]
   },
   {
     slug: "create-structuring-check",
@@ -10193,6 +11004,8 @@ var sections30 = [
     seoTitle: "Create Structuring Check \u2014 Talonic Docs",
     description: "Create a validation check for the structuring pipeline. Supports field_format, value_range, cross_field, and ai_coherence check types.",
     content: [
+      { type: "paragraph", text: "Create a new validation check to enforce data quality rules on structuring results. Each check targets a specific validation type and can be scoped to a single schema or applied globally. Newly created checks are automatically active and will run against all future structuring results." },
+      { type: "callout", variant: "info", text: "The `config` object shape depends on the check `type`. For `value_range`, provide `field`, `min`, and `max`. For `field_format`, provide `field` and `pattern`. For `cross_field`, provide `fields` and a `rule` expression. For `ai_coherence`, no config is required." },
       {
         type: "endpoint",
         method: "POST",
@@ -10281,8 +11094,11 @@ var sections30 = [
       { label: "List Checks", slug: "list-structuring-checks" },
       { label: "Get Check", slug: "get-structuring-check" }
     ],
-    faq: [],
-    mentions: ["validation check", "value_range"]
+    faq: [
+      { question: "Can I create a check without a config object?", answer: "Yes, for `ai_coherence` checks. For `value_range`, `field_format`, and `cross_field` checks, the `config` object defines the validation logic and is effectively required." },
+      { question: "What happens if I use a master-view API key?", answer: "Create operations require a customer-scoped API key. Using a master-view key returns a `400 bad_request` error because the system cannot determine which organization to associate the check with." }
+    ],
+    mentions: ["validation check", "value_range", "create check", "field_format"]
   },
   {
     slug: "get-structuring-check",
@@ -10291,6 +11107,8 @@ var sections30 = [
     seoTitle: "Manage Structuring Check \u2014 Talonic Docs",
     description: "Get, update, or delete a structuring check. Same path supports GET (detail), PUT (update), and DELETE operations with appropriate scopes.",
     content: [
+      { type: "paragraph", text: "Retrieve, update, or remove a validation check by its UUID. Use **GET** to inspect a check's current configuration, **PUT** to modify its name, severity, config, or active status, and **DELETE** to soft-delete it. Soft-deleted checks set `is_active` to `false` and stop running against new results, but their historical check outcomes remain intact." },
+      { type: "callout", variant: "warning", text: "DELETE is a soft-delete. The check is deactivated (`is_active = false`) rather than permanently removed. Existing result check outcomes referencing this check are preserved." },
       {
         type: "endpoint",
         method: "GET",
@@ -10360,8 +11178,11 @@ var sections30 = [
       { label: "List Checks", slug: "list-structuring-checks" },
       { label: "Create Check", slug: "create-structuring-check" }
     ],
-    faq: [],
-    mentions: ["manage check"]
+    faq: [
+      { question: "Can I reactivate a deleted check?", answer: "Yes. Since DELETE is a soft-delete, you can use PUT on the same check ID to set `is_active` back to `true`." },
+      { question: "Does updating a check re-evaluate existing results?", answer: "No. Updates only affect future structuring results. Historical check outcomes are immutable." }
+    ],
+    mentions: ["manage check", "soft-delete", "update check"]
   },
   {
     slug: "list-structuring-gates",
@@ -10370,6 +11191,8 @@ var sections30 = [
     seoTitle: "List Structuring Gates \u2014 Talonic Docs",
     description: "List approval gates that aggregate validation checks and control whether records require manual approval before delivery.",
     content: [
+      { type: "paragraph", text: "Approval gates sit between extraction and delivery. Each gate aggregates one or more rules (e.g. minimum confidence threshold, validation pass rate) and decides whether structuring results are auto-approved for delivery or queued for human review. Gates can be scoped to a specific schema and optionally linked to a delivery destination." },
+      { type: "callout", variant: "info", text: "Gates without any rules will auto-approve all results. Add at least one rule to enforce quality thresholds before delivery." },
       {
         type: "endpoint",
         method: "GET",
@@ -10458,8 +11281,11 @@ var sections30 = [
       { label: "Create Gate", slug: "create-structuring-gate" },
       { label: "Gate Rules", slug: "gate-rules" }
     ],
-    faq: [],
-    mentions: ["approval gates"]
+    faq: [
+      { question: "How does auto_approve_after_hours work?", answer: "When set, results that have been pending for longer than the specified number of hours are automatically approved without manual intervention. This prevents bottlenecks when reviewers are unavailable." },
+      { question: "Can I link a gate to a delivery destination?", answer: "Yes. Set `destination_id` when creating or updating a gate to route approved results directly to a specific delivery destination (webhook, S3, SFTP, etc.)." }
+    ],
+    mentions: ["approval gates", "delivery gate", "auto-approve"]
   },
   {
     slug: "create-structuring-gate",
@@ -10468,6 +11294,8 @@ var sections30 = [
     seoTitle: "Create Structuring Gate \u2014 Talonic Docs",
     description: "Create an approval gate with optional schema scope. Gates aggregate validation checks and control approval workflows.",
     content: [
+      { type: "paragraph", text: "Create an approval gate to control the flow of structuring results to delivery. A gate starts with no rules \u2014 add rules via the gate rules endpoint to define quality thresholds. The `on_approve` and `on_flag` fields control what happens when results pass or fail the gate's rules." },
+      { type: "callout", variant: "warning", text: "A newly created gate has an empty rules array. Results will auto-approve until you add at least one rule via `POST /v1/structuring/gates/{id}/rules`." },
       {
         type: "endpoint",
         method: "POST",
@@ -10545,8 +11373,11 @@ var sections30 = [
       { label: "List Gates", slug: "list-structuring-gates" },
       { label: "Gate Rules", slug: "gate-rules" }
     ],
-    faq: [],
-    mentions: ["create gate"]
+    faq: [
+      { question: "What is the typical workflow after creating a gate?", answer: "Create the gate, then add rules via `POST /v1/structuring/gates/{id}/rules` to define thresholds like minimum confidence or validation pass rate. Results that fail any rule are queued for manual approval." },
+      { question: "Can I create multiple gates for the same schema?", answer: "Yes. Multiple gates can target the same `user_schema_id`. Each gate evaluates independently, and a result must pass all applicable gates to be auto-approved." }
+    ],
+    mentions: ["create gate", "approval workflow"]
   },
   {
     slug: "get-structuring-gate",
@@ -10555,6 +11386,7 @@ var sections30 = [
     seoTitle: "Manage Structuring Gate \u2014 Talonic Docs",
     description: "Get, update, or delete an approval gate. Same path supports GET (detail with rules), PUT (update), and DELETE operations.",
     content: [
+      { type: "paragraph", text: "Retrieve, update, or remove an approval gate by its UUID. **GET** returns the gate with its active rules embedded. **PUT** updates gate properties (same body shape as create). **DELETE** soft-deletes the gate by setting `is_active` to `false`." },
       {
         type: "endpoint",
         method: "GET",
@@ -10633,8 +11465,11 @@ var sections30 = [
       { label: "List Gates", slug: "list-structuring-gates" },
       { label: "Gate Rules", slug: "gate-rules" }
     ],
-    faq: [],
-    mentions: ["manage gate"]
+    faq: [
+      { question: "Does deleting a gate affect pending approvals?", answer: "No. Pending approval items that were queued by this gate remain in the queue. They can still be approved or rejected manually. The gate simply stops evaluating new results." },
+      { question: "Are rules returned on PUT responses?", answer: "No. The `rules` array is only populated on GET responses. After a PUT update, re-fetch with GET to see the current rules." }
+    ],
+    mentions: ["manage gate", "soft-delete gate"]
   },
   {
     slug: "gate-rules",
@@ -10643,6 +11478,15 @@ var sections30 = [
     seoTitle: "Structuring Gate Rules \u2014 Talonic Docs",
     description: "Add or remove rules from an approval gate. Rules define thresholds like min_confidence, validation_pass_rate, and field_coverage.",
     content: [
+      { type: "paragraph", text: "Gate rules define the quality thresholds that structuring results must meet to be auto-approved. Each rule has a type and a configuration that specifies the threshold. Rules are evaluated in `sort_order` \u2014 if any rule fails, the result is flagged and queued for manual approval." },
+      {
+        type: "list",
+        items: [
+          "**min_confidence** \u2014 requires the row-level confidence score to exceed a threshold (e.g. 0.85)",
+          "**validation_pass_rate** \u2014 requires a minimum percentage of validation checks to pass",
+          "**field_coverage** \u2014 requires a minimum percentage of schema fields to have non-null values"
+        ]
+      },
       {
         type: "endpoint",
         method: "POST",
@@ -10746,8 +11590,11 @@ var sections30 = [
       { label: "Create Gate", slug: "create-structuring-gate" },
       { label: "Pending Approvals", slug: "pending-approvals" }
     ],
-    faq: [],
-    mentions: ["gate rules", "min_confidence", "threshold"]
+    faq: [
+      { question: "Can I add multiple rules of the same type to a gate?", answer: "Yes. For example, you could add two `min_confidence` rules with different thresholds for different severity levels, though in practice a single rule per type is typical." },
+      { question: "What happens when I remove all rules from a gate?", answer: "The gate will auto-approve all results since there are no thresholds to fail against. This is equivalent to disabling the gate without deleting it." }
+    ],
+    mentions: ["gate rules", "min_confidence", "threshold", "validation_pass_rate", "field_coverage"]
   },
   {
     slug: "result-checks",
@@ -10756,6 +11603,7 @@ var sections30 = [
     seoTitle: "Structuring Result Checks \u2014 Talonic Docs",
     description: "Get validation check outcomes for a specific structuring result showing check name, pass/fail status, and messages.",
     content: [
+      { type: "paragraph", text: "Retrieve the validation check outcomes for a specific structuring result. Each outcome records whether a configured check passed or failed for that result, along with the check's name and severity. Use this to understand why a result was flagged or to build audit trails for data quality." },
       {
         type: "endpoint",
         method: "GET",
@@ -10816,8 +11664,11 @@ var sections30 = [
       { label: "List Checks", slug: "list-structuring-checks" },
       { label: "Pending Approvals", slug: "pending-approvals" }
     ],
-    faq: [],
-    mentions: ["result checks", "validation outcomes"]
+    faq: [
+      { question: "Are check outcomes generated automatically?", answer: "Yes. Check outcomes are computed automatically when a structuring result is produced. You do not need to trigger evaluation manually." },
+      { question: "What does the details object contain for a failed check?", answer: "The `details` object contains type-specific failure information. For `value_range`, it includes the field name, actual value, and the configured min/max bounds. For `ai_coherence`, it includes the LLM reasoning." }
+    ],
+    mentions: ["result checks", "validation outcomes", "check results"]
   },
   {
     slug: "pending-approvals",
@@ -10826,6 +11677,8 @@ var sections30 = [
     seoTitle: "Pending Approvals \u2014 Talonic Docs",
     description: "List structuring results awaiting manual approval. These are results that did not pass all gate rules automatically.",
     content: [
+      { type: "paragraph", text: "Retrieve the queue of structuring results that failed one or more gate rules and require manual review. Each item in the response represents a failed check outcome, linking a structuring result to the check that flagged it. Use this endpoint to build approval workflows or monitor data quality issues." },
+      { type: "callout", variant: "info", text: "The pending approvals endpoint returns up to 100 items per call. If you have a high volume of flagged results, implement polling or use the `auto_approve_after_hours` gate setting to prevent queue buildup." },
       {
         type: "endpoint",
         method: "GET",
@@ -10882,8 +11735,11 @@ var sections30 = [
       { label: "Approve / Reject", slug: "approve-reject-result" },
       { label: "Gate Rules", slug: "gate-rules" }
     ],
-    faq: [],
-    mentions: ["pending approvals"]
+    faq: [
+      { question: "Can a single result appear multiple times in pending approvals?", answer: "Yes. If a result fails multiple checks, each failed check outcome appears as a separate item in the pending approvals list. Approving the result clears all pending items for that result." },
+      { question: "How do I clear the pending approvals queue?", answer: "Approve or reject each pending result via `POST /v1/structuring/approvals/{id}/approve` or `/reject`. Alternatively, configure `auto_approve_after_hours` on the gate to auto-clear items after a timeout." }
+    ],
+    mentions: ["pending approvals", "approval queue", "manual review"]
   },
   {
     slug: "approve-reject-result",
@@ -10892,6 +11748,8 @@ var sections30 = [
     seoTitle: "Approve or Reject Structuring Result \u2014 Talonic Docs",
     description: "Approve or reject a structuring result. POST /approve approves the result; POST /reject rejects it. Both return the updated status.",
     content: [
+      { type: "paragraph", text: "Submit an approval or rejection decision for a structuring result that is pending manual review. Approving a result triggers the gate's `on_approve` action (typically delivery). Rejecting it removes the result from the approval queue without triggering delivery. Both actions require the `gate_id` to record which gate the decision applies to." },
+      { type: "callout", variant: "warning", text: "The `gate_id` parameter is required. Each decision is recorded against a specific gate, allowing multiple gates to independently control the same result." },
       {
         type: "endpoint",
         method: "POST",
@@ -10944,8 +11802,11 @@ var sections30 = [
       { label: "Pending Approvals", slug: "pending-approvals" },
       { label: "Trigger Delivery", slug: "trigger-delivery" }
     ],
-    faq: [],
-    mentions: ["approve", "reject", "structuring result"]
+    faq: [
+      { question: "What happens after I approve a result?", answer: "The gate's `on_approve` action fires. If set to `export`, a delivery signal is emitted for the result. Use the trigger delivery endpoint to manually control when delivery occurs for an entire run." },
+      { question: "Can I approve a result that was already rejected?", answer: "Yes. Approval and rejection decisions are additive records. A subsequent approval overrides a prior rejection for the same gate." }
+    ],
+    mentions: ["approve", "reject", "structuring result", "approval decision"]
   },
   {
     slug: "trigger-delivery",
@@ -10954,6 +11815,8 @@ var sections30 = [
     seoTitle: "Trigger Delivery \u2014 Talonic Docs",
     description: "Trigger delivery for a structuring run by emitting delivery signals for all approved results. Returns delivered and skipped counts.",
     content: [
+      { type: "paragraph", text: "Manually trigger delivery for an entire structuring run. This emits delivery signals for all approved results in the run, routing them to configured delivery destinations (webhooks, S3, SFTP, etc.). Results that have not been approved are skipped. Use this after batch-approving results or when you want explicit control over when data leaves the platform." },
+      { type: "callout", variant: "warning", text: "This endpoint triggers delivery for all approved results in the run. There is no undo. Ensure all results have been reviewed before calling this endpoint in production workflows." },
       {
         type: "endpoint",
         method: "POST",
@@ -10995,8 +11858,11 @@ var sections30 = [
     related: [
       { label: "Approve / Reject", slug: "approve-reject-result" }
     ],
-    faq: [],
-    mentions: ["trigger delivery", "delivery signals"]
+    faq: [
+      { question: "What happens to unapproved results when I trigger delivery?", answer: "Unapproved results are silently skipped. Only results with an `approved` status are included in the delivery signals." },
+      { question: "Can I trigger delivery multiple times for the same run?", answer: "Yes. Delivery is idempotent per result \u2014 each result generates a deterministic idempotency key, so duplicate signals are deduplicated by the delivery pipeline." }
+    ],
+    mentions: ["trigger delivery", "delivery signals", "export"]
   }
 ];
@@ -11010,6 +11876,16 @@ var sections31 = [
     description: "Get aggregate structuring metrics for a schema across all runs including capture hit rate, synthesize rate, strategy distribution, and tier funnel.",
     content: [
       { type: "paragraph", text: "Telemetry endpoints aggregate structuring metrics (capture hit rate, synthesize rate, strategy distribution, tier funnel) per schema or per run." },
+      { type: "paragraph", text: "The schema summary returns metrics from the **latest run** for a given schema. Use it to understand how effectively the pipeline fills cells using the field registry versus LLM synthesis, and how fields distribute across resolution tiers." },
+      {
+        type: "list",
+        items: [
+          "**capture_hit_rate** \u2014 Fraction of cells filled from the field registry without LLM calls. Higher is more cost-efficient.",
+          "**synthesize_rate** \u2014 Fraction of cells that required LLM synthesis (Phase 2 agent extraction).",
+          "**strategy_distribution** \u2014 Breakdown by strategy: `transfer`, `extract`, `compute`, `skip`.",
+          "**tier_funnel** \u2014 How cells resolved across registry tiers: `tier1` (core), `tier2` (established), `tier3` (emerging), `unresolved`."
+        ]
+      },
       {
         type: "endpoint",
         method: "GET",
@@ -11068,8 +11944,12 @@ var sections31 = [
       { label: "Schema Trend", slug: "schema-telemetry-trend" },
       { label: "Schema Fields", slug: "schema-telemetry-fields" }
     ],
-    faq: [],
-    mentions: ["telemetry", "capture hit rate", "synthesize rate"]
+    faq: [
+      { question: "What is the difference between capture_hit_rate and synthesize_rate?", answer: "Capture hit rate measures cells filled deterministically from the field registry (no LLM cost). Synthesize rate measures cells that required an LLM call to fill. The two rates plus skipped cells sum to approximately 1.0." },
+      { question: "Which run does the summary reflect?", answer: "The summary always reflects the most recent completed run for the schema. To see metrics from a specific run, use the Run Summary endpoint instead." },
+      { question: "What do the strategy_distribution values mean?", answer: "`transfer` means the value was copied from the field registry, `extract` means LLM extraction was used, `compute` means a deterministic formula produced the value, and `skip` means the field was intentionally left empty." }
+    ],
+    mentions: ["telemetry", "capture hit rate", "synthesize rate", "strategy distribution", "tier funnel"]
   },
   {
     slug: "schema-telemetry-trend",
@@ -11078,6 +11958,8 @@ var sections31 = [
     seoTitle: "Schema Telemetry Trend \u2014 Talonic Docs",
     description: "Get metric trends over time for a schema. Returns time-series telemetry data across recent runs for tracking quality changes.",
     content: [
+      { type: "paragraph", text: "Track how structuring metrics evolve over successive runs for a schema. This endpoint returns a **time-series** of telemetry snapshots, allowing you to detect quality improvements, regressions, or shifts in strategy distribution as your field registry matures." },
+      { type: "callout", variant: "info", text: "A rising `capture_hit_rate` over time indicates the field registry is learning from extractions and resolving more fields deterministically, reducing LLM costs." },
       {
         type: "endpoint",
         method: "GET",
@@ -11167,8 +12049,11 @@ var sections31 = [
       { label: "Schema Summary", slug: "schema-telemetry-summary" },
       { label: "Run Summary", slug: "run-telemetry-summary" }
     ],
-    faq: [],
-    mentions: ["telemetry trend", "time-series"]
+    faq: [
+      { question: "How many runs does the trend include by default?", answer: "The default window is 10 runs. Use the `window` query parameter to request up to 50 recent runs." },
+      { question: "What does a decreasing synthesize_rate indicate?", answer: "A decreasing synthesize rate means more fields are being resolved from the registry without LLM calls. This is the expected trajectory as the field registry accumulates data from successive extractions." }
+    ],
+    mentions: ["telemetry trend", "time-series", "quality tracking"]
   },
   {
     slug: "schema-telemetry-fields",
@@ -11177,6 +12062,8 @@ var sections31 = [
     seoTitle: "Schema Field Telemetry \u2014 Talonic Docs",
     description: "Get per-field structuring metrics for a schema including field-level state distribution, capture rates, and strategy breakdown.",
     content: [
+      { type: "paragraph", text: "Drill down to **individual field performance** within a schema. This endpoint returns per-field capture rates, synthesis rates, the most common strategy used, and the distribution of cell states (filled, empty, skipped). Use it to identify underperforming fields that may need instruction tuning or manual review." },
+      { type: "callout", variant: "info", text: "Fields with a high `synthesize_rate` and low `capture_rate` are candidates for field registry enrichment or instruction refinement to reduce LLM dependency." },
       {
         type: "endpoint",
         method: "GET",
@@ -11240,10 +12127,14 @@ var sections31 = [
       }
     ],
     related: [
-      { label: "Schema Summary", slug: "schema-telemetry-summary" }
+      { label: "Schema Summary", slug: "schema-telemetry-summary" },
+      { label: "Schema Trend", slug: "schema-telemetry-trend" }
     ],
-    faq: [],
-    mentions: ["field telemetry", "capture rates"]
+    faq: [
+      { question: "What does a high empty count in state_distribution mean?", answer: "A high empty count means the field could not be extracted from many documents. This may indicate the field is not present in those document types, or the extraction instructions need refinement." },
+      { question: "How is the strategy field determined?", answer: "The `strategy` field shows the most frequently used resolution strategy for that field across all documents in the latest run. Possible values are `transfer`, `extract`, `compute`, and `skip`." }
+    ],
+    mentions: ["field telemetry", "capture rates", "per-field metrics", "state distribution"]
   },
   {
     slug: "run-telemetry-summary",
@@ -11252,6 +12143,8 @@ var sections31 = [
     seoTitle: "Run Telemetry Summary \u2014 Talonic Docs",
     description: "Get aggregate structuring metrics for a single job run including strategy distribution, tier funnel, and capture hit rate.",
     content: [
+      { type: "paragraph", text: "Retrieve structuring telemetry for a **specific job run** rather than the latest run for a schema. Use this when you need to inspect the performance of a particular execution, compare two runs side by side, or debug a run that produced unexpected results." },
+      { type: "callout", variant: "info", text: "The response shape is identical to the Schema Summary endpoint. The only difference is that this endpoint targets a specific run by ID instead of returning the latest run for a schema." },
       {
         type: "endpoint",
         method: "GET",
@@ -11310,8 +12203,11 @@ var sections31 = [
       { label: "Schema Summary", slug: "schema-telemetry-summary" },
       { label: "Schema Trend", slug: "schema-telemetry-trend" }
     ],
-    faq: [],
-    mentions: ["run telemetry"]
+    faq: [
+      { question: "How do I compare two runs?", answer: "Call this endpoint twice with different run IDs and compare the `capture_hit_rate`, `synthesize_rate`, and `strategy_distribution` values to identify improvements or regressions." },
+      { question: "Can I get run telemetry for a run that is still in progress?", answer: "No. Telemetry is computed after a run completes. Runs with status `pending` or `running` will return a 404 until they finish." }
+    ],
+    mentions: ["run telemetry", "run comparison"]
   }
 ];
@@ -11325,6 +12221,8 @@ var sections32 = [
     description: "List ground-truth datasets used for measuring extraction accuracy. Each dataset contains manually verified document-field value pairs.",
     content: [
       { type: "paragraph", text: "Validation runs measure extraction accuracy against ground-truth datasets. Manage datasets and runs, and retrieve per-document and per-field accuracy results. Create a ground-truth set, then run validations to compare extracted values against expected values." },
+      { type: "paragraph", text: "A **ground-truth dataset** is a collection of manually verified document-field value pairs that serve as the benchmark for accuracy measurement. Each dataset can be scoped to a specific schema via `user_schema_id`, or left unscoped for cross-schema validation." },
+      { type: "callout", variant: "info", text: "Ground-truth datasets are reusable. Create a dataset once and run validations against it repeatedly as you refine your schemas and extraction pipeline." },
       {
         type: "endpoint",
         method: "GET",
@@ -11381,8 +12279,11 @@ var sections32 = [
       { label: "Get Ground-Truth", slug: "get-ground-truth" },
       { label: "Create Validation Run", slug: "create-validation-run" }
     ],
-    faq: [],
-    mentions: ["ground truth", "validation dataset"]
+    faq: [
+      { question: "Can I create ground-truth datasets without a schema?", answer: "Yes. The `user_schema_id` field is optional. Unscoped datasets can be used for cross-schema validation or when you want to validate specific fields regardless of schema." },
+      { question: "How many entries should a ground-truth dataset have?", answer: "There is no minimum or maximum. For statistically meaningful accuracy results, aim for at least 20-30 document-field pairs covering the fields you care about most." }
+    ],
+    mentions: ["ground truth", "validation dataset", "accuracy benchmark"]
   },
   {
     slug: "get-ground-truth",
@@ -11391,6 +12292,8 @@ var sections32 = [
     seoTitle: "Manage Ground-Truth Dataset \u2014 Talonic Docs",
     description: "Get detail with expected values or delete a ground-truth dataset. Supports GET (read scope) and DELETE (write scope) on the same path.",
     content: [
+      { type: "paragraph", text: "Retrieve the full details of a ground-truth dataset including all expected value entries, or permanently delete the dataset. The GET response includes every document-field pair with the expected value, which you can use to audit the benchmark data before running a validation." },
+      { type: "callout", variant: "warning", text: "Deleting a ground-truth dataset also removes all associated expected value entries. Existing validation runs that used this dataset are retained but can no longer be re-run." },
       {
         type: "endpoint",
         method: "GET",
@@ -11462,10 +12365,14 @@ var sections32 = [
       }
     ],
     related: [
-      { label: "List Ground-Truth", slug: "list-ground-truth" }
+      { label: "List Ground-Truth", slug: "list-ground-truth" },
+      { label: "Create Validation Run", slug: "create-validation-run" }
     ],
-    faq: [],
-    mentions: ["ground truth dataset"]
+    faq: [
+      { question: "Does deleting a ground-truth dataset delete its validation runs?", answer: "No. Validation runs and their results are retained. However, you cannot create new validation runs against a deleted dataset." },
+      { question: "Can I update individual expected values in a dataset?", answer: "Expected values are managed as part of the dataset. To change values, delete the dataset and recreate it with the corrected entries." }
+    ],
+    mentions: ["ground truth dataset", "expected values"]
   },
   {
     slug: "list-validation-runs",
@@ -11474,6 +12381,8 @@ var sections32 = [
     seoTitle: "List Validation Runs \u2014 Talonic Docs",
     description: "List validation runs that compare extraction results against ground-truth datasets. Requires read scope.",
     content: [
+      { type: "paragraph", text: "List all validation runs for your organization, ordered by most recent first. A **validation run** compares the structured output of a job run against a ground-truth dataset to produce per-field and overall accuracy scores." },
+      { type: "paragraph", text: "Each run includes its status (`pending`, `running`, `completed`, `failed`), the overall accuracy score (available once completed), and links to the detailed results. Use this to track validation history and identify accuracy trends." },
       {
         type: "endpoint",
         method: "GET",
@@ -11537,8 +12446,11 @@ var sections32 = [
       { label: "Create Validation Run", slug: "create-validation-run" },
       { label: "Get Validation Run", slug: "get-validation-run" }
     ],
-    faq: [],
-    mentions: ["validation runs"]
+    faq: [
+      { question: "How many validation runs are returned?", answer: "Up to 100 runs are returned, ordered by `created_at` descending. There is no pagination \u2014 all runs are included in a single response." },
+      { question: "What does an accuracy of null mean?", answer: "A null accuracy indicates the run has not completed yet. The score is populated only when the run reaches `completed` status." }
+    ],
+    mentions: ["validation runs", "accuracy tracking"]
   },
   {
     slug: "create-validation-run",
@@ -11547,6 +12459,9 @@ var sections32 = [
     seoTitle: "Create Validation Run \u2014 Talonic Docs",
     description: "Create a validation run comparing a job against a ground-truth dataset. Measures per-document and per-field extraction accuracy.",
     content: [
+      { type: "paragraph", text: "Start a new validation run that compares the output of a **job run** against a **ground-truth dataset**. The validation engine compares each extracted value to the expected value, computing exact match, fuzzy match, and similarity scores. An optional LLM judge provides a semantic verdict for ambiguous cases." },
+      { type: "paragraph", text: "Validation runs start in `pending` status and move to `running` as comparisons are performed. Once complete, the `accuracy` field contains the overall score and per-field results are available via the Results endpoint." },
+      { type: "callout", variant: "info", text: "Both `golden_sample_id` and `dataspace_run_id` must belong to your organization. The API returns 404 if either resource is not found." },
       {
         type: "endpoint",
         method: "POST",
@@ -11625,8 +12540,11 @@ var sections32 = [
       { label: "List Validation Runs", slug: "list-validation-runs" },
       { label: "Get Validation Results", slug: "get-validation-results" }
     ],
-    faq: [],
-    mentions: ["create validation run", "accuracy measurement"]
+    faq: [
+      { question: "How long does a validation run take?", answer: "Most validation runs complete within seconds. The duration depends on the number of document-field pairs in the ground-truth dataset and whether the LLM judge is invoked for ambiguous comparisons." },
+      { question: "Can I run validation against the same dataset multiple times?", answer: "Yes. You can create multiple validation runs against the same ground-truth dataset with different job runs to track accuracy improvements over time." }
+    ],
+    mentions: ["create validation run", "accuracy measurement", "LLM judge"]
   },
   {
     slug: "get-validation-run",
@@ -11635,6 +12553,7 @@ var sections32 = [
     seoTitle: "Manage Validation Run \u2014 Talonic Docs",
     description: "Get validation run detail with accuracy summary or delete a run. Supports GET (read scope) and DELETE (write scope) on the same path.",
     content: [
+      { type: "paragraph", text: "Retrieve the full details of a validation run including its status, accuracy score, and total comparisons. Or permanently delete a run and its associated results. Use GET to poll a run's status until it reaches `completed`, then fetch the detailed results." },
       {
         type: "endpoint",
         method: "GET",
@@ -11701,8 +12620,11 @@ var sections32 = [
       { label: "List Validation Runs", slug: "list-validation-runs" },
       { label: "Get Validation Results", slug: "get-validation-results" }
     ],
-    faq: [],
-    mentions: ["validation run detail"]
+    faq: [
+      { question: "Does deleting a validation run delete the ground-truth dataset?", answer: "No. Deleting a run only removes the run record and its per-field results. The ground-truth dataset and the job run remain intact." },
+      { question: "How do I poll for run completion?", answer: "Call `GET /v1/validation/runs/{id}` and check the `status` field. Poll until it changes from `pending` or `running` to `completed` or `failed`." }
+    ],
+    mentions: ["validation run detail", "run status polling"]
   },
   {
     slug: "get-validation-results",
@@ -11711,6 +12633,16 @@ var sections32 = [
     seoTitle: "Validation Results \u2014 Talonic Docs",
     description: "Get per-field validation results including overall accuracy, per-field accuracy, match type, similarity scores, and judge verdicts.",
     content: [
+      { type: "paragraph", text: "Retrieve the granular, per-field comparison results for a completed validation run. Each result entry shows the **expected value**, **actual extracted value**, **match type** (exact, fuzzy, or no_match), a **similarity score**, and an optional **LLM judge verdict** for ambiguous cases." },
+      { type: "paragraph", text: "Use the `judged_only=true` parameter to filter results to only those where the LLM judge was invoked. This is useful for reviewing cases where simple string comparison was insufficient and semantic judgment was required." },
+      {
+        type: "list",
+        items: [
+          "**exact** \u2014 The extracted value matches the expected value character-for-character.",
+          "**fuzzy** \u2014 The values are similar but not identical (e.g. different formatting, minor typos).",
+          "**no_match** \u2014 The extracted value does not match the expected value."
+        ]
+      },
       {
         type: "endpoint",
         method: "GET",
@@ -11782,8 +12714,12 @@ var sections32 = [
       { label: "Get Validation Run", slug: "get-validation-run" },
       { label: "List Ground-Truth", slug: "list-ground-truth" }
     ],
-    faq: [],
-    mentions: ["validation results", "accuracy", "judge verdict"]
+    faq: [
+      { question: "What is the difference between match_type and judge_verdict?", answer: "The `match_type` is a deterministic string comparison (exact, fuzzy, no_match). The `judge_verdict` is an LLM-based semantic assessment (`correct`, `incorrect`, `partial`) invoked for ambiguous cases where string comparison alone is insufficient." },
+      { question: "When is the LLM judge invoked?", answer: "The judge is invoked for fuzzy matches and edge cases where the similarity score falls in an ambiguous range. Exact matches and clear no-matches do not trigger the judge." },
+      { question: "Can I filter results by document or field?", answer: "The `judged_only` parameter filters by judge involvement. To filter by document or field, retrieve the full results and filter client-side." }
+    ],
+    mentions: ["validation results", "accuracy", "judge verdict", "similarity score", "match type"]
   }
 ];
@@ -11797,6 +12733,16 @@ var sections33 = [
     description: "Get the current credit balance for the authenticated customer including currency and timestamp. Requires read scope.",
     content: [
       { type: "paragraph", text: "Credit endpoints expose the current balance, transaction history, aggregate usage summaries, daily usage, and a per-request usage log with model and token counts. Track credit balance and usage breakdowns by operation type and time period." },
+      { type: "paragraph", text: "The balance endpoint returns an **enriched** view of your credit account, including the EUR-equivalent balance, a 30-day burn rate, projected runway in days, and your current API tier. Use this to monitor consumption and plan capacity." },
+      {
+        type: "list",
+        items: [
+          "**balance_credits** \u2014 Current credit balance as an integer.",
+          "**burn_rate_30d_credits** \u2014 Credits consumed in the last 30 days for trend analysis.",
+          "**projected_runway_days** \u2014 Estimated days remaining at the current burn rate.",
+          "**tier** \u2014 Your current API tier (e.g. `free`, `starter`, `growth`), which determines rate limits and features."
+        ]
+      },
       {
         type: "endpoint",
         method: "GET",
@@ -11844,8 +12790,12 @@ var sections33 = [
       { label: "History", slug: "credits-history" },
       { label: "Usage Summary", slug: "credits-usage" }
     ],
-    faq: [],
-    mentions: ["credit balance"]
+    faq: [
+      { question: "What does projected_runway_days: -1 mean?", answer: "A value of -1 indicates that your burn rate over the last 30 days is zero, so runway cannot be projected. This typically means no API calls were made during the period." },
+      { question: "When does the API tier reset?", answer: "Tiers reset on the 1st of each month at midnight UTC. The exact reset timestamp is returned in the `tier_resets_at` field." },
+      { question: "How is the EUR balance calculated?", answer: "The `balance_eur` is computed by dividing `balance_credits` by a configured credits-per-EUR rate. This rate is fixed and does not fluctuate." }
+    ],
+    mentions: ["credit balance", "burn rate", "API tier", "runway"]
   },
   {
     slug: "credits-history",
@@ -11854,6 +12804,8 @@ var sections33 = [
     seoTitle: "Credits History Endpoint \u2014 Talonic Docs",
     description: "Get credit transaction history including purchases, deductions, and adjustments with page-based pagination.",
     content: [
+      { type: "paragraph", text: "Retrieve a chronological log of every credit transaction on your account. Transactions include **purchases** (positive amounts), **consumption deductions** (negative amounts), **bonuses**, and **manual adjustments**. Use this to audit spending and reconcile usage." },
+      { type: "callout", variant: "info", text: "Transactions are ordered by most recent first. Each entry includes the `operation_type` that triggered it (e.g. `extraction`, `manual`), making it easy to trace costs back to specific pipeline operations." },
       {
         type: "endpoint",
         method: "GET",
@@ -11921,8 +12873,11 @@ var sections33 = [
       { label: "Balance", slug: "credits-balance" },
       { label: "Usage Summary", slug: "credits-usage" }
     ],
-    faq: [],
-    mentions: ["credit history", "transactions"]
+    faq: [
+      { question: "How do I distinguish between a purchase and a deduction?", answer: "The `amount` field is positive for purchases and bonuses, and negative for consumption deductions. The `type` field also explicitly labels each transaction." },
+      { question: "What operation types are tracked?", answer: "Common operation types include `extraction`, `document_ai_ocr`, `matching`, and `manual`. The exact set depends on which pipeline operations your account uses." }
+    ],
+    mentions: ["credit history", "transactions", "consumption", "purchase"]
   },
   {
     slug: "credits-usage",
@@ -11931,6 +12886,8 @@ var sections33 = [
     seoTitle: "Credits Usage Summary \u2014 Talonic Docs",
     description: "Get aggregate credit usage summary broken down by operation type and model for a configurable time period.",
     content: [
+      { type: "paragraph", text: "Get a high-level view of your API usage grouped by **operation type** and **model**. This endpoint aggregates call counts, token consumption, and estimated costs over a configurable lookback period. Use it to understand which operations drive your spending." },
+      { type: "callout", variant: "info", text: "Cost estimates include all token classes: input tokens, output tokens, cache creation tokens, and cache read tokens. Each is priced at the model-specific rate." },
       {
         type: "endpoint",
         method: "GET",
@@ -12004,8 +12961,11 @@ var sections33 = [
       { label: "Daily Usage", slug: "credits-usage-daily" },
       { label: "Usage Log", slug: "credits-usage-log" }
     ],
-    faq: [],
-    mentions: ["usage summary", "credit breakdown"]
+    faq: [
+      { question: "Why does document_ai_ocr show zero tokens?", answer: "Document AI OCR (Mistral) is not token-based. Its cost is calculated from `pages_processed * cost_per_page`. Token fields are 0 because they do not apply to this operation type." },
+      { question: "Can I change the lookback period?", answer: "Yes. Pass the `days` query parameter to set the lookback window. The default is 30 days. There is no maximum limit." }
+    ],
+    mentions: ["usage summary", "credit breakdown", "token consumption", "cost estimation"]
   },
   {
     slug: "credits-usage-daily",
@@ -12014,6 +12974,7 @@ var sections33 = [
     seoTitle: "Credits Daily Usage \u2014 Talonic Docs",
     description: "Get per-day credit usage breakdown for the specified period (default last 30 days) with call counts and token totals per day.",
     content: [
+      { type: "paragraph", text: "Get a per-day breakdown of API usage over a configurable period. Each entry includes the total number of API calls, input/output token counts, and estimated cost for that calendar date. Use this for usage trend analysis and daily cost monitoring." },
       {
         type: "endpoint",
         method: "GET",
@@ -12076,8 +13037,11 @@ var sections33 = [
       { label: "Usage Summary", slug: "credits-usage" },
       { label: "Usage Log", slug: "credits-usage-log" }
     ],
-    faq: [],
-    mentions: ["daily usage"]
+    faq: [
+      { question: "Are days with zero usage included in the response?", answer: "Days with no API calls are omitted from the response array. Only dates with at least one recorded call are returned." },
+      { question: "What timezone are the daily dates in?", answer: "Dates are in UTC. A call made at 23:59 UTC on September 14 appears under `2024-09-14`, not the caller's local date." }
+    ],
+    mentions: ["daily usage", "usage trend", "daily cost"]
   },
   {
     slug: "credits-usage-log",
@@ -12086,6 +13050,9 @@ var sections33 = [
     seoTitle: "Credits Usage Log \u2014 Talonic Docs",
     description: "Get a detailed per-request usage log with operation type, model, input/output token counts, and cost estimates for each API call.",
     content: [
+      { type: "paragraph", text: "Retrieve a detailed log of individual API requests with per-request token counts, model information, and cost estimates. This is the most granular usage view available, showing every LLM call and OCR request made by your account." },
+      { type: "paragraph", text: "Each log entry links back to the originating document (when applicable) via the `document_id` field, allowing you to trace costs to specific documents in your pipeline." },
+      { type: "callout", variant: "info", text: "The usage log is ordered by most recent first. Use page-based pagination to browse historical entries." },
       {
         type: "endpoint",
         method: "GET",
@@ -12159,8 +13126,11 @@ var sections33 = [
       { label: "Usage Summary", slug: "credits-usage" },
       { label: "Balance", slug: "credits-balance" }
     ],
-    faq: [],
-    mentions: ["usage log", "token counts", "cost estimation"]
+    faq: [
+      { question: "What are cache_read_tokens?", answer: "Cache read tokens represent prompt cache hits where previously cached input was reused instead of being reprocessed. These are charged at a significantly lower rate than regular input tokens." },
+      { question: "Why is document_id null for some entries?", answer: "Operations that are not tied to a specific document (e.g. schema generation, field resolution) have a null `document_id`. Document-level operations like extraction always include the document reference." }
+    ],
+    mentions: ["usage log", "token counts", "cost estimation", "cache tokens", "per-request"]
   }
 ];
@@ -12174,6 +13144,14 @@ var sections34 = [
     description: "Get a comprehensive workspace overview including document stats, schemas, active runs, field registry summary, and recent activity for AI integrations.",
     content: [
       { type: "paragraph", text: "The Agent API provides programmatic access to the same AI assistant capabilities available in the Talonic platform UI. Use the context endpoint to retrieve a comprehensive workspace snapshot, and the tools endpoint to discover all available agent capabilities." },
+      { type: "paragraph", text: "The workspace context is useful for building external dashboards, feeding context into custom AI integrations, or monitoring workspace health. The response includes document processing stats, schema summaries, active extraction runs, field registry tier distribution, and up to 15 recent activity events." },
+      { type: "list", ordered: false, items: [
+        "**Document stats** \u2014 total count, completed this week/24h, currently processing",
+        "**Schemas** \u2014 user-defined schemas with field counts and versions",
+        "**Active runs** \u2014 in-flight extraction runs with status and document counts",
+        "**Field registry** \u2014 total fields and distribution across Tier 1, 2, and 3",
+        "**Recent activity** \u2014 up to 15 events with type, message, timestamp, and actor"
+      ] },
       {
         type: "endpoint",
         method: "GET",
@@ -12252,9 +13230,11 @@ var sections34 = [
       { label: "List Agent Tools", slug: "agent-tools" }
     ],
     faq: [
-      { question: "What does the agent context endpoint return?", answer: "A comprehensive workspace snapshot including document stats, schemas, active runs, field registry summary, and recent activity." }
+      { question: "What does the agent context endpoint return?", answer: "A comprehensive workspace snapshot including document stats, schemas, active runs, field registry summary, and recent activity." },
+      { question: "How often is the workspace context updated?", answer: "The context is computed on each request from live data. There is no caching, so the response always reflects the current workspace state." },
+      { question: "Can I use the agent context to feed an external LLM?", answer: "Yes. The context endpoint is designed for this use case. Pass the response as system context to any LLM to give it awareness of your workspace state, document types, and schemas." }
     ],
-    mentions: ["agent context", "workspace overview", "AI integration"]
+    mentions: ["agent context", "workspace overview", "AI integration", "dashboard data"]
   },
   {
     slug: "agent-tools",
@@ -12263,6 +13243,8 @@ var sections34 = [
     seoTitle: "Agent Tools Endpoint \u2014 Talonic Docs",
     description: "List all tools available to the embedded agent including their impact level (read/write) and descriptions for discovering agent capabilities.",
     content: [
+      { type: "paragraph", text: "Discover all tools available to the embedded AI agent. Each tool declares its **impact level** \u2014 whether it performs a read-only operation or a mutation \u2014 so you can build permission-aware integrations. Use this endpoint to dynamically generate tool descriptions for external AI agents or to audit available capabilities." },
+      { type: "callout", variant: "info", text: "Impact levels follow a severity scale: `read` (no side effects), `draft_mutation` (creates drafts only), `live_mutation` (modifies live data), and `irreversible` (permanent changes like deletion). Use these to implement confirmation gates in your integration." },
       {
         type: "endpoint",
         method: "GET",
@@ -12310,8 +13292,11 @@ var sections34 = [
     related: [
       { label: "Get Workspace Context", slug: "agent-context" }
     ],
-    faq: [],
-    mentions: ["agent tools", "capabilities"]
+    faq: [
+      { question: "Can I invoke agent tools directly via the API?", answer: "The tools endpoint lists available capabilities. Tool invocation happens through the embedded agent chat interface or via the Claude API proxy, not through direct REST calls to individual tools." },
+      { question: "What impact levels are available?", answer: "Four levels: `read` (safe, no side effects), `draft_mutation` (creates drafts), `live_mutation` (modifies live data), and `irreversible` (permanent changes). Use these to build confirmation gates." }
+    ],
+    mentions: ["agent tools", "capabilities", "impact levels", "tool discovery"]
   }
 ];
@@ -12325,6 +13310,16 @@ var sections35 = [
     description: "List all matching configurations for the current workspace. Each config defines field mappings, strategies, and weights used to match documents against reference data.",
     content: [
       { type: "paragraph", text: "Matching connects incoming documents to reference datasets using configurable field-level strategies. Each configuration defines which fields to compare, how to compare them (exact, fuzzy, date range, numeric range), and the relative weight of each field in the overall confidence score." },
+      { type: "paragraph", text: "A typical matching workflow is: upload reference data (CSV or XLSX), create a config with field mappings, run matching, and review results. For complex datasets, use the **generate strategy** endpoint to let AI recommend optimal field mappings and weights." },
+      {
+        type: "list",
+        items: [
+          "Each config targets a single reference dataset.",
+          "Field mappings define source-to-target column pairs with a comparison strategy.",
+          "Weights must sum to 1.0 across all field mappings.",
+          "The confidence threshold (default 0.85) controls the minimum score for a match."
+        ]
+      },
       {
         type: "endpoint",
         method: "GET",
@@ -12414,7 +13409,8 @@ var sections35 = [
       { label: "Run Matching", slug: "run-matching" }
     ],
     faq: [
-      { question: "What is a matching configuration?", answer: "A matching configuration defines which fields to compare between documents and reference data, what comparison strategy to use per field (exact, fuzzy, date_range, numeric_range), and how much weight each field carries in the overall confidence score." }
+      { question: "What is a matching configuration?", answer: "A matching configuration defines which fields to compare between documents and reference data, what comparison strategy to use per field (exact, fuzzy, date_range, numeric_range), and how much weight each field carries in the overall confidence score." },
+      { question: "Can I have multiple configs for the same reference dataset?", answer: "Yes. You can create multiple configs targeting the same reference dataset with different field mappings, strategies, or thresholds to test different matching approaches." }
     ],
     mentions: ["matching", "reference data", "field mapping"]
   },
@@ -12425,6 +13421,17 @@ var sections35 = [
     seoTitle: "Create Matching Config \u2014 Talonic Docs",
     description: "Create a matching configuration with field mappings, comparison strategies (exact, fuzzy, date_range, numeric_range), and per-field weights that sum to 1.0.",
     content: [
+      { type: "paragraph", text: "Create a matching configuration that defines how documents are compared against a reference dataset. Each field mapping specifies a source field (from extracted documents), a target column (in the reference data), a comparison strategy, and a relative weight." },
+      { type: "callout", variant: "info", text: "Field weights should sum to 1.0. The overall confidence score for a match is the weighted sum of per-field scores. Use the **generate strategy** endpoint to get AI-recommended mappings if you are unsure which fields and weights to use." },
+      {
+        type: "list",
+        items: [
+          "**exact** \u2014 case-insensitive string equality. Best for codes, IDs, and standardized values.",
+          "**fuzzy** \u2014 Levenshtein/token similarity. Handles name variations and minor typos.",
+          "**date_range** \u2014 date proximity within a configurable tolerance window.",
+          "**numeric_range** \u2014 numeric proximity within a configurable tolerance. Handles rounding differences."
+        ]
+      },
       {
         type: "endpoint",
         method: "POST",
@@ -12522,7 +13529,9 @@ var sections35 = [
       { label: "Generate Strategy", slug: "generate-strategy" }
     ],
     faq: [
-      { question: "What strategies are available for field matching?", answer: "Four strategies: exact (case-insensitive equality), fuzzy (Levenshtein/token similarity), date_range (date proximity within tolerance), and numeric_range (numeric proximity within tolerance)." }
+      { question: "What strategies are available for field matching?", answer: "Four strategies: exact (case-insensitive equality), fuzzy (Levenshtein/token similarity), date_range (date proximity within tolerance), and numeric_range (numeric proximity within tolerance)." },
+      { question: "Do field weights need to sum to exactly 1.0?", answer: "Weights should sum to 1.0 for meaningful confidence scores. If they do not sum to 1.0, the system normalizes them internally, but explicitly setting weights to sum to 1.0 gives you predictable confidence values." },
+      { question: "Can I use the same reference dataset column in multiple mappings?", answer: "Yes. A single target column can appear in multiple field mappings with different source fields and strategies, which is useful when multiple document fields might correspond to the same reference column." }
     ],
     mentions: ["matching config", "field strategy", "weights"]
   },
@@ -12533,6 +13542,7 @@ var sections35 = [
     seoTitle: "Manage Matching Config \u2014 Talonic Docs",
     description: "Get matching configuration details, update field mappings and weights, or delete a configuration. Deleting a config does not remove past run results.",
     content: [
+      { type: "paragraph", text: "Retrieve, update, or delete a matching configuration. Updates to field mappings and thresholds take effect on the next run \u2014 they do not retroactively change past results. Deleting a config removes the configuration but preserves all historical run results for audit purposes." },
       {
         type: "endpoint",
         method: "GET",
@@ -12673,7 +13683,10 @@ var sections35 = [
       { label: "List Configs", slug: "list-matching-configs" },
       { label: "Run Matching", slug: "run-matching" }
     ],
-    faq: [],
+    faq: [
+      { question: "Can I change the reference_data_id on an existing config?", answer: "No. The reference dataset is fixed at creation. To match against a different dataset, create a new configuration." },
+      { question: "Does updating a config affect past run results?", answer: "No. Updates only affect future runs. Past results are immutable and retain the field mappings and thresholds that were active when the run executed." }
+    ],
     mentions: ["matching config management"]
   },
   {
@@ -12683,6 +13696,9 @@ var sections35 = [
     seoTitle: "Run Matching \u2014 Talonic Docs",
     description: "Execute a matching run against reference data asynchronously, or use smart-run for AI-assisted matching that auto-tunes strategies. Poll run status via the run detail endpoint.",
     content: [
+      { type: "paragraph", text: "Execute a matching run to compare documents against the reference dataset defined in the config. Runs are **asynchronous** \u2014 the endpoint returns immediately with a `queued` status. Poll the run detail endpoint to track progress and retrieve results." },
+      { type: "paragraph", text: "The **smart-run** variant uses a pre-generated AI strategy that automatically tunes comparison thresholds and field priorities. This can significantly improve accuracy on datasets with heterogeneous formatting, inconsistent naming, or mixed-language content." },
+      { type: "callout", variant: "info", text: "Matching runs are processed via a BullMQ job queue. Large datasets may take several minutes. Use `GET /v1/matching/runs/:id/progress` to poll for live progress updates while the run is active." },
       {
         type: "endpoint",
         method: "POST",
@@ -12791,7 +13807,9 @@ var sections35 = [
       { label: "Results", slug: "get-matching-results" }
     ],
     faq: [
-      { question: "What is the difference between run and smart-run?", answer: "A standard run uses the strategies and weights defined in the config. A smart-run uses a pre-generated strategy with AI-tuned comparison thresholds, which can improve accuracy on heterogeneous datasets." }
+      { question: "What is the difference between run and smart-run?", answer: "A standard run uses the strategies and weights defined in the config. A smart-run uses a pre-generated strategy with AI-tuned comparison thresholds, which can improve accuracy on heterogeneous datasets." },
+      { question: "How long does a matching run take?", answer: "Runtime depends on the number of documents and reference rows. For a typical dataset of a few hundred documents against a few thousand reference rows, expect 1-5 minutes. Large datasets may take longer." },
+      { question: "Can I run matching on a subset of documents?", answer: "Currently, runs match all unmatched documents in the workspace. Document-level filtering is not yet supported \u2014 use the results endpoint to filter outcomes after the run completes." }
     ],
     mentions: ["matching run", "smart matching", "async"]
   },
@@ -12802,6 +13820,8 @@ var sections35 = [
     seoTitle: "Matching Run Detail \u2014 Talonic Docs",
     description: "Get the status, progress, and summary of a matching run. Status progresses from queued to running to completed or failed.",
     content: [
+      { type: "paragraph", text: "Retrieve the current state of a matching run. Poll this endpoint while `status` is `queued` or `running` to track progress. Once `completed`, the response includes the top 50 results by confidence. Use the results endpoint for full paginated access." },
+      { type: "callout", variant: "info", text: "The `ai_resolving` status indicates that the run has finished standard matching and is now running an AI resolution pass on low-confidence rows. This pass uses Claude Haiku to disambiguate borderline matches." },
       {
         type: "endpoint",
         method: "GET",
@@ -12889,7 +13909,10 @@ var sections35 = [
       { label: "Results", slug: "get-matching-results" },
       { label: "Run Matching", slug: "run-matching" }
     ],
-    faq: [],
+    faq: [
+      { question: "Why does the run detail only show 50 results?", answer: "The run detail endpoint includes the top 50 results by confidence for quick inspection. Use GET /v1/matching/runs/:id/results with pagination for the full result set." },
+      { question: "What does the ai_resolving status mean?", answer: "The run has completed standard field-level matching and is now running an AI resolution pass (using Claude Haiku) on rows with low confidence scores. This can upgrade borderline matches or confirm non-matches." }
+    ],
     mentions: ["matching run status", "progress"]
   },
   {
@@ -12899,6 +13922,8 @@ var sections35 = [
     seoTitle: "Matching Results \u2014 Talonic Docs",
     description: "Retrieve matching results for a completed run. Returns the top 5 candidates per document with weighted confidence scores and per-field evidence breakdowns.",
     content: [
+      { type: "paragraph", text: "Retrieve the full paginated results for a completed matching run. Each result represents a document matched (or unmatched) against the reference dataset, with a weighted confidence score and per-field evidence breakdown showing how each field contributed to the overall score." },
+      { type: "callout", variant: "info", text: "Results with `status: pending` have not been reviewed. Use `POST /v1/matching/runs/:runId/results/:resultId/review` to approve or reject individual matches. Approved matches can be used downstream for data enrichment and reconciliation workflows." },
       {
         type: "endpoint",
         method: "GET",
@@ -12977,7 +14002,9 @@ var sections35 = [
       { label: "List Configs", slug: "list-matching-configs" }
     ],
     faq: [
-      { question: "How is the confidence score calculated?", answer: "Each field produces a strategy-specific score (0-1). The overall confidence is the weighted sum of per-field scores, using the weights defined in the matching config." }
+      { question: "How is the confidence score calculated?", answer: "Each field produces a strategy-specific score (0-1). The overall confidence is the weighted sum of per-field scores, using the weights defined in the matching config." },
+      { question: "What does null matched_reference_row_id mean?", answer: "A null matched_reference_row_id means no reference row scored above the configured threshold for this document. The document is effectively unmatched." },
+      { question: "Can I review results programmatically?", answer: "Yes. Use POST /v1/matching/runs/:runId/results/:resultId/review with a status of approved or rejected to programmatically review match results." }
     ],
     mentions: ["matching results", "confidence score", "evidence"]
   },
@@ -12988,6 +14015,9 @@ var sections35 = [
     seoTitle: "Generate Matching Strategy \u2014 Talonic Docs",
     description: "Use AI to auto-suggest field mappings and strategies for a reference dataset. Analyses column names, data types, and sample values to recommend optimal matching configurations.",
     content: [
+      { type: "paragraph", text: "Let AI analyse your reference dataset and recommend field mappings, comparison strategies, and weights. The strategy generator examines column names, data types, and sample values to produce an optimal matching configuration. Use the generated strategy as a starting point or pass it directly to a smart-run." },
+      { type: "paragraph", text: 'Each generated strategy includes a `rationale` per field mapping explaining why the AI chose that strategy and weight. You can optionally provide a `user_prompt` with natural language guidance (e.g. "prioritise date matching" or "vendor name is the most important field") to steer the generation.' },
+      { type: "callout", variant: "info", text: "Strategy generation does not modify any existing configs. The generated strategy is saved independently and can be applied to a config via the smart-run endpoint, or used as a reference when manually creating a config." },
       {
         type: "endpoint",
         method: "POST",
@@ -13060,9 +14090,14 @@ var sections35 = [
       }
     ],
     related: [
-      { label: "Create Config", slug: "create-matching-config" }
+      { label: "Create Config", slug: "create-matching-config" },
+      { label: "Run Smart Matching", slug: "run-matching" }
+    ],
+    faq: [
+      { question: "Does strategy generation cost credits?", answer: "Yes. Strategy generation uses AI (Claude) to analyse columns and sample data. The cost is minimal \u2014 typically a single Haiku call \u2014 and is reflected in your credit usage." },
+      { question: "Can I regenerate a strategy with different guidance?", answer: "Yes. Call the generate endpoint again with a different user_prompt. Each call creates a new strategy version. Previous versions are retained." },
+      { question: "How do I use a generated strategy?", answer: "Pass the strategy ID to POST /v1/matching/configs/:id/smart-run to execute a run using the AI-generated thresholds and weights. Alternatively, copy the field_mappings from the strategy into a new config via POST /v1/matching/configs." }
     ],
-    faq: [],
     mentions: ["AI strategy", "auto-suggest", "field mapping"]
   }
 ];
@@ -13077,6 +14112,18 @@ var sections36 = [
     description: "List all configured delivery destinations for the workspace. Destinations define where processed data is sent \u2014 webhook, SFTP, S3, Azure Blob, Google Drive, or OneDrive.",
     content: [
       { type: "paragraph", text: "Delivery routes processed data to external systems through a four-part pipeline: **signals** (domain events) are matched to **bindings** (routing rules), resolved into **deliverables** (payloads), serialized, and sent to **destinations** (connectors). Destinations define the transport protocol and credentials." },
+      { type: "paragraph", text: "Each destination is an instance of one of the six supported connector types. A single destination can serve multiple bindings \u2014 for example, you might have one S3 destination with separate bindings for extraction results and run outcomes. Auth credentials are stored securely and never returned in API responses." },
+      {
+        type: "list",
+        items: [
+          "**webhook** \u2014 HTTP POST with optional HMAC-SHA256 signing and idempotency headers.",
+          "**sftp** \u2014 File upload via SSH with password or private key auth.",
+          "**s3** \u2014 Object upload to AWS S3 with access key auth.",
+          "**azure_blob** \u2014 Blob upload to Azure Storage with connection string or account key.",
+          "**google_drive** \u2014 File upload via OAuth (`drive.file` scope).",
+          "**onedrive** \u2014 File upload via OAuth (`Files.ReadWrite.All` scope)."
+        ]
+      },
       {
         type: "endpoint",
         method: "GET",
@@ -13156,7 +14203,9 @@ var sections36 = [
       { label: "List Bindings", slug: "list-delivery-bindings" }
     ],
     faq: [
-      { question: "What destination types are supported?", answer: "Six connector types: webhook (HTTP POST with HMAC-SHA256 signing), sftp, s3, azure_blob, google_drive (OAuth), and onedrive (OAuth)." }
+      { question: "What destination types are supported?", answer: "Six connector types: webhook (HTTP POST with HMAC-SHA256 signing), sftp, s3, azure_blob, google_drive (OAuth), and onedrive (OAuth)." },
+      { question: "Why is my destination marked as inactive?", answer: "Destinations are automatically deactivated when an auth failure or SSRF block occurs during delivery. Fix the credentials or URL, then re-enable via PUT /v1/delivery/destinations/:id." },
+      { question: "Are auth credentials returned in the response?", answer: "No. Auth credentials are never returned in API responses. The has_auth_config and has_signing_secret boolean fields indicate whether credentials are configured." }
     ],
     mentions: ["delivery", "destinations", "connectors"]
   },
@@ -13167,6 +14216,8 @@ var sections36 = [
     seoTitle: "Create Delivery Destination \u2014 Talonic Docs",
     description: "Create a delivery destination with connector type, transport config, and authentication. Supported types: webhook, sftp, s3, azure_blob, google_drive, onedrive.",
     content: [
+      { type: "paragraph", text: "Create a new delivery destination by specifying the connector type, transport configuration, and optional authentication. The `config` and `auth_config` schemas vary by destination type \u2014 see the catalog endpoint for connector capabilities." },
+      { type: "callout", variant: "info", text: "OAuth-based destinations (google_drive, onedrive) require completing an OAuth flow before creating the destination. Use the OAuth start endpoint to initiate the flow and obtain tokens." },
       {
         type: "endpoint",
         method: "POST",
@@ -13254,8 +14305,11 @@ var sections36 = [
       { label: "List Destinations", slug: "list-delivery-destinations" },
       { label: "Test Destination", slug: "manage-delivery-destination" }
     ],
-    faq: [],
-    mentions: ["destination creation", "webhook", "HMAC"]
+    faq: [
+      { question: "How does webhook signing work?", answer: 'When a signing_secret is configured, every delivery includes an X-Talonic-Signature header with format t=<timestamp>,v1=<hex>. The signature is HMAC-SHA256 of the signing secret over "<timestamp>.<body>". Verify this on your server to confirm authenticity.' },
+      { question: "What is the payload cap?", answer: "The global payload cap is 5 MiB per delivery. You can override this per destination with payload_cap_bytes. Payloads exceeding the cap are rejected with a non-retryable payload_too_large error." }
+    ],
+    mentions: ["destination creation", "webhook", "HMAC", "signing"]
   },
   {
     slug: "manage-delivery-destination",
@@ -13264,6 +14318,8 @@ var sections36 = [
     seoTitle: "Manage Delivery Destination \u2014 Talonic Docs",
     description: "Get destination details, update config, delete a destination, or send a test payload to verify connectivity. Auth credentials are always redacted in responses.",
     content: [
+      { type: "paragraph", text: "Manage a single destination: retrieve its current config, update transport settings or credentials, delete it, or test connectivity. The **test** endpoint probes the destination without delivering real data \u2014 file-drop connectors (S3, SFTP, Azure Blob) verify bucket/container reachability without writing any objects." },
+      { type: "callout", variant: "warning", text: "Deleting a destination cascades to all its bindings, delivery items, and DLQ entries. This is irreversible. Disable the destination (`is_active: false`) instead if you want to preserve history." },
       {
         type: "endpoint",
         method: "GET",
@@ -13453,10 +14509,14 @@ var sections36 = [
       }
     ],
     related: [
-      { label: "List Destinations", slug: "list-delivery-destinations" }
+      { label: "List Destinations", slug: "list-delivery-destinations" },
+      { label: "Create Destination", slug: "create-delivery-destination" }
     ],
-    faq: [],
-    mentions: ["destination management", "test delivery"]
+    faq: [
+      { question: "Does the test endpoint write data to the destination?", answer: "No. File-drop connectors (S3, SFTP, Azure Blob, Google Drive, OneDrive) use lightweight probes (HeadBucket, list, getProperties) that verify reachability without creating any objects. Webhook destinations receive a small synthetic payload." },
+      { question: "Can I re-enable a destination that was auto-disabled?", answer: "Yes. Fix the underlying issue (expired credentials, unreachable URL), then update the destination with the corrected config. The destination will be re-enabled automatically, or you can explicitly set is_active: true." }
+    ],
+    mentions: ["destination management", "test delivery", "cascade delete"]
   },
   {
     slug: "list-delivery-bindings",
@@ -13466,6 +14526,7 @@ var sections36 = [
     description: "List all delivery bindings that route signals to destinations. Each binding maps an event type to a deliverable resolver, serializer format, and destination.",
     content: [
       { type: "paragraph", text: "A binding connects the four parts of the delivery pipeline: a **signal filter** (which events to listen for), a **deliverable type** (what payload to build), a **serializer format** (how to encode it), and a **destination** (where to send it). The compatibility triangle is enforced on creation." },
+      { type: "paragraph", text: "Bindings are the core routing configuration for delivery. When a domain event (e.g. `document.extracted`) fires, the poller matches it against all active bindings. Each matching binding produces a separate delivery attempt \u2014 so a single event can fan out to multiple destinations simultaneously." },
       {
         type: "endpoint",
         method: "GET",
@@ -13533,7 +14594,8 @@ var sections36 = [
       { label: "Catalog", slug: "delivery-catalog" }
     ],
     faq: [
-      { question: "What is the compatibility triangle?", answer: "When creating a binding, the system verifies that the signal event type, deliverable resolver, and serializer format are mutually compatible. For example, a CSV serializer cannot serialize a graph deliverable." }
+      { question: "What is the compatibility triangle?", answer: "When creating a binding, the system verifies that the signal event type, deliverable resolver, and serializer format are mutually compatible. For example, a CSV serializer cannot serialize a graph deliverable." },
+      { question: "Can multiple bindings target the same destination?", answer: "Yes. A single destination can serve many bindings with different signal filters, deliverable types, and serializer formats. Each binding produces independent delivery attempts." }
     ],
     mentions: ["bindings", "signal filter", "compatibility triangle"]
   },
@@ -13544,6 +14606,8 @@ var sections36 = [
     seoTitle: "Create Delivery Binding \u2014 Talonic Docs",
     description: "Create a delivery binding that routes domain signals through a deliverable resolver and serializer to a destination. Includes field mapping and retry policy configuration.",
     content: [
+      { type: "paragraph", text: "Create a binding that wires a domain event to a destination. The **compatibility triangle** is validated on creation: the signal event type must be compatible with the deliverable resolver, the serializer must support the deliverable shape, and the connector must support the serializer format." },
+      { type: "callout", variant: "info", text: "Use the catalog endpoints (`/v1/delivery/catalog/*`) to discover valid combinations before creating a binding. The catalog lists all available signals, deliverables, serializers, and connectors with their compatibility constraints." },
       {
         type: "endpoint",
         method: "POST",
@@ -13631,8 +14695,11 @@ var sections36 = [
       { label: "List Bindings", slug: "list-delivery-bindings" },
       { label: "Catalog", slug: "delivery-catalog" }
     ],
-    faq: [],
-    mentions: ["binding creation", "field map", "delivery policy"]
+    faq: [
+      { question: "What is the default retry policy?", answer: "By default, deliveries are retried up to 7 times with an exponential backoff schedule: 0s, 30s, 2m, 8m, 30m, 2h, 8h. Override this with the delivery_policy field." },
+      { question: "What is the field_map for?", answer: "The field_map applies a JSONPath projection to the resolved payload before serialization. Use it to rename fields, drop internal fields, or add static values. If omitted, the full payload is delivered as-is." }
+    ],
+    mentions: ["binding creation", "field map", "delivery policy", "retry"]
   },
   {
     slug: "manage-delivery-binding",
@@ -13641,6 +14708,7 @@ var sections36 = [
     seoTitle: "Manage Delivery Binding \u2014 Talonic Docs",
     description: "Get binding details, update signal filters or field maps, delete a binding, or preview the resolved payload for a binding without sending it.",
     content: [
+      { type: "paragraph", text: "Manage a single delivery binding: retrieve its configuration, update the signal filter or field map, delete it, or preview the payload it would produce. Updates re-validate the compatibility triangle. Deleting a binding stops future routing but allows in-flight deliveries to complete." },
       {
         type: "endpoint",
         method: "GET",
@@ -13832,9 +14900,13 @@ var sections36 = [
       }
     ],
     related: [
-      { label: "List Bindings", slug: "list-delivery-bindings" }
+      { label: "List Bindings", slug: "list-delivery-bindings" },
+      { label: "Catalog", slug: "delivery-catalog" }
+    ],
+    faq: [
+      { question: "Does deleting a binding affect in-flight deliveries?", answer: "No. In-flight deliveries (already queued in the delivery job queue) will complete normally. Deletion only prevents new events from being routed to this binding." },
+      { question: "Is the preview endpoint live?", answer: "The public API preview endpoint currently returns a stub response. The internal preview endpoint is fully functional and walks the full resolve -> project -> serialize pipeline." }
     ],
-    faq: [],
     mentions: ["binding management", "preview"]
   },
   {
@@ -13844,6 +14916,8 @@ var sections36 = [
     seoTitle: "Delivery History \u2014 Talonic Docs",
     description: "View delivery attempt history with status, HTTP codes, and timing. Get detail for a single item or replay a failed delivery attempt.",
     content: [
+      { type: "paragraph", text: "The delivery history tracks every attempt to deliver a payload to a destination. Each attempt is recorded as a **delivery item** with status, timing, HTTP response code, and optional request/response bodies. Use this endpoint to audit delivery performance and debug failures." },
+      { type: "callout", variant: "info", text: "Request and response bodies are truncated to 10 KB and retained for a configurable period (default 30 days). After the retention period, bodies are nulled but metadata (status, HTTP code, duration, error code) is preserved indefinitely." },
       {
         type: "endpoint",
         method: "GET",
@@ -14015,8 +15089,11 @@ var sections36 = [
       { label: "Dead Letter Queue", slug: "delivery-dlq" },
       { label: "List Bindings", slug: "list-delivery-bindings" }
     ],
-    faq: [],
-    mentions: ["delivery history", "replay", "attempt log"]
+    faq: [
+      { question: "What is the idempotency key?", answer: "The idempotency key is a deterministic SHA-256 hash of the binding ID and event ID. It is sent on the wire (as an HTTP header, object metadata, or filename token depending on the connector) so receivers can deduplicate repeated deliveries." },
+      { question: "How does replay differ from DLQ replay?", answer: "Item replay re-enqueues a specific (binding, event) pair with a new attempt number. DLQ replay deletes the dead-letter row and re-enqueues with attempt=1. Both preserve the same idempotency key so receivers can deduplicate." }
+    ],
+    mentions: ["delivery history", "replay", "attempt log", "idempotency"]
   },
   {
     slug: "delivery-dlq",
@@ -14026,6 +15103,8 @@ var sections36 = [
     description: "Inspect and manage permanently failed deliveries in the dead letter queue. Replay individual items or discard them.",
     content: [
       { type: "paragraph", text: "Deliveries that exhaust all retry attempts are moved to the dead letter queue (DLQ). Items in the DLQ can be inspected, replayed (enqueues a fresh attempt), or deleted." },
+      { type: "paragraph", text: "The DLQ is the terminal state for failed deliveries. Common error codes include `connector_5xx` (destination returned a server error), `auth_failed` (invalid credentials), `ssrf_blocked` (destination URL resolves to a private network), and `payload_too_large` (payload exceeds the cap). Non-retryable errors (`auth_failed`, `ssrf_blocked`) skip the retry ladder and go directly to the DLQ." },
+      { type: "callout", variant: "warning", text: "DLQ replay **deletes** the dead-letter row before enqueuing the new attempt. If the enqueue fails, the DLQ row is lost. This is an intentional trade-off: the system prefers losing a DLQ row over duplicating a delivery." },
       {
         type: "endpoint",
         method: "GET",
@@ -14158,9 +15237,11 @@ var sections36 = [
       { label: "List Destinations", slug: "list-delivery-destinations" }
     ],
     faq: [
-      { question: "When does a delivery end up in the DLQ?", answer: "A delivery enters the dead letter queue after exhausting all retry attempts defined in the binding delivery_policy (default 7 attempts with exponential backoff over ~10 hours)." }
+      { question: "When does a delivery end up in the DLQ?", answer: "A delivery enters the dead letter queue after exhausting all retry attempts defined in the binding delivery_policy (default 7 attempts with exponential backoff over ~10 hours)." },
+      { question: "What error codes indicate non-retryable failures?", answer: "auth_failed (invalid credentials), ssrf_blocked (destination URL resolves to private network), and payload_too_large (exceeds payload cap) skip the retry ladder and go directly to the DLQ after a single attempt." },
+      { question: "Can I set up alerts for DLQ entries?", answer: "Yes. Create a binding with signal_filter event_type: delivery.item.failed targeting your alerting webhook. The delivery system emits meta-signals for failed deliveries, with built-in loop prevention to avoid infinite cascades." }
     ],
-    mentions: ["dead letter queue", "DLQ", "failed delivery"]
+    mentions: ["dead letter queue", "DLQ", "failed delivery", "error codes"]
   },
   {
     slug: "delivery-catalog",
@@ -14170,6 +15251,7 @@ var sections36 = [
     description: "Discover available signals, deliverable types, serializer formats, and connector types from the delivery registry. Use the catalog to build valid bindings.",
     content: [
       { type: "paragraph", text: "The catalog endpoints expose the four delivery registries. Use them to discover which event types, deliverable resolvers, serializer formats, and connectors are available before creating bindings." },
+      { type: "paragraph", text: "Walk the catalog **top-down** to build valid binding configurations: start with signals to pick an event type, then check which deliverables are compatible with that signal, which serializers support the deliverable shape, and which connectors accept the serializer format. This ensures every combination passes the compatibility triangle." },
       {
         type: "endpoint",
         method: "GET",
@@ -14362,9 +15444,11 @@ var sections36 = [
       { label: "List Destinations", slug: "list-delivery-destinations" }
     ],
     faq: [
-      { question: "How do I know which combinations are valid for a binding?", answer: "Use the catalog endpoints to check compatibility. A valid binding requires: the signal event_type exists, the deliverable type lists that signal in compatible_signals, the serializer supports the deliverable shape, and the connector supports the serializer format." }
+      { question: "How do I know which combinations are valid for a binding?", answer: "Use the catalog endpoints to check compatibility. A valid binding requires: the signal event_type exists, the deliverable type lists that signal in compatible_signals, the serializer supports the deliverable shape, and the connector supports the serializer format." },
+      { question: "What is the difference between record and file delivery semantics?", answer: "Record semantics (webhook) deliver one event per HTTP request. File semantics (S3, SFTP, Azure Blob, Google Drive, OneDrive) write each delivery as a separate file/object, using a configurable filename template with tokens like {event_id} and {timestamp_iso}." },
+      { question: "Are all catalog entries available for use?", answer: "Most entries are live. Some deliverable types (case_snapshot, graph_relations) are registered as stubs with empty compatible_signals arrays \u2014 they appear in the catalog but cannot be used in bindings until their resolvers are implemented." }
     ],
-    mentions: ["delivery catalog", "registry discovery"]
+    mentions: ["delivery catalog", "registry discovery", "compatibility"]
   }
 ];
@@ -14378,6 +15462,17 @@ var sections37 = [
     description: "List all extraction batches with status, item counts, and provider information. Batches group documents submitted for deferred extraction at 50% cost.",
     content: [
       { type: "paragraph", text: "Batch inference defers Claude extraction to the provider batch API at **50% cost**. Documents uploaded with `processing_mode=batch` run OCR and classification immediately but queue extraction. Batches accumulate items, submit to the provider on a timer or threshold, and poll for results." },
+      { type: "paragraph", text: "Talonic supports two batch providers: the **Anthropic Message Batches API** (direct) and **AWS Bedrock** (`CreateModelInvocationJob`). The provider is auto-detected from your workspace configuration. Results are typically delivered within 24 hours, with a maximum SLA of 48 hours." },
+      {
+        type: "list",
+        items: [
+          "Documents run Stage 1 (OCR + classification + triage) immediately on upload.",
+          "Stage 2 (Claude extraction) is deferred to the batch API at 50% of standard cost.",
+          "Batches require a minimum of 100 items (Bedrock requirement). Smaller uploads fall back to realtime extraction.",
+          "Results are polled hourly. Use the sync endpoint to check sooner."
+        ]
+      },
+      { type: "callout", variant: "info", text: "Image-only documents (no text content) cannot enter the batch pipeline because the text-only batch payload cannot carry image bytes. These are automatically routed to realtime extraction." },
       {
         type: "endpoint",
         method: "GET",
@@ -14467,7 +15562,8 @@ var sections37 = [
     ],
     faq: [
       { question: "What are the batch statuses?", answer: "Batches progress through: accumulating (collecting items), submitted (sent to provider), completed (all results received), failed (provider error), or cancelled (manually cancelled)." },
-      { question: "How much does batch processing cost?", answer: "Batch processing runs at 50% of the standard extraction cost. Results are delivered within 48 hours." }
+      { question: "How much does batch processing cost?", answer: "Batch processing runs at 50% of the standard extraction cost. Results are delivered within 48 hours." },
+      { question: "What is the minimum batch size?", answer: "Batches require a minimum of 100 items (a Bedrock requirement enforced for both providers). Uploads below this threshold fall back to realtime extraction with a warning." }
     ],
     mentions: ["batch inference", "extraction batches", "cost reduction"]
   },
@@ -14478,6 +15574,8 @@ var sections37 = [
     seoTitle: "Get Batch Detail \u2014 Talonic Docs",
     description: "Get detailed information for a single extraction batch including item counts, provider, status, and timing. Shows per-item breakdown when the batch is completed.",
     content: [
+      { type: "paragraph", text: "Retrieve the full batch record including per-item status. Poll this endpoint while `status` is `submitted` to track progress. Once `completed`, each item shows its individual outcome and processing timestamp." },
+      { type: "callout", variant: "info", text: "Items that fail extraction in the batch are retried via **realtime** extraction (never re-batched) to preserve the original 48-hour SLA. Check `items[].status` for per-document outcomes." },
       {
         type: "endpoint",
         method: "GET",
@@ -14561,8 +15659,11 @@ var sections37 = [
       { label: "List Batches", slug: "list-batches" },
       { label: "Sync Batch", slug: "sync-batch" }
     ],
-    faq: [],
-    mentions: ["batch detail", "provider metadata"]
+    faq: [
+      { question: "What happens to failed batch items?", answer: "Items that fail during batch processing are retried via realtime extraction (not re-batched). The document status transitions from batch_queued to the standard extraction pipeline." },
+      { question: "How long do batch results take?", answer: "Results are typically delivered within 24 hours. The maximum SLA is 48 hours. Use the sync endpoint to poll the provider for results before the automatic hourly check." }
+    ],
+    mentions: ["batch detail", "provider metadata", "per-item status"]
   },
   {
     slug: "sync-batch",
@@ -14571,6 +15672,7 @@ var sections37 = [
     seoTitle: "Sync Batch with Provider \u2014 Talonic Docs",
     description: "Force a sync with the provider to check for batch results. Useful when you do not want to wait for the hourly automatic poll.",
     content: [
+      { type: "paragraph", text: "Force an immediate check with the batch provider (Anthropic or Bedrock) for results. By default, batches are polled automatically every hour. Use this endpoint when you need results sooner or want to verify the current provider-side status." },
       {
         type: "endpoint",
         method: "POST",
@@ -14635,7 +15737,8 @@ var sections37 = [
       { label: "Cancel Batch", slug: "cancel-batch" }
     ],
     faq: [
-      { question: "How often are batches polled automatically?", answer: "The system polls the provider every hour for results. Use the sync endpoint to trigger an immediate check." }
+      { question: "How often are batches polled automatically?", answer: "The system polls the provider every hour for results. Use the sync endpoint to trigger an immediate check." },
+      { question: "Can I sync a batch that is still accumulating?", answer: "Yes, but it will have no effect since the batch has not been submitted to the provider yet. The status will remain accumulating." }
     ],
     mentions: ["batch sync", "provider poll"]
   },
@@ -14646,6 +15749,7 @@ var sections37 = [
     seoTitle: "Cancel Extraction Batch \u2014 Talonic Docs",
     description: "Cancel an in-progress extraction batch. Only batches in accumulating or submitted status can be cancelled. Completed batches cannot be rolled back.",
     content: [
+      { type: "paragraph", text: "Cancel a batch that is still `accumulating` or `submitted`. Cancellation sends a stop request to the provider if the batch was already submitted. Documents in the cancelled batch revert to `batch_queued` status and can be resubmitted or processed via realtime extraction." },
       {
         type: "endpoint",
         method: "POST",
@@ -14710,7 +15814,10 @@ var sections37 = [
       { label: "List Batches", slug: "list-batches" },
       { label: "Batch Detail", slug: "get-batch" }
     ],
-    faq: [],
+    faq: [
+      { question: "Can I cancel a completed batch?", answer: "No. Only batches in accumulating or submitted status can be cancelled. Completed, failed, and already-cancelled batches return a 400 error." },
+      { question: "What happens to documents after cancellation?", answer: "Documents revert to batch_queued status. You can resubmit them in a new batch or process them via realtime extraction by updating their processing_mode." }
+    ],
     mentions: ["batch cancellation"]
   }
 ];
@@ -14725,6 +15832,15 @@ var sections38 = [
     description: "List cases with optional status and search filters. Supports pagination for large result sets. Cases group related documents via shared entities.",
     content: [
       { type: "paragraph", text: "Cases group documents that share entities discovered through linking. Use filters to narrow results by status or search term." },
+      { type: "paragraph", text: "A **case** is a cluster of documents connected by shared field values in the linking graph. The linking engine discovers entity overlaps \u2014 such as matching vendor IDs, policy numbers, or account references \u2014 and groups the connected documents into a single case for review." },
+      {
+        type: "list",
+        items: [
+          "Filter by `search` to find cases by label or document content.",
+          "Use `min_documents` to exclude trivial single-document cases.",
+          "Cases are created automatically \u2014 there is no manual create endpoint."
+        ]
+      },
       {
         type: "endpoint",
         method: "GET",
@@ -14788,7 +15904,9 @@ var sections38 = [
       { label: "Case Anomalies", slug: "case-anomalies" }
     ],
     faq: [
-      { question: "How are cases created?", answer: "Cases are created automatically when the linking engine discovers shared entities across two or more documents." }
+      { question: "How are cases created?", answer: "Cases are created automatically when the linking engine discovers shared entities across two or more documents." },
+      { question: "Can I create a case manually?", answer: "No. Cases are discovered automatically through the linking graph. You can manually adjust case membership using the merge, split, pin, and remove document endpoints." },
+      { question: "What is a case key?", answer: "A case key is a hex hash (8-64 characters) that uniquely identifies a case within your workspace. It is derived from the linked entity values that form the case." }
     ],
     mentions: ["cases", "linking", "documents"]
   },
@@ -14799,6 +15917,8 @@ var sections38 = [
     seoTitle: "Get Case Endpoint \u2014 Talonic Docs",
     description: "Retrieve a case by its key (e.g. CASE-001) including linked documents, shared entities, AI-generated narration, label, and anomaly count.",
     content: [
+      { type: "paragraph", text: "Retrieve the full detail of a case including its documents, AI-generated narrative summary, and anomaly count. The narrative is generated by Claude and summarizes the relationships between documents in the case." },
+      { type: "callout", variant: "info", text: "The `narrative` field is generated on demand via `POST /v1/cases/:key/narrate`. It will be `null` until narration is triggered for this case." },
       {
         type: "endpoint",
         method: "GET",
@@ -14881,7 +16001,10 @@ var sections38 = [
       { label: "Case Operations", slug: "case-operations" },
       { label: "Case Evidence", slug: "case-evidence" }
     ],
-    faq: [],
+    faq: [
+      { question: "What is the case narrative?", answer: "The narrative is an AI-generated summary produced by Claude that describes the relationships, patterns, and key facts across the documents in the case. It is generated on demand via POST /v1/cases/:key/narrate." },
+      { question: "Why is the narrative field null?", answer: "Narration is generated on demand. Call POST /v1/cases/:key/narrate to generate a narrative, then retrieve it with this endpoint." }
+    ],
     mentions: ["case detail", "narration", "shared entities"]
   },
   {
@@ -14892,6 +16015,17 @@ var sections38 = [
     description: "List anomalies detected within a case with severity filtering and dismissed toggle. Dismiss individual anomalies after review.",
     content: [
       { type: "paragraph", text: "Anomalies are structural issues detected across the documents in a case \u2014 field conflicts, duplicate key divergence, value reuse patterns, and more." },
+      { type: "paragraph", text: "Five **structural detectors** run automatically when cases are materialized. They surface inconsistencies that may indicate data quality issues, duplicate submissions, or missing documents within a case." },
+      {
+        type: "list",
+        items: [
+          "**Validation cluster** \u2014 groups of fields that consistently fail validation together.",
+          "**Field conflict** \u2014 contradictory values for the same field across documents.",
+          "**Duplicate key divergence** \u2014 documents share a key but differ on other fields.",
+          "**Missing document type** \u2014 a case is missing an expected document type based on templates.",
+          "**Value reuse** \u2014 the same value appears in unrelated fields across documents."
+        ]
+      },
       {
         type: "endpoint",
         method: "GET",
@@ -14975,9 +16109,11 @@ var sections38 = [
       { label: "Case Evidence", slug: "case-evidence" }
     ],
     faq: [
-      { question: "What types of anomalies are detected?", answer: "Five structural detectors: validation cluster, field conflict, duplicate key divergence, missing document type, and value reuse." }
+      { question: "What types of anomalies are detected?", answer: "Five structural detectors: validation cluster, field conflict, duplicate key divergence, missing document type, and value reuse." },
+      { question: "Can I un-dismiss an anomaly?", answer: "Dismissal is a one-way operation. If a similar anomaly is detected again during case recomputation, it will appear as a new anomaly entry." },
+      { question: "How are anomaly severities assigned?", answer: "Severity is assigned by each detector based on the scope of the issue. For example, a field conflict across many documents is rated higher than a conflict in just two." }
     ],
-    mentions: ["anomalies", "severity", "dismiss"]
+    mentions: ["anomalies", "severity", "dismiss", "structural detectors"]
   },
   {
     slug: "case-evidence",
@@ -14986,6 +16122,8 @@ var sections38 = [
     seoTitle: "Case Evidence Endpoint \u2014 Talonic Docs",
     description: "List evidence items within a case. Filter by validation status, source document, category, or free-text search across evidence fields.",
     content: [
+      { type: "paragraph", text: "Evidence items are the extracted field values from documents in a case, annotated with validation status and confidence scores. Use evidence to audit the data quality within a case and understand which fields link documents together." },
+      { type: "callout", variant: "info", text: "Evidence is produced by the evidence validation engine, which runs rule-based validators (structural checks, checksum validation, domain packs) against extracted values. Each evidence item records the validation outcome for a specific field on a specific document." },
       {
         type: "endpoint",
         method: "GET",
@@ -15052,8 +16190,11 @@ var sections38 = [
       { label: "Get Case", slug: "get-case" },
       { label: "Case Anomalies", slug: "case-anomalies" }
     ],
-    faq: [],
-    mentions: ["evidence", "validation", "case evidence"]
+    faq: [
+      { question: "What is the difference between evidence and anomalies?", answer: "Evidence items are individual field values with validation status and confidence. Anomalies are higher-level structural issues detected across multiple evidence items, such as field conflicts or duplicate key divergence." },
+      { question: "How is the confidence score computed?", answer: "Confidence is assigned during extraction and reflects how certain the AI is about the extracted value. It ranges from 0 (low confidence) to 1 (high confidence) and is independent of the validation status." }
+    ],
+    mentions: ["evidence", "validation", "case evidence", "confidence"]
   },
   {
     slug: "case-operations",
@@ -15063,6 +16204,8 @@ var sections38 = [
     description: "Update the status of a case. These operations modify case metadata.",
     content: [
       { type: "paragraph", text: "Operations that modify a case: update the case status (lifecycle management)." },
+      { type: "paragraph", text: "Cases follow a lifecycle: `discovered` &rarr; `confirmed` &rarr; `active` &rarr; `resolved`. Transition the status as your team reviews and processes the case. Resolved cases can include optional notes documenting the resolution." },
+      { type: "callout", variant: "info", text: "Status transitions are not strictly enforced \u2014 you can move a case to any valid status. However, setting `resolved` records a `resolved_at` timestamp that cannot be cleared by reverting to an earlier status." },
       {
         type: "endpoint",
         method: "PATCH",
@@ -15116,8 +16259,11 @@ var sections38 = [
       { label: "Get Case", slug: "get-case" },
       { label: "Case Merge & Split", slug: "case-merge-split" }
     ],
-    faq: [],
-    mentions: ["status", "case operations"]
+    faq: [
+      { question: "What are the valid case statuses?", answer: "Four lifecycle statuses: discovered (initial), confirmed (verified as real), active (being worked), and resolved (complete). The typical flow is discovered -> confirmed -> active -> resolved." },
+      { question: "Can I reopen a resolved case?", answer: "Yes, you can set the status back to active or confirmed. However, the resolved_at timestamp will remain set from the original resolution." }
+    ],
+    mentions: ["status", "case operations", "lifecycle"]
   },
   {
     slug: "case-edges",
@@ -15127,6 +16273,8 @@ var sections38 = [
     description: "List, confirm, or reject linking edges within a case. Edges represent shared-value connections between documents in the linking graph.",
     content: [
       { type: "paragraph", text: "Edges are the connections between documents in a case, created by the linking engine when documents share field values. Confirm or reject edges to refine case membership." },
+      { type: "paragraph", text: "Each edge connects two documents through a shared field value \u2014 for example, both documents might contain the same `vendor_id`. The `final_score` reflects how strong the connection is, based on learned field-pair weights. Confirming or rejecting edges feeds back into the weight-learning system to improve future case discovery." },
+      { type: "callout", variant: "warning", text: "Rejecting an edge may split the case if the rejected edge was the only connection between a subset of documents. The affected documents will form a separate case." },
       {
         type: "endpoint",
         method: "GET",
@@ -15225,9 +16373,10 @@ var sections38 = [
       { label: "Linking", slug: "linking-overview" }
     ],
     faq: [
-      { question: "What happens when I reject an edge?", answer: "The connection between the two documents is removed. If this disconnects a document from all others in the case, it may be removed from the case." }
+      { question: "What happens when I reject an edge?", answer: "The connection between the two documents is removed. If this disconnects a document from all others in the case, it may be removed from the case." },
+      { question: "Do edge confirmations and rejections affect future cases?", answer: "Yes. Confirmations and rejections update the learned field-pair weights (FieldBindingWeight), which adjusts edge scores in future case discovery runs. This creates a feedback loop that improves case quality over time." }
     ],
-    mentions: ["edges", "linking", "confirm", "reject"]
+    mentions: ["edges", "linking", "confirm", "reject", "field-pair weights"]
   },
   {
     slug: "case-documents",
@@ -15236,6 +16385,8 @@ var sections38 = [
     seoTitle: "Case Documents Endpoints \u2014 Talonic Docs",
     description: "Pin or remove documents within a case. Pinned documents are highlighted in the case view and preserved during case operations.",
     content: [
+      { type: "paragraph", text: "Manage document membership within a case. **Pin** a document to mark it as important \u2014 pinned documents are highlighted in the UI and preserved during split operations. **Remove** a document to detach it from the case entirely." },
+      { type: "callout", variant: "info", text: "Removing a document from a case does not delete the document itself. The document remains in your workspace and may be re-linked into a case during the next recompute cycle if linking edges still exist." },
       {
         type: "endpoint",
         method: "POST",
@@ -15292,7 +16443,10 @@ var sections38 = [
       { label: "Get Case", slug: "get-case" },
       { label: "Case Edges", slug: "case-edges" }
     ],
-    faq: [],
+    faq: [
+      { question: "Does removing a document delete it?", answer: "No. Removing a document only detaches it from the case. The document remains in your workspace and can still appear in other cases or be re-linked in a future recompute." },
+      { question: "What does pinning a document do?", answer: "Pinning highlights the document in the case view and ensures it is preserved in the original partition during split operations. It does not affect linking or anomaly detection." }
+    ],
     mentions: ["pin", "documents", "case documents"]
   },
   {
@@ -15303,6 +16457,8 @@ var sections38 = [
     description: "Split a case into two cases or merge two cases into one. Restructure case boundaries when automatic grouping needs adjustment.",
     content: [
       { type: "paragraph", text: "When the automatic case grouping does not match your needs, split a case into two partitions or merge two cases together." },
+      { type: "paragraph", text: "These operations let you manually adjust case boundaries. **Splitting** divides a case by assigning each document to one of two partitions. **Merging** combines all documents from two cases into a single case. Both operations preserve linking edges and re-evaluate anomalies on the resulting cases." },
+      { type: "callout", variant: "warning", text: "When merging, `case_key_b` is resolved (set to `resolved` status) and its documents are moved into `case_key_a`. The merged case key is derived from case A." },
       {
         type: "endpoint",
         method: "POST",
@@ -15386,7 +16542,9 @@ var sections38 = [
       { label: "Case Operations", slug: "case-operations" }
     ],
     faq: [
-      { question: "What happens to anomalies when cases are merged?", answer: "Anomalies from all source cases are carried over to the merged case and re-evaluated." }
+      { question: "What happens to anomalies when cases are merged?", answer: "Anomalies from all source cases are carried over to the merged case and re-evaluated." },
+      { question: "Do I need to include every document when splitting?", answer: "Yes. Every document in the case must appear in exactly one of partition_a or partition_b. Omitting a document or including it in both partitions will return a 400 error." },
+      { question: "Can I merge more than two cases at once?", answer: "No. The merge endpoint accepts exactly two case keys. To merge multiple cases, chain merge calls \u2014 merge A and B first, then merge the result with C." }
     ],
     mentions: ["merge", "split", "case restructuring"]
   }
@@ -15402,6 +16560,11 @@ var sections39 = [
     description: "List pending review items in the review queue with pagination. Review items are validation records awaiting human approval, rejection, or flagging.",
     content: [
       { type: "paragraph", text: "The review queue surfaces validation records that require human judgment before delivery. Items appear when extraction confidence is below the auto-approval threshold or when anomalies are detected." },
+      { type: "list", ordered: false, items: [
+        "Filter by `status` to see only `pending`, `approved`, or `rejected` records",
+        "Use cursor-based pagination to iterate through large queues",
+        "Sort by `created_at` in ascending or descending order"
+      ] },
       {
         type: "endpoint",
         method: "GET",
@@ -15488,9 +16651,11 @@ var sections39 = [
       { label: "Get Review Item", slug: "get-review-item" }
     ],
     faq: [
-      { question: "When do items appear in the review queue?", answer: "Items are queued when extraction confidence falls below the auto-approval threshold or when anomalies require human verification." }
+      { question: "When do items appear in the review queue?", answer: "Items are queued when extraction confidence falls below the auto-approval threshold or when anomalies require human verification." },
+      { question: "How do I paginate through all review items?", answer: "Pass the `next_cursor` value from the response as the `cursor` query parameter in your next request. Continue until `has_more` is false." },
+      { question: "Can I filter review items by document or schema?", answer: "The list endpoint supports filtering by `status`. To find review items for a specific document, retrieve all pending items and filter client-side by `document_id`." }
     ],
-    mentions: ["review", "queue", "validation"]
+    mentions: ["review", "queue", "validation", "pagination"]
   },
   {
     slug: "review-stats",
@@ -15499,6 +16664,7 @@ var sections39 = [
     seoTitle: "Review Queue Statistics \u2014 Talonic Docs",
     description: "Get statistics for the review queue including total pending items, items by status, and average time in queue. Useful for monitoring review backlog.",
     content: [
+      { type: "paragraph", text: "Get a summary of the review queue broken down by status. Use this endpoint to monitor backlog size, track review throughput, and trigger alerts when pending items exceed a threshold." },
       {
         type: "endpoint",
         method: "GET",
@@ -15541,8 +16707,11 @@ var sections39 = [
       { label: "List Review Items", slug: "list-review-items" },
       { label: "Review Batch", slug: "review-batch" }
     ],
-    faq: [],
-    mentions: ["review statistics", "queue metrics"]
+    faq: [
+      { question: "Does the stats endpoint count all-time or only active items?", answer: "It counts all review records across all statuses, including already-approved and rejected items. Use the `by_status.pending` value to see only the active backlog." },
+      { question: "How often should I poll review stats?", answer: "Stats are computed on each request. For dashboard polling, an interval of 30-60 seconds is reasonable. For high-throughput pipelines, consider using webhooks for real-time notifications." }
+    ],
+    mentions: ["review statistics", "queue metrics", "backlog monitoring"]
   },
   {
     slug: "get-review-item",
@@ -15551,6 +16720,8 @@ var sections39 = [
     seoTitle: "Get Review Item Endpoint \u2014 Talonic Docs",
     description: "Retrieve a single review item by ID with full detail including extracted data, confidence scores, and validation flags for human review.",
     content: [
+      { type: "paragraph", text: "Retrieve full details for a single review item, including per-field review decisions, low-confidence fields that triggered the review, and any reviewer comments. This endpoint provides the data needed to build custom review interfaces." },
+      { type: "callout", variant: "info", text: "The `low_confidence_fields` array lists the specific field keys that fell below the confidence threshold and triggered this item for review. Use this to highlight problematic fields in your review UI." },
       {
         type: "endpoint",
         method: "GET",
@@ -15617,8 +16788,11 @@ var sections39 = [
       { label: "Review Action", slug: "review-action" },
       { label: "Review Assign", slug: "review-assign" }
     ],
-    faq: [],
-    mentions: ["review detail", "confidence", "flagged fields"]
+    faq: [
+      { question: "What are low_confidence_fields?", answer: "An array of field keys where the extraction confidence fell below the auto-approval threshold. These are the fields that caused the record to be queued for human review." },
+      { question: "Can I see the extracted data for a review item?", answer: "The review item includes the document_id and run_id. Use these to retrieve the full extraction output from the extraction or structuring endpoints." }
+    ],
+    mentions: ["review detail", "confidence", "flagged fields", "field decisions"]
   },
   {
     slug: "review-action",
@@ -15627,6 +16801,8 @@ var sections39 = [
     seoTitle: "Review Action Endpoint \u2014 Talonic Docs",
     description: "Take an action on a review item: approve, reject, or flag. Optionally include a comment explaining the decision for audit purposes.",
     content: [
+      { type: "paragraph", text: "Approve or reject a review item to advance it through the delivery pipeline. Approved items proceed to delivery. Rejected items are returned for re-extraction or manual correction. Include an optional `reason` for audit trail purposes." },
+      { type: "callout", variant: "warning", text: "Review actions are final. Once a record is approved or rejected, it cannot be reverted to pending status through the API." },
       {
         type: "endpoint",
         method: "POST",
@@ -15697,9 +16873,10 @@ var sections39 = [
       { label: "Review Batch", slug: "review-batch" }
     ],
     faq: [
-      { question: "What happens after approval?", answer: "Approved items proceed to delivery. Rejected items are returned for re-extraction or manual correction." }
+      { question: "What happens after approval?", answer: "Approved items proceed to delivery. Rejected items are returned for re-extraction or manual correction." },
+      { question: "Is the reason field stored for audit purposes?", answer: "Yes. The reason is stored as the review_comment on the record and is visible when retrieving the review item detail." }
     ],
-    mentions: ["approve", "reject", "flag", "review action"]
+    mentions: ["approve", "reject", "flag", "review action", "audit trail"]
   },
   {
     slug: "review-batch",
@@ -15708,6 +16885,8 @@ var sections39 = [
     seoTitle: "Batch Review Endpoint \u2014 Talonic Docs",
     description: "Batch approve or reject multiple review items in a single request. Useful for clearing the review queue when items share similar characteristics.",
     content: [
+      { type: "paragraph", text: "Process multiple review items in a single API call. This is useful for clearing backlogs when you have high-confidence items that can be bulk-approved, or when rejecting a batch of items from a failed extraction run." },
+      { type: "callout", variant: "info", text: "The batch endpoint processes items independently. If some items fail (e.g. not found), the remaining items are still processed. Check the `results` array for per-item outcomes." },
       {
         type: "endpoint",
         method: "POST",
@@ -15763,8 +16942,11 @@ var sections39 = [
       { label: "Review Action", slug: "review-action" },
       { label: "Review Stats", slug: "review-stats" }
     ],
-    faq: [],
-    mentions: ["batch", "bulk review"]
+    faq: [
+      { question: "Is there a limit on how many items I can batch?", answer: "There is no hard limit on array size, but very large batches may take longer to process. For best performance, batch in groups of 50-100 items." },
+      { question: "What happens if some items in the batch are already approved?", answer: "Already-actioned items are skipped and reported with an error status in the results array. The remaining items are still processed." }
+    ],
+    mentions: ["batch", "bulk review", "batch processing"]
   },
   {
     slug: "review-assign",
@@ -15773,6 +16955,7 @@ var sections39 = [
     seoTitle: "Assign Review Item Endpoint \u2014 Talonic Docs",
     description: "Assign a review item to a team member for review. Assigned items appear in the assignee's personal review queue.",
     content: [
+      { type: "paragraph", text: "Assign a pending review item to a specific team member. Assignments help distribute review workload and track who is responsible for each item. Pass `null` as the `user_id` to unassign an item." },
       {
         type: "endpoint",
         method: "POST",
@@ -15841,8 +17024,11 @@ var sections39 = [
       { label: "Get Review Item", slug: "get-review-item" },
       { label: "Review Action", slug: "review-action" }
     ],
-    faq: [],
-    mentions: ["assign", "team", "review assignment"]
+    faq: [
+      { question: "Can I assign an already-reviewed item?", answer: "Assignment is only meaningful for pending items. You can technically assign a reviewed item, but it has no effect on the review workflow since the item has already been actioned." },
+      { question: "How do I unassign a review item?", answer: "Pass `null` as the `user_id` in the request body. The `assigned_to` field will be set to null." }
+    ],
+    mentions: ["assign", "team", "review assignment", "workload distribution"]
   }
 ];
@@ -15856,6 +17042,11 @@ var sections40 = [
     description: "List all ground truth datasets used for benchmarking extraction accuracy. Each dataset contains manually verified entries that serve as the gold standard.",
     content: [
       { type: "paragraph", text: "Ground truth datasets contain manually verified data entries that serve as the gold standard for measuring extraction accuracy. Create datasets, add entries, then run benchmarks against extraction results." },
+      { type: "list", ordered: false, items: [
+        "Each dataset contains verified entries mapping documents to expected field values",
+        "Datasets can be scoped to a specific user schema via `user_schema_id`",
+        "Use datasets as inputs to benchmark runs for per-field accuracy measurement"
+      ] },
       {
         type: "endpoint",
         method: "GET",
@@ -15931,8 +17122,11 @@ var sections40 = [
       { label: "Create Dataset", slug: "create-quality-dataset" },
       { label: "List Benchmarks", slug: "list-benchmarks" }
     ],
-    faq: [],
-    mentions: ["ground truth", "quality", "benchmarking"]
+    faq: [
+      { question: "How many ground truth datasets can I create?", answer: "There is no hard limit on the number of datasets. Create separate datasets for different document types or schema versions to track accuracy independently." },
+      { question: "What is the recommended number of entries per dataset?", answer: "For statistically meaningful accuracy scores, aim for at least 30-50 entries per dataset. Smaller datasets may produce volatile accuracy metrics." }
+    ],
+    mentions: ["ground truth", "quality", "benchmarking", "datasets"]
   },
   {
     slug: "create-quality-dataset",
@@ -15941,6 +17135,7 @@ var sections40 = [
     seoTitle: "Create Ground Truth Dataset \u2014 Talonic Docs",
     description: "Create a new ground truth dataset linked to a schema. The dataset defines the expected extraction output used for accuracy benchmarking.",
     content: [
+      { type: "paragraph", text: "Create an empty ground truth dataset that you can populate with verified entries. Datasets serve as the baseline for benchmark runs that measure extraction accuracy. After creating a dataset, add entries individually or import them in bulk via CSV." },
       {
         type: "endpoint",
         method: "POST",
@@ -16001,7 +17196,10 @@ var sections40 = [
       { label: "List Datasets", slug: "list-quality-datasets" },
       { label: "Quality Entries", slug: "quality-entries" }
     ],
-    faq: [],
+    faq: [
+      { question: "Do I need to link a dataset to a schema?", answer: "No. The user_schema_id is optional. However, linking to a schema ensures that your ground truth entries use the correct field names and makes benchmark results more meaningful." },
+      { question: "Can I rename a dataset after creation?", answer: "Dataset metadata (name, description) is set at creation time. To change it, delete the dataset and create a new one with the desired name." }
+    ],
     mentions: ["create ground truth", "dataset"]
   },
   {
@@ -16011,6 +17209,8 @@ var sections40 = [
     seoTitle: "Get or Delete Ground Truth Dataset \u2014 Talonic Docs",
     description: "Retrieve a ground truth dataset by ID with metadata and entry count, or delete it permanently. Deleting a dataset does not remove associated benchmark results.",
     content: [
+      { type: "paragraph", text: "Retrieve a dataset with its metadata and sample entries, or delete it permanently. The GET response includes a `samples` array with the actual ground truth entries, allowing you to inspect the expected values for each document." },
+      { type: "callout", variant: "warning", text: "Deleting a dataset is permanent. However, benchmark results that used this dataset are retained for historical reference. The benchmark will show the dataset_id but the dataset itself will no longer be retrievable." },
       {
         type: "endpoint",
         method: "GET",
@@ -16096,8 +17296,11 @@ var sections40 = [
       { label: "List Datasets", slug: "list-quality-datasets" },
       { label: "Quality Entries", slug: "quality-entries" }
     ],
-    faq: [],
-    mentions: ["ground truth detail", "delete dataset"]
+    faq: [
+      { question: "Are benchmark results deleted when I delete a dataset?", answer: "No. Benchmark results are retained for historical reference even after the source dataset is deleted." },
+      { question: "Does the GET response include all entries?", answer: "Yes. The `samples` array contains all ground truth entries in the dataset. For very large datasets, this response may be sizable." }
+    ],
+    mentions: ["ground truth detail", "delete dataset", "samples"]
   },
   {
     slug: "quality-entries",
@@ -16107,6 +17310,7 @@ var sections40 = [
     description: "List, add, import, or delete entries in a ground truth dataset. Entries represent individual verified data points used for benchmarking extraction accuracy.",
     content: [
       { type: "paragraph", text: "Entries are the individual verified data points within a ground truth dataset. Add them manually, or import in bulk via CSV." },
+      { type: "callout", variant: "info", text: "Each entry maps a `document_id` to an `expected_data` object containing the verified field values. Field keys in `expected_data` should match the field names used in your extraction schema for accurate benchmark comparisons." },
       {
         type: "endpoint",
         method: "GET",
@@ -16190,6 +17394,7 @@ var sections40 = [
   "created_at": "2024-09-05T12:00:00.000Z"
 }`
       },
+      { type: "paragraph", text: "For bulk entry creation, use the CSV import endpoint. The CSV must have `document_id` as the first column, with remaining columns matching your schema field names." },
       {
         type: "endpoint",
         method: "POST",
@@ -16256,9 +17461,11 @@ var sections40 = [
       { label: "Create Benchmark", slug: "create-benchmark" }
     ],
     faq: [
-      { question: "What CSV format is expected for import?", answer: "The first column must be document_id. Remaining columns should match the schema field names. Header row is required." }
+      { question: "What CSV format is expected for import?", answer: "The first column must be document_id. Remaining columns should match the schema field names. Header row is required." },
+      { question: "Can I add multiple entries for the same document?", answer: "Each document should have one entry per dataset. Duplicate document_id rows in CSV imports are skipped and counted in the `skipped` total." },
+      { question: "What happens if expected_data field names do not match the schema?", answer: "Unmatched fields are stored but ignored during benchmark comparison. Only fields present in both the ground truth entry and the extraction output are compared." }
     ],
-    mentions: ["entries", "ground truth values", "CSV import"]
+    mentions: ["entries", "ground truth values", "CSV import", "bulk import"]
   },
   {
     slug: "list-benchmarks",
@@ -16267,6 +17474,7 @@ var sections40 = [
     seoTitle: "List Benchmark Runs \u2014 Talonic Docs",
     description: "List benchmark runs that compare extraction results against ground truth datasets. Each run produces per-field accuracy metrics.",
     content: [
+      { type: "paragraph", text: "Benchmark runs compare your extraction output against ground truth datasets to produce per-field accuracy scores. Each run evaluates every document in the dataset and produces an `accuracy_overall` score along with per-field breakdowns. Use benchmarks to track extraction quality over time and measure the impact of schema or pipeline changes." },
       {
         type: "endpoint",
         method: "GET",
@@ -16362,8 +17570,11 @@ var sections40 = [
       { label: "Create Benchmark", slug: "create-benchmark" },
       { label: "Benchmark Results", slug: "get-benchmark-results" }
     ],
-    faq: [],
-    mentions: ["benchmarks", "accuracy", "quality runs"]
+    faq: [
+      { question: "What benchmark statuses are possible?", answer: "Benchmarks progress through `queued` (waiting to start), `running` (evaluating documents), `completed` (results available), or `failed` (an error occurred during evaluation)." },
+      { question: "Why is accuracy_overall null?", answer: "Accuracy scores are only computed after the benchmark run completes. While the status is `queued` or `running`, accuracy fields are null." }
+    ],
+    mentions: ["benchmarks", "accuracy", "quality runs", "per-field accuracy"]
   },
   {
     slug: "create-benchmark",
@@ -16372,6 +17583,8 @@ var sections40 = [
     seoTitle: "Create Benchmark Run \u2014 Talonic Docs",
     description: "Start a benchmark run that compares a job run output against a ground truth dataset. Produces per-field accuracy scores and overall metrics.",
     content: [
+      { type: "paragraph", text: "Start a new benchmark run that evaluates your current extraction output against a ground truth dataset. The benchmark compares each document in the dataset entry-by-entry and field-by-field, producing an overall accuracy score and per-field breakdowns." },
+      { type: "callout", variant: "info", text: "Benchmark runs are asynchronous. The endpoint returns immediately with status `queued`. Poll the benchmark detail endpoint or list benchmarks to check when the run completes." },
       {
         type: "endpoint",
         method: "POST",
@@ -16448,8 +17661,11 @@ var sections40 = [
       { label: "List Benchmarks", slug: "list-benchmarks" },
       { label: "Benchmark Results", slug: "get-benchmark-results" }
     ],
-    faq: [],
-    mentions: ["create benchmark", "run comparison"]
+    faq: [
+      { question: "Can I run multiple benchmarks simultaneously?", answer: "Yes. Benchmark runs are independent and can execute in parallel against different datasets or the same dataset." },
+      { question: "How long does a benchmark take to complete?", answer: "Duration depends on the number of entries in the dataset. A 50-entry dataset typically completes in a few seconds. Poll the benchmark detail to track progress." }
+    ],
+    mentions: ["create benchmark", "run comparison", "accuracy evaluation"]
   },
   {
     slug: "get-benchmark-results",
@@ -16458,6 +17674,7 @@ var sections40 = [
     seoTitle: "Benchmark Results & Comparison \u2014 Talonic Docs",
     description: "Get per-field accuracy results for a benchmark run, or compare two benchmark runs side by side to track extraction quality improvements over time.",
     content: [
+      { type: "paragraph", text: "Retrieve per-document accuracy results for a completed benchmark run, showing which fields matched and which diverged from the ground truth. Each result includes the extracted value, expected value, and whether they matched. Use the compare endpoint to track accuracy improvements across runs." },
       {
         type: "endpoint",
         method: "GET",
@@ -16499,6 +17716,7 @@ var sections40 = [
   ]
 }`
       },
+      { type: "paragraph", text: "To track accuracy trends over time, compare two benchmark runs side by side. The `accuracy_delta` shows the difference in overall accuracy between the two runs." },
       {
         type: "endpoint",
         method: "GET",
@@ -16565,9 +17783,11 @@ var sections40 = [
       { label: "List Datasets", slug: "list-quality-datasets" }
     ],
     faq: [
-      { question: "How is field accuracy calculated?", answer: "Each extracted value is compared to the ground truth entry for the same document and field. Accuracy is the ratio of correct matches to total entries." }
+      { question: "How is field accuracy calculated?", answer: "Each extracted value is compared to the ground truth entry for the same document and field. Accuracy is the ratio of correct matches to total entries." },
+      { question: "What does a negative accuracy_delta mean?", answer: "A negative delta means run_a has lower accuracy than run_b. For example, -0.03 means run_a is 3 percentage points less accurate. Use chronological ordering (older run as run_a) to see improvement as a positive delta." },
+      { question: "Can I compare runs from different datasets?", answer: "Yes, but the comparison only shows overall accuracy differences. Per-field comparisons are most meaningful when both runs use the same ground truth dataset." }
     ],
-    mentions: ["accuracy", "benchmark results", "comparison", "field-level metrics"]
+    mentions: ["accuracy", "benchmark results", "comparison", "field-level metrics", "accuracy delta"]
   }
 ];
@@ -16581,6 +17801,16 @@ var sections41 = [
     description: "List all routing rules ordered by priority. Routing rules match incoming documents by conditions and apply actions like schema assignment, job triggering, or team routing.",
     content: [
       { type: "paragraph", text: "Routing rules automate document processing by matching incoming documents against conditions and applying actions. Rules are evaluated in priority order \u2014 the first matching rule wins." },
+      { type: "paragraph", text: "When a document is classified (after OCR and type detection), the routing engine evaluates each active rule in priority order. The first rule whose conditions match is applied, and subsequent rules are skipped. This lets you build a priority chain: specific rules at the top, catch-all rules at the bottom." },
+      {
+        type: "list",
+        items: [
+          "Rules trigger on `document_classified` events after OCR and type detection.",
+          "Conditions can match `document_type`, `source`, `language`, `sensitivity`, and other metadata.",
+          "Actions include schema assignment, job triggering, and team routing.",
+          "Lower priority numbers are evaluated first."
+        ]
+      },
       {
         type: "endpoint",
         method: "GET",
@@ -16669,7 +17899,9 @@ var sections41 = [
       { label: "Reorder Rules", slug: "reorder-routing-rules" }
     ],
     faq: [
-      { question: "How are routing rules evaluated?", answer: "Rules are evaluated in priority order (lowest number first). The first rule whose conditions match the incoming document is applied." }
+      { question: "How are routing rules evaluated?", answer: "Rules are evaluated in priority order (lowest number first). The first rule whose conditions match the incoming document is applied." },
+      { question: "What happens if no rule matches a document?", answer: "If no routing rule matches, the document proceeds through the default pipeline without any schema assignment or special routing. You can add a low-priority catch-all rule to handle unmatched documents." },
+      { question: "Can I scope a rule to a specific source connection?", answer: "Yes. Set source_connection_id to limit the rule to documents ingested from a specific source (e.g. a particular Google Drive folder or S3 bucket). Documents from other sources will skip the rule." }
     ],
     mentions: ["routing rules", "priority", "document routing"]
   },
@@ -16680,6 +17912,8 @@ var sections41 = [
     seoTitle: "Create Routing Rule Endpoint \u2014 Talonic Docs",
     description: "Create a new routing rule with conditions on document properties and actions to apply when matched. Conditions can match document type, source, and other metadata.",
     content: [
+      { type: "paragraph", text: 'Create a rule that automatically applies actions to incoming documents based on their metadata. Conditions define what to match (e.g. document type equals "invoice"), and actions define what to do (e.g. assign the finance schema). Rules are evaluated on every `document_classified` event.' },
+      { type: "callout", variant: "info", text: "New rules are created with `is_active: true` by default. If you want to test a rule before activating it, create it, then immediately disable it via `PATCH /v1/routing-rules/:id` with `is_active: false`." },
       {
         type: "endpoint",
         method: "POST",
@@ -16768,7 +18002,8 @@ var sections41 = [
     ],
     faq: [
       { question: "What condition fields are available?", answer: "Conditions can match on `document_type`, `source`, `language`, `sensitivity`, and other document metadata fields." },
-      { question: "Can a rule have multiple actions?", answer: "Yes. Actions are executed in order. Common combinations include assigning a schema and triggering a job." }
+      { question: "Can a rule have multiple actions?", answer: "Yes. Actions are executed in order. Common combinations include assigning a schema and triggering a job." },
+      { question: "What is the default priority?", answer: "If you omit the priority field, it defaults to 100. Use the reorder endpoint to adjust priorities after creation." }
     ],
     mentions: ["create rule", "conditions", "actions", "assign_schema", "trigger_job", "route_to"]
   },
@@ -16779,6 +18014,7 @@ var sections41 = [
     seoTitle: "Get, Update, Delete Routing Rule \u2014 Talonic Docs",
     description: "Retrieve, update, or delete a routing rule by ID. Update conditions, actions, priority, or enabled state. Deleting a rule does not affect previously routed documents.",
     content: [
+      { type: "paragraph", text: "Retrieve, update, or delete a single routing rule. Updates take effect immediately \u2014 the next `document_classified` event will use the updated rule. Deleting a rule does not retroactively affect documents that were already routed by it." },
       {
         type: "endpoint",
         method: "GET",
@@ -16946,7 +18182,10 @@ var sections41 = [
       { label: "List Routing Rules", slug: "list-routing-rules" },
       { label: "Reorder Rules", slug: "reorder-routing-rules" }
     ],
-    faq: [],
+    faq: [
+      { question: "Do updates affect already-routed documents?", answer: "No. Rule changes only affect future document_classified events. Documents that were already routed by the previous version of the rule are not retroactively updated." },
+      { question: "Can I temporarily disable a rule without deleting it?", answer: "Yes. Use PATCH with is_active: false to disable the rule. It will be skipped during evaluation but retained for future re-activation." }
+    ],
     mentions: ["update rule", "delete rule", "manage routing"]
   },
   {
@@ -16956,6 +18195,8 @@ var sections41 = [
     seoTitle: "Reorder Routing Rules Endpoint \u2014 Talonic Docs",
     description: "Reorder routing rules by providing an ordered array of rule IDs. Priority values are reassigned sequentially based on the new order.",
     content: [
+      { type: "paragraph", text: "Reassign priority values for all routing rules at once. Pass an ordered array of rule IDs \u2014 the first ID receives priority 1, the second receives priority 2, and so on. This is the recommended way to change evaluation order after initial creation." },
+      { type: "callout", variant: "warning", text: "All active rule IDs must be included in the `rule_ids` array. Omitting any rule returns a validation error. Inactive rules should also be included to maintain a consistent priority sequence." },
       {
         type: "endpoint",
         method: "POST",
@@ -17007,7 +18248,8 @@ var sections41 = [
       { label: "Manage Routing Rule", slug: "manage-routing-rule" }
     ],
     faq: [
-      { question: "Do I need to include all rule IDs?", answer: "Yes. All active rule IDs must be included in the array. Omitting a rule ID will result in an error." }
+      { question: "Do I need to include all rule IDs?", answer: "Yes. All active rule IDs must be included in the array. Omitting a rule ID will result in an error." },
+      { question: "Does reordering affect currently processing documents?", answer: "No. Reordering only affects future document_classified events. Documents currently being processed continue with their already-matched rule." }
     ],
     mentions: ["reorder", "priority", "rule ordering"]
   }
@@ -17026,6 +18268,18 @@ var sections42 = [
         type: "paragraph",
         text: "Billing settings control whether AI agents can autonomously top up credits. A human must enable auto top-up and configure the threshold and amount before agents can call the topup endpoint."
       },
+      {
+        type: "paragraph",
+        text: "The auto top-up system is designed with a **human-in-the-loop** safety model. An organization admin enables it once with a threshold and amount, and from that point agents can autonomously maintain the credit balance without further human intervention."
+      },
+      {
+        type: "list",
+        items: [
+          "Read current settings with `GET /v1/billing/settings`.",
+          "Enable auto top-up and set threshold/amount with `PATCH /v1/billing/settings`.",
+          "Only users with `write` scope can modify settings \u2014 agents cannot enable auto top-up themselves."
+        ]
+      },
       {
         type: "endpoint",
         method: "GET",
@@ -17118,6 +18372,14 @@ var sections42 = [
       {
         question: "Who can enable auto top-up?",
         answer: "Only a human with write access can enable auto top-up via PATCH /v1/billing/settings. Agents cannot enable it themselves."
+      },
+      {
+        question: "What are the limits for auto_topup_amount?",
+        answer: "The minimum is 1,000 credits and the maximum is 500,000 credits per top-up. Values outside this range return a 400 error."
+      },
+      {
+        question: "Does disabling auto top-up affect the current balance?",
+        answer: "No. Disabling auto top-up only prevents future autonomous top-ups. The current credit balance is unchanged."
       }
     ],
     mentions: ["billing settings", "auto top-up", "threshold", "credits"]
@@ -17133,6 +18395,11 @@ var sections42 = [
         type: "paragraph",
         text: "AI agents call this endpoint to autonomously add credits when the balance falls below the configured threshold. **A human must first enable auto top-up** via `PATCH /v1/billing/settings`."
       },
+      {
+        type: "paragraph",
+        text: "This endpoint is idempotent when the balance is already above the threshold \u2014 it returns `topped_up: false` without adding credits. Agents can safely call it on every extraction cycle without risk of over-provisioning."
+      },
+      { type: "callout", variant: "info", text: "Combine this endpoint with the `X-Talonic-Balance-Credits` response header from `POST /v1/extract` to build an autonomous credit management loop. Check the balance header after each extraction and call top-up when it drops below your threshold." },
       {
         type: "endpoint",
         method: "POST",
@@ -17213,6 +18480,10 @@ var sections42 = [
       {
         question: "What scope does the API key need?",
         answer: "The billing scope. This must be explicitly granted when creating the API key \u2014 existing keys do not have it by default."
+      },
+      {
+        question: "Is it safe to call top-up on every request?",
+        answer: "Yes. When the balance is above the threshold, the endpoint returns topped_up: false without adding credits. There is no cost or side effect for a no-op call."
       }
     ],
     mentions: ["auto top-up", "agent", "billing scope", "credits", "autonomous"]
@@ -17228,6 +18499,10 @@ var sections42 = [
         type: "paragraph",
         text: "Every successful `POST /v1/extract` response includes cost headers so AI agents can track spending without a separate API call:"
       },
+      {
+        type: "paragraph",
+        text: "Cost tracking is built into the extraction response to support autonomous agent workflows. Instead of polling a separate balance endpoint, agents read the cost headers inline and decide whether to trigger a top-up. The `Cells-Resolved-Registry` vs `Cells-Resolved-AI` breakdown shows how many fields were resolved from cached registry data (free) versus AI extraction (metered)."
+      },
       {
         type: "param-table",
         title: "Response headers",
@@ -17253,7 +18528,8 @@ X-Talonic-Cells-Resolved-AI: 1`
       {
         type: "paragraph",
         text: "Agents can read these headers after every extraction to decide whether to call `POST /v1/billing/topup` to replenish credits."
-      }
+      },
+      { type: "callout", variant: "info", text: "Registry-resolved cells are free because the value was already known from a previous extraction. Over time, as your field registry grows, more cells resolve from the registry and fewer require paid AI extraction." }
     ],
     related: [
       { label: "POST /v1/extract", slug: "post-extract" },
@@ -17264,6 +18540,14 @@ X-Talonic-Cells-Resolved-AI: 1`
       {
         question: "Are cost headers included on async (202) responses?",
         answer: "No. Cost headers are only included on synchronous 200 responses where extraction completes immediately. Async responses return a poll URL instead."
+      },
+      {
+        question: "Are cost headers included on batch extraction responses?",
+        answer: "No. Batch extraction (processing_mode=batch) defers extraction to the provider batch API. Cost is calculated when the batch completes, not at upload time."
+      },
+      {
+        question: "What is the credit-to-EUR conversion rate?",
+        answer: "The rate is configured per organization and visible in the billing settings. One credit typically equals EUR 0.001, but this may vary by plan tier."
       }
     ],
     mentions: ["cost headers", "X-Talonic-Cost", "credits", "balance", "cells resolved"]