@deepcitation/deepcitation-js 1.0.2 → 1.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (59) hide show
  1. package/README.md +71 -1197
  2. package/lib/client/DeepCitation.d.ts +204 -0
  3. package/lib/client/DeepCitation.js +473 -0
  4. package/lib/client/index.d.ts +2 -0
  5. package/lib/client/index.js +1 -0
  6. package/lib/client/types.d.ts +157 -0
  7. package/lib/client/types.js +1 -0
  8. package/lib/index.d.ts +25 -0
  9. package/lib/index.js +22 -0
  10. package/lib/parsing/normalizeCitation.d.ts +5 -0
  11. package/lib/parsing/normalizeCitation.js +182 -0
  12. package/lib/parsing/parseCitation.d.ts +79 -0
  13. package/lib/parsing/parseCitation.js +371 -0
  14. package/lib/parsing/parseWorkAround.d.ts +2 -0
  15. package/lib/parsing/parseWorkAround.js +73 -0
  16. package/lib/prompts/citationPrompts.d.ts +133 -0
  17. package/lib/prompts/citationPrompts.js +152 -0
  18. package/lib/prompts/index.d.ts +3 -0
  19. package/lib/prompts/index.js +3 -0
  20. package/lib/prompts/promptCompression.d.ts +14 -0
  21. package/lib/prompts/promptCompression.js +109 -0
  22. package/lib/prompts/types.d.ts +4 -0
  23. package/lib/prompts/types.js +1 -0
  24. package/lib/react/CitationComponent.d.ts +134 -0
  25. package/lib/react/CitationComponent.js +376 -0
  26. package/lib/react/CitationVariants.d.ts +135 -0
  27. package/lib/react/CitationVariants.js +283 -0
  28. package/lib/react/DiffDisplay.d.ts +10 -0
  29. package/lib/react/DiffDisplay.js +33 -0
  30. package/lib/react/UrlCitationComponent.d.ts +83 -0
  31. package/lib/react/UrlCitationComponent.js +224 -0
  32. package/lib/react/VerificationTabs.d.ts +10 -0
  33. package/lib/react/VerificationTabs.js +36 -0
  34. package/lib/react/icons.d.ts +8 -0
  35. package/lib/react/icons.js +9 -0
  36. package/lib/react/index.d.ts +16 -0
  37. package/lib/react/index.js +18 -0
  38. package/lib/react/primitives.d.ts +104 -0
  39. package/lib/react/primitives.js +190 -0
  40. package/lib/react/types.d.ts +192 -0
  41. package/lib/react/types.js +1 -0
  42. package/lib/react/useSmartDiff.d.ts +16 -0
  43. package/lib/react/useSmartDiff.js +64 -0
  44. package/lib/react/utils.d.ts +34 -0
  45. package/lib/react/utils.js +59 -0
  46. package/lib/types/boxes.d.ts +11 -0
  47. package/lib/types/boxes.js +1 -0
  48. package/lib/types/citation.d.ts +44 -0
  49. package/lib/types/citation.js +2 -0
  50. package/lib/types/foundHighlight.d.ts +23 -0
  51. package/lib/types/foundHighlight.js +22 -0
  52. package/lib/types/index.d.ts +11 -0
  53. package/lib/types/index.js +7 -0
  54. package/lib/types/search.d.ts +30 -0
  55. package/lib/types/search.js +1 -0
  56. package/lib/utils/sha.d.ts +10 -0
  57. package/lib/utils/sha.js +108 -0
  58. package/package.json +6 -5
  59. /package/{src → lib}/react/styles.css +0 -0
package/README.md CHANGED
@@ -7,7 +7,6 @@
7
7
  [![npm version](https://img.shields.io/npm/v/@deepcitation/deepcitation-js.svg)](https://www.npmjs.com/package/@deepcitation/deepcitation-js)
8
8
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
9
9
  [![TypeScript](https://img.shields.io/badge/TypeScript-5.0+-blue.svg)](https://www.typescriptlang.org/)
10
- [![Node.js](https://img.shields.io/badge/Node.js-18+-green.svg)](https://nodejs.org/)
11
10
 
12
11
  [Documentation](https://deepcitation.com/docs) · [Get Free API Key](https://deepcitation.com/signup) · [Examples](./examples) · [Discord](https://discord.gg/deepcitation)
13
12
 
@@ -26,27 +25,11 @@ After: "Revenue grew 45% [1]" → ✅ Verified on page 3, line 12 (with scree
26
25
 
27
26
  ## Features
28
27
 
29
- | Feature | Description |
30
- |---------|-------------|
31
- | **Deterministic Matching** | Every citation is traced to its exact location. No fuzzy matching, no guessing—just proof. |
32
- | **Visual Proof Generation** | Automated screenshots with highlighted text show exactly where each citation comes from. |
33
- | **Any LLM Provider** | Works with OpenAI, Anthropic, Google, Azure, or your own models. |
34
- | **React Components** | 7 pre-built components + composable primitives for building citation UIs. |
35
- | **TypeScript Native** | Full type safety with comprehensive type definitions. |
36
-
37
- ## Table of Contents
38
-
39
- - [Installation](#installation)
40
- - [Quick Start](#quick-start)
41
- - [Examples](#examples)
42
- - [API Reference](#api-reference)
43
- - [API Response Types](#api-response-types)
44
- - [Integration Patterns](#integration-patterns)
45
- - [Error Handling](#error-handling)
46
- - [React Components](#react-components)
47
- - [Types](#types)
48
- - [Supported File Types](#supported-file-types)
49
- - [Contributing](#contributing)
28
+ - **Deterministic Matching** Every citation traced to its exact location. No fuzzy matching, no guessing.
29
+ - **Visual Proof** – Automated screenshots with highlighted text show exactly where citations come from.
30
+ - **Any LLM Provider** Works with OpenAI, Anthropic, Google, Azure, or your own models.
31
+ - **React Components** Pre-built components + composable primitives for citation UIs.
32
+ - **TypeScript Native** Full type safety with comprehensive type definitions.
50
33
 
51
34
  ## Installation
52
35
 
@@ -54,16 +37,6 @@ After: "Revenue grew 45% [1]" → ✅ Verified on page 3, line 12 (with scree
54
37
  npm install @deepcitation/deepcitation-js
55
38
  ```
56
39
 
57
- ```bash
58
- yarn add @deepcitation/deepcitation-js
59
- ```
60
-
61
- ```bash
62
- bun add @deepcitation/deepcitation-js
63
- ```
64
-
65
- ## Get Your API Key
66
-
67
40
  Get a free API key at [deepcitation.com](https://deepcitation.com/signup) — no credit card required.
68
41
 
69
42
  ```bash
@@ -79,29 +52,26 @@ DeepCitation works in three steps: **Pre-Prompt**, **Post-Prompt**, and **Displa
79
52
 
80
53
  ### Step 1: Pre-Prompt
81
54
 
82
- Before calling your LLM, upload source documents and enhance your prompt with citation instructions.
55
+ Upload source documents and enhance your prompt with citation instructions.
83
56
 
84
57
  ```typescript
85
58
  import { DeepCitation, wrapCitationPrompt } from "@deepcitation/deepcitation-js";
86
59
 
87
- const deepcitation = new DeepCitation({ apiKey: process.env.DEEPCITATION_API_KEY });
60
+ const dc = new DeepCitation({ apiKey: process.env.DEEPCITATION_API_KEY });
88
61
 
89
- // Upload source files and get structured content
90
- const { fileDataParts, fileDeepTexts } = await deepcitation.prepareFiles([
62
+ // Upload source files
63
+ const { fileDataParts, fileDeepTexts } = await dc.prepareFiles([
91
64
  { file: pdfBuffer, filename: "report.pdf" },
92
- { file: invoiceBuffer, filename: "invoice.pdf" },
93
65
  ]);
94
66
 
95
- // Wrap your prompts with citation instructions (fileDeepText handles file content)
96
- const systemPrompt = `You are a helpful financial analyst...`;
97
-
67
+ // Wrap prompts with citation instructions
98
68
  const { enhancedSystemPrompt, enhancedUserPrompt } = wrapCitationPrompt({
99
- systemPrompt,
100
- userPrompt: userMessage,
101
- fileDeepText: fileDeepTexts, // Pass single string or array for multi-file
69
+ systemPrompt: "You are a helpful assistant...",
70
+ userPrompt: "Analyze this document",
71
+ fileDeepText: fileDeepTexts,
102
72
  });
103
73
 
104
- // Call your LLM as usual
74
+ // Call your LLM
105
75
  const response = await llm.chat({
106
76
  messages: [
107
77
  { role: "system", content: enhancedSystemPrompt },
@@ -112,17 +82,15 @@ const response = await llm.chat({
112
82
 
113
83
  ### Step 2: Post-Prompt
114
84
 
115
- After receiving the LLM response, verify citations against the source documents.
85
+ Verify citations against the source documents.
116
86
 
117
87
  ```typescript
118
- // Verify all citations from LLM output in one call
119
- const result = await deepcitation.verifyCitations({
88
+ const result = await dc.verifyCitations({
120
89
  llmOutput: response.content,
121
- fileDataParts, // For Zero Data Retention or after storage expires (30 days)
90
+ fileDataParts,
122
91
  });
123
92
 
124
93
  // result.citations contains verification status + visual proof
125
- // { "1": { status: "found", imageBase64: "...", matchSnippet: "..." }, ... }
126
94
  ```
127
95
 
128
96
  ### Step 3: Display
@@ -131,8 +99,9 @@ Render verified citations with React components.
131
99
 
132
100
  ```tsx
133
101
  import { CitationComponent } from "@deepcitation/deepcitation-js/react";
102
+ import "@deepcitation/deepcitation-js/react/styles.css";
134
103
 
135
- function Response({ text, citations, verifications }) {
104
+ function Response({ citations, verifications }) {
136
105
  return (
137
106
  <p>
138
107
  Revenue grew by
@@ -148,1209 +117,114 @@ function Response({ text, citations, verifications }) {
148
117
 
149
118
  ---
150
119
 
151
- ## Examples
152
-
153
- Check out the [examples directory](./examples) for complete, runnable examples:
154
-
155
- | Example | Description | Use Case |
156
- |---------|-------------|----------|
157
- | [**basic-verification**](./examples/basic-verification) | Core 3-step workflow with OpenAI or Anthropic | Learning the basics, simple integrations |
158
- | [**support-bot**](./examples/support-bot) | Customer support bot with invisible citations | Production apps, customer-facing AI |
159
- | [**nextjs-ai-sdk**](./examples/nextjs-ai-sdk) | Next.js chat app with Vercel AI SDK | Full-stack apps, streaming UI |
160
-
161
- ```bash
162
- # Run the basic example
163
- cd examples/basic-verification
164
- npm install
165
- cp .env.example .env # Add your API keys
166
- npm run start:openai
167
-
168
- # Or run the Next.js example
169
- cd examples/nextjs-ai-sdk
170
- npm install
171
- npm run dev
172
- ```
173
-
174
- ---
175
-
176
- ## API Reference
120
+ ## Core API
177
121
 
178
122
  ### DeepCitation Client
179
123
 
180
124
  ```typescript
181
- import { DeepCitation } from "@deepcitation/deepcitation-js";
182
-
183
125
  const dc = new DeepCitation({
184
- apiKey: string; // Required: Your API key (dc_live_* or dc_test_*)
185
- apiUrl?: string; // Optional: Custom API URL (default: https://api.deepcitation.com)
186
- });
187
- ```
188
-
189
- #### `prepareFiles(files)`
190
-
191
- Upload source documents and get structured text for LLM prompts.
192
-
193
- ```typescript
194
- const { fileDataParts, fileDeepTexts } = await dc.prepareFiles([
195
- { file: pdfBuffer, filename: "report.pdf" },
196
- { file: imageBlob, filename: "chart.png" },
197
- ]);
198
-
199
- // fileDataParts: Array of file references (use with verifyCitations)
200
- // fileDeepTexts: Array of formatted text strings with page markers and line IDs
201
- ```
202
-
203
- **Supported file types:**
204
- - PDF documents (native and scanned)
205
- - Images (PNG, JPEG, TIFF, WebP, AVIF, HEIC)
206
-
207
- #### `convertToPdf(input)`
208
-
209
- Convert a URL or Office document to PDF for citation verification.
210
-
211
- ```typescript
212
- // Convert a URL (shorthand)
213
- const result = await dc.convertToPdf("https://example.com/article");
214
-
215
- // Convert a URL with options
216
- const result = await dc.convertToPdf({
217
- url: "https://example.com/article",
218
- singlePage: true, // Render as single long page instead of paginated
219
- });
220
-
221
- // Convert an Office document
222
- const result = await dc.convertToPdf({
223
- file: docxBuffer, // File, Blob, or Buffer
224
- filename: "report.docx",
225
- fileId: "my-custom-id", // Optional custom file ID
226
- });
227
-
228
- // result: { fileId, metadata, status }
229
- ```
230
-
231
- **Supported formats:**
232
- - URLs: Any publicly accessible web page or direct PDF link
233
- - Office: `.doc`, `.docx`, `.xls`, `.xlsx`, `.ppt`, `.pptx`, `.odt`, `.ods`, `.odp`, `.rtf`, `.csv`
234
-
235
- #### `prepareConvertedFile(options)`
236
-
237
- Process a previously converted file for citation verification. Use this after `convertToPdf()`.
238
-
239
- ```typescript
240
- // Convert first
241
- const converted = await dc.convertToPdf("https://example.com/article");
242
-
243
- // Then prepare for verification
244
- const { fileDeepText, fileId } = await dc.prepareConvertedFile({
245
- fileId: converted.fileId,
126
+ apiKey: string, // Your API key (dc_live_* or dc_test_*)
127
+ apiUrl?: string, // Optional: Custom API URL
246
128
  });
247
129
 
248
- // fileDeepText is ready for LLM prompts (pass to wrapCitationPrompt)
249
- // fileId can be used for verifyCitations()
250
- ```
251
-
252
- #### `verifyCitations(options)`
130
+ // Upload and prepare source files
131
+ await dc.prepareFiles(files: FileInput[])
253
132
 
254
- Verify citations from LLM output against source documents.
133
+ // Convert URLs/Office docs to PDF
134
+ await dc.convertToPdf(urlOrOptions: string | ConvertOptions)
255
135
 
256
- ```typescript
257
- const result = await dc.verifyCitations({
258
- llmOutput: string; // The LLM response containing citations
259
- fileDataParts?: FileData[]; // Optional: File references for verification
260
- outputImageFormat?: "jpeg" | "png" | "avif"; // Optional: Image format (default: "avif")
261
- });
262
-
263
- // Returns: { citations: Record<string, VerificationResult> }
136
+ // Verify LLM citations
137
+ await dc.verifyCitations({ llmOutput, fileDataParts?, outputImageFormat? })
264
138
  ```
265
139
 
266
140
  ### Prompt Utilities
267
141
 
268
- #### `wrapCitationPrompt(options)`
269
-
270
- Wrap both system and user prompts with citation instructions.
271
-
272
- ```typescript
273
- import { wrapCitationPrompt } from "@deepcitation/deepcitation-js";
274
-
275
- const { enhancedSystemPrompt, enhancedUserPrompt } = wrapCitationPrompt({
276
- systemPrompt: "You are a helpful assistant...",
277
- userPrompt: "Analyze this document",
278
- fileDeepText: fileDeepTexts, // Single string or array for multi-file
279
- });
280
-
281
- // fileDeepText is automatically wrapped in <file_text> tags
282
- // For multiple files, each is tagged with file_index
283
- ```
284
-
285
- #### `wrapSystemCitationPrompt(options)`
286
-
287
- Wrap only the system prompt with citation instructions (for more control).
288
-
289
- ```typescript
290
- import { wrapSystemCitationPrompt } from "@deepcitation/deepcitation-js";
291
-
292
- const enhanced = wrapSystemCitationPrompt({
293
- systemPrompt: "You are a helpful assistant...",
294
- isAudioVideo?: boolean, // Use timestamp-based citations
295
- prependCitationInstructions?: boolean, // Add instructions before prompt
296
- });
297
- ```
298
-
299
- #### `getAllCitationsFromLlmOutput(llmOutput)`
300
-
301
- Extract citations from LLM response (supports both XML `<cite>` tags and JSON formats).
302
-
303
- ```typescript
304
- import { getAllCitationsFromLlmOutput } from "@deepcitation/deepcitation-js";
305
-
306
- const citations = getAllCitationsFromLlmOutput(llmResponse);
307
- // { "1": { pageNumber: 1, lineId: "L1", fullPhrase: "..." }, ... }
308
- ```
309
-
310
- #### Multi-File Utilities
311
-
312
- ```typescript
313
- import { groupCitationsByFileId } from "@deepcitation/deepcitation-js";
314
-
315
- // Group citations by fileId for multi-file verification
316
- const citationsByFile = groupCitationsByFileId(citations);
317
- // Returns: Map<string, { [key: string]: Citation }>
318
-
319
- // Object version for easier serialization
320
- const citationsByFileObj = groupCitationsByFileIdObject(citations);
321
- // Returns: { [fileId: string]: { [key: string]: Citation } }
322
- ```
323
-
324
- #### Cleanup Utilities
325
-
326
142
  ```typescript
327
143
  import {
328
- removeCitations,
329
- removePageNumberMetadata,
330
- removeLineIdMetadata,
144
+ wrapCitationPrompt, // Wrap system + user prompts
145
+ wrapSystemCitationPrompt, // Wrap system prompt only
146
+ getAllCitationsFromLlmOutput, // Extract citations from response
147
+ CITATION_JSON_OUTPUT_FORMAT, // JSON schema for structured output
331
148
  } from "@deepcitation/deepcitation-js";
332
-
333
- // Remove citation tags (optionally keep values)
334
- const clean = removeCitations(text, leaveValue?: boolean);
335
-
336
- // Remove page number metadata tags
337
- const noPages = removePageNumberMetadata(text);
338
-
339
- // Remove line ID metadata tags
340
- const noLines = removeLineIdMetadata(text);
341
149
  ```
342
150
 
343
- ### JSON Output Format
344
-
345
- For LLMs with structured output support:
346
-
347
- ```typescript
348
- import { CITATION_JSON_OUTPUT_FORMAT } from "@deepcitation/deepcitation-js";
349
-
350
- const response = await openai.chat.completions.create({
351
- model: "gpt-4o",
352
- messages: [...],
353
- response_format: {
354
- type: "json_schema",
355
- json_schema: {
356
- name: "analysis",
357
- schema: {
358
- type: "object",
359
- properties: {
360
- findings: {
361
- type: "array",
362
- items: {
363
- type: "object",
364
- properties: {
365
- text: { type: "string" },
366
- citation: CITATION_JSON_OUTPUT_FORMAT,
367
- },
368
- },
369
- },
370
- },
371
- },
372
- },
373
- },
374
- });
375
- ```
376
-
377
- ---
378
-
379
- ## API Response Types
380
-
381
- ### Verification Response Structure
382
-
383
- When you call `verifyCitations()`, you receive a structured response containing verification results for each citation:
384
-
385
- ```typescript
386
- interface VerifyCitationsResponse {
387
- citations: Record<string, FoundHighlightLocation>;
388
- }
389
- ```
390
-
391
- Each citation key maps to a `FoundHighlightLocation` object:
392
-
393
- ```typescript
394
- interface FoundHighlightLocation {
395
- // Verification result
396
- searchState: SearchState; // Contains status and location details
397
-
398
- // Location information
399
- pageNumber?: number | null; // Page where citation was found
400
- matchSnippet?: string | null; // Text snippet of the matched content
401
- lowerCaseSearchTerm?: string | null; // The search term used for matching
402
-
403
- // Visual proof
404
- verificationImageBase64?: string | null; // Base64-encoded screenshot
405
- verificationImageUrl?: string | null; // URL to hosted image (if configured)
406
-
407
- // Metadata
408
- verifiedAt?: Date; // Timestamp of verification
409
- source?: string | null; // Verification engine version
410
-
411
- // Original citation data
412
- citation?: Citation; // The original citation object
413
- }
414
- ```
415
-
416
- ### SearchState and Status Values
417
-
418
- The `SearchState` object contains the verification result:
419
-
420
- ```typescript
421
- interface SearchState {
422
- status: SearchStatus;
423
-
424
- // Page location comparison
425
- expectedPage?: number | null; // Page claimed by LLM
426
- actualPage?: number | null; // Page where text was actually found
427
-
428
- // Line location comparison
429
- expectedLineIds?: number[] | null; // Line IDs claimed by LLM
430
- actualLineIds?: number[] | null; // Actual line IDs where text was found
431
-
432
- // For audio/video citations
433
- expectedTimestamps?: { startTime?: string; endTime?: string };
434
- actualTimestamps?: { startTime?: string; endTime?: string };
435
- }
436
- ```
437
-
438
- ### SearchStatus Reference
439
-
440
- | Status | Description | `isVerified` | `isPartialMatch` | Recommended Action |
441
- |--------|-------------|--------------|------------------|-------------------|
442
- | `"found"` | Exact match at expected location | ✅ `true` | `false` | Display with full confidence |
443
- | `"found_value_only"` | Citation value found, but not exact phrase | ✅ `true` | `false` | Display, phrase may be paraphrased |
444
- | `"found_phrase_missed_value"` | Exact phrase found, value mismatch | ✅ `true` | `false` | Display, minor discrepancy |
445
- | `"partial_text_found"` | Part of the citation text was found | ✅ `true` | ✅ `true` | Display with partial indicator |
446
- | `"found_on_other_page"` | Text found on different page | ✅ `true` | ✅ `true` | Display, page number was wrong |
447
- | `"found_on_other_line"` | Text found on different line | ✅ `true` | ✅ `true` | Display, line reference was off |
448
- | `"first_word_found"` | Only first word matched | ✅ `true` | ✅ `true` | Display with caution indicator |
449
- | `"not_found"` | Citation not found in document | ❌ `false` | `false` | Hide, retry, or flag for review |
450
- | `"pending"` | Page still being processed | pending | `false` | Show loading state, poll for update |
451
- | `"loading"` | Verification in progress | pending | `false` | Show loading state |
452
- | `"timestamp_wip"` | Audio/video timestamp verification in progress | pending | `false` | Show loading state |
453
-
454
- ### CitationStatus Helper
455
-
456
- Use `getCitationStatus()` to get a simplified status object:
457
-
458
- ```typescript
459
- import { getCitationStatus } from "@deepcitation/deepcitation-js";
460
-
461
- const status = getCitationStatus(foundHighlight);
462
-
463
- // status: CitationStatus
464
- {
465
- isVerified: boolean; // true if citation is trustworthy (found, partial, or value-only)
466
- isMiss: boolean; // true only if status === "not_found"
467
- isPartialMatch: boolean; // true for partial matches (other page/line, partial text)
468
- isPending: boolean; // true if still loading or pending
469
- }
470
- ```
471
-
472
- ---
473
-
474
- ## Integration Patterns
475
-
476
- ### Pattern 1: Retry Logic for Unverified Citations
477
-
478
- When citations fail verification, you may want to retry with the LLM or flag for human review:
479
-
480
- ```typescript
481
- import { DeepCitation, getCitationStatus } from "@deepcitation/deepcitation-js";
482
-
483
- async function verifyWithRetry(
484
- dc: DeepCitation,
485
- llmOutput: string,
486
- fileDataParts: FileDataPart[],
487
- maxRetries = 2
488
- ) {
489
- const result = await dc.verifyCitations({ llmOutput, fileDataParts });
490
-
491
- const unverifiedCitations: string[] = [];
492
- const verifiedCitations: Record<string, FoundHighlightLocation> = {};
493
-
494
- for (const [key, highlight] of Object.entries(result.citations)) {
495
- const status = getCitationStatus(highlight);
496
-
497
- if (status.isMiss) {
498
- unverifiedCitations.push(key);
499
- } else {
500
- verifiedCitations[key] = highlight;
501
- }
502
- }
503
-
504
- // Option 1: Flag unverified citations for human review
505
- if (unverifiedCitations.length > 0) {
506
- await flagForReview(unverifiedCitations, llmOutput);
507
- }
508
-
509
- // Option 2: Ask LLM to regenerate with stricter instructions
510
- if (unverifiedCitations.length > 0 && maxRetries > 0) {
511
- const regenerated = await regenerateWithStricterPrompt(
512
- unverifiedCitations,
513
- fileDataParts
514
- );
515
- return verifyWithRetry(dc, regenerated, fileDataParts, maxRetries - 1);
516
- }
517
-
518
- return { verifiedCitations, unverifiedCitations };
519
- }
520
- ```
521
-
522
- ### Pattern 2: Invisible Citations (Support Bots & Customer-Facing Apps)
523
-
524
- For customer support bots where you want verified information without showing citation markers:
151
+ ### React Components
525
152
 
526
153
  ```typescript
527
154
  import {
528
- DeepCitation,
529
- wrapCitationPrompt,
530
- removeCitations,
531
- getCitationStatus
532
- } from "@deepcitation/deepcitation-js";
533
-
534
- async function generateVerifiedResponse(userQuestion: string, documents: File[]) {
535
- const dc = new DeepCitation({ apiKey: process.env.DEEPCITATION_API_KEY });
536
-
537
- // Step 1: Prepare files and prompt (citations are generated internally)
538
- const { fileDataParts, fileDeepTexts } = await dc.prepareFiles(
539
- documents.map(f => ({ file: f, filename: f.name }))
540
- );
541
-
542
- const { enhancedSystemPrompt, enhancedUserPrompt } = wrapCitationPrompt({
543
- systemPrompt: "You are a helpful support agent...",
544
- userPrompt: userQuestion,
545
- fileDeepText: fileDeepTexts, // Automatically wrapped in <file_text> tags
546
- });
547
-
548
- // Step 2: Get LLM response with citations
549
- const llmResponse = await callYourLLM(enhancedSystemPrompt, enhancedUserPrompt);
550
-
551
- // Step 3: Verify citations
552
- const result = await dc.verifyCitations({
553
- llmOutput: llmResponse,
554
- fileDataParts
555
- });
556
-
557
- // Step 4: Calculate confidence score
558
- const citations = Object.values(result.citations);
559
- const verifiedCount = citations.filter(c => getCitationStatus(c).isVerified).length;
560
- const confidenceScore = citations.length > 0
561
- ? verifiedCount / citations.length
562
- : 0;
563
-
564
- // Step 5: Return clean response (no visible citations) with metadata
565
- return {
566
- // Remove citation tags for clean customer-facing text
567
- response: removeCitations(llmResponse, false),
568
-
569
- // Include confidence for internal monitoring/logging
570
- confidence: confidenceScore,
571
- totalCitations: citations.length,
572
- verifiedCitations: verifiedCount,
573
-
574
- // Store verification details for audit/debugging
575
- verificationDetails: result.citations,
576
- };
577
- }
578
- ```
579
-
580
- ### Pattern 3: Conditional Rendering Based on Verification Status
581
-
582
- Show different UI treatments based on citation quality:
583
-
584
- ```typescript
585
- import { getCitationStatus, type FoundHighlightLocation } from "@deepcitation/deepcitation-js";
586
-
587
- function CitationWithQualityIndicator({
588
- citation,
589
- foundCitation
590
- }: {
591
- citation: Citation;
592
- foundCitation: FoundHighlightLocation | null;
593
- }) {
594
- const status = getCitationStatus(foundCitation);
595
-
596
- // Fully verified - show with confidence
597
- if (status.isVerified && !status.isPartialMatch) {
598
- return (
599
- <span className="citation verified">
600
- [{citation.citationNumber}] <VerifiedBadge />
601
- </span>
602
- );
603
- }
604
-
605
- // Partial match - show with warning
606
- if (status.isPartialMatch) {
607
- return (
608
- <span className="citation partial" title="Citation found with minor discrepancies">
609
- [{citation.citationNumber}] <PartialMatchIcon />
610
- </span>
611
- );
612
- }
613
-
614
- // Still loading
615
- if (status.isPending) {
616
- return (
617
- <span className="citation loading">
618
- [{citation.citationNumber}] <Spinner size="xs" />
619
- </span>
620
- );
621
- }
622
-
623
- // Not found - hide or show warning
624
- if (status.isMiss) {
625
- // Option A: Hide unverified citations entirely
626
- return null;
627
-
628
- // Option B: Show with strong warning
629
- return (
630
- <span className="citation unverified" title="Could not verify this citation">
631
- [{citation.citationNumber}] <WarningIcon />
632
- </span>
633
- );
634
- }
635
-
636
- return <span>[{citation.citationNumber}]</span>;
637
- }
638
- ```
639
-
640
- ### Pattern 4: Streaming Response with Progressive Verification
641
-
642
- Verify citations as they stream in from the LLM:
643
-
644
- ```typescript
645
- import {
646
- DeepCitation,
647
- getAllCitationsFromLlmOutput,
648
- getCitationStatus
649
- } from "@deepcitation/deepcitation-js";
650
-
651
- async function* streamWithVerification(
652
- dc: DeepCitation,
653
- fileDataParts: FileDataPart[],
654
- llmStream: AsyncIterable<string>
655
- ) {
656
- let fullOutput = "";
657
- let lastVerifiedCount = 0;
658
- const verifiedCitations: Record<string, FoundHighlightLocation> = {};
659
-
660
- for await (const chunk of llmStream) {
661
- fullOutput += chunk;
662
- yield { type: "content", content: chunk };
663
-
664
- // Extract citations found so far
665
- const citations = getAllCitationsFromLlmOutput(fullOutput);
666
- const citationCount = Object.keys(citations).length;
667
-
668
- // New citations detected - verify them
669
- if (citationCount > lastVerifiedCount) {
670
- const result = await dc.verifyCitations({
671
- llmOutput: fullOutput,
672
- fileDataParts,
673
- });
674
-
675
- // Yield verification updates
676
- for (const [key, highlight] of Object.entries(result.citations)) {
677
- if (!verifiedCitations[key]) {
678
- verifiedCitations[key] = highlight;
679
- yield {
680
- type: "verification",
681
- citationKey: key,
682
- status: getCitationStatus(highlight),
683
- highlight
684
- };
685
- }
686
- }
687
-
688
- lastVerifiedCount = citationCount;
689
- }
690
- }
691
-
692
- yield { type: "complete", verifiedCitations };
693
- }
694
-
695
- // Usage
696
- const stream = streamWithVerification(dc, fileDataParts, llmStream);
697
- for await (const event of stream) {
698
- if (event.type === "content") {
699
- appendToUI(event.content);
700
- } else if (event.type === "verification") {
701
- updateCitationStatus(event.citationKey, event.status);
702
- }
703
- }
704
- ```
705
-
706
- ### Pattern 5: Batch Processing with Quality Gates
707
-
708
- Process multiple documents with minimum verification thresholds:
709
-
710
- ```typescript
711
- interface ProcessingResult {
712
- documentId: string;
713
- response: string;
714
- verificationRate: number;
715
- passed: boolean;
716
- citations: Record<string, FoundHighlightLocation>;
717
- }
718
-
719
- async function batchProcessWithQualityGate(
720
- documents: { id: string; file: File; question: string }[],
721
- minimumVerificationRate = 0.8 // 80% of citations must verify
722
- ): Promise<ProcessingResult[]> {
723
- const dc = new DeepCitation({ apiKey: process.env.DEEPCITATION_API_KEY });
724
- const results: ProcessingResult[] = [];
725
-
726
- for (const doc of documents) {
727
- const { fileDataParts, fileDeepTexts } = await dc.prepareFiles([
728
- { file: doc.file, filename: doc.file.name }
729
- ]);
730
-
731
- const llmResponse = await generateResponse(doc.question, fileDeepTexts);
732
-
733
- const verification = await dc.verifyCitations({
734
- llmOutput: llmResponse,
735
- fileDataParts,
736
- });
737
-
738
- const citations = Object.values(verification.citations);
739
- const verifiedCount = citations.filter(
740
- c => getCitationStatus(c).isVerified
741
- ).length;
742
- const verificationRate = citations.length > 0
743
- ? verifiedCount / citations.length
744
- : 1;
745
-
746
- results.push({
747
- documentId: doc.id,
748
- response: llmResponse,
749
- verificationRate,
750
- passed: verificationRate >= minimumVerificationRate,
751
- citations: verification.citations,
752
- });
753
- }
754
-
755
- // Log quality metrics
756
- const passRate = results.filter(r => r.passed).length / results.length;
757
- console.log(`Batch quality: ${(passRate * 100).toFixed(1)}% passed threshold`);
758
-
759
- return results;
760
- }
761
- ```
762
-
763
- ### Pattern 6: Handling Page/Line Discrepancies
764
-
765
- When citations are found but on different pages or lines:
766
-
767
- ```typescript
768
- function analyzeLocationDiscrepancy(highlight: FoundHighlightLocation) {
769
- const { searchState } = highlight;
770
-
771
- if (searchState?.status === "found_on_other_page") {
772
- return {
773
- type: "page_mismatch",
774
- message: `Citation found on page ${searchState.actualPage} (expected page ${searchState.expectedPage})`,
775
- severity: "low", // Content is verified, just wrong page reference
776
- suggestion: "LLM may have miscounted pages or document has different pagination",
777
- };
778
- }
779
-
780
- if (searchState?.status === "found_on_other_line") {
781
- return {
782
- type: "line_mismatch",
783
- message: `Citation found on lines ${searchState.actualLineIds?.join(", ")} (expected ${searchState.expectedLineIds?.join(", ")})`,
784
- severity: "low",
785
- suggestion: "Line numbers shifted, but content is verified",
786
- };
787
- }
788
-
789
- if (searchState?.status === "partial_text_found") {
790
- return {
791
- type: "partial_match",
792
- message: `Only part of the cited text was found: "${highlight.matchSnippet}"`,
793
- severity: "medium",
794
- suggestion: "LLM may have paraphrased or combined text from multiple locations",
795
- };
796
- }
797
-
798
- return null;
799
- }
800
- ```
801
-
802
- ---
803
-
804
- ## Error Handling
805
-
806
- ### API Errors
807
-
808
- ```typescript
809
- try {
810
- const result = await dc.verifyCitations({ llmOutput, fileDataParts });
811
- } catch (error) {
812
- if (error.code === "unauthenticated") {
813
- // Invalid or expired API key
814
- console.error("Invalid API key. Get a new one at deepcitation.com");
815
- } else if (error.code === "payment-required") {
816
- // Usage limits exceeded
817
- console.error("Free tier exhausted. Add payment method to continue.");
818
- } else if (error.code === "invalid-argument") {
819
- // Bad request (malformed citations, invalid file references)
820
- console.error("Invalid request:", error.message);
821
- } else if (error.code === "not-found") {
822
- // File not found (expired or never uploaded)
823
- console.error("File not found. Files expire after 30 days.");
824
- } else {
825
- // Network or internal error - safe to retry
826
- console.error("Temporary error, retrying...", error);
827
- await delay(1000);
828
- return dc.verifyCitations({ llmOutput, fileDataParts });
829
- }
830
- }
831
- ```
832
-
833
- ### Graceful Degradation
834
-
835
- When verification fails, fall back to showing unverified citations:
836
-
837
- ```typescript
838
- async function verifyWithFallback(
839
- dc: DeepCitation,
840
- llmOutput: string,
841
- fileDataParts: FileDataPart[]
842
- ) {
843
- try {
844
- return await dc.verifyCitations({ llmOutput, fileDataParts });
845
- } catch (error) {
846
- console.warn("Verification unavailable, showing unverified citations");
847
-
848
- // Parse citations without verification
849
- const citations = getAllCitationsFromLlmOutput(llmOutput);
850
-
851
- // Return mock "pending" status for all citations
852
- return {
853
- citations: Object.fromEntries(
854
- Object.entries(citations).map(([key, citation]) => [
855
- key,
856
- {
857
- citation,
858
- searchState: { status: "pending" as const },
859
- verificationImageBase64: null,
860
- }
861
- ])
862
- )
863
- };
864
- }
865
- }
866
- ```
867
-
868
- ---
869
-
870
- ## React Components
871
-
872
- ### Pre-built Components
873
-
874
- ```tsx
875
- import {
876
- CitationComponent,
877
- ChipCitation,
878
- SuperscriptCitation,
879
- FootnoteCitation,
880
- InlineCitation,
881
- MinimalCitation,
882
- UrlCitationComponent,
155
+ CitationComponent, // Primary citation display component
156
+ CitationVariants, // Alternative citation styles
157
+ UrlCitationComponent, // For URL-based citations
883
158
  } from "@deepcitation/deepcitation-js/react";
884
-
885
- // Classic bracket style [1]
886
- <CitationComponent
887
- citation={citation}
888
- foundCitation={verificationResult}
889
- displayCitationValue={true}
890
- eventHandlers={{
891
- onClick: (citation, key, e) => scrollToSource(citation),
892
- onMouseEnter: (citation, key) => showPreview(citation),
893
- }}
894
- />
895
-
896
- // Chip/badge style
897
- <ChipCitation citation={citation} foundCitation={result} size="md" />
898
-
899
- // Superscript style (academic)
900
- <SuperscriptCitation citation={citation} foundCitation={result} />
901
-
902
- // Footnote marker style
903
- <FootnoteCitation citation={citation} symbolStyle="asterisk" />
904
-
905
- // Inline with subtle underline
906
- <InlineCitation citation={citation} underlineStyle="dotted" />
907
-
908
- // Minimal - just the number
909
- <MinimalCitation citation={citation} showStatusIndicator={true} />
910
159
  ```
911
160
 
912
- ### Composable Primitives
913
-
914
- Build custom citation components using composable primitives:
915
-
916
- ```tsx
917
- import { Citation, useCitationContext } from "@deepcitation/deepcitation-js/react";
918
-
919
- function CustomCitation({ citation, foundCitation, onClick }) {
920
- return (
921
- <Citation.Root citation={citation} foundCitation={foundCitation}>
922
- <Citation.Trigger onCitationClick={onClick}>
923
- <Citation.Bracket>
924
- <Citation.Number />
925
- <Citation.Indicator />
926
- </Citation.Bracket>
927
- </Citation.Trigger>
928
- </Citation.Root>
929
- );
930
- }
931
-
932
- // Custom chip-style
933
- function ChipCitation({ citation, foundCitation }) {
934
- return (
935
- <Citation.Root citation={citation} foundCitation={foundCitation}>
936
- <Citation.Trigger className="px-2 py-0.5 rounded-full bg-blue-100">
937
- <Citation.Value separator="" />
938
- <Citation.Bracket open="[" close="]">
939
- <Citation.Number />
940
- </Citation.Bracket>
941
- <Citation.Indicator
942
- verifiedIndicator={<CheckIcon />}
943
- partialIndicator={<AlertIcon />}
944
- />
945
- </Citation.Trigger>
946
- </Citation.Root>
947
- );
948
- }
949
-
950
- // Status-aware rendering
951
- function StatusCitation({ citation, foundCitation }) {
952
- return (
953
- <Citation.Root citation={citation} foundCitation={foundCitation}>
954
- <Citation.Status>
955
- {(status) => (
956
- <span className={status.isVerified ? "text-green-600" : "text-gray-500"}>
957
- <Citation.Trigger>
958
- <Citation.Phrase maxLength={50} />
959
- {status.isVerified && " ✓"}
960
- </Citation.Trigger>
961
- </span>
962
- )}
963
- </Citation.Status>
964
- </Citation.Root>
965
- );
966
- }
967
- ```
968
-
969
- ### Available Primitives
970
-
971
- | Primitive | Description |
972
- |-----------|-------------|
973
- | `Citation.Root` | Context provider, wraps all other primitives |
974
- | `Citation.Trigger` | Interactive element with event handlers |
975
- | `Citation.Bracket` | Renders brackets around content |
976
- | `Citation.Number` | Displays citation number |
977
- | `Citation.Value` | Displays citation value/summary |
978
- | `Citation.Indicator` | Status indicator (✓, *, etc.) |
979
- | `Citation.Status` | Render prop for accessing status |
980
- | `Citation.Phrase` | Displays full phrase with truncation |
981
- | `Citation.Page` | Displays page number |
982
-
983
- ### Hooks
161
+ ### Types
984
162
 
985
163
  ```typescript
986
- import { useCitationContext, useCitationContextSafe } from "@deepcitation/deepcitation-js/react";
987
-
988
- // Must be inside Citation.Root
989
- const { citation, status, citationKey } = useCitationContext();
990
-
991
- // Safe version (returns null if not in context)
992
- const context = useCitationContextSafe();
164
+ import type {
165
+ Citation,
166
+ FoundHighlightLocation,
167
+ SearchState,
168
+ SearchStatus,
169
+ } from "@deepcitation/deepcitation-js";
993
170
  ```
994
171
 
995
172
  ---
996
173
 
997
- ## Styling
998
-
999
- ### CSS Import
1000
-
1001
- ```css
1002
- @import "@deepcitation/deepcitation-js/react/styles.css";
1003
- ```
1004
-
1005
- ### CSS Custom Properties
174
+ ## Examples
1006
175
 
1007
- ```css
1008
- :root {
1009
- --citation-color-verified: #22c55e;
1010
- --citation-color-partial: #eab308;
1011
- --citation-color-miss: #ef4444;
1012
- --citation-color-pending: #9ca3af;
1013
- --citation-color-primary: #3b82f6;
176
+ Check out the [examples directory](./examples) for complete, runnable examples:
1014
177
 
1015
- --citation-bg-verified: rgba(34, 197, 94, 0.1);
1016
- --citation-bg-partial: rgba(234, 179, 8, 0.1);
1017
- --citation-bg-miss: rgba(239, 68, 68, 0.1);
178
+ - [**basic-verification**](./examples/basic-verification) Core 3-step workflow
179
+ - [**support-bot**](./examples/support-bot) Customer support bot with invisible citations
180
+ - [**nextjs-ai-sdk**](./examples/nextjs-ai-sdk) Full-stack Next.js chat app
1018
181
 
1019
- --citation-font-size-sm: 0.75em;
1020
- --citation-font-size-md: 0.875em;
1021
- --citation-border-radius: 9999px;
1022
- }
182
+ ```bash
183
+ cd examples/basic-verification
184
+ npm install
185
+ cp .env.example .env # Add your API keys
186
+ npm run start:openai
1023
187
  ```
1024
188
 
1025
189
  ---
1026
190
 
1027
- ## Types
1028
-
1029
- ### Core Types
1030
-
1031
- ```typescript
1032
- // Citation extracted from LLM output
1033
- interface Citation {
1034
- fileId?: string; // Document identifier
1035
- startPageKey?: string | null; // Page key format: "page_number_PAGE_index_INDEX"
1036
- pageNumber?: number | null; // Page number (1-indexed)
1037
- fullPhrase?: string | null; // Exact verbatim text from source
1038
- value?: string | null; // Citation value/summary
1039
- lineIds?: number[] | null; // Line numbers in document
1040
- reasoning?: string | null; // LLM's reasoning for citation
1041
- citationNumber?: number; // Sequential citation number
1042
- beforeCite?: string; // Text before <cite> tag
1043
- selection?: ScreenBox | null; // Bounding box coordinates
1044
- formFieldName?: string | null; // Form field name (for form documents)
1045
- formFieldValue?: string | null; // Form field value
1046
- timestamps?: { // For audio/video citations
1047
- startTime?: string; // Format: "HH:MM:SS.SSS"
1048
- endTime?: string;
1049
- };
1050
- }
1051
-
1052
- // Verification result for a citation
1053
- interface FoundHighlightLocation {
1054
- searchState: SearchState; // Verification status and details
1055
- pageNumber?: number | null; // Page where citation was found
1056
- matchSnippet?: string | null; // Matched text snippet
1057
- lowerCaseSearchTerm?: string | null; // Search term used
1058
- verificationImageBase64?: string | null; // Base64 screenshot proof
1059
- verificationImageUrl?: string | null; // Hosted image URL
1060
- verifiedAt?: Date; // Verification timestamp
1061
- source?: string | null; // Engine version
1062
- citation?: Citation; // Original citation data
1063
- label?: string | null; // e.g., "Invoice", "Contract"
1064
- hitIndexWithinPage?: number | null; // Match index on page
1065
- pdfSpaceItem?: PdfSpaceItem; // PDF coordinate data
1066
- }
1067
-
1068
- // Verification status with location comparison
1069
- interface SearchState {
1070
- status: SearchStatus; // Verification result
1071
- expectedPage?: number | null; // Page claimed by LLM
1072
- actualPage?: number | null; // Page where found
1073
- expectedLineIds?: number[] | null; // Lines claimed by LLM
1074
- actualLineIds?: number[] | null; // Actual lines
1075
- expectedTimestamps?: { startTime?: string; endTime?: string };
1076
- actualTimestamps?: { startTime?: string; endTime?: string };
1077
- }
1078
-
1079
- // All possible verification statuses
1080
- type SearchStatus =
1081
- | "found" // ✓ Exact match at expected location
1082
- | "found_value_only" // Found value, phrase differs
1083
- | "found_phrase_missed_value" // Found phrase, value differs
1084
- | "partial_text_found" // Partial text match
1085
- | "found_on_other_page" // Found on different page
1086
- | "found_on_other_line" // Found on different line
1087
- | "first_word_found" // Only first word matched
1088
- | "not_found" // ✗ Citation not found
1089
- | "pending" // Page processing, will retry
1090
- | "loading" // Verification in progress
1091
- | "timestamp_wip"; // Audio/video timestamp processing
1092
-
1093
- // Simplified status for UI logic
1094
- interface CitationStatus {
1095
- isVerified: boolean; // Trustworthy citation
1096
- isMiss: boolean; // Not found in document
1097
- isPartialMatch: boolean; // Found with discrepancies
1098
- isPending: boolean; // Still processing
1099
- }
1100
- ```
1101
-
1102
- ### File Upload Types
1103
-
1104
- ```typescript
1105
- // File upload response
1106
- interface UploadFileResponse {
1107
- fileId: string; // Unique file identifier (custom or auto-generated)
1108
- fileDeepText: string; // Formatted text for LLM with page markers and line IDs
1109
- formFields?: Array<{ // Extracted form fields (for PDF forms)
1110
- name: string;
1111
- value?: string;
1112
- pageIndex?: number;
1113
- type?: string;
1114
- }>;
1115
- metadata: {
1116
- filename: string;
1117
- mimeType: string;
1118
- pageCount: number;
1119
- textByteSize: number;
1120
- };
1121
- status: "ready" | "error";
1122
- processingTimeMs?: number;
1123
- error?: string;
1124
- }
1125
-
1126
- // File reference for verification (opaque - pass directly to verifyCitations)
1127
- interface FileDataPart {
1128
- fileId: string;
1129
- }
1130
-
1131
- // Result from prepareFiles()
1132
- interface PrepareFilesResult {
1133
- fileDataParts: FileDataPart[]; // Pass to verifyCitations()
1134
- fileDeepTexts: string[]; // Pass to wrapCitationPrompt({ fileDeepText })
1135
- }
1136
- ```
191
+ ## Documentation
1137
192
 
1138
- ### Conversion Types
193
+ For comprehensive documentation including:
194
+ - Full API reference
195
+ - Integration patterns
196
+ - Error handling
197
+ - Advanced React components
198
+ - TypeScript types
1139
199
 
1140
- ```typescript
1141
- // Input for convertToPdf()
1142
- interface ConvertFileInput {
1143
- url?: string; // URL to convert to PDF
1144
- file?: File | Blob | Buffer; // Office file to convert
1145
- filename?: string; // Custom filename
1146
- fileId?: string; // Custom file ID
1147
- singlePage?: boolean; // For URLs: single long page
1148
- }
1149
-
1150
- // Response from convertToPdf()
1151
- interface ConvertFileResponse {
1152
- fileId: string; // Use with prepareConvertedFile()
1153
- metadata: {
1154
- originalFilename: string; // Original filename
1155
- originalMimeType: string; // Original MIME type
1156
- convertedMimeType: string; // Always "application/pdf"
1157
- conversionTimeMs: number; // Conversion duration
1158
- };
1159
- status: "converted" | "error";
1160
- error?: string;
1161
- }
1162
-
1163
- // Options for prepareConvertedFile()
1164
- interface PrepareConvertedFileOptions {
1165
- fileId: string; // From convertToPdf()
1166
- }
1167
- ```
1168
-
1169
- ### Geometry Types
1170
-
1171
- ```typescript
1172
- interface ScreenBox {
1173
- x: number;
1174
- y: number;
1175
- width: number;
1176
- height: number;
1177
- }
1178
-
1179
- interface PdfSpaceItem extends ScreenBox {
1180
- text?: string;
1181
- }
1182
- ```
200
+ Visit **[deepcitation.com/docs](https://deepcitation.com/docs)**
1183
201
 
1184
202
  ---
1185
203
 
1186
204
  ## Supported File Types
1187
205
 
1188
- ### Direct Upload (via `prepareFiles()` or `uploadFile()`)
1189
-
1190
- - PDF documents (native and scanned with OCR)
1191
- - Images (PNG, JPEG, TIFF, WebP, AVIF, HEIC)
1192
-
1193
- ### URL Conversion (via `convertToPdf()`)
1194
-
1195
- Convert any publicly accessible URL to PDF for citation verification:
1196
-
1197
- - Web pages (HTML rendered to PDF)
1198
- - Direct PDF links (downloaded and processed)
1199
-
1200
- ### Office Documents (via `convertToPdf()`)
1201
-
1202
- Convert Microsoft Office and OpenDocument formats to PDF:
1203
-
1204
- | Format | Extensions |
1205
- |--------|------------|
1206
- | Microsoft Word | `.doc`, `.docx` |
1207
- | Microsoft Excel | `.xls`, `.xlsx` |
1208
- | Microsoft PowerPoint | `.ppt`, `.pptx` |
1209
- | OpenDocument | `.odt`, `.ods`, `.odp` |
1210
- | Rich Text Format | `.rtf` |
1211
- | CSV | `.csv` |
1212
-
1213
- #### Office/URL Conversion Example
1214
-
1215
- ```typescript
1216
- import { DeepCitation, wrapCitationPrompt } from "@deepcitation/deepcitation-js";
1217
-
1218
- const dc = new DeepCitation({ apiKey: process.env.DEEPCITATION_API_KEY });
1219
-
1220
- // Convert a URL to PDF
1221
- const urlResult = await dc.convertToPdf("https://example.com/article");
1222
-
1223
- // Or convert an Office document
1224
- const docResult = await dc.convertToPdf({
1225
- file: docxBuffer,
1226
- filename: "report.docx",
1227
- });
1228
-
1229
- // Then prepare the converted file for citation verification
1230
- const { fileDeepText, fileId } = await dc.prepareConvertedFile({
1231
- fileId: urlResult.fileId,
1232
- });
1233
-
1234
- // Use fileDeepText in your LLM prompt via wrapCitationPrompt
1235
- const { enhancedSystemPrompt, enhancedUserPrompt } = wrapCitationPrompt({
1236
- systemPrompt: "You are a helpful assistant...",
1237
- userPrompt: userMessage,
1238
- fileDeepText, // Automatically wrapped in <file_text> tags
1239
- });
1240
-
1241
- // After LLM response, verify citations
1242
- const verified = await dc.verifyCitations(fileId, citations);
1243
- ```
206
+ **Documents:** PDF (native and scanned), URLs, Office formats (`.docx`, `.xlsx`, `.pptx`, etc.)
207
+ **Images:** PNG, JPEG, TIFF, WebP, AVIF, HEIC
208
+ **Media:** Audio and video (with timestamp-based citations)
1244
209
 
1245
210
  ---
1246
211
 
1247
212
  ## Contributing
1248
213
 
1249
- We welcome contributions! Here's how to get started:
1250
-
1251
- ### Setting Up Development Environment
1252
-
1253
- ```bash
1254
- # Fork and clone the repository
1255
- git clone https://github.com/YOUR_USERNAME/deepcitation-js.git
1256
- cd deepcitation-js
1257
-
1258
- # Install dependencies
1259
- bun install
1260
-
1261
- # Build the package
1262
- bun run build
1263
-
1264
- # Run tests
1265
- bun run test
1266
- ```
1267
-
1268
- ### Making Changes
1269
-
1270
- 1. Create a feature branch from `main`:
1271
- ```bash
1272
- git checkout -b feature/your-feature-name
1273
- ```
1274
-
1275
- 2. Make your changes and ensure tests pass:
1276
- ```bash
1277
- bun run test
1278
- bun run lint
1279
- ```
1280
-
1281
- 3. Commit with a descriptive message:
1282
- ```bash
1283
- git commit -m "feat: add new citation format support"
1284
- ```
1285
-
1286
- 4. Push and open a Pull Request against `main`.
1287
-
1288
- ### Pull Request Guidelines
1289
-
1290
- - Keep PRs focused on a single change
1291
- - Include tests for new functionality
1292
- - Update documentation as needed
1293
- - Follow the existing code style
1294
- - Ensure all CI checks pass
1295
-
1296
- ### Commit Message Convention
1297
-
1298
- We follow [Conventional Commits](https://www.conventionalcommits.org/):
1299
-
1300
- - `feat:` - New features
1301
- - `fix:` - Bug fixes
1302
- - `docs:` - Documentation changes
1303
- - `refactor:` - Code refactoring
1304
- - `test:` - Test additions or fixes
1305
- - `chore:` - Maintenance tasks
214
+ Contributions are welcome! Please see [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.
1306
215
 
1307
216
  ---
1308
217
 
1309
- ## Reporting Issues
1310
-
1311
- Found a bug or have a feature request? [Open an issue on GitHub](https://github.com/deepcitation/deepcitation-js/issues/new).
1312
-
1313
- ### Bug Reports
1314
-
1315
- When reporting bugs, please include:
1316
-
1317
- - **Description**: Clear explanation of the issue
1318
- - **Reproduction steps**: Minimal code to reproduce the problem
1319
- - **Expected behavior**: What you expected to happen
1320
- - **Actual behavior**: What actually happened
1321
- - **Environment**: Node.js version, package version, OS
1322
-
1323
- Example:
1324
- ```markdown
1325
- **Description**
1326
- Citations with special characters are not parsed correctly.
1327
-
1328
- **Reproduction**
1329
- const result = getAllCitationsFromLlmOutput('<cite page="1">Test & Co.</cite>');
1330
- // Returns empty object instead of parsed citation
1331
-
1332
- **Expected**: Citation with value "Test & Co." parsed
1333
- **Actual**: Empty object returned
1334
-
1335
- **Environment**: Node.js 20.x, @deepcitation/deepcitation-js v1.0.0, macOS 14
1336
- ```
1337
-
1338
- ### Feature Requests
1339
-
1340
- For feature requests, please describe:
218
+ ## License
1341
219
 
1342
- - **Use case**: What problem are you trying to solve?
1343
- - **Proposed solution**: How would you like it to work?
1344
- - **Alternatives considered**: Other approaches you've thought of
220
+ MIT License - see [LICENSE](./LICENSE) for details.
1345
221
 
1346
222
  ---
1347
223
 
1348
- ## License
1349
-
1350
- MIT
1351
-
1352
224
  ## Links
1353
225
 
1354
226
  - [Documentation](https://deepcitation.com/docs)
1355
- - [Get Free API Key](https://deepcitation.com/signup)
1356
- - [GitHub](https://github.com/deepcitation/deepcitation)
227
+ - [Get API Key](https://deepcitation.com/signup)
228
+ - [Discord Community](https://discord.gg/deepcitation)
229
+ - [GitHub Issues](https://github.com/deepcitation/deepcitation-js/issues)
230
+ - [Examples](./examples)