npm - @pdfvector/client - Versions diffs - 0.0.29 → 0.0.31 - Mend

@pdfvector/client 0.0.29 → 0.0.31

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,25 @@
 # @pdfvector/client
+## 0.0.31
+### Patch Changes
+- [#244](https://github.com/phuctm97/pdfvector/pull/244) [`d751cdd`](https://github.com/phuctm97/pdfvector/commit/d751cdde1c208c3298d1a0c2c34406e724e53264) Thanks [@khanhduyvt0101](https://github.com/khanhduyvt0101)! - Improve PDF Vector SDK error handling.
+- Updated dependencies [[`d751cdd`](https://github.com/phuctm97/pdfvector/commit/d751cdde1c208c3298d1a0c2c34406e724e53264)]:
+  - @pdfvector/instance-client@0.0.51
+## 0.0.30
+### Patch Changes
+- [#240](https://github.com/phuctm97/pdfvector/pull/240) [`2c8691c`](https://github.com/phuctm97/pdfvector/commit/2c8691c9bbd251ff7b7a153fd4254d9360c11c08) Thanks [@khanhduyvt0101](https://github.com/khanhduyvt0101)! - Add academic.parse to resolve academic paper IDs or provider URLs to public PDFs and parse them to markdown.
+- Updated dependencies []:
+  - @pdfvector/instance-client@0.0.50
 ## 0.0.29
 ### Patch Changes

package/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # PDF Vector TypeScript/JavaScript SDK
-The official TypeScript/JavaScript SDK for the [PDF Vector](https://www.pdfvector.com) API: Parse PDF, Word, Image, and Excel documents to clean, structured markdown format, ask questions about documents using AI, extract structured data from documents with JSON Schema, search across multiple academic databases with a unified API, fetch specific publications by DOI, PubMed ID, ArXiv ID, and more, find relevant academic citations for paragraphs of text, explore paper citation graphs, find similar papers, and search for research grants across US, EU, and UK funding databases.
+The official TypeScript/JavaScript SDK for the [PDF Vector](https://www.pdfvector.com) API: Parse PDF, Word, Image, and Excel documents to clean, structured markdown format, ask questions about documents using AI, extract structured data from documents with JSON Schema, search across multiple academic databases with a unified API, fetch specific publications by DOI, PubMed ID, ArXiv ID, and more, convert academic paper IDs or provider URLs to markdown, find relevant academic citations for paragraphs of text, explore paper citation graphs, find similar papers, and search for research grants across US, EU, and UK funding databases.
 ## Installation
@@ -380,6 +380,36 @@ result.errors?.forEach((error) => {
 **Supported ID types:** DOI, PubMed ID, ArXiv ID, Semantic Scholar ID, ERIC ID, Europe PMC ID, OpenAlex ID.
+### Parse Academic Paper to Markdown
+Resolve a paper ID or provider URL to its public PDF and parse it into markdown. Uses the same per-page model pricing as Document Parse.
+```typescript
+const result = await client.academic.parse({
+  id: "1706.03762", // DOI, PubMed ID, ArXiv ID, Semantic Scholar ID, or provider URL
+  model: "auto",    // "auto" | "nano" | "mini" | "pro" | "max"
+});
+console.log(`Title: ${result.title}`);
+console.log(`Provider: ${result.detectedProvider}`);
+console.log(`PDF: ${result.pdfURL}`);
+console.log(result.markdown);
+console.log(`Pages: ${result.pageCount}, Credits: ${result.credits}`);
+```
+You can pass a provider URL instead of an ID:
+```typescript
+const result = await client.academic.parse({
+  url: "https://arxiv.org/abs/1706.03762",
+  model: "nano",
+});
+console.log(result.markdown);
+```
+Provide exactly one of `id` or `url`. If the paper cannot be found, has no public PDF, or the resolved PDF cannot be fetched, the API returns a typed `PDFVectorError` with a clear message and no parse credits are charged.
 ### Find Citations for a Paragraph
 Find relevant academic citations for each sentence in a paragraph using semantic similarity. Costs 2 credits per sentence analyzed.
@@ -573,6 +603,7 @@ console.log(resultB.documentId); // "doc-b"
 | Bank Statement Extract | 6 | 10 | 14 | 18 | /page |
 | Academic Search | 2 | 2 | 2 | 2 | /request |
 | Academic Fetch | 2 | 2 | 2 | 2 | /request |
+| Academic Parse | 1 | 2 | 4 | 8 | /page |
 | Academic Find Citations | 2 | 2 | 2 | 2 | /sentence |
 | Academic Paper Graph | 2+ | 2+ | 2+ | 2+ | /request |
 | Academic Similar Papers | 3 | 3 | 3 | 3 | /request |
@@ -580,10 +611,14 @@ console.log(resultB.documentId); // "doc-b"
 ## Error Handling
-All API errors are thrown as `PDFVectorError` instances. The SDK transparently maps every server error into the most specific subclass it can, so you can branch on the type using `instanceof` and read typed metadata fields directly.
+All API errors are thrown as `PDFVectorError` instances. The SDK maps server errors into specific subclasses and adds user/agent-friendly fields such as `title`, `suggestion`, `userError`, retry flags, and `requestId`.
 ```typescript
-import { createClient, PDFVectorError } from "@pdfvector/client";
+import {
+  PDFVectorError,
+  createClient,
+  isPDFVectorUserError,
+} from "@pdfvector/client";
 const client = createClient({ apiKey: "your-api-key" });
@@ -593,35 +628,59 @@ try {
   });
   console.log(result.markdown);
 } catch (error) {
+  if (isPDFVectorUserError(error)) {
+    console.error(error.title);
+    console.error(error.suggestion);
+    return;
+  }
   if (error instanceof PDFVectorError) {
-    console.error(`API Error [${error.code}]: ${error.message}`);
-    console.error(`HTTP Status: ${error.status}`);
-    console.error(`Request ID: ${error.requestId}`);   // server-assigned, useful for support
-    console.error(`Document ID: ${error.documentId}`); // echoed back if you set one
-    console.error(`User error: ${error.userError}`);   // true if caused by your input
-  } else {
-    // Network errors (DNS, connection refused, timeout) bubble up as TypeError.
-    console.error("Unexpected Error:", error);
+    console.error(error.supportMessage);
+    console.error(error.toAgentError());
+    return;
+  }
+  // Network errors (DNS, connection refused, timeout) bubble up as TypeError.
+  console.error("Unexpected Error:", error);
+}
+```
+### User errors
+Use `isPDFVectorUserError(error)` or `error.userError` for caller-fixable failures that should usually be shown to the user instead of reported as system failures. For example, URL input failures such as `URL did not return a supported document` are `URLFetchError` instances with `userError: true`.
+```typescript
+import { isPDFVectorUserError, isPDFVectorError } from "@pdfvector/client";
+try {
+  await client.document.parse({ url: "https://example.com/page.html" });
+} catch (error) {
+  if (isPDFVectorUserError(error)) {
+    console.error(error.suggestion);
+  }
+  if (isPDFVectorError(error) && error.retryableWithHigherModel) {
+    console.error("Retry with a stronger model or a smaller document.");
   }
 }
 ```
 ### Branching on specific error types
-Every error class extends `PDFVectorError`, so you can use `instanceof` to handle specific cases. Specialized subclasses expose typed fields pulled from the error's `data` payload:
+Every error class extends `PDFVectorError`, so you can use `instanceof` to handle specific cases. Specialized subclasses expose typed fields pulled from the error payload:
 ```typescript
 import {
-  createClient,
+  EmptyDocumentError,
+  ExtractionFailedError,
   FileTooLargeError,
+  InvalidSchemaError,
+  NoPublicPDFError,
   PageLimitExceededError,
   PasswordProtectedError,
-  URLFetchError,
-  UnauthorizedError,
   TooManyRequestsError,
-  EmptyDocumentError,
-  ExtractionFailedError,
-  PDFVectorError,
+  UnauthorizedError,
+  URLFetchError,
 } from "@pdfvector/client";
 try {
@@ -633,14 +692,18 @@ try {
     );
   } else if (error instanceof PageLimitExceededError) {
     console.error(
-      `Document has ${error.pageCount} pages — ${error.model} only supports up to ${error.pageLimit}`,
+      `Document has ${error.pageCount} pages; ${error.model} supports up to ${error.pageLimit}`,
     );
   } else if (error instanceof PasswordProtectedError) {
     console.error("Remove the password from the file and try again");
   } else if (error instanceof URLFetchError) {
-    console.error(`Could not fetch ${error.url}: ${error.statusCode} ${error.statusText}`);
+    console.error(error.suggestion);
+  } else if (error instanceof InvalidSchemaError) {
+    console.error(error.reason);
+  } else if (error instanceof NoPublicPDFError) {
+    console.error("Provide a direct PDF URL or upload the paper file directly");
   } else if (error instanceof UnauthorizedError) {
-    console.error("Invalid API key — check your dashboard");
+    console.error("Invalid API key; check your dashboard");
   } else if (error instanceof TooManyRequestsError) {
     console.error(`Rate limit ${error.limit} exceeded; resets at ${error.resetAt}`);
   } else if (error instanceof EmptyDocumentError) {
@@ -648,34 +711,6 @@ try {
   } else if (error instanceof ExtractionFailedError) {
     console.error(`Extraction failed. Hint: ${error.hint}`);
     if (error.rawText) console.error(`Model output sample: ${error.rawText}`);
-  } else if (error instanceof PDFVectorError) {
-    // Catch-all for any error code not specifically handled
-    console.error(`API Error [${error.code}]: ${error.message}`);
-  }
-}
-```
-You can also branch on the error code if you prefer:
-```typescript
-try {
-  await client.document.parse({ url: "..." });
-} catch (error) {
-  if (error instanceof PDFVectorError) {
-    switch (error.code) {
-      case "UNAUTHORIZED":
-        console.error("Invalid API key");
-        break;
-      case "BAD_REQUEST":
-        console.error("Validation error:", error.message);
-        break;
-      case "UNPROCESSABLE_CONTENT":
-        console.error("Could not process document:", error.message);
-        break;
-      case "INTERNAL_SERVER_ERROR":
-        console.error(`Server error (requestId: ${error.requestId}):`, error.message);
-        break;
-    }
   }
 }
 ```
@@ -690,13 +725,17 @@ PDFVectorError
 │   ├── PasswordProtectedError
 │   ├── UnsupportedFormatError            — format, supportedFormats
 │   ├── URLFetchError                     — url, statusCode, statusText
+│   ├── InvalidDocumentURLError
+│   ├── InvalidBase64Error
 │   ├── TierNotSupportedError             — documentType, model, allowedTypes
 │   ├── InvalidSchemaError                — reason
 │   └── NoInputProvidedError
 ├── UnauthorizedError               (401)
 ├── NotFoundError                   (404)
+│   ├── AcademicPaperNotFoundError        — input, paperErrorCode
+│   └── NoPublicPDFError                  — input, paperTitle, doi, providerURL
 ├── ConflictError                   (409)
-├── TooManyRequestsError            (429) — limit, resetAt
+├── TooManyRequestsError            (429) — limit, resetAt, retryAfterSeconds
 ├── UnprocessableContentError       (422)
 │   ├── EmptyDocumentError
 │   ├── NoTextDetectedError
@@ -709,42 +748,36 @@ PDFVectorError
 | Field | Type | Description |
 |-------|------|-------------|
-| `code` | `string` | The ORPC error code (`BAD_REQUEST`, `UNAUTHORIZED`, etc.) |
-| `status` | `number` | HTTP status code (400, 401, 404, 409, 422, 429, 500, 501) |
-| `message` | `string` | Human-readable error message |
-| `data` | `Record<string, unknown>` | Raw error payload from the server |
-| `requestId` | `number \| undefined` | Server-assigned request ID — include in support tickets |
+| `code` | `string` | API error code (`BAD_REQUEST`, `UNAUTHORIZED`, etc.) |
+| `status` | `number` | HTTP-style status code |
+| `title` | `string` | Short readable summary |
+| `message` | `string` | Server-provided error message |
+| `suggestion` | `string` | Recommended next action |
+| `category` | `string` | `authentication`, `validation`, `document_input`, `document_processing`, `rate_limit`, `not_found`, `conflict`, `unsupported`, or `server` |
+| `origin` | `"user" \| "system"` | Whether the failure is caller-fixable or likely server/provider-side |
+| `userError` | `boolean` | `true` for expected caller-fixable failures |
+| `retryable` | `boolean` | `true` when retrying may help |
+| `retryableWithHigherModel` | `boolean` | `true` when retrying with a stronger model or smaller document may help |
+| `requestId` | `number \| undefined` | Server-assigned request ID; include in support tickets |
 | `documentId` | `string \| undefined` | Echoed back if you passed `context.documentId` |
-| `userError` | `boolean` | `true` if the failure was caused by your input (vs. a server-side issue) |
-| `cause` | `unknown` | Original error (the underlying `ORPCError` from the wire) |
-### Type guard
-If you'd rather not import `PDFVectorError` just to do an `instanceof` check, use the `isPDFVectorError` guard:
-```typescript
-import { isPDFVectorError } from "@pdfvector/client";
+| `reasonCode` | `string \| undefined` | More specific server reason when available, such as `NO_PUBLIC_PDF` |
+| `supportMessage` | `string` | Compact support/logging message |
+| `data` | `Record<string, unknown>` | Raw error payload from the server |
+| `cause` | `unknown` | Original underlying error |
-try {
-  await client.document.parse({ url: "..." });
-} catch (error) {
-  if (isPDFVectorError(error)) {
-    console.error(error.code, error.message, error.requestId);
-  }
-}
-```
+Use `error.toAgentError()` or `JSON.stringify(error)` when you need a serializable error object for logs, workflows, retry planners, or agent tool responses.
 ### Error Codes
 | Code | Status | Description |
 |------|--------|-------------|
-| `BAD_REQUEST` | 400 | Input validation failed (e.g., missing fields, invalid URL, file too large, page limit exceeded, invalid JSON Schema) |
+| `BAD_REQUEST` | 400 | Input validation failed, including invalid URLs, unsupported formats, file size limits, page limits, invalid base64, and invalid JSON Schema |
 | `UNAUTHORIZED` | 401 | Missing or invalid API key |
-| `NOT_FOUND` | 404 | Resource not found (e.g., academic paper ID, version) |
+| `NOT_FOUND` | 404 | Resource not found, including academic paper IDs and papers without public PDFs |
 | `CONFLICT` | 409 | Operation conflicts with the current state |
-| `UNPROCESSABLE_CONTENT` | 422 | Document could not be processed (empty, no readable text, extraction failed) |
+| `UNPROCESSABLE_CONTENT` | 422 | Document could not be processed, including empty documents, no readable text, and extraction failures |
 | `TOO_MANY_REQUESTS` | 429 | Rate limit exceeded |
-| `INTERNAL_SERVER_ERROR` | 500 | Server-side failure — capture the `requestId` for support |
+| `INTERNAL_SERVER_ERROR` | 500 | Server-side failure; capture the `requestId` for support |
 | `NOT_IMPLEMENTED` | 501 | Endpoint not available on this instance |
 ## TypeScript Support
@@ -755,6 +788,7 @@ The SDK is written in TypeScript and includes full type definitions:
 import {
   createClient,
   isPDFVectorError,
+  isPDFVectorUserError,
   // Base error class — all errors inherit from this
   PDFVectorError,
   // HTTP-aligned error categories
@@ -772,12 +806,16 @@ import {
   PasswordProtectedError,
   UnsupportedFormatError,
   URLFetchError,
+  InvalidDocumentURLError,
+  InvalidBase64Error,
   TierNotSupportedError,
   InvalidSchemaError,
   NoInputProvidedError,
   EmptyDocumentError,
   NoTextDetectedError,
   ExtractionFailedError,
+  AcademicPaperNotFoundError,
+  NoPublicPDFError,
   // Underlying ORPC error — re-exported for advanced use cases
   ORPCError,
 } from "@pdfvector/client";
@@ -789,7 +827,10 @@ import type {
   ContractInputs,
   ContractOutputs,
   PDFVectorModel,
+  PDFVectorAgentError,
+  PDFVectorErrorCategory,
   PDFVectorErrorCode,
+  PDFVectorErrorOrigin,
 } from "@pdfvector/client";
 ```

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@pdfvector/client",
-  "version": "0.0.29",
+  "version": "0.0.31",
   "type": "module",
   "description": "Official TypeScript/JavaScript SDK for PDF Vector API",
   "license": "MIT",
@@ -23,7 +23,7 @@
   },
   "main": ".tsc/lib/index.js",
   "dependencies": {
-    "@pdfvector/instance-client": "^0.0.49"
+    "@pdfvector/instance-client": "^0.0.51"
   },
   "files": [
     ".tsc",