@noedgeai-org/doc2x-mcp 0.1.3-dev.6.1 → 0.1.4-dev.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,29 +1,47 @@
1
1
  # Doc2x MCP Server
2
2
 
3
+ [![CI](https://github.com/NoEdgeAI/doc2x-mcp/actions/workflows/ci.yml/badge.svg)](https://github.com/NoEdgeAI/doc2x-mcp/actions/workflows/ci.yml)
4
+ [![Publish](https://github.com/NoEdgeAI/doc2x-mcp/actions/workflows/publish.yml/badge.svg)](https://github.com/NoEdgeAI/doc2x-mcp/actions/workflows/publish.yml)
5
+ [![npm version](https://img.shields.io/npm/v/%40noedgeai-org%2Fdoc2x-mcp)](https://www.npmjs.com/package/@noedgeai-org/doc2x-mcp)
6
+
3
7
  简体中文 | [English](./README_EN.md)
4
8
 
5
- 本项目提供一个基于 stdio 的 MCP Server,把 Doc2x v2 的 PDF/图片接口封装成语义化 tools。
9
+ Doc2x v2 PDF/图片能力封装为基于 stdio 的 MCP Server,提供稳定、可组合的语义化 tools。
10
+
11
+ ## 目录
12
+
13
+ - [项目定位](#项目定位)
14
+ - [版本与环境](#版本与环境)
15
+ - [快速开始](#快速开始)
16
+ - [配置参考](#配置参考)
17
+ - [Tool API 总览](#tool-api-总览)
18
+ - [常见工作流](#常见工作流)
19
+ - [本地开发](#本地开发)
20
+ - [CI / 发布流水线](#ci--发布流水线)
21
+ - [如何发布](#如何发布)
22
+ - [安装本仓库 Skill(可选)](#安装本仓库-skill可选)
23
+ - [安全与排错](#安全与排错)
24
+ - [问题反馈](#问题反馈)
25
+ - [License](#license)
6
26
 
7
- ## 1) 运行环境
27
+ ## 项目定位
8
28
 
9
- - Node.js >= 18
29
+ - 面向 MCP 客户端(Codex CLI / Claude Code / 自定义 Agent)提供 Doc2x 能力。
30
+ - 以 submit/status/wait 统一异步任务模型,便于自动化编排。
31
+ - 提供可控超时、轮询、下载白名单等运行时安全边界。
10
32
 
11
- ## 2) 配置
33
+ ## 版本与环境
12
34
 
13
- 通过环境变量配置:
35
+ - 本地运行:Node.js `>=18` 即可。
36
+ - CI 校验:Node.js `18`、`20` 都会跑构建。
37
+ - 发布环境:GitHub Actions 中发布任务使用 Node.js `24`。
38
+ - 包管理器:统一用 pnpm(锁文件 `pnpm-lock.yaml`)。
14
39
 
15
- - `DOC2X_API_KEY`:必填(形如 `sk-xxx`)
16
- - `DOC2X_BASE_URL`:可选,默认 `https://v2.doc2x.noedgeai.com`
17
- - `DOC2X_HTTP_TIMEOUT_MS`:可选,默认 `60000`
18
- - `DOC2X_POLL_INTERVAL_MS`:可选,默认 `2000`
19
- - `DOC2X_MAX_WAIT_MS`:可选,默认 `600000`
20
- - `DOC2X_PARSE_PDF_MAX_OUTPUT_CHARS`:可选,默认 `5000`;限制 `doc2x_parse_pdf_wait_text` 返回文本的最大字符数,避免大模型上下文超限(设为 `0` 表示不限制)
21
- - `DOC2X_PARSE_PDF_MAX_OUTPUT_PAGES`:可选,默认 `10`;限制 `doc2x_parse_pdf_wait_text` 合并的最大页数(设为 `0` 表示不限制)
22
- - `DOC2X_DOWNLOAD_URL_ALLOWLIST`:可选,默认 `".amazonaws.com.cn,.aliyuncs.com,.noedgeai.com"`;设为 `*` 可允许任意 host(不推荐)
40
+ ## 快速开始
23
41
 
24
- ## 3) 启动
42
+ ### 方式 A:通过 npx(推荐)
25
43
 
26
- ### 方式 A:通过 npx
44
+ MCP client 配置中添加:
27
45
 
28
46
  ```json
29
47
  {
@@ -40,11 +58,13 @@
40
58
 
41
59
  ```bash
42
60
  cd doc2x-mcp
43
- npm install
44
- npm run build
45
- DOC2X_API_KEY=sk-xxx npm start
61
+ pnpm install --frozen-lockfile
62
+ pnpm run build
63
+ DOC2X_API_KEY=sk-xxx pnpm start
46
64
  ```
47
65
 
66
+ MCP client 指向本地构建产物:
67
+
48
68
  ```json
49
69
  {
50
70
  "command": "node",
@@ -56,27 +76,34 @@ DOC2X_API_KEY=sk-xxx npm start
56
76
  }
57
77
  ```
58
78
 
59
- ## 4) Tools
60
-
61
- - `doc2x_parse_pdf_submit`
62
- - `doc2x_parse_pdf_status`
63
- - `doc2x_parse_pdf_wait_text`
64
- - `doc2x_convert_export_submit`
65
- - `doc2x_convert_export_result`
66
- - `doc2x_convert_export_wait`
67
- - `doc2x_download_url_to_file`
68
- - `doc2x_parse_image_layout_sync`
69
- - `doc2x_parse_image_layout_submit`
70
- - `doc2x_parse_image_layout_status`
71
- - `doc2x_parse_image_layout_wait_text`
72
- - `doc2x_materialize_convert_zip`
73
- - `doc2x_debug_config`
79
+ ## 配置参考
80
+
81
+ | 环境变量 | 必填 | 默认值 | 说明 |
82
+ | --- | --- | --- | --- |
83
+ | `DOC2X_API_KEY` | 是 | - | Doc2x API Key(`sk-xxx`) |
84
+ | `DOC2X_BASE_URL` | 否 | `https://v2.doc2x.noedgeai.com` | Doc2x API 基础地址 |
85
+ | `DOC2X_HTTP_TIMEOUT_MS` | 否 | `60000` | 单次 HTTP 超时(毫秒) |
86
+ | `DOC2X_POLL_INTERVAL_MS` | 否 | `2000` | 轮询间隔(毫秒) |
87
+ | `DOC2X_MAX_WAIT_MS` | 否 | `600000` | wait 类工具最大等待时长(毫秒) |
88
+ | `DOC2X_PARSE_PDF_MAX_OUTPUT_CHARS` | 否 | `5000` | `doc2x_parse_pdf_wait_text` 最大返回字符数;`0`=不限制 |
89
+ | `DOC2X_PARSE_PDF_MAX_OUTPUT_PAGES` | 否 | `10` | `doc2x_parse_pdf_wait_text` 最大合并页数;`0`=不限制 |
90
+ | `DOC2X_DOWNLOAD_URL_ALLOWLIST` | 否 | `.amazonaws.com.cn,.aliyuncs.com,.noedgeai.com` | 下载 URL 白名单;`*` 允许任意 host(不推荐) |
91
+
92
+ ## Tool API 总览
93
+
94
+ | 阶段 | Tools | 说明 |
95
+ | --- | --- | --- |
96
+ | PDF 解析 | `doc2x_parse_pdf_submit` / `doc2x_parse_pdf_status` / `doc2x_parse_pdf_wait_text` / `doc2x_materialize_pdf_layout_json` | 提交任务、查询状态、等待并取文本,或将 v3 layout 结果落盘为本地 JSON |
97
+ | 结果导出 | `doc2x_convert_export_submit` / `doc2x_convert_export_result` / `doc2x_convert_export_wait` | 发起导出、查结果、等待导出完成 |
98
+ | 下载落盘 | `doc2x_download_url_to_file` / `doc2x_materialize_convert_zip` | 下载 URL 到本地、解包 convert zip |
99
+ | 图片版面解析 | `doc2x_parse_image_layout_sync` / `doc2x_parse_image_layout_submit` / `doc2x_parse_image_layout_status` / `doc2x_parse_image_layout_wait_text` | 同步/异步图片 OCR 与版面解析 |
100
+ | 诊断 | `doc2x_debug_config` | 返回配置解析与 API key 来源,便于排错 |
74
101
 
75
102
  ### PDF 解析模型(`doc2x_parse_pdf_submit` / `doc2x_parse_pdf_wait_text`)
76
103
 
77
104
  - 可选参数:`model`
78
- - 可选值:仅 `v3-2026`(最新模型)
79
- - 说明:不传 `model` 时默认使用 `v2`;若想体验最新模型,传:
105
+ - 可选值:`v2`(默认) / `v3-2026`(最新模型)
106
+ - 不传时默认 `v2`
80
107
 
81
108
  ```json
82
109
  {
@@ -84,22 +111,110 @@ DOC2X_API_KEY=sk-xxx npm start
84
111
  }
85
112
  ```
86
113
 
114
+ ### PDF Layout JSON 落盘(`doc2x_materialize_pdf_layout_json`)
115
+
116
+ - 必选参数:`output_path`
117
+ - `uid` 与 `pdf_path` 二选一
118
+ - `v2` 不支持 `layout`;需要 `pages[].layout` 时请使用 `v3-2026`
119
+ - 若传 `pdf_path` 但不传 `model`,该工具默认使用 `v3-2026`
120
+ - 成功时将原始 `result` JSON 写到本地
121
+
122
+ `layout` 是页面块结构和坐标信息,适合 figure/table 裁剪、区域高亮、结构化抽取和版面分析;如果只想看正文内容,优先使用 Markdown / DOCX 导出。
123
+
124
+ ```json
125
+ {
126
+ "pdf_path": "/absolute/path/to/input.pdf",
127
+ "output_path": "/absolute/path/to/input_v3.layout.json"
128
+ }
129
+ ```
130
+
87
131
  ### 导出公式参数(`doc2x_convert_export_submit` / `doc2x_convert_export_wait`)
88
132
 
89
133
  - 必选参数:`formula_mode`(`normal` / `dollar`)
90
- - 可选参数:`formula_level`(`int32`,仅源解析任务为 `model=v3-2026` 时生效,`v2` 下无效)
134
+ - 可选参数:`formula_level`(仅源解析任务为 `model=v3-2026` 时生效)
91
135
  - 取值说明:
92
- - `0`:不退化公式(保留原始 Markdown)
93
- - `1`:行内公式变为普通文本(退化 `\\(...\\)` 和 `$...$`)
94
- - `2`:全部公式变为普通文本(退化 `\\(...\\)`、`$...$`、`\\[...\\]`、`$$...$$`)
136
+ - `0`:保留公式
137
+ - `1`:仅退化行内公式(`\\(...\\)`、`$...$`)
138
+ - `2`:退化全部公式(`\\(...\\)`、`$...$`、`\\[...\\]`、`$$...$$`)
95
139
 
96
- ## 5) 协议
140
+ ## 常见工作流
97
141
 
98
- MIT License,详见 `LICENSE`。
142
+ ### 工作流 1:PDF -> Markdown 本地文件
143
+
144
+ 1. `doc2x_parse_pdf_submit` 提交 PDF 解析。
145
+ 2. `doc2x_convert_export_wait` 等待导出(`to=md`,并指定 `formula_mode`)。
146
+ 3. `doc2x_convert_export_result` 获取下载 URL。
147
+ 4. `doc2x_download_url_to_file` 下载到目标路径。
148
+
149
+ ### 工作流 2:图片版面 OCR 快速结果
150
+
151
+ 1. `doc2x_parse_image_layout_sync` 直接同步解析。
152
+ 2. 若需要稳态轮询,改用 submit/status/wait 组合。
153
+
154
+ ### 工作流 3:PDF -> v3 layout JSON 本地文件
155
+
156
+ 1. 调用 `doc2x_materialize_pdf_layout_json`,传入 `pdf_path` 和 `output_path`。
157
+ 2. 工具会等待 parse 成功,并将原始 `result` JSON 落到本地。
158
+ 3. 该 JSON 可直接提供给后续 figure/table 裁剪脚本使用。
159
+
160
+ ## 本地开发
161
+
162
+ ### 环境要求
163
+
164
+ - Node.js `>=18`
165
+ - pnpm(与仓库锁文件一致)
166
+
167
+ ### 常用命令
168
+
169
+ ```bash
170
+ pnpm install --frozen-lockfile
171
+ pnpm run build
172
+ pnpm run test
173
+ pnpm run format:check
174
+ ```
175
+
176
+ 运行服务:
177
+
178
+ ```bash
179
+ DOC2X_API_KEY=sk-xxx pnpm start
180
+ ```
99
181
 
100
- ## 6) 安装本仓库 Skill(可选)
182
+ ## CI / 发布流水线
101
183
 
102
- 用于给 Codex CLI / Claude Code 增加一个“教大模型如何使用 doc2x-mcp tools 的 Skill”(便于按固定工作流调用 tools、导出与下载、以及排错)。
184
+ 仓库使用 GitHub Actions:
185
+
186
+ - CI:`.github/workflows/ci.yml`
187
+ - 触发:`push` 到 `main`、`pull_request`、`workflow_dispatch`、每周一 UTC `03:17`
188
+ - 文档-only 变更(`**/*.md`、`LICENSE`)自动跳过
189
+ - 任务:
190
+ - `dependency-review`(仅 PR)
191
+ - `build`(Node.js `18/20` 矩阵)
192
+ - `package-smoke`(`npm pack --dry-run`)
193
+ - `security-audit`(仅手动/定时)
194
+
195
+ - Publish:`.github/workflows/publish.yml`
196
+ - `dev` 分支 push:发布 npm `dev` tag
197
+ - `v*.*.*` tag push:发布 npm `latest`
198
+ - 发布前校验 tag 版本与 `package.json` 版本一致
199
+ - 发布命令:`npm publish --provenance`
200
+
201
+ 建议提交前本地对齐:
202
+
203
+ ```bash
204
+ pnpm install --frozen-lockfile
205
+ pnpm run build
206
+ npm pack --dry-run
207
+ pnpm audit --prod --audit-level high
208
+ ```
209
+
210
+ ## 如何发布
211
+
212
+ - 开发预发布(`dev`):push 到 `dev` 分支后自动发布到 npm `dev` tag。版本会自动改成 `x.y.z-dev.<run>.<attempt>`。
213
+ - 正式发布(`latest`):push `v*.*.*` tag 后发布到 npm `latest`。tag 版本必须和 `package.json` 版本一致。
214
+
215
+ ## 安装本仓库 Skill(可选)
216
+
217
+ 用于给 Codex CLI / Claude Code 增加一个“教大模型如何使用 doc2x-mcp tools 的 Skill”。
103
218
 
104
219
  不需要 clone 仓库的一键安装(推荐):
105
220
 
@@ -107,28 +222,35 @@ MIT License,详见 `LICENSE`。
107
222
  curl -fsSL https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill.sh | sh
108
223
  ```
109
224
 
110
- 重复执行同一条命令即可覆盖安装(默认会覆盖已存在目录)。
111
-
112
225
  在本仓库源码目录安装:
113
226
 
114
227
  ```bash
115
- npm run skill:install
228
+ pnpm run skill:install
116
229
  ```
117
230
 
118
- 默认安装到:
119
-
120
- 脚本默认安装到:
231
+ 默认安装目录:
121
232
 
122
- - Codex CLI:`~/.codex/skills/public/doc2x-mcp`(用 `CODEX_HOME` 覆盖)
123
- - Claude Code:`~/.claude/skills/doc2x-mcp`(用 `CLAUDE_HOME` 覆盖)
233
+ - Codex CLI:`~/.codex/skills/public/doc2x-mcp`(可用 `CODEX_HOME` 覆盖)
234
+ - Claude Code:`~/.claude/skills/doc2x-mcp`(可用 `CLAUDE_HOME` 覆盖)
124
235
 
125
236
  说明:
126
237
 
127
- - `--target auto`(默认)会同时安装到 Codex + Claude;如只想装其中一个,用 `--target codex|claude`。
128
- - PowerShell 7+ 一键安装:`irm https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill.ps1 | iex`
129
- - Windows PowerShell 5.1 一键安装:`irm https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill-winps.ps1 | iex`
238
+ - `--target auto`(默认)会同时安装到 Codex + Claude
239
+ - PowerShell 7+:`irm https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill.ps1 | iex`
240
+ - Windows PowerShell 5.1:`irm https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill-winps.ps1 | iex`
130
241
 
131
- 覆盖安装目录示例:
242
+ ## 安全与排错
132
243
 
133
- - mac/linux:`CODEX_HOME=/custom/.codex curl -fsSL https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill.sh | sh -s -- --target codex`
134
- - Windows:`$env:CODEX_HOME="C:\\path\\.codex"; irm https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill.ps1 | iex`
244
+ - 不要在仓库提交真实 `DOC2X_API_KEY`。
245
+ - 白名单默认限制下载域名;如需放开,评估风险后再使用 `DOC2X_DOWNLOAD_URL_ALLOWLIST=*`。
246
+ - 配置异常时优先调用 `doc2x_debug_config` 定位环境变量来源与解析结果。
247
+
248
+ ## 问题反馈
249
+
250
+ - 使用问题或缺陷反馈:GitHub Issues
251
+ [https://github.com/NoEdgeAI/doc2x-mcp/issues](https://github.com/NoEdgeAI/doc2x-mcp/issues)
252
+ - 建议在 issue 中附上最小复现输入、触发的 tool 名称、以及 `doc2x_debug_config` 结果(可脱敏)。
253
+
254
+ ## License
255
+
256
+ MIT License,详见 `LICENSE`。
package/README_EN.md CHANGED
@@ -1,34 +1,52 @@
1
1
  # Doc2x MCP Server
2
2
 
3
+ [![CI](https://github.com/NoEdgeAI/doc2x-mcp/actions/workflows/ci.yml/badge.svg)](https://github.com/NoEdgeAI/doc2x-mcp/actions/workflows/ci.yml)
4
+ [![Publish](https://github.com/NoEdgeAI/doc2x-mcp/actions/workflows/publish.yml/badge.svg)](https://github.com/NoEdgeAI/doc2x-mcp/actions/workflows/publish.yml)
5
+ [![npm version](https://img.shields.io/npm/v/%40noedgeai-org%2Fdoc2x-mcp)](https://www.npmjs.com/package/@noedgeai-org/doc2x-mcp)
6
+
3
7
  English | [简体中文](./README.md)
4
8
 
5
- This project provides a stdio-based MCP Server that wraps Doc2x v2 PDF/image APIs into semantic tools.
9
+ A stdio-based MCP Server that wraps Doc2x v2 PDF/image capabilities into stable, composable semantic tools.
10
+
11
+ ## Table of Contents
12
+
13
+ - [Project Scope](#project-scope)
14
+ - [Runtime Quick Facts](#runtime-quick-facts)
15
+ - [Quick Start](#quick-start)
16
+ - [Configuration Reference](#configuration-reference)
17
+ - [Tool API Overview](#tool-api-overview)
18
+ - [Common Workflows](#common-workflows)
19
+ - [Local Development](#local-development)
20
+ - [CI / Release Pipelines](#ci--release-pipelines)
21
+ - [Publishing Flow](#publishing-flow)
22
+ - [Install Repo Skill (Optional)](#install-repo-skill-optional)
23
+ - [Security and Troubleshooting](#security-and-troubleshooting)
24
+ - [Getting Help](#getting-help)
25
+ - [License](#license)
6
26
 
7
- ## 1) Requirements
27
+ ## Project Scope
8
28
 
9
- - Node.js >= 18
29
+ - Exposes Doc2x capabilities to MCP clients (Codex CLI / Claude Code / custom agents).
30
+ - Uses a unified async contract (`submit/status/wait`) for predictable automation.
31
+ - Provides runtime safety boundaries (timeouts, polling controls, download allowlist).
10
32
 
11
- ## 2) Configuration
33
+ ## Runtime Quick Facts
12
34
 
13
- Configure via environment variables:
35
+ - Local run: Node.js `>=18` is enough.
36
+ - CI checks: builds run on Node.js `18` and `20`.
37
+ - Release environment: publish jobs run on Node.js `24` in GitHub Actions.
38
+ - Package manager: pnpm with lockfile `pnpm-lock.yaml`.
14
39
 
15
- - `DOC2X_API_KEY`: required (e.g. `sk-xxx`)
16
- - `DOC2X_BASE_URL`: optional, default `https://v2.doc2x.noedgeai.com`
17
- - `DOC2X_HTTP_TIMEOUT_MS`: optional, default `60000`
18
- - `DOC2X_POLL_INTERVAL_MS`: optional, default `2000`
19
- - `DOC2X_MAX_WAIT_MS`: optional, default `600000`
20
- - `DOC2X_PARSE_PDF_MAX_OUTPUT_CHARS`: optional, default `5000`; limit the returned text size of `doc2x_parse_pdf_wait_text` (set `0` for unlimited)
21
- - `DOC2X_PARSE_PDF_MAX_OUTPUT_PAGES`: optional, default `10`; limit merged pages of `doc2x_parse_pdf_wait_text` (set `0` for unlimited)
22
- - `DOC2X_DOWNLOAD_URL_ALLOWLIST`: optional, default `".amazonaws.com.cn,.aliyuncs.com,.noedgeai.com"`; set to `*` to allow any host (not recommended)
40
+ ## Quick Start
23
41
 
24
- ## 3) Run
42
+ ### Option A: via npx (recommended)
25
43
 
26
- ### Option A: via npx
44
+ Add this in your MCP client config:
27
45
 
28
46
  ```json
29
47
  {
30
48
  "command": "npx",
31
- "args": ["-y", "@noedgeai-org/doc2x-mcp"],
49
+ "args": ["-y", "@noedgeai-org/doc2x-mcp@latest"],
32
50
  "env": {
33
51
  "DOC2X_API_KEY": "sk-xxx",
34
52
  "DOC2X_BASE_URL": "https://v2.doc2x.noedgeai.com"
@@ -40,11 +58,13 @@ Configure via environment variables:
40
58
 
41
59
  ```bash
42
60
  cd doc2x-mcp
43
- npm install
44
- npm run build
45
- DOC2X_API_KEY=sk-xxx npm start
61
+ pnpm install --frozen-lockfile
62
+ pnpm run build
63
+ DOC2X_API_KEY=sk-xxx pnpm start
46
64
  ```
47
65
 
66
+ Point MCP client to your local build output:
67
+
48
68
  ```json
49
69
  {
50
70
  "command": "node",
@@ -56,27 +76,34 @@ DOC2X_API_KEY=sk-xxx npm start
56
76
  }
57
77
  ```
58
78
 
59
- ## 4) Tools
60
-
61
- - `doc2x_parse_pdf_submit`
62
- - `doc2x_parse_pdf_status`
63
- - `doc2x_parse_pdf_wait_text`
64
- - `doc2x_convert_export_submit`
65
- - `doc2x_convert_export_result`
66
- - `doc2x_convert_export_wait`
67
- - `doc2x_download_url_to_file`
68
- - `doc2x_parse_image_layout_sync`
69
- - `doc2x_parse_image_layout_submit`
70
- - `doc2x_parse_image_layout_status`
71
- - `doc2x_parse_image_layout_wait_text`
72
- - `doc2x_materialize_convert_zip`
73
- - `doc2x_debug_config`
79
+ ## Configuration Reference
80
+
81
+ | Environment Variable | Required | Default | Description |
82
+ | --- | --- | --- | --- |
83
+ | `DOC2X_API_KEY` | Yes | - | Doc2x API key (`sk-xxx`) |
84
+ | `DOC2X_BASE_URL` | No | `https://v2.doc2x.noedgeai.com` | Doc2x API base URL |
85
+ | `DOC2X_HTTP_TIMEOUT_MS` | No | `60000` | Per-request HTTP timeout in ms |
86
+ | `DOC2X_POLL_INTERVAL_MS` | No | `2000` | Polling interval in ms |
87
+ | `DOC2X_MAX_WAIT_MS` | No | `600000` | Max wait duration for wait tools in ms |
88
+ | `DOC2X_PARSE_PDF_MAX_OUTPUT_CHARS` | No | `5000` | Max returned chars for `doc2x_parse_pdf_wait_text`; `0` = unlimited |
89
+ | `DOC2X_PARSE_PDF_MAX_OUTPUT_PAGES` | No | `10` | Max merged pages for `doc2x_parse_pdf_wait_text`; `0` = unlimited |
90
+ | `DOC2X_DOWNLOAD_URL_ALLOWLIST` | No | `.amazonaws.com.cn,.aliyuncs.com,.noedgeai.com` | URL host allowlist for downloads; `*` allows any host (not recommended) |
91
+
92
+ ## Tool API Overview
93
+
94
+ | Stage | Tools | Purpose |
95
+ | --- | --- | --- |
96
+ | PDF parse | `doc2x_parse_pdf_submit` / `doc2x_parse_pdf_status` / `doc2x_parse_pdf_wait_text` / `doc2x_materialize_pdf_layout_json` | Submit parse tasks, check status, wait and fetch text, or materialize v3 layout JSON locally |
97
+ | Export | `doc2x_convert_export_submit` / `doc2x_convert_export_result` / `doc2x_convert_export_wait` | Start export, read export result, wait for completion |
98
+ | Download | `doc2x_download_url_to_file` / `doc2x_materialize_convert_zip` | Download export URL to local path, materialize convert zip |
99
+ | Image layout parse | `doc2x_parse_image_layout_sync` / `doc2x_parse_image_layout_submit` / `doc2x_parse_image_layout_status` / `doc2x_parse_image_layout_wait_text` | Sync/async OCR and layout parse for images |
100
+ | Diagnostics | `doc2x_debug_config` | Show resolved config and API key source |
74
101
 
75
102
  ### PDF Parse Model (`doc2x_parse_pdf_submit` / `doc2x_parse_pdf_wait_text`)
76
103
 
77
104
  - Optional parameter: `model`
78
- - Supported values: only `v3-2026` (latest model)
79
- - Notes: omit `model` to use default `v2`; to try the latest model, pass:
105
+ - Supported values: `v2` (default) / `v3-2026` (latest model)
106
+ - Default (when omitted): `v2`
80
107
 
81
108
  ```json
82
109
  {
@@ -84,22 +111,110 @@ DOC2X_API_KEY=sk-xxx npm start
84
111
  }
85
112
  ```
86
113
 
114
+ ### PDF Layout JSON Materialization (`doc2x_materialize_pdf_layout_json`)
115
+
116
+ - Required: `output_path`
117
+ - Provide either `uid` or `pdf_path`
118
+ - `v2` does not support `layout`; use `v3-2026` when `pages[].layout` is required
119
+ - When `pdf_path` is used and `model` is omitted, this tool defaults to `v3-2026`
120
+ - On success it writes the raw parse `result` JSON locally
121
+
122
+ `layout` contains page block structure and coordinates, which is useful for figure/table crops, region highlighting, structured extraction, and layout analysis. If the goal is readable full text, prefer Markdown / DOCX export.
123
+
124
+ ```json
125
+ {
126
+ "pdf_path": "/absolute/path/to/input.pdf",
127
+ "output_path": "/absolute/path/to/input_v3.layout.json"
128
+ }
129
+ ```
130
+
87
131
  ### Export Formula Parameters (`doc2x_convert_export_submit` / `doc2x_convert_export_wait`)
88
132
 
89
- - Required parameter: `formula_mode` (`normal` / `dollar`)
90
- - Optional parameter: `formula_level` (`int32`, effective only when the source parse task uses `model=v3-2026`; ignored by `v2`)
133
+ - Required: `formula_mode` (`normal` / `dollar`)
134
+ - Optional: `formula_level` (effective only when source parse used `model=v3-2026`)
91
135
  - Value mapping:
92
- - `0`: keep formulas as-is (preserve original Markdown)
93
- - `1`: degrade inline formulas to plain text (`\\(...\\)` and `$...$`)
94
- - `2`: degrade all formulas to plain text (`\\(...\\)`, `$...$`, `\\[...\\]`, `$$...$$`)
136
+ - `0`: keep formulas
137
+ - `1`: degrade inline formulas (`\\(...\\)`, `$...$`)
138
+ - `2`: degrade all formulas (`\\(...\\)`, `$...$`, `\\[...\\]`, `$$...$$`)
95
139
 
96
- ## 5) License
140
+ ## Common Workflows
97
141
 
98
- MIT License. See `LICENSE`.
142
+ ### Workflow 1: PDF -> Markdown local file
143
+
144
+ 1. Submit parse via `doc2x_parse_pdf_submit`.
145
+ 2. Wait export via `doc2x_convert_export_wait` (`to=md` and `formula_mode` required).
146
+ 3. Read result URL via `doc2x_convert_export_result`.
147
+ 4. Download to local path via `doc2x_download_url_to_file`.
148
+
149
+ ### Workflow 2: Fast image OCR/layout result
150
+
151
+ 1. Use `doc2x_parse_image_layout_sync` for direct parse.
152
+ 2. For robust polling behavior, switch to submit/status/wait flow.
153
+
154
+ ### Workflow 3: PDF -> local v3 layout JSON
155
+
156
+ 1. Call `doc2x_materialize_pdf_layout_json` with `pdf_path` and `output_path`.
157
+ 2. The tool waits for parse success and writes the raw `result` JSON locally.
158
+ 3. The saved JSON can be consumed directly by downstream figure/table crop scripts.
159
+
160
+ ## Local Development
161
+
162
+ ### Requirements
163
+
164
+ - Node.js `>=18`
165
+ - pnpm (aligned with lockfile)
166
+
167
+ ### Common commands
168
+
169
+ ```bash
170
+ pnpm install --frozen-lockfile
171
+ pnpm run build
172
+ pnpm run test
173
+ pnpm run format:check
174
+ ```
175
+
176
+ Run server:
177
+
178
+ ```bash
179
+ DOC2X_API_KEY=sk-xxx pnpm start
180
+ ```
99
181
 
100
- ## 6) Install Repo Skill (Optional)
182
+ ## CI / Release Pipelines
101
183
 
102
- Installs a tool-use skill for Codex CLI / Claude Code (teaches the LLM how to use doc2x-mcp tools with a standard workflow: submit/status/wait/export/download).
184
+ This repository uses GitHub Actions:
185
+
186
+ - CI: `.github/workflows/ci.yml`
187
+ - Triggers: `push` to `main`, `pull_request`, `workflow_dispatch`, weekly on Monday at UTC `03:17`
188
+ - Doc-only changes (`**/*.md`, `LICENSE`) are skipped
189
+ - Jobs:
190
+ - `dependency-review` (PR only)
191
+ - `build` (Node.js `18/20` matrix)
192
+ - `package-smoke` (`npm pack --dry-run`)
193
+ - `security-audit` (manual/scheduled only)
194
+
195
+ - Publish: `.github/workflows/publish.yml`
196
+ - Push to `dev`: publish npm package with `dev` tag
197
+ - Push tag `v*.*.*`: publish npm package as `latest`
198
+ - Verifies tag version matches `package.json` version before publish
199
+ - Publish command: `npm publish --provenance`
200
+
201
+ Recommended local parity checks before pushing:
202
+
203
+ ```bash
204
+ pnpm install --frozen-lockfile
205
+ pnpm run build
206
+ npm pack --dry-run
207
+ pnpm audit --prod --audit-level high
208
+ ```
209
+
210
+ ## Publishing Flow
211
+
212
+ - Dev pre-release (`dev`): push to `dev` branch to publish npm `dev` tag. Version is auto rewritten to `x.y.z-dev.<run>.<attempt>`.
213
+ - Production release (`latest`): push `v*.*.*` tag to publish npm `latest`. Tag version must match `package.json`.
214
+
215
+ ## Install Repo Skill (Optional)
216
+
217
+ Installs a reusable skill for Codex CLI / Claude Code to guide tool usage with the standard `submit/status/wait/export/download` workflow.
103
218
 
104
219
  One-command install without cloning (recommended):
105
220
 
@@ -107,28 +222,35 @@ One-command install without cloning (recommended):
107
222
  curl -fsSL https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill.sh | sh
108
223
  ```
109
224
 
110
- Re-run the same command to overwrite (default behavior overwrites an existing destination directory).
111
-
112
225
  Install from this repo source directory:
113
226
 
114
227
  ```bash
115
- npm run skill:install
228
+ pnpm run skill:install
116
229
  ```
117
230
 
118
- Default destination:
119
-
120
- The script installs to:
231
+ Default destinations:
121
232
 
122
- - Codex CLI: `~/.codex/skills/public/doc2x-mcp` (override via `CODEX_HOME`)
123
- - Claude Code: `~/.claude/skills/doc2x-mcp` (override via `CLAUDE_HOME`)
233
+ - Codex CLI: `~/.codex/skills/public/doc2x-mcp` (override with `CODEX_HOME`)
234
+ - Claude Code: `~/.claude/skills/doc2x-mcp` (override with `CLAUDE_HOME`)
124
235
 
125
236
  Notes:
126
237
 
127
- - `--target auto` (default) installs to both Codex + Claude; use `--target codex|claude` to install only one.
128
- - PowerShell 7+ one-command install: `irm https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill.ps1 | iex`
129
- - Windows PowerShell 5.1 one-command install: `irm https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill-winps.ps1 | iex`
238
+ - `--target auto` (default) installs to both Codex + Claude.
239
+ - PowerShell 7+: `irm https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill.ps1 | iex`
240
+ - Windows PowerShell 5.1: `irm https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill-winps.ps1 | iex`
130
241
 
131
- Override install dir examples:
242
+ ## Security and Troubleshooting
132
243
 
133
- - mac/linux: `CODEX_HOME=/custom/.codex curl -fsSL https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill.sh | sh -s -- --target codex`
134
- - Windows: `$env:CODEX_HOME="C:\\path\\.codex"; irm https://raw.githubusercontent.com/NoEdgeAI/doc2x-mcp/main/scripts/install-skill.ps1 | iex`
244
+ - Never commit real `DOC2X_API_KEY` to the repository.
245
+ - The download allowlist is restrictive by default; evaluate risk before using `DOC2X_DOWNLOAD_URL_ALLOWLIST=*`.
246
+ - Use `doc2x_debug_config` first when diagnosing config/environment issues.
247
+
248
+ ## Getting Help
249
+
250
+ - Usage questions or bug reports: GitHub Issues
251
+ [https://github.com/NoEdgeAI/doc2x-mcp/issues](https://github.com/NoEdgeAI/doc2x-mcp/issues)
252
+ - Include minimal reproduction input, affected tool name, and sanitized `doc2x_debug_config` output when possible.
253
+
254
+ ## License
255
+
256
+ MIT License. See `LICENSE`.
@@ -1,3 +1,8 @@
1
+ export declare function validatePdfLayoutResult(result: unknown, uid?: string): {
2
+ result: Record<string, unknown>;
3
+ pageCount: number;
4
+ hasLayout: true;
5
+ };
1
6
  export declare function materializeConvertZip(args: {
2
7
  convert_zip_base64: string;
3
8
  output_dir: string;
@@ -6,3 +11,13 @@ export declare function materializeConvertZip(args: {
6
11
  zip_path: string;
7
12
  extracted: boolean;
8
13
  }>;
14
+ export declare function materializePdfLayoutJson(args: {
15
+ result: unknown;
16
+ output_path: string;
17
+ uid?: string;
18
+ }): Promise<{
19
+ uid: string;
20
+ output_path: string;
21
+ page_count: number;
22
+ has_layout: true;
23
+ }>;
@@ -1,6 +1,8 @@
1
1
  import fsp from 'node:fs/promises';
2
2
  import path from 'node:path';
3
3
  import { spawn } from 'node:child_process';
4
+ import { ToolError } from '#errors';
5
+ import { TOOL_ERROR_CODE_INVALID_JSON } from '#errorCodes';
4
6
  function spawnUnzip(zipPath, outputDir) {
5
7
  return new Promise((resolve) => {
6
8
  const child = spawn('unzip', ['-o', zipPath, '-d', outputDir], { stdio: 'ignore' });
@@ -8,6 +10,44 @@ function spawnUnzip(zipPath, outputDir) {
8
10
  child.on('exit', (code) => resolve(code === 0));
9
11
  });
10
12
  }
13
+ function isRecord(value) {
14
+ return value !== null && typeof value === 'object' && !Array.isArray(value);
15
+ }
16
+ export function validatePdfLayoutResult(result, uid) {
17
+ if (!isRecord(result))
18
+ throw new ToolError({
19
+ code: TOOL_ERROR_CODE_INVALID_JSON,
20
+ message: 'parse result must be a JSON object',
21
+ retryable: false,
22
+ uid,
23
+ });
24
+ const pages = result.pages;
25
+ if (!Array.isArray(pages) || pages.length === 0)
26
+ throw new ToolError({
27
+ code: TOOL_ERROR_CODE_INVALID_JSON,
28
+ message: 'parse result must contain a non-empty pages array',
29
+ retryable: false,
30
+ uid,
31
+ });
32
+ for (let i = 0; i < pages.length; i++) {
33
+ const page = pages[i];
34
+ if (!isRecord(page))
35
+ throw new ToolError({
36
+ code: TOOL_ERROR_CODE_INVALID_JSON,
37
+ message: `pages[${i}] must be an object`,
38
+ retryable: false,
39
+ uid,
40
+ });
41
+ if (!isRecord(page.layout))
42
+ throw new ToolError({
43
+ code: TOOL_ERROR_CODE_INVALID_JSON,
44
+ message: `pages[${i}].layout must be an object`,
45
+ retryable: false,
46
+ uid,
47
+ });
48
+ }
49
+ return { result, pageCount: pages.length, hasLayout: true };
50
+ }
11
51
  export async function materializeConvertZip(args) {
12
52
  const outDir = path.resolve(args.output_dir);
13
53
  await fsp.mkdir(outDir, { recursive: true });
@@ -17,3 +57,15 @@ export async function materializeConvertZip(args) {
17
57
  const extracted = await spawnUnzip(zipPath, outDir);
18
58
  return { output_dir: outDir, zip_path: zipPath, extracted };
19
59
  }
60
+ export async function materializePdfLayoutJson(args) {
61
+ const validated = validatePdfLayoutResult(args.result, args.uid);
62
+ const outputPath = path.resolve(args.output_path);
63
+ await fsp.mkdir(path.dirname(outputPath), { recursive: true });
64
+ await fsp.writeFile(outputPath, `${JSON.stringify(validated.result, null, 2)}\n`, 'utf8');
65
+ return {
66
+ uid: args.uid ?? '',
67
+ output_path: outputPath,
68
+ page_count: validated.pageCount,
69
+ has_layout: validated.hasLayout,
70
+ };
71
+ }
@@ -1,4 +1,6 @@
1
- export declare const PARSE_PDF_MODELS: readonly ["v3-2026"];
1
+ export declare const PARSE_PDF_MODEL_V2: "v2";
2
+ export declare const PARSE_PDF_MODEL_V3: "v3-2026";
3
+ export declare const PARSE_PDF_MODELS: readonly ["v2", "v3-2026"];
2
4
  export type ParsePdfModel = (typeof PARSE_PDF_MODELS)[number];
3
5
  export declare function parsePdfSubmit(pdfPath: string, opts?: {
4
6
  model?: ParsePdfModel;
@@ -12,6 +14,15 @@ export declare function parsePdfStatus(uid: string): Promise<{
12
14
  detail: string;
13
15
  result: {} | null;
14
16
  }>;
17
+ export declare function parsePdfWaitResultByUid(args: {
18
+ uid: string;
19
+ poll_interval_ms?: number;
20
+ max_wait_ms?: number;
21
+ }): Promise<{
22
+ uid: string;
23
+ status: "success";
24
+ result: {} | null;
25
+ }>;
15
26
  export declare function parsePdfWaitTextByUid(args: {
16
27
  uid: string;
17
28
  poll_interval_ms?: number;
package/dist/doc2x/pdf.js CHANGED
@@ -9,7 +9,9 @@ import { doc2xRequestJson, putToSignedUrl } from '#doc2x/client';
9
9
  import { DOC2X_TASK_STATUS_FAILED, DOC2X_TASK_STATUS_SUCCESS } from '#doc2x/constants';
10
10
  import { HTTP_METHOD_GET, HTTP_METHOD_POST } from '#doc2x/http';
11
11
  import { v2 } from '#doc2x/paths';
12
- export const PARSE_PDF_MODELS = ['v3-2026'];
12
+ export const PARSE_PDF_MODEL_V2 = 'v2';
13
+ export const PARSE_PDF_MODEL_V3 = 'v3-2026';
14
+ export const PARSE_PDF_MODELS = [PARSE_PDF_MODEL_V2, PARSE_PDF_MODEL_V3];
13
15
  function mergePagesToTextWithLimit(result, joinWith, limits) {
14
16
  const parsed = result ?? null;
15
17
  const sourcePages = _.isArray(parsed?.pages) ? parsed.pages : [];
@@ -112,10 +114,9 @@ export async function parsePdfStatus(uid) {
112
114
  result: data.result ?? null,
113
115
  };
114
116
  }
115
- export async function parsePdfWaitTextByUid(args) {
117
+ async function waitForParsePdfSuccessByUid(args) {
116
118
  const pollInterval = args.poll_interval_ms ?? CONFIG.pollIntervalMs;
117
119
  const maxWait = args.max_wait_ms ?? CONFIG.maxWaitMs;
118
- const joinWith = args.join_with ?? '\n\n---\n\n';
119
120
  const uid = String(args.uid || '').trim();
120
121
  if (!uid)
121
122
  throw new ToolError({
@@ -145,13 +146,8 @@ export async function parsePdfWaitTextByUid(args) {
145
146
  }
146
147
  throw e;
147
148
  }
148
- if (st.status === DOC2X_TASK_STATUS_SUCCESS) {
149
- const merged = mergePagesToTextWithLimit(st.result, joinWith, {
150
- maxOutputChars: args.max_output_chars,
151
- maxOutputPages: args.max_output_pages,
152
- });
153
- return { uid, status: DOC2X_TASK_STATUS_SUCCESS, ...merged };
154
- }
149
+ if (st.status === DOC2X_TASK_STATUS_SUCCESS)
150
+ return st;
155
151
  if (st.status === DOC2X_TASK_STATUS_FAILED)
156
152
  throw new ToolError({
157
153
  code: TOOL_ERROR_CODE_PARSE_FAILED,
@@ -162,3 +158,16 @@ export async function parsePdfWaitTextByUid(args) {
162
158
  await sleep(pollInterval);
163
159
  }
164
160
  }
161
+ export async function parsePdfWaitResultByUid(args) {
162
+ const st = await waitForParsePdfSuccessByUid(args);
163
+ return { uid: st.uid, status: DOC2X_TASK_STATUS_SUCCESS, result: st.result };
164
+ }
165
+ export async function parsePdfWaitTextByUid(args) {
166
+ const joinWith = args.join_with ?? '\n\n---\n\n';
167
+ const st = await waitForParsePdfSuccessByUid(args);
168
+ const merged = mergePagesToTextWithLimit(st.result, joinWith, {
169
+ maxOutputChars: args.max_output_chars,
170
+ maxOutputPages: args.max_output_pages,
171
+ });
172
+ return { uid: st.uid, status: DOC2X_TASK_STATUS_SUCCESS, ...merged };
173
+ }
@@ -1,14 +1,15 @@
1
1
  import { CONFIG } from '#config';
2
2
  import { isRetryableError } from '#errors';
3
- import { parsePdfStatus, parsePdfSubmit, parsePdfWaitTextByUid, } from '#doc2x/pdf';
3
+ import { PARSE_PDF_MODEL_V2, PARSE_PDF_MODEL_V3, parsePdfStatus, parsePdfSubmit, parsePdfWaitResultByUid, parsePdfWaitTextByUid, } from '#doc2x/pdf';
4
+ import { materializePdfLayoutJson } from '#doc2x/materialize';
4
5
  import { asJsonResult, asTextResult } from '#mcp/results';
5
- import { deleteUidCache, fileSig, getSubmittedUidFromCache, joinWithSchema, makePdfUidCacheKey, missingEitherFieldError, nonNegativeIntSchema, parsePdfModelSchema, parsePdfUidSchema, pdfPathForWaitSchema, pdfPathSchema, positiveIntMsSchema, setFailedUidCache, setSubmittedUidCache, withToolErrorHandling, } from '#mcp/registerToolsShared';
6
+ import { deleteUidCache, fileSig, getSubmittedUidFromCache, jsonOutputPathSchema, joinWithSchema, makePdfUidCacheKey, missingEitherFieldError, nonNegativeIntSchema, parsePdfModelSchema, parsePdfUidSchema, pdfPathForWaitSchema, pdfPathSchema, positiveIntMsSchema, setFailedUidCache, setSubmittedUidCache, withToolErrorHandling, } from '#mcp/registerToolsShared';
6
7
  export function registerPdfTools(server, ctx) {
7
8
  server.registerTool('doc2x_parse_pdf_submit', {
8
9
  description: 'Create a Doc2x PDF parse task for a local file and return {uid}. Prefer calling doc2x_parse_pdf_status to monitor progress/result; only call doc2x_parse_pdf_wait_text if the user explicitly asks to wait/return merged text.',
9
10
  inputSchema: {
10
11
  pdf_path: pdfPathSchema,
11
- model: parsePdfModelSchema.describe("Optional parse model. Use 'v3-2026' to try the latest model. Omit this field to use default v2."),
12
+ model: parsePdfModelSchema.describe(`Optional parse model. Supported values: '${PARSE_PDF_MODEL_V2}' and '${PARSE_PDF_MODEL_V3}'. Omit this field to use default ${PARSE_PDF_MODEL_V2}.`),
12
13
  },
13
14
  }, withToolErrorHandling(async ({ pdf_path, model }) => {
14
15
  const sig = await fileSig(pdf_path);
@@ -45,7 +46,7 @@ export function registerPdfTools(server, ctx) {
45
46
  .optional()
46
47
  .describe('Max pages to merge into returned text (0 = unlimited). Default can be set via env DOC2X_PARSE_PDF_MAX_OUTPUT_PAGES.'),
47
48
  model: parsePdfModelSchema
48
- .describe("Optional parse model used only when submitting from pdf_path. Use 'v3-2026' to try latest model. Omit this field to use default v2."),
49
+ .describe(`Optional parse model used only when submitting from pdf_path. Supported values: '${PARSE_PDF_MODEL_V2}' and '${PARSE_PDF_MODEL_V3}'. Omit this field to use default ${PARSE_PDF_MODEL_V2}.`),
49
50
  },
50
51
  }, withToolErrorHandling(async (args) => {
51
52
  const maxOutputChars = args.max_output_chars ?? CONFIG.parsePdfMaxOutputChars;
@@ -120,4 +121,65 @@ export function registerPdfTools(server, ctx) {
120
121
  }
121
122
  }
122
123
  }));
124
+ server.registerTool('doc2x_materialize_pdf_layout_json', {
125
+ description: `Wait for a PDF parse task and write the raw Doc2x result JSON (with page layout) to output_path. Prefer passing uid. If only pdf_path is provided, this tool reuses a cached uid or submits a new parse with model='${PARSE_PDF_MODEL_V3}' by default.`,
126
+ inputSchema: {
127
+ uid: parsePdfUidSchema.optional(),
128
+ pdf_path: pdfPathForWaitSchema.optional(),
129
+ output_path: jsonOutputPathSchema,
130
+ poll_interval_ms: positiveIntMsSchema.optional(),
131
+ max_wait_ms: positiveIntMsSchema.optional(),
132
+ model: parsePdfModelSchema
133
+ .describe(`Optional parse model used only when submitting from pdf_path. Supported values: '${PARSE_PDF_MODEL_V2}' and '${PARSE_PDF_MODEL_V3}'. Defaults to '${PARSE_PDF_MODEL_V3}' for this tool because ${PARSE_PDF_MODEL_V2} does not return layout.`),
134
+ },
135
+ }, withToolErrorHandling(async (args) => {
136
+ const materializeByUid = async (uid) => {
137
+ const out = await parsePdfWaitResultByUid({
138
+ uid,
139
+ poll_interval_ms: args.poll_interval_ms,
140
+ max_wait_ms: args.max_wait_ms,
141
+ });
142
+ return await materializePdfLayoutJson({
143
+ uid: out.uid,
144
+ result: out.result,
145
+ output_path: args.output_path,
146
+ });
147
+ };
148
+ const uid = String(args.uid || '').trim();
149
+ if (uid)
150
+ return asJsonResult(await materializeByUid(uid));
151
+ const pdfPath = String(args.pdf_path || '').trim();
152
+ if (!pdfPath)
153
+ throw missingEitherFieldError('uid', 'pdf_path');
154
+ const sig = await fileSig(pdfPath);
155
+ const model = args.model ?? PARSE_PDF_MODEL_V3;
156
+ const cacheKey = makePdfUidCacheKey(sig.absPath, model);
157
+ const resolvedUid = getSubmittedUidFromCache(ctx, { kind: 'pdf', key: cacheKey, sig });
158
+ const finalUid = resolvedUid || (await parsePdfSubmit(pdfPath, { model })).uid;
159
+ setSubmittedUidCache(ctx, { kind: 'pdf', key: cacheKey, sig, uid: finalUid });
160
+ const markFailed = (failedUid) => setFailedUidCache(ctx, { kind: 'pdf', key: cacheKey, sig, uid: failedUid });
161
+ try {
162
+ return asJsonResult(await materializeByUid(finalUid));
163
+ }
164
+ catch (e) {
165
+ if (!resolvedUid) {
166
+ markFailed(finalUid);
167
+ throw e;
168
+ }
169
+ deleteUidCache(ctx, { kind: 'pdf', key: cacheKey });
170
+ if (!isRetryableError(e)) {
171
+ markFailed(finalUid);
172
+ throw e;
173
+ }
174
+ const retryUid = (await parsePdfSubmit(pdfPath, { model })).uid;
175
+ setSubmittedUidCache(ctx, { kind: 'pdf', key: cacheKey, sig, uid: retryUid });
176
+ try {
177
+ return asJsonResult(await materializeByUid(retryUid));
178
+ }
179
+ catch (retryErr) {
180
+ markFailed(retryUid);
181
+ throw retryErr;
182
+ }
183
+ }
184
+ }));
123
185
  }
@@ -82,6 +82,7 @@ export declare const convertToSchema: z.ZodEnum<{
82
82
  docx: "docx";
83
83
  }>;
84
84
  export declare const parsePdfModelSchema: z.ZodOptional<z.ZodEnum<{
85
+ v2: "v2";
85
86
  "v3-2026": "v3-2026";
86
87
  }>>;
87
88
  export declare const convertFormulaModeSchema: z.ZodEnum<{
@@ -98,6 +99,7 @@ export declare const imagePathSchema: z.ZodString;
98
99
  export declare const imagePathForWaitSchema: z.ZodString;
99
100
  export declare const pdfPathForWaitSchema: z.ZodString;
100
101
  export declare const outputPathSchema: z.ZodString;
102
+ export declare const jsonOutputPathSchema: z.ZodString;
101
103
  export declare const doc2xDownloadUrlSchema: z.ZodPipe<z.ZodString, z.ZodURL>;
102
104
  export declare const convertZipBase64Schema: z.ZodString;
103
105
  export declare const outputDirSchema: z.ZodString;
@@ -5,7 +5,7 @@ import path from 'node:path';
5
5
  import { LRUCache } from 'lru-cache';
6
6
  import { z } from 'zod';
7
7
  import { CONVERT_FORMULA_LEVELS } from '#doc2x/convert';
8
- import { PARSE_PDF_MODELS } from '#doc2x/pdf';
8
+ import { PARSE_PDF_MODEL_V2, PARSE_PDF_MODELS } from '#doc2x/pdf';
9
9
  import { ToolError } from '#errors';
10
10
  import { TOOL_ERROR_CODE_INVALID_ARGUMENT } from '#errorCodes';
11
11
  import { asErrorResult } from '#mcp/results';
@@ -101,7 +101,7 @@ export function sameSig(a, b) {
101
101
  return a.md5 === b.md5;
102
102
  }
103
103
  function normalizeParsePdfModel(model) {
104
- return model ?? 'v2';
104
+ return model ?? PARSE_PDF_MODEL_V2;
105
105
  }
106
106
  export function makePdfUidCacheKey(absPath, model) {
107
107
  return JSON.stringify([absPath, normalizeParsePdfModel(model)]);
@@ -174,6 +174,11 @@ export const imagePathSchema = imagePathBaseSchema.describe("Absolute path to a
174
174
  export const imagePathForWaitSchema = imagePathBaseSchema.describe('Absolute path to a local image file (png/jpg). Used to reuse cached uid or submit a new async task.');
175
175
  export const pdfPathForWaitSchema = pdfPathSchema.describe('Absolute path to a local PDF file. If uid is not provided, this tool will reuse cached uid (if any) or submit a new task.');
176
176
  export const outputPathSchema = absolutePathSchema.describe('Absolute path for the output file. The file will be overwritten if it exists.');
177
+ export const jsonOutputPathSchema = absolutePathSchema
178
+ .refine((v) => v.toLowerCase().endsWith('.json'), {
179
+ message: "Path must end with '.json'.",
180
+ })
181
+ .describe('Absolute path for the output JSON file. The file will be overwritten if it exists.');
177
182
  export const doc2xDownloadUrlSchema = z
178
183
  .string()
179
184
  .trim()
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@noedgeai-org/doc2x-mcp",
3
- "version": "0.1.3-dev.6.1",
3
+ "version": "0.1.4-dev.8.1",
4
4
  "description": "Doc2x MCP server (stdio, MCP SDK).",
5
5
  "license": "MIT",
6
6
  "engines": {
@@ -31,7 +31,7 @@
31
31
  "skill:install:ps": "pwsh -NoProfile -ExecutionPolicy Bypass -File scripts/install-skill.ps1",
32
32
  "skill:install:winps": "powershell -NoProfile -ExecutionPolicy Bypass -File scripts/install-skill-winps.ps1",
33
33
  "start": "node dist/index.js",
34
- "test:unit": "npm run build && node --test test/unit/registerToolsShared.test.js",
34
+ "test:unit": "npm run build && node --test test/unit/registerToolsShared.test.js test/unit/materialize.test.js",
35
35
  "test:e2e": "npm run build && node --test test/e2e/mcpServer.e2e.test.js",
36
36
  "test": "npm run test:unit && npm run test:e2e",
37
37
  "prepublishOnly": "pnpm run build"
@@ -1,173 +1,116 @@
1
1
  ---
2
2
  name: doc2x-mcp
3
- description: 使用 Doc2x MCP 工具完成文档解析与转换:对 PDF/扫描件/图片做 OCR 与版面解析,抽取文本/表格,导出为 Markdown/LaTeX(TeX)/DOCX 并下载落盘(submit/status/wait/export/download)。当用户提到 PDF/pdfs、scanned PDF、OCR、image-to-text、extract text/tables、表格抽取、文档转换/convert、导出/export、Markdown、LaTeX/TeX、DOCX、doc2x、doc2x-mcp、MCP 时使用。
3
+ description: 使用 Doc2x MCP 工具处理 PDF、扫描件和图片:提交解析、查询状态、等待文本、导出 Markdown/LaTeX/DOCX、下载落盘,以及将 PDF v3 layout 结果写为本地 JSON。用户提到 PDF、OCR、scan/scanned PDF、image-to-text、extract text/tables、表格抽取、layout、Markdown、LaTeX/TeX、DOCX、doc2x、doc2x-mcp、MCP、figure/table crop、v3 JSON 时使用。
4
4
  ---
5
5
 
6
- # Doc2x MCP Tool-Use Skill (for LLM)
6
+ # Doc2x MCP
7
7
 
8
- ## 你要做什么
8
+ ## 目的
9
9
 
10
- 你是一个会调用 MCP tools 的助手。凡是涉及 PDF/图片的“解析/抽取/导出/下载”,必须通过 `doc2x-mcp` tools 执行真实操作:
10
+ 凡是“解析 PDF/图片、抽取文本/表格、导出文档、下载结果、获取 v3 layout JSON”的请求,都应通过 `doc2x-mcp` tools 执行真实操作,不要臆造 `uid`、`url`、文件内容或导出结果。
11
11
 
12
- - 不要臆测/伪造 `uid`、`url`、文件内容或导出结果
13
- - 不要跳过工具步骤直接输出“看起来合理”的内容
12
+ ## 必须遵守
14
13
 
15
- ## 全局约束(必须遵守)
14
+ 1. 所有文件路径都用绝对路径:`pdf_path`、`image_path`、`output_path`、`output_dir`。
15
+ 2. 不要伪造下载 URL;只能使用 `doc2x_convert_export_*` 返回的 `url`。
16
+ 3. 同一个 `uid` 的同一组导出参数不要并发重复提交。
17
+ 4. 同一个 `uid` 做多档导出对比时,必须按“导出成功 -> 立即下载 -> 再导出下一档”执行,避免结果覆盖。
18
+ 5. 不要回显 `DOC2X_API_KEY`;排错只用 `doc2x_debug_config` 的摘要信息。
19
+ 6. `model` 只用于 PDF 解析提交;`formula_level` 只用于导出,且仅在源解析为 `v3-2026` 时有效。
20
+ 7. `doc2x_parse_pdf_wait_text` 只适合预览或摘要;需要完整结果时优先导出文件。
21
+ 8. 需要 PDF v3 block/layout 坐标时,不要从文本结果推断,直接使用 `doc2x_materialize_pdf_layout_json`。
16
22
 
17
- 1. 路径必须是绝对路径
18
- `pdf_path` / `image_path` / `output_path` / `output_dir` 都应使用绝对路径;相对路径可能会被 server 以意外的 cwd 解析导致失败。
23
+ ## 参数边界
19
24
 
20
- 2. 扩展名约束
21
- `doc2x_parse_pdf_submit.pdf_path` 必须以 `.pdf` 结尾;图片解析使用 `png/jpg`。
25
+ - PDF 解析:`doc2x_parse_pdf_submit` 和 `doc2x_parse_pdf_wait_text(pdf_path 分支)` 可传 `model: "v2" | "v3-2026"`;不传默认 `v2`。
26
+ - PDF layout JSON:`doc2x_materialize_pdf_layout_json` 在 `pdf_path` 分支默认使用 `v3-2026`,并要求返回结果包含 `pages[].layout`。
27
+ - 导出:`formula_mode` 建议总是显式传入。
28
+ - `formula_level` 必须传数字 `0 | 1 | 2`,不要传字符串。
29
+ - 图片解析路径只接受 `png/jpg/jpeg`;PDF 路径必须以 `.pdf` 结尾;layout JSON 输出路径应以 `.json` 结尾。
22
30
 
23
- 3. 不要并发重复提交导出
24
- 同一个 `uid` 对同一种导出配置(`to + formula_mode + formula_level (+ filename + filename_mode + merge_cross_page_forms...)`)不要并行重复 submit。
25
- 补充:同一 `uid + to` 的导出结果可能会被后一次覆盖;做“多档对比”(如 `formula_level=0/1/2`)时,必须按 **导出成功 → 立即下载落盘 → 再导出下一档** 的顺序执行。
31
+ ## 按目标选 Tool
26
32
 
27
- 4. 不要泄露密钥
28
- 永远不要回显/记录 `DOC2X_API_KEY`。排错只用 `doc2x_debug_config` `apiKeyLen/apiKeyPrefix/apiKeySource`。
33
+ - 提交 PDF 解析:`doc2x_parse_pdf_submit`
34
+ - 查看 PDF 状态:`doc2x_parse_pdf_status`
35
+ - 取 PDF 文本预览:`doc2x_parse_pdf_wait_text`
36
+ - 导出 PDF 为 `md/tex/docx`:`doc2x_convert_export_wait`
37
+ - 下载导出文件:`doc2x_download_url_to_file`
38
+ - 落盘 PDF v3 layout JSON:`doc2x_materialize_pdf_layout_json`
39
+ - 图片版面解析原始结果:`doc2x_parse_image_layout_sync`
40
+ - 图片版面解析并等待首屏 Markdown:`doc2x_parse_image_layout_submit` -> `doc2x_parse_image_layout_wait_text`
41
+ - 落盘 `convert_zip`:`doc2x_materialize_convert_zip`
42
+ - 配置排错:`doc2x_debug_config`
29
43
 
30
- 5. 不要伪造下载 URL
31
- 下载必须使用 `doc2x_convert_export_*` 返回的 `url`;不要自己拼接。
44
+ ## 标准流程
32
45
 
33
- 6. 参数生效边界
34
- `model` 仅用于 PDF 解析提交(默认 `v2`,可选 `v3-2026`);`formula_level` 仅用于导出(`doc2x_convert_export_*`),并且只在源解析任务使用 `v3-2026` 时生效(`v2` 下无效)。
46
+ ### 1. PDF -> 完整文件
35
47
 
36
- ## 关键参数语义(避免误用)
48
+ 当用户要完整 Markdown / TeX / DOCX,本流程优先:
37
49
 
38
- - `doc2x_parse_pdf_submit` / `doc2x_parse_pdf_wait_text(pdf_path 提交分支)`
39
- - 可选 `model: "v3-2026"`;不传则默认 `v2`。
40
- - `doc2x_convert_export_submit` / `doc2x_convert_export_wait`
41
- - `formula_mode`:`"normal"` `"dollar"`(关键参数,建议总是显式传入)。
42
- - `formula_level`:`0 | 1 | 2`(可选,**数字类型**,不要传字符串 `"0"|"1"|"2"`)
43
- - `0`:不退化公式(保留原始 Markdown)
44
- - `1`:行内公式退化为普通文本(`\(...\)`、`$...$`)
45
- - `2`:行内 + 块级公式全部退化为普通文本(`\(...\)`、`$...$`、`\[...\]`、`$$...$$`)
50
+ 1. `doc2x_parse_pdf_submit({ pdf_path, model? })`
51
+ 2. 轮询 `doc2x_parse_pdf_status({ uid })` 直到成功
52
+ 3. `doc2x_convert_export_wait({ uid, to, formula_mode, formula_level?, filename?, filename_mode? })`
53
+ 4. `doc2x_download_url_to_file({ url, output_path })`
46
54
 
47
- ## Tool 选择(按用户目标)
55
+ 说明:
48
56
 
49
- - **PDF 解析任务**:`doc2x_parse_pdf_submit` `doc2x_parse_pdf_status`
50
- - **少量预览/摘要**:`doc2x_parse_pdf_wait_text`(可能截断;要完整内容请导出文件)
51
- - **导出文件(md/tex/docx)**:`doc2x_convert_export_submit` `doc2x_convert_export_wait`(或直接 `doc2x_convert_export_wait` 走兼容模式一键导出)
52
- - **下载落盘**:`doc2x_download_url_to_file`
53
- - **图片版面解析**:`doc2x_parse_image_layout_sync` 或 `doc2x_parse_image_layout_submit` → `doc2x_parse_image_layout_wait_text`
54
- - **解包资源 zip**:`doc2x_materialize_convert_zip`
55
- - **配置排错**:`doc2x_debug_config`
57
+ - `md/docx` 常用 `formula_mode: "normal"`
58
+ - `tex` 常用 `formula_mode: "dollar"`
59
+ - 需要完整内容时,不要用 `doc2x_parse_pdf_wait_text` 代替导出
56
60
 
57
- ## 标准工作流(照做)
61
+ ### 2. PDF -> 文本预览
58
62
 
59
- ### 工作流 A:批量 PDF → 导出文件(MD/TEX/DOCX,高效并行版)
63
+ 仅在用户要快速预览、摘要、少量文本时使用:
60
64
 
61
- 适用于“多个 PDF 批量导出并落盘(.md / .tex / .docx)”。核心原则:
65
+ - `doc2x_parse_pdf_wait_text({ pdf_path | uid, max_output_chars?, max_output_pages?, model? })`
62
66
 
63
- - `doc2x_parse_pdf_submit` 可并行(批量提交)
64
- - `doc2x_parse_pdf_status` 可并行(批量轮询)
65
- - **流水线式并行**:某个 `uid` 一旦解析成功,立刻开始该 `uid` 的导出+下载(不必等所有 PDF 都解析完)
66
- - 不同 `uid` 的导出与下载可并行
67
- - **同一个 `uid` 的同一种导出配置(`to + formula_mode + formula_level (+ filename + filename_mode + merge_cross_page_forms...)`)不要并行重复提交**
68
- - 同一个 `uid` 若要导出多种格式(例如 md + docx + tex),建议**按格式串行**,但不同 `uid` 仍可并行
67
+ 若出现截断提示,应切回“PDF -> 完整文件”流程。
69
68
 
70
- **批量提交解析任务(并行)**
69
+ ### 3. PDF -> v3 layout JSON
71
70
 
72
- - 对每个 `pdf_path` 调用:`doc2x_parse_pdf_submit({ pdf_path, model? })` → `{ uid }`
71
+ 当用户要 figure/table 坐标、block bbox、layout blocks、后续裁剪脚本输入时使用:
73
72
 
74
- **等待解析完成(并行)**
73
+ - 优先:`doc2x_materialize_pdf_layout_json({ uid | pdf_path, output_path, model? })`
75
74
 
76
- - 对每个 `uid` 轮询:`doc2x_parse_pdf_status({ uid })` 直到 `status="success"`
77
- - 若 `status="failed"`:汇报 `detail`,该文件停止后续步骤
75
+ 要向用户说明 `layout` 的用途:
78
76
 
79
- **导出目标格式(并行,按 uid)**
77
+ - `Markdown/text` 适合阅读正文;`layout` 适合程序继续处理页面结构
78
+ - `layout.blocks[].bbox` 可用于 figure/table 裁剪、区域截图、框选高亮、可视化调试
79
+ - `layout.blocks[].type` 可用于区分标题、正文、表格、图片等块,做结构化抽取
80
+ - `layout` 适合作为后续脚本输入,例如 figure/table crop、block 对齐、版面分析
81
+ - 如果用户只想“看内容”,优先给 Markdown / DOCX;如果用户要“知道内容在页面哪里”,就用 `layout`
80
82
 
81
- 推荐用 `doc2x_convert_export_wait` 走“兼容模式一键导出”(当你提供 `formula_mode` 且本进程未提交过该导出时,会自动 submit 一次,然后 wait),避免你手动拆成 submit+wait:
83
+ 行为要求:
82
84
 
83
- - DOCX:`doc2x_convert_export_wait({ uid, to: "docx", formula_mode: "normal", formula_level? })``{ status: "success", url }`
84
- - Markdown:`doc2x_convert_export_wait({ uid, to: "md", formula_mode: "normal", formula_level?, filename?, filename_mode? })``{ status: "success", url }`
85
- - LaTeX:`doc2x_convert_export_wait({ uid, to: "tex", formula_mode: "dollar", formula_level? })` → `{ status: "success", url }`
85
+ - `pdf_path` 分支时,默认使用 `v3-2026`
86
+ - 输出的是原始 parse `result` JSON,而不是精简文本
87
+ - 若返回结果缺少 `pages[].layout`,应视为失败而不是静默降级
86
88
 
87
- (或显式两步:`doc2x_convert_export_submit(...)` `doc2x_convert_export_wait({ uid, to })`)
89
+ ### 4. 图片 -> 版面结果
88
90
 
89
- **补充建议**
91
+ - 直接拿原始结果:`doc2x_parse_image_layout_sync({ image_path })`
92
+ - 等待并取首屏 Markdown:`doc2x_parse_image_layout_submit({ image_path })` -> `doc2x_parse_image_layout_wait_text({ uid })`
93
+ - 结果包含 `convert_zip` 且用户要资源落盘时:`doc2x_materialize_convert_zip({ convert_zip_base64, output_dir })`
90
94
 
91
- - `formula_mode` 是关键参数:建议总是显式传入(`"normal"` / `"dollar"`,按用户偏好选择;常见:`md/docx` 用 `"normal"`、`tex` 用 `"dollar"`)
92
- - 需要做公式退化时显式传 `formula_level`(`0/1/2`);若不需要退化,建议显式传 `0`,避免调用端默认值歧义
93
- - `filename`/`filename_mode` 主要用于 `md/tex`:传不带扩展名的 basename,并配合 `filename_mode: "auto"`(避免 `name.md.md` / `name.tex.tex`)
94
- - 对同一个 `uid` 做多格式导出时,先确定顺序(例如先 md 再 docx),逐个完成再进行下一个格式
95
- - 对同一个 `uid` 的同一格式做“多档参数对比”(如 `formula_level`),每一档都要先下载再进行下一档,避免覆盖导致误判
95
+ ### 5. 批量 PDF
96
96
 
97
- **批量下载(并行)**
97
+ 批量场景采用流水线,不要全串行:
98
98
 
99
- - `doc2x_download_url_to_file({ url, output_path })``{ output_path, bytes_written }`
100
- - `output_path` 必须为绝对路径,且每个文件应唯一(建议用原文件名 + 对应扩展名:`.md` / `.tex` / `.docx`)
99
+ 1. 多个 `pdf_path` 可并行 `doc2x_parse_pdf_submit`
100
+ 2. 多个 `uid` 可并行 `doc2x_parse_pdf_status`
101
+ 3. 某个 `uid` 一旦 parse 成功,立即开始它自己的导出和下载
102
+ 4. 不同 `uid` 可并行导出
103
+ 5. 同一个 `uid` 的同一种导出配置不要并发
101
104
 
102
- **并发建议**
105
+ ## 向用户回报
103
106
 
104
- - 10 个 PDF 以内通常可以直接并行;更多文件建议分批/限流(避免触发超时/限流)
107
+ - 成功时报告:输入文件、`uid`、输出路径、必要时 `bytes_written`
108
+ - 失败时报告:错误码、错误消息、相关 `uid`,并指出哪些文件未受影响
109
+ - 当用户目标是“本地文件”时,优先回报落盘结果,不要只贴长文本
105
110
 
106
- **向用户回报(按文件汇总)**
111
+ ## 常见错误处理
107
112
 
108
- - 成功:列出每个输入文件对应的 `output_path` 与 `bytes_written`
109
- - 失败:列出失败文件与错误原因(包含 `uid` 与 `detail`/错误码),并说明其余文件不受影响
110
-
111
- ### 工作流 B:PDF Markdown 文件(推荐)
112
-
113
- 当用户目标是“拿到完整 Markdown / 落盘”,主链路应当是导出与下载,不要依赖 `doc2x_parse_pdf_wait_text`。
114
-
115
- **提交解析任务**
116
-
117
- - `doc2x_parse_pdf_submit({ pdf_path, model? })` → `{ uid }`
118
-
119
- **等待解析完成**
120
-
121
- - 轮询 `doc2x_parse_pdf_status({ uid })` 直到 `status="success"`(失败则带 `detail` 汇报)
122
-
123
- **导出 Markdown**
124
-
125
- - `doc2x_convert_export_wait({ uid, to: "md", formula_mode: "normal", formula_level?, filename?, filename_mode? })` → `{ status: "success", url }`
126
-
127
- **下载落盘**
128
-
129
- - `doc2x_download_url_to_file({ url, output_path })` → `{ output_path, bytes_written }`
130
-
131
- **向用户回报**
132
-
133
- - 回复用户:保存路径、文件大小、`uid`(必要时附上 `url`)
134
-
135
- ### 工作流 C:PDF → 文本预览(可控长度)
136
-
137
- 当用户只需要“摘要/少量预览”时才用:
138
-
139
- - `doc2x_parse_pdf_wait_text({ pdf_path | uid, max_output_chars?, max_output_pages? })`
140
-
141
- 如果返回包含截断提示(`[doc2x-mcp] Output truncated ...`),应切换到“工作流 B”导出 md 获取完整内容。
142
-
143
- ### 工作流 D:PDF 导出格式(MD / TEX / DOCX)
144
-
145
- - Markdown:`to="md"`(完整 Markdown 导出优先参考“工作流 B”)
146
- - LaTeX:`to="tex"`
147
- - Word:`to="docx"`
148
- - 调用链同“工作流 A / B”(先解析 → 再导出 → 再下载),按目标格式调整 `to`(并按需设置 `formula_mode/formula_level/filename`)
149
- - 注意:`doc2x_convert_export_submit.formula_mode` 必填(`"normal"` 或 `"dollar"`);`formula_level` 可选(`0/1/2`)
150
- - 若需要对比不同 `formula_level`,请按顺序执行并在每次导出成功后立即下载,再进行下一档,避免后一次结果覆盖前一次。
151
-
152
- ### 工作流 E:图片 → Markdown(版面解析)
153
-
154
- - 只要结果(同步):`doc2x_parse_image_layout_sync({ image_path })`(返回原始 JSON,可能包含 `convert_zip`)
155
- - 要首屏 markdown(异步):`doc2x_parse_image_layout_submit({ image_path })` → `doc2x_parse_image_layout_wait_text({ uid })`
156
-
157
- 如果结果里有 `convert_zip`(base64)且用户希望落盘资源文件:
158
-
159
- - `doc2x_materialize_convert_zip({ convert_zip_base64, output_dir })` → `{ output_dir, zip_path, extracted }`
160
-
161
- ## 失败与排错(你应当这样处理)
162
-
163
- 1. 鉴权/配置异常
164
- 先 `doc2x_debug_config()`,确认 `apiKeyLen > 0` 且 `baseUrl/httpTimeoutMs/pollIntervalMs/maxWaitMs` 合理。
165
-
166
- 2. 等待超时
167
- 建议用户调大 `DOC2X_MAX_WAIT_MS` 或按需调 `DOC2X_POLL_INTERVAL_MS`(不要过于频繁)。
168
-
169
- 3. 下载被阻止(安全策略)
170
- `doc2x_download_url_to_file` 只允许 `https` 且要求 host 在 `DOC2X_DOWNLOAD_URL_ALLOWLIST` 内;被拦截时解释原因,并让用户选择“加 allowlist”或“保持默认安全策略”。
171
-
172
- 4. 用户给的是相对路径/不确定路径
173
- 要求用户提供绝对路径;不要猜。
113
+ 1. 缺参数或路径不合法:提示用户提供绝对路径,不要猜测相对路径。
114
+ 2. 等待超时:说明可调大 `DOC2X_MAX_WAIT_MS` 或适度调整轮询间隔。
115
+ 3. 下载被策略拦截:解释是 `DOC2X_DOWNLOAD_URL_ALLOWLIST` 限制,不要绕过。
116
+ 4. 认证或配置问题:调用 `doc2x_debug_config`,只汇报 `apiKeySource/apiKeyPrefix/apiKeyLen` 等摘要。