@laiye-adp/agentic-doc-parse-and-extract-cli 1.10.0-beta.1 → 1.10.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +25 -18
- package/README.zh.md +29 -18
- package/bin/adp +40 -0
- package/package.json +7 -7
package/README.md
CHANGED
|
@@ -1,23 +1,22 @@
|
|
|
1
|
-
|
|
1
|
+
## 🚀 About Laiye ADP
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
ADP is Laiye's **intelligent agent document processing product (Agentic Document Processing, referred to as ADP)** , based on the general understanding ability of large models, without relying on rules and annotations, with the general understanding ability of multi-language, MultiModal Machine Learning, and multi-scene; autonomous planning and execution of intelligent agents, able to understand task goals, autonomous planning steps, invoke tools, and complete complex tasks; end-to-end business automation, from document input to business decision-making to human-machine collaboration, forming a complete closed loop.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
**agentic-doc-parse-and-extract** is the official open-source CLI tool of ADP, supporting both manual terminal invocation and automatic invocation via AI Skill. With a single command, it can accomplish: structured document parsing + intelligent extraction of key fields, covering all scenarios including invoices, orders, certificates, bills, and general documents, outputting standard JSON, and seamlessly integrating with automation and AI workflows.
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
---
|
|
8
8
|
|
|
9
|
-
|
|
10
|
-
- **Invoice/Receipt/Purchase Order Extraction** — Extract key information and line items from invoices, receipts, purchase orders and more
|
|
11
|
-
- **Custom Document Extraction** — Extract custom fields from any type of document
|
|
12
|
-
- **Batch Processing** — Concurrently process multiple documents in a folder or from URLs, with per-file result output
|
|
13
|
-
- **Sync/Async** — Support both sync and async processing modes
|
|
14
|
-
- **Two-Phase Async** — `--async --no-wait` submits tasks and outputs task-id list; `query --file` resumes from where you left off
|
|
15
|
-
- **Reliability** — Automatic retry with exponential backoff (`--retry`), fine-grained exit codes
|
|
16
|
-
- **Cross Platform** — Windows / Linux / macOS, static binaries with no dependencies
|
|
9
|
+
## 💡 Core Features
|
|
17
10
|
|
|
18
|
-
|
|
11
|
+
agentic-doc-parse-and-extract focuses on intelligent processing of the entire document workflow, taking into account both manual terminal calls and automatic calls by AI Agents. Its core functions cover all scenarios of parsing, extraction, and batch processing, requiring no complex configuration, and operations can be completed with a single command:
|
|
19
12
|
|
|
20
|
-
|
|
13
|
+
| Function Name | Function Description | Optimal Scenario |
|
|
14
|
+
|---------|------------------|----------|
|
|
15
|
+
| **Document Parsing** | Automatically recognize multi-format documents such as PDFs and images, convert messy unstructured content (e.g., scanned documents, handwritten text, complex layout documents) into standardized Structured Data, while preserving the original document hierarchy and key relationships | Convert unstructured documents into Structured Data for LLM reading and subsequent extraction |
|
|
16
|
+
| **Out Of The Box Document Extraction** | Based on the native AI capabilities of the ADP large model, it comes with built-in standardized extraction models for invoices, receipts, orders, commonly used certificates in China, etc. No need to configure rules or manual annotation, one-click extraction of key fields from various types of general documentation, outputting standard JSON | Account Payable automation, expense management, procurement automation, quick entry of card and certificate information into the system |
|
|
17
|
+
| **Custom Document Extraction** | Supports independent creation, editing, and management of personalized extraction applications, allowing configuration of exclusive extraction fields and recognition logic for enterprise-specific documentation and industry-customized forms | Private extraction requirements for enterprise-specific documentation, industry-customized forms, and non-standardized documents |
|
|
18
|
+
| **Task Query** | Supports asynchronous task submission and status query, enabling quick viewing of task execution progress, success/failure status, and final task processing results | Batch task processing, asynchronous document processing, problem troubleshooting, and processing record tracing |
|
|
19
|
+
| **Application Management** | Provides comprehensive application management capabilities, allowing users to view all available extraction applications (system-built + custom), query application details, and manage application tags | Multi-scenario business switching, full lifecycle management of applications, and custom application management |
|
|
21
20
|
|
|
22
21
|
## Agent Integration
|
|
23
22
|
|
|
@@ -215,10 +214,18 @@ adp_results_20250417_153020/
|
|
|
215
214
|
- App cache: `~/.adp/app_cache.json`
|
|
216
215
|
- Version check cache: `~/.adp/version_check.json` (refreshed every 24h)
|
|
217
216
|
|
|
218
|
-
## License
|
|
217
|
+
## 📜 License
|
|
219
218
|
|
|
220
|
-
|
|
219
|
+
We adopt a combined model of open-source tools + paid services: the CLI tool is completely free and open-source, making it easy for everyone to quickly integrate; while the core ADP intelligent parsing capability is a Public Cloud commercial service, billed based on actual usage, aiming to provide users with a highly accurate and stable document processing experience.
|
|
221
220
|
|
|
222
|
-
|
|
221
|
+
- **CLI Tool**: Open source under the MIT License, freely available for use, modification, and distribution
|
|
222
|
+
- **ADP Service**: AI document processing service based on Public Cloud, billed by usage, [Billing Rules](#credit)
|
|
223
223
|
|
|
224
|
-
|
|
224
|
+
Free Quota: New users can receive **100 free credits** per month after registration, allowing them to experience full functionality
|
|
225
|
+
|
|
226
|
+
## 📞 Support and Contact
|
|
227
|
+
- **CLI Documentation**: [ADP CLI User Guide](https://laiye-tech.feishu.cn/wiki/YIaawiK2DimisZk5KfDc8a8cnLh)
|
|
228
|
+
- **API Documentation**: [OpenAPI User Guide](https://laiye-tech.feishu.cn/wiki/S1t2wYR04ivndKkMDxxcp2SFnKd?from=from_copylink)
|
|
229
|
+
- **User Guide**: [Public Cloud Operation Manual](https://laiye-tech.feishu.cn/wiki/OfexwgVUQiOpEek4kO7c7NEJnAe)
|
|
230
|
+
- **Problem Feedback**: [GitHub Issues](https://github.com/laiye-ai/adp-cli/issues) | global_product@laiye.com
|
|
231
|
+
- **Official Website**: [Laiye Technology](https://laiye.com/en/)
|
package/README.zh.md
CHANGED
|
@@ -1,23 +1,22 @@
|
|
|
1
|
-
|
|
1
|
+
## 🚀 关于来也ADP
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
ADP是来也科技公司**智能体文档处理产品 (Agentic Document Processing,简称 ADP)**, 基于大模型的通用理解能力,不依赖规则与标注,具备对多语言、多模态、多场景的通用理解能力;智能体的自主规划与执行,能够理解任务目标、自主规划步骤、调用工具、完成复杂任务;端到端的业务自动化,从文档输入到业务决策再到人机协同,形成完整闭环。
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
**agentic-doc-parse-and-extract** 是 ADP 官方开源 CLI 工具,同时支持人工终端调用 + AI Skill 自动调用。一条命令即可完成:文档结构化解析 + 关键字段智能抽取,覆盖发票、订单、证件、票据、通用文档全场景,输出标准 JSON,无缝对接自动化与 AI 流程。
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
---
|
|
8
8
|
|
|
9
|
-
|
|
10
|
-
- **发票/订单抽取** — 从发票、收据、采购订单等文档中抽取关键字段和表格
|
|
11
|
-
- **自定义文档抽取** — 从任意类型的文档中抽取自定义字段或表格
|
|
12
|
-
- **批量处理** — 并发处理文件夹或 URL 列表中的多个文档,每个文件单独输出结果
|
|
13
|
-
- **同步/异步** — 支持同步和异步两种模式
|
|
14
|
-
- **两阶段异步** — `--async --no-wait` 仅提交任务并输出 task-id 列表;`query --file` 从中断处恢复查询
|
|
15
|
-
- **可靠性** — 自动重试与指数退避(`--retry`),细粒度退出码
|
|
16
|
-
- **跨平台** — Windows / Linux / macOS,静态二进制无依赖
|
|
9
|
+
## 💡 核心功能
|
|
17
10
|
|
|
18
|
-
|
|
11
|
+
agentic-doc-parse-and-extract 聚焦文档全流程智能处理,兼顾人工终端调用与 AI Agent 自动调用,核心功能覆盖解析、抽取、批量处理全场景,无需复杂配置,一条命令即可完成操作:
|
|
19
12
|
|
|
20
|
-
|
|
13
|
+
| 功能名称 | 功能描述 | 最佳场景 |
|
|
14
|
+
|---------|------------------|----------|
|
|
15
|
+
| **文档解析** | 自动识别 PDF、图片等多格式文档,将杂乱的非结构化内容(如扫描件、手写体、复杂排版文档)转化为标准化结构化数据,保留原始文档层级与关键关联关系 | 将非结构化文档转换为结构化数据,供 LLM 阅读和后续抽取使用 |
|
|
16
|
+
| **开箱即用文档抽取** | 基于 ADP 大模型原生 AI 能力,内置发票、收据、订单、中国地区常用证件等标准化抽取模型,无需配置规则、无需人工标注,一键提取各类通用单据关键字段,输出标准 JSON | 应付账款自动化、费用管理、采购自动化、卡证信息快速录入系统 |
|
|
17
|
+
| **自定义文档抽取** | 支持自主创建、编辑与管理个性化抽取应用,可针对企业专属单据、行业定制表单配置专属抽取字段与识别逻辑 | 企业专属单据、行业定制表单、非标准化文档的私有化抽取需求 |
|
|
18
|
+
| **任务查询** | 支持异步任务提交与状态查询,可快速查看任务执行进度、成功/失败状态,以及任务最终处理结果 | 批量任务处理、异步文档处理、问题排查与处理记录追溯 |
|
|
19
|
+
| **应用管理** | 提供完整的应用管理能力,可查看所有可用的抽取应用(系统内置 + 自定义)、查询应用详情、应用标签 | 多场景业务切换、应用全生命周期管控、自定义应用管理 |
|
|
21
20
|
|
|
22
21
|
## Agent 集成
|
|
23
22
|
|
|
@@ -215,10 +214,22 @@ adp_results_20250417_153020/
|
|
|
215
214
|
- 应用缓存:`~/.adp/app_cache.json`
|
|
216
215
|
- 版本检查缓存:`~/.adp/version_check.json`(每 24 小时刷新)
|
|
217
216
|
|
|
218
|
-
##
|
|
217
|
+
## 📜 授权许可
|
|
219
218
|
|
|
220
|
-
|
|
219
|
+
我们采用 开源工具 + 付费服务 的组合模式:CLI 工具完全免费开源,方便大家快速接入;而核心的 ADP 智能解析能力为公有云商业服务,按实际使用量计费,旨在为用户提供高精准、高稳定的文档处理体验。
|
|
221
220
|
|
|
222
|
-
|
|
221
|
+
- **CLI 工具**:MIT License 开源许可,可自由使用、修改和分发
|
|
222
|
+
- **ADP 服务**:基于公有云的 AI 文档处理服务,按使用量计费,[计费规则](#credit)
|
|
223
|
+
|
|
224
|
+
免费额度:新用户注册后每月可获得 **100 免费积分**,可体验完整功能
|
|
225
|
+
|
|
226
|
+
|
|
227
|
+
## 📞 支持与联系
|
|
228
|
+
- **CLI 使用指南:** [ADP CLI 使用指南](https://laiye-tech.feishu.cn/wiki/Hz3Vw1IQki3YQtk33gLcSdwSndc)
|
|
229
|
+
- **API 接口文档:** [Open API 使用指南](https://laiye-tech.feishu.cn/wiki/PO9Jw4cH3iV2ThkMPW2c539pnkc)
|
|
230
|
+
- **ADP 产品操作手册:** [公有云操作手册](https://laiye-tech.feishu.cn/wiki/UDYIwG42pisBbFkJI39ctpeKnWh)
|
|
231
|
+
|
|
232
|
+
- **问题反馈:** [GitHub Issues](https://github.com/laiye-ai-repos/adp-skill/issues)
|
|
233
|
+
- **邮箱:** global_product@laiye.com
|
|
234
|
+
- **官网:** [来也科技](https://laiye.com)
|
|
223
235
|
|
|
224
|
-
跨平台构建:`make build-all VERSION=v1.0.0`。运行 E2E 测试:`bash tests/test.sh`。
|
package/bin/adp
ADDED
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
|
|
3
|
+
'use strict';
|
|
4
|
+
|
|
5
|
+
const { spawnSync } = require('child_process');
|
|
6
|
+
const path = require('path');
|
|
7
|
+
const fs = require('fs');
|
|
8
|
+
|
|
9
|
+
const platform = process.platform;
|
|
10
|
+
const arch = process.arch;
|
|
11
|
+
const pkgName = `@laiye-adp/adp-cli-${platform}-${arch}`;
|
|
12
|
+
const binName = platform === 'win32' ? 'adp.exe' : 'adp';
|
|
13
|
+
|
|
14
|
+
function findBinPath() {
|
|
15
|
+
// Method 1: require.resolve (works for local/project installs)
|
|
16
|
+
try {
|
|
17
|
+
const pkgDir = path.dirname(require.resolve(`${pkgName}/package.json`));
|
|
18
|
+
const p = path.join(pkgDir, 'bin', binName);
|
|
19
|
+
if (fs.existsSync(p)) return p;
|
|
20
|
+
} catch {}
|
|
21
|
+
|
|
22
|
+
// Method 2: look in sibling node_modules (works for global installs)
|
|
23
|
+
// Global layout: <prefix>/node_modules/@laiye-adp/agentic-doc-parse-and-extract-cli/bin/adp
|
|
24
|
+
// <prefix>/node_modules/@laiye-adp/adp-cli-win32-x64/bin/adp.exe
|
|
25
|
+
const scopeDir = path.resolve(__dirname, '..', '..');
|
|
26
|
+
const p = path.join(scopeDir, `adp-cli-${platform}-${arch}`, 'bin', binName);
|
|
27
|
+
if (fs.existsSync(p)) return p;
|
|
28
|
+
|
|
29
|
+
return null;
|
|
30
|
+
}
|
|
31
|
+
|
|
32
|
+
const binPath = findBinPath();
|
|
33
|
+
if (!binPath) {
|
|
34
|
+
console.error(`[adp-cli] Platform package "${pkgName}" is not installed.`);
|
|
35
|
+
console.error(`[adp-cli] Supported platforms: linux-x64, linux-arm64, darwin-x64, darwin-arm64, win32-x64, win32-arm64`);
|
|
36
|
+
process.exit(1);
|
|
37
|
+
}
|
|
38
|
+
|
|
39
|
+
const result = spawnSync(binPath, process.argv.slice(2), { stdio: 'inherit' });
|
|
40
|
+
process.exit(result.status ?? 1);
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@laiye-adp/agentic-doc-parse-and-extract-cli",
|
|
3
|
-
"version": "1.10.
|
|
3
|
+
"version": "1.10.1",
|
|
4
4
|
"description": "Official CLI for Laiye ADP (Agentic Document Processing) - document parsing and intelligent extraction",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"adp",
|
|
@@ -39,11 +39,11 @@
|
|
|
39
39
|
"arm64"
|
|
40
40
|
],
|
|
41
41
|
"optionalDependencies": {
|
|
42
|
-
"@laiye-adp/adp-cli-linux-x64": "1.10.
|
|
43
|
-
"@laiye-adp/adp-cli-linux-arm64": "1.10.
|
|
44
|
-
"@laiye-adp/adp-cli-darwin-x64": "1.10.
|
|
45
|
-
"@laiye-adp/adp-cli-darwin-arm64": "1.10.
|
|
46
|
-
"@laiye-adp/adp-cli-win32-x64": "1.10.
|
|
47
|
-
"@laiye-adp/adp-cli-win32-arm64": "1.10.
|
|
42
|
+
"@laiye-adp/adp-cli-linux-x64": "1.10.1",
|
|
43
|
+
"@laiye-adp/adp-cli-linux-arm64": "1.10.1",
|
|
44
|
+
"@laiye-adp/adp-cli-darwin-x64": "1.10.1",
|
|
45
|
+
"@laiye-adp/adp-cli-darwin-arm64": "1.10.1",
|
|
46
|
+
"@laiye-adp/adp-cli-win32-x64": "1.10.1",
|
|
47
|
+
"@laiye-adp/adp-cli-win32-arm64": "1.10.1"
|
|
48
48
|
}
|
|
49
49
|
}
|