@helloxiaohu/plugin-mineru 0.0.20 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,101 +1,101 @@
1
- # Xpert Plugin: MinerU
2
-
3
- `@xpert-ai/plugin-mineru` is a MinerU document converter plugin for the [Xpert AI](https://github.com/xpert-ai/xpert) platform, providing extraction capabilities from PDF to Markdown and structured JSON. The plugin includes built-in MinerU integration strategies, document conversion strategies, and result parsing services, enabling secure access to the MinerU API in automated workflows, polling task status, and writing parsed content and attachment resources to the platform file system.
4
-
5
- ## Installation
6
-
7
- ```bash
8
- pnpm add @xpert-ai/plugin-mineru
9
- # or
10
- npm install @xpert-ai/plugin-mineru
11
- ```
12
-
13
- > **Note**: This plugin depends on `@xpert-ai/plugin-sdk`, `@nestjs/common@^11`, `@nestjs/config@^4`, `@metad/contracts`, `axios@1`, `chalk@4`, `@langchain/core@^0.3.72`, and `uuid@8` as peerDependencies. Please ensure these packages are installed in your host project.
14
-
15
- ## Quick Start
16
-
17
- 1. **Prepare MinerU Credentials**
18
- Obtain a valid API Key from the MinerU dashboard and confirm the service address (default: `https://mineru.net/api/v4`).
19
-
20
- 2. **Configure Integration in Xpert**
21
- - Via Xpert Console: Create a MinerU integration and fill in the following fields.
22
- - Or set environment variables in your deployment environment:
23
- - `MINERU_API_BASE_URL`: Optional, defaults to `https://mineru.net/api/v4`.
24
- - `MINERU_API_TOKEN`: Required, used as a fallback credential if no integration is configured.
25
-
26
- Example integration configuration (JSON):
27
-
28
- ```json
29
- {
30
- "provider": "mineru",
31
- "options": {
32
- "apiUrl": "https://mineru.net/api/v4",
33
- "apiKey": "your-mineru-api-key"
34
- }
35
- }
36
- ```
37
-
38
- 3. **Register the Plugin**
39
- Configure the plugin in your host service's plugin registration process:
40
-
41
- ```sh .env
42
- PLUGINS=@xpert-ai/plugin-mineru
43
- ```
44
-
45
- The plugin returns the NestJS module `MinerUPlugin` in the `register` hook and logs messages during the `onStart`/`onStop` lifecycle.
46
-
47
- ## MinerU Integration Options
48
-
49
- | Field | Type | Description | Required | Default |
50
- | -------- | ------ | ------------------------------------- | -------- | ---------------------------- |
51
- | apiUrl | string | MinerU API base URL | No | `https://mineru.net/api/v4` |
52
- | apiKey | string | MinerU service API Key (keep secret) | Yes | — |
53
-
54
- > If both integration configuration and environment variables are provided, options from the integration configuration take precedence.
55
-
56
- ## Document Conversion Parameters
57
-
58
- `MinerUTransformerStrategy` supports the following configuration options (passed to the MinerU API when starting a workflow):
59
-
60
- | Field | Type | Default | Description |
61
- | ---------------- | ------- | ------------ | --------------------------------------------------- |
62
- | `isOcr` | boolean | `true` | Enable OCR for image-based PDFs. |
63
- | `enableFormula` | boolean | `true` | Recognize mathematical formulas and output tags. |
64
- | `enableTable` | boolean | `true` | Recognize tables and output structured tags. |
65
- | `language` | string | `"ch"` | Main document language, per MinerU API (`en`/`ch`). |
66
- | `modelVersion` | string | `"pipeline"` | MinerU model version (`pipeline`, `vlm`, etc.). |
67
-
68
- By default, the plugin creates MinerU tasks for each file to be processed, polls until `full_zip_url` is returned, then downloads and parses the zip package in memory.
69
-
70
- ## Permissions
71
-
72
- - **Integration**: Access MinerU integration configuration to read API address and credentials.
73
- - **File System**: Perform `read/write/list` on `XpFileSystem` to store image resources from MinerU results.
74
-
75
- Ensure the plugin is granted these permissions in your authorization policy, or it will not be able to retrieve results or write attachments.
76
-
77
- ## Output Content
78
-
79
- The parser generates:
80
-
81
- - Full Markdown: Resource links are automatically replaced to point to actual URLs written via `XpFileSystem`.
82
- - Structured metadata: Includes MinerU task ID, layout JSON (`layout.json`), content list (`content_list.json`), original PDF filename, etc.
83
- - Attachment asset list: Records written image resources for easy association by callers.
84
-
85
- The returned `Document<ChunkMetadata>` array currently defaults to a single chunk containing the full Markdown; you can split it as needed.
86
-
87
- ## Development & Debugging
88
-
89
- Run the following commands in the repository root to build and test locally:
90
-
91
- ```bash
92
- npm install
93
- npx nx build @xpert-ai/plugin-mineru
94
- npx nx test @xpert-ai/plugin-mineru
95
- ```
96
-
97
- TypeScript build artifacts are output to `packages/mineru/dist`. Before publishing, ensure `package.json`, type declarations, and runtime files are in sync.
98
-
99
- ## License
100
-
101
- This project follows the [AGPL-3.0 License](../../../LICENSE) in the repository root.
1
+ # Xpert Plugin: MinerU
2
+
3
+ `@xpert-ai/plugin-mineru` is a MinerU document converter plugin for the [Xpert AI](https://github.com/xpert-ai/xpert) platform, providing extraction capabilities from PDF to Markdown and structured JSON. The plugin includes built-in MinerU integration strategies, document conversion strategies, and result parsing services, enabling secure access to the MinerU API in automated workflows, polling task status, and writing parsed content and attachment resources to the platform file system.
4
+
5
+ ## Installation
6
+
7
+ ```bash
8
+ pnpm add @xpert-ai/plugin-mineru
9
+ # or
10
+ npm install @xpert-ai/plugin-mineru
11
+ ```
12
+
13
+ > **Note**: This plugin depends on `@xpert-ai/plugin-sdk`, `@nestjs/common@^11`, `@nestjs/config@^4`, `@metad/contracts`, `axios@1`, `chalk@4`, `@langchain/core@^0.3.72`, and `uuid@8` as peerDependencies. Please ensure these packages are installed in your host project.
14
+
15
+ ## Quick Start
16
+
17
+ 1. **Prepare MinerU Credentials**
18
+ Obtain a valid API Key from the MinerU dashboard and confirm the service address (default: `https://mineru.net/api/v4`).
19
+
20
+ 2. **Configure Integration in Xpert**
21
+ - Via Xpert Console: Create a MinerU integration and fill in the following fields.
22
+ - Or set environment variables in your deployment environment:
23
+ - `MINERU_API_BASE_URL`: Optional, defaults to `https://mineru.net/api/v4`.
24
+ - `MINERU_API_TOKEN`: Required, used as a fallback credential if no integration is configured.
25
+
26
+ Example integration configuration (JSON):
27
+
28
+ ```json
29
+ {
30
+ "provider": "mineru",
31
+ "options": {
32
+ "apiUrl": "https://mineru.net/api/v4",
33
+ "apiKey": "your-mineru-api-key"
34
+ }
35
+ }
36
+ ```
37
+
38
+ 3. **Register the Plugin**
39
+ Configure the plugin in your host service's plugin registration process:
40
+
41
+ ```sh .env
42
+ PLUGINS=@xpert-ai/plugin-mineru
43
+ ```
44
+
45
+ The plugin returns the NestJS module `MinerUPlugin` in the `register` hook and logs messages during the `onStart`/`onStop` lifecycle.
46
+
47
+ ## MinerU Integration Options
48
+
49
+ | Field | Type | Description | Required | Default |
50
+ | -------- | ------ | ------------------------------------- | -------- | ---------------------------- |
51
+ | apiUrl | string | MinerU API base URL | No | `https://mineru.net/api/v4` |
52
+ | apiKey | string | MinerU service API Key (keep secret) | Yes | — |
53
+
54
+ > If both integration configuration and environment variables are provided, options from the integration configuration take precedence.
55
+
56
+ ## Document Conversion Parameters
57
+
58
+ `MinerUTransformerStrategy` supports the following configuration options (passed to the MinerU API when starting a workflow):
59
+
60
+ | Field | Type | Default | Description |
61
+ | ---------------- | ------- | ------------ | --------------------------------------------------- |
62
+ | `isOcr` | boolean | `true` | Enable OCR for image-based PDFs. |
63
+ | `enableFormula` | boolean | `true` | Recognize mathematical formulas and output tags. |
64
+ | `enableTable` | boolean | `true` | Recognize tables and output structured tags. |
65
+ | `language` | string | `"ch"` | Main document language, per MinerU API (`en`/`ch`). |
66
+ | `modelVersion` | string | `"pipeline"` | MinerU model version (`pipeline`, `vlm`, etc.). |
67
+
68
+ By default, the plugin creates MinerU tasks for each file to be processed, polls until `full_zip_url` is returned, then downloads and parses the zip package in memory.
69
+
70
+ ## Permissions
71
+
72
+ - **Integration**: Access MinerU integration configuration to read API address and credentials.
73
+ - **File System**: Perform `read/write/list` on `XpFileSystem` to store image resources from MinerU results.
74
+
75
+ Ensure the plugin is granted these permissions in your authorization policy, or it will not be able to retrieve results or write attachments.
76
+
77
+ ## Output Content
78
+
79
+ The parser generates:
80
+
81
+ - Full Markdown: Resource links are automatically replaced to point to actual URLs written via `XpFileSystem`.
82
+ - Structured metadata: Includes MinerU task ID, layout JSON (`layout.json`), content list (`content_list.json`), original PDF filename, etc.
83
+ - Attachment asset list: Records written image resources for easy association by callers.
84
+
85
+ The returned `Document<ChunkMetadata>` array currently defaults to a single chunk containing the full Markdown; you can split it as needed.
86
+
87
+ ## Development & Debugging
88
+
89
+ Run the following commands in the repository root to build and test locally:
90
+
91
+ ```bash
92
+ npm install
93
+ npx nx build @xpert-ai/plugin-mineru
94
+ npx nx test @xpert-ai/plugin-mineru
95
+ ```
96
+
97
+ TypeScript build artifacts are output to `packages/mineru/dist`. Before publishing, ensure `package.json`, type declarations, and runtime files are in sync.
98
+
99
+ ## License
100
+
101
+ This project follows the [AGPL-3.0 License](../../../LICENSE) in the repository root.
@@ -1 +1 @@
1
- {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AACxB,OAAO,KAAK,EAAE,WAAW,EAAE,MAAM,sBAAsB,CAAC;AAexD,QAAA,MAAM,YAAY,gDAChB,CAAC;AAEH,QAAA,MAAM,MAAM,EAAE,WAAW,CAAC,CAAC,CAAC,KAAK,CAAC,OAAO,YAAY,CAAC,CA4BrD,CAAC;AAEF,eAAe,MAAM,CAAC"}
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AACxB,OAAO,KAAK,EAAE,WAAW,EAAE,MAAM,sBAAsB,CAAC;AAcxD,QAAA,MAAM,YAAY,gDAChB,CAAC;AAEH,QAAA,MAAM,MAAM,EAAE,WAAW,CAAC,CAAC,CAAC,KAAK,CAAC,OAAO,YAAY,CAAC,CA4BrD,CAAC;AAEF,eAAe,MAAM,CAAC"}
package/dist/index.js CHANGED
@@ -1,25 +1,24 @@
1
1
  import { z } from 'zod';
2
2
  import { readFileSync } from 'fs';
3
- import { fileURLToPath } from 'url';
4
- import { dirname, join } from 'path';
3
+ import { join } from 'path';
5
4
  import { MinerUPlugin } from './lib/mineru.plugin.js';
6
5
  import { icon } from './lib/types.js';
7
- const __filename = fileURLToPath(import.meta.url);
8
- const dir_name = dirname(__filename);
9
- const packageJson = JSON.parse(readFileSync(join(dir_name, '../package.json'), 'utf8'));
6
+ import { getModuleMeta } from './lib/path-meta.js';
7
+ const { __filename, __dirname } = getModuleMeta(import.meta);
8
+ const packageJson = JSON.parse(readFileSync(join(__dirname, '../package.json'), 'utf8'));
10
9
  const ConfigSchema = z.object({});
11
10
  const plugin = {
12
11
  meta: {
13
12
  name: packageJson.name,
14
13
  version: packageJson.version,
15
- category: 'tools',
14
+ category: 'set',
16
15
  icon: {
17
16
  type: 'svg',
18
17
  value: icon
19
18
  },
20
19
  displayName: 'MinerU Transformer',
21
- description: 'Provide PDF to Markdown and JSON transformation functionality',
22
- keywords: ['integration', 'pdf', 'markdown', 'json', 'transformer'],
20
+ description: 'Provide document to Markdown and JSON transformation functionality',
21
+ keywords: ['integration', 'document', 'pdf', 'docx', 'ppt', 'image', 'markdown', 'json', 'transformer'],
23
22
  author: 'XpertAI Team',
24
23
  homepage: 'https://www.npmjs.com/package/@xpert-ai/plugin-mineru',
25
24
  },
@@ -1 +1 @@
1
- {"version":3,"file":"integration.strategy.d.ts","sourceRoot":"","sources":["../../src/lib/integration.strategy.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,KAAK,YAAY,EAAE,oBAAoB,EAAE,MAAM,kBAAkB,CAAC;AAO3E,OAAO,EACL,mBAAmB,EAGnB,0BAA0B,EAC3B,MAAM,sBAAsB,CAAC;AAE9B,OAAO,EAA2B,wBAAwB,EAAE,MAAM,YAAY,CAAC;AAE/E,qBAEa,yBACX,YAAW,mBAAmB,CAAC,wBAAwB,CAAC;IAExD,QAAQ,CAAC,IAAI,EAAE,oBAAoB,CAsEjC;IAGF,OAAO,CAAC,QAAQ,CAAC,aAAa,CAAgB;IAExC,OAAO,CACX,WAAW,EAAE,YAAY,CAAC,wBAAwB,CAAC,EACnD,OAAO,EAAE,0BAA0B,GAClC,OAAO,CAAC,GAAG,CAAC;IAIT,cAAc,CAAC,MAAM,EAAE,wBAAwB,GAAG,OAAO,CAAC,IAAI,CAAC;CA2BtE"}
1
+ {"version":3,"file":"integration.strategy.d.ts","sourceRoot":"","sources":["../../src/lib/integration.strategy.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,KAAK,YAAY,EAAE,oBAAoB,EAAE,MAAM,kBAAkB,CAAC;AAO3E,OAAO,EACL,mBAAmB,EAGnB,0BAA0B,EAC3B,MAAM,sBAAsB,CAAC;AAE9B,OAAO,EAAgB,wBAAwB,EAAE,MAAM,YAAY,CAAC;AAEpE,qBAEa,yBACX,YAAW,mBAAmB,CAAC,wBAAwB,CAAC;IAExD,QAAQ,CAAC,IAAI,EAAE,oBAAoB,CAuFjC;IAGF,OAAO,CAAC,QAAQ,CAAC,aAAa,CAAgB;IAExC,OAAO,CACX,WAAW,EAAE,YAAY,CAAC,wBAAwB,CAAC,EACnD,OAAO,EAAE,0BAA0B,GAClC,OAAO,CAAC,GAAG,CAAC;IAIT,cAAc,CAAC,MAAM,EAAE,wBAAwB,GAAG,OAAO,CAAC,IAAI,CAAC;CA2BtE"}
@@ -3,11 +3,11 @@ import { forwardRef, Inject, Injectable, } from '@nestjs/common';
3
3
  import { ConfigService } from '@nestjs/config';
4
4
  import { IntegrationStrategyKey, } from '@xpert-ai/plugin-sdk';
5
5
  import { MinerUClient } from './mineru.client.js';
6
- import { icon, MinerUIntegration } from './types.js';
6
+ import { icon, MinerU } from './types.js';
7
7
  let MinerUIntegrationStrategy = class MinerUIntegrationStrategy {
8
8
  constructor() {
9
9
  this.meta = {
10
- name: MinerUIntegration,
10
+ name: MinerU,
11
11
  label: {
12
12
  en_US: 'MinerU',
13
13
  },
@@ -68,6 +68,21 @@ let MinerUIntegrationStrategy = class MinerUIntegrationStrategy {
68
68
  enum: ['official', 'self-hosted'],
69
69
  default: 'official',
70
70
  },
71
+ extraFormats: {
72
+ type: 'array',
73
+ title: {
74
+ en_US: 'Extra Formats',
75
+ zh_Hans: '额外输出格式',
76
+ },
77
+ description: {
78
+ en_US: 'Optional extra output formats (docx, html, latex). Markdown and JSON are always included.',
79
+ zh_Hans: '可选额外输出格式(docx、html、latex)。Markdown 和 JSON 默认包含。',
80
+ },
81
+ items: {
82
+ type: 'string',
83
+ enum: ['docx', 'html', 'latex'],
84
+ },
85
+ },
71
86
  },
72
87
  },
73
88
  features: [],
@@ -80,7 +95,7 @@ let MinerUIntegrationStrategy = class MinerUIntegrationStrategy {
80
95
  async validateConfig(config) {
81
96
  const mineruClient = new MinerUClient(this.configService, {
82
97
  integration: {
83
- provider: MinerUIntegration,
98
+ provider: MinerU,
84
99
  options: config,
85
100
  },
86
101
  });
@@ -113,6 +128,6 @@ __decorate([
113
128
  ], MinerUIntegrationStrategy.prototype, "configService", void 0);
114
129
  MinerUIntegrationStrategy = __decorate([
115
130
  Injectable(),
116
- IntegrationStrategyKey(MinerUIntegration)
131
+ IntegrationStrategyKey(MinerU)
117
132
  ], MinerUIntegrationStrategy);
118
133
  export { MinerUIntegrationStrategy };
@@ -1 +1 @@
1
- {"version":3,"file":"mineru.client.d.ts","sourceRoot":"","sources":["../../src/lib/mineru.client.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,YAAY,EAAE,MAAM,kBAAkB,CAAC;AAEhD,OAAO,EAAE,aAAa,EAAE,MAAM,gBAAgB,CAAC;AAC/C,OAAO,EAAmB,YAAY,EAAE,MAAM,sBAAsB,CAAC;AACrE,OAAc,EAAE,aAAa,EAAE,MAAM,OAAO,CAAC;AAK7C,OAAO,EAIL,wBAAwB,EAExB,0BAA0B,EAC1B,gBAAgB,EACjB,MAAM,YAAY,CAAC;AAIpB,UAAU,iBAAiB;IACzB,GAAG,CAAC,EAAE,MAAM,CAAC;IACb,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB,KAAK,CAAC,EAAE,OAAO,CAAC;IAChB,aAAa,CAAC,EAAE,OAAO,CAAC;IACxB,WAAW,CAAC,EAAE,OAAO,CAAC;IACtB,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB,YAAY,CAAC,EAAE,MAAM,CAAC;IACtB,MAAM,CAAC,EAAE,MAAM,CAAC;IAChB,UAAU,CAAC,EAAE,MAAM,CAAC;IACpB,YAAY,CAAC,EAAE,MAAM,EAAE,CAAC;IACxB,WAAW,CAAC,EAAE,MAAM,CAAC;IACrB,IAAI,CAAC,EAAE,MAAM,CAAC;IACd,mEAAmE;IACnE,WAAW,CAAC,EAAE,MAAM,CAAC;IACrB,yEAAyE;IACzE,OAAO,CAAC,EAAE,MAAM,CAAC;IACjB,2EAA2E;IAC3E,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,4EAA4E;IAC5E,gBAAgB,CAAC,EAAE,OAAO,CAAC;CAC5B;AAED,UAAU,mBAAmB;IAC3B,GAAG,EAAE,MAAM,CAAC;IACZ,KAAK,CAAC,EAAE,OAAO,CAAC;IAChB,MAAM,CAAC,EAAE,MAAM,CAAC;IAChB,UAAU,CAAC,EAAE,MAAM,CAAC;CACrB;AAED,UAAU,sBAAsB;IAC9B,KAAK,EAAE,mBAAmB,EAAE,CAAC;IAC7B,aAAa,CAAC,EAAE,OAAO,CAAC;IACxB,WAAW,CAAC,EAAE,OAAO,CAAC;IACtB,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB,YAAY,CAAC,EAAE,MAAM,CAAC;IACtB,YAAY,CAAC,EAAE,MAAM,EAAE,CAAC;IACxB,WAAW,CAAC,EAAE,MAAM,CAAC;IACrB,IAAI,CAAC,EAAE,MAAM,CAAC;CACf;AAED,UAAU,iBAAiB;IACzB,aAAa,CAAC,EAAE,OAAO,CAAC;IACxB,WAAW,CAAC,EAAE,OAAO,CAAC;IACtB,QAAQ,CAAC,EAAE,MAAM,CAAC;CACnB;AASD,qBAAa,YAAY;IAWrB,OAAO,CAAC,QAAQ,CAAC,aAAa;IAC9B,OAAO,CAAC,QAAQ,CAAC,WAAW,CAAC;IAX/B,OAAO,CAAC,QAAQ,CAAC,MAAM,CAAiC;IACxD,OAAO,CAAC,QAAQ,CAAC,OAAO,CAAS;IACjC,OAAO,CAAC,QAAQ,CAAC,KAAK,CAAC,CAAS;IAChC,SAAgB,UAAU,EAAE,gBAAgB,CAAC;IAC7C,OAAO,CAAC,QAAQ,CAAC,UAAU,CAAiD;IAE5E,IAAI,UAAU,IAAI,YAAY,GAAG,SAAS,CAEzC;gBAEkB,aAAa,EAAE,aAAa,EAC5B,WAAW,CAAC,EAAE;QACvB,UAAU,CAAC,EAAE,YAAY,CAAC;QAC1B,WAAW,CAAC,EAAE,OAAO,CAAC,YAAY,CAAC,wBAAwB,CAAC,CAAC,CAAC;KACjE;IAkBP;;;OAGG;IACG,UAAU,CAAC,OAAO,EAAE,iBAAiB,GAAG,OAAO,CAAC;QAAE,MAAM,EAAE,MAAM,CAAA;KAAE,CAAC;IAYzE;;OAEG;IACG,eAAe,CAAC,OAAO,EAAE,sBAAsB,GAAG,OAAO,CAAC;QAAE,OAAO,EAAE,MAAM,CAAC;QAAC,QAAQ,CAAC,EAAE,MAAM,EAAE,CAAA;KAAE,CAAC;IAmCzG,iBAAiB,CAAC,MAAM,EAAE,MAAM,GAAG,0BAA0B,GAAG,SAAS;IAOzE;;OAEG;IACG,aAAa,CAAC,MAAM,EAAE,MAAM,EAAE,OAAO,CAAC,EAAE,iBAAiB,GAAG,OAAO,CAAC;QACxE,YAAY,CAAC,EAAE,MAAM,CAAC;QACtB,QAAQ,CAAC,EAAE,MAAM,CAAC;QAClB,OAAO,CAAC,EAAE,MAAM,CAAC;QACjB,MAAM,CAAC,EAAE,MAAM,CAAC;KACjB,CAAC;IAoBF;;OAEG;IACG,cAAc,CAAC,OAAO,EAAE,MAAM,GAAG,OAAO,CAAC,GAAG,CAAC;IAiBnD;;OAEG;IACG,WAAW,CAAC,MAAM,EAAE,MAAM,EAAE,SAAS,SAAgB,EAAE,UAAU,SAAO,GAAG,OAAO,CAAC,GAAG,CAAC;IAsB7F,OAAO,CAAC,cAAc;IAMtB,OAAO,CAAC,iBAAiB;IAczB,OAAO,CAAC,kBAAkB;IAyB1B,OAAO,CAAC,sBAAsB;IAI9B,OAAO,CAAC,gBAAgB;IAIxB,OAAO,CAAC,WAAW;IAQnB,OAAO,CAAC,kBAAkB;IAO1B,OAAO,CAAC,oBAAoB;YAYd,kBAAkB;YA4BlB,oBAAoB;YASpB,qBAAqB;YA0DrB,uBAAuB;IA+CrC,OAAO,CAAC,iBAAiB;IAgBzB,OAAO,CAAC,2BAA2B;IAenC,OAAO,CAAC,6BAA6B;IAcrC,OAAO,CAAC,iBAAiB;IAQzB,OAAO,CAAC,aAAa;IAcrB,OAAO,CAAC,iBAAiB;IAQzB,OAAO,CAAC,eAAe;YAIT,YAAY;IAkB1B,OAAO,CAAC,eAAe;IA0BvB,wBAAwB,IAAI,OAAO,CAAC,aAAa,CAAC,GAAG,EAAE,GAAG,CAAC,CAAC;IAKtD,wBAAwB;CAU/B"}
1
+ {"version":3,"file":"mineru.client.d.ts","sourceRoot":"","sources":["../../src/lib/mineru.client.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,YAAY,EAAE,MAAM,kBAAkB,CAAC;AAEhD,OAAO,EAAE,aAAa,EAAE,MAAM,gBAAgB,CAAC;AAC/C,OAAO,EAAmB,YAAY,EAAE,MAAM,sBAAsB,CAAC;AACrE,OAAc,EAAE,aAAa,EAAE,MAAM,OAAO,CAAC;AAK7C,OAAO,EAIL,wBAAwB,EAExB,0BAA0B,EAC1B,gBAAgB,EACjB,MAAM,YAAY,CAAC;AAIpB,UAAU,iBAAiB;IACzB,GAAG,CAAC,EAAE,MAAM,CAAC;IACb,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB,KAAK,CAAC,EAAE,OAAO,CAAC;IAChB,aAAa,CAAC,EAAE,OAAO,CAAC;IACxB,WAAW,CAAC,EAAE,OAAO,CAAC;IACtB,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB,YAAY,CAAC,EAAE,MAAM,CAAC;IACtB,MAAM,CAAC,EAAE,MAAM,CAAC;IAChB,UAAU,CAAC,EAAE,MAAM,CAAC;IACpB,YAAY,CAAC,EAAE,MAAM,EAAE,CAAC;IACxB,WAAW,CAAC,EAAE,MAAM,CAAC;IACrB,IAAI,CAAC,EAAE,MAAM,CAAC;IACd,mEAAmE;IACnE,WAAW,CAAC,EAAE,MAAM,CAAC;IACrB,yEAAyE;IACzE,OAAO,CAAC,EAAE,MAAM,CAAC;IACjB,2EAA2E;IAC3E,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,4EAA4E;IAC5E,gBAAgB,CAAC,EAAE,OAAO,CAAC;CAC5B;AAED,UAAU,mBAAmB;IAC3B,GAAG,EAAE,MAAM,CAAC;IACZ,KAAK,CAAC,EAAE,OAAO,CAAC;IAChB,MAAM,CAAC,EAAE,MAAM,CAAC;IAChB,UAAU,CAAC,EAAE,MAAM,CAAC;CACrB;AAED,UAAU,sBAAsB;IAC9B,KAAK,EAAE,mBAAmB,EAAE,CAAC;IAC7B,aAAa,CAAC,EAAE,OAAO,CAAC;IACxB,WAAW,CAAC,EAAE,OAAO,CAAC;IACtB,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB,YAAY,CAAC,EAAE,MAAM,CAAC;IACtB,YAAY,CAAC,EAAE,MAAM,EAAE,CAAC;IACxB,WAAW,CAAC,EAAE,MAAM,CAAC;IACrB,IAAI,CAAC,EAAE,MAAM,CAAC;CACf;AAED,UAAU,iBAAiB;IACzB,aAAa,CAAC,EAAE,OAAO,CAAC;IACxB,WAAW,CAAC,EAAE,OAAO,CAAC;IACtB,QAAQ,CAAC,EAAE,MAAM,CAAC;CACnB;AASD,qBAAa,YAAY;IAWrB,OAAO,CAAC,QAAQ,CAAC,aAAa;IAC9B,OAAO,CAAC,QAAQ,CAAC,WAAW,CAAC;IAX/B,OAAO,CAAC,QAAQ,CAAC,MAAM,CAAiC;IACxD,OAAO,CAAC,QAAQ,CAAC,OAAO,CAAS;IACjC,OAAO,CAAC,QAAQ,CAAC,KAAK,CAAC,CAAS;IAChC,SAAgB,UAAU,EAAE,gBAAgB,CAAC;IAC7C,OAAO,CAAC,QAAQ,CAAC,UAAU,CAAiD;IAE5E,IAAI,UAAU,IAAI,YAAY,GAAG,SAAS,CAEzC;gBAEkB,aAAa,EAAE,aAAa,EAC5B,WAAW,CAAC,EAAE;QACvB,UAAU,CAAC,EAAE,YAAY,CAAC;QAC1B,WAAW,CAAC,EAAE,OAAO,CAAC,YAAY,CAAC,wBAAwB,CAAC,CAAC,CAAC;KACjE;IAkBP;;;OAGG;IACG,UAAU,CAAC,OAAO,EAAE,iBAAiB,GAAG,OAAO,CAAC;QAAE,MAAM,EAAE,MAAM,CAAA;KAAE,CAAC;IAYzE;;OAEG;IACG,eAAe,CAAC,OAAO,EAAE,sBAAsB,GAAG,OAAO,CAAC;QAAE,OAAO,EAAE,MAAM,CAAC;QAAC,QAAQ,CAAC,EAAE,MAAM,EAAE,CAAA;KAAE,CAAC;IAmCzG,iBAAiB,CAAC,MAAM,EAAE,MAAM,GAAG,0BAA0B,GAAG,SAAS;IAOzE;;OAEG;IACG,aAAa,CAAC,MAAM,EAAE,MAAM,EAAE,OAAO,CAAC,EAAE,iBAAiB,GAAG,OAAO,CAAC;QACxE,YAAY,CAAC,EAAE,MAAM,CAAC;QACtB,QAAQ,CAAC,EAAE,MAAM,CAAC;QAClB,OAAO,CAAC,EAAE,MAAM,CAAC;QACjB,MAAM,CAAC,EAAE,MAAM,CAAC;KACjB,CAAC;IAoBF;;OAEG;IACG,cAAc,CAAC,OAAO,EAAE,MAAM,GAAG,OAAO,CAAC,GAAG,CAAC;IAiBnD;;OAEG;IACG,WAAW,CAAC,MAAM,EAAE,MAAM,EAAE,SAAS,SAAgB,EAAE,UAAU,SAAO,GAAG,OAAO,CAAC,GAAG,CAAC;IAsB7F,OAAO,CAAC,cAAc;IAMtB,OAAO,CAAC,iBAAiB;IAczB,OAAO,CAAC,kBAAkB;IAiC1B,OAAO,CAAC,sBAAsB;IAI9B,OAAO,CAAC,gBAAgB;IAIxB,OAAO,CAAC,WAAW;IAQnB,OAAO,CAAC,kBAAkB;IAO1B,OAAO,CAAC,oBAAoB;YAYd,kBAAkB;YA4BlB,oBAAoB;YAwIpB,qBAAqB;YAyFrB,uBAAuB;IAsDrC,OAAO,CAAC,iBAAiB;IAgBzB,OAAO,CAAC,2BAA2B;IAenC,OAAO,CAAC,6BAA6B;IAcrC,OAAO,CAAC,iBAAiB;IAQzB,OAAO,CAAC,aAAa;IAcrB,OAAO,CAAC,iBAAiB;IAQzB,OAAO,CAAC,eAAe;YAIT,YAAY;IAkB1B,OAAO,CAAC,eAAe;IA0BvB,wBAAwB,IAAI,OAAO,CAAC,aAAa,CAAC,GAAG,EAAE,GAAG,CAAC,CAAC;IAKtD,wBAAwB;CAU/B"}
@@ -3,7 +3,7 @@ import { getErrorMessage } from '@xpert-ai/plugin-sdk';
3
3
  import axios from 'axios';
4
4
  import FormData from 'form-data';
5
5
  import { randomUUID } from 'crypto';
6
- import { basename } from 'path';
6
+ import { basename, isAbsolute, join as pathJoin } from 'path';
7
7
  import fs from 'fs';
8
8
  import { ENV_MINERU_API_BASE_URL, ENV_MINERU_API_TOKEN, ENV_MINERU_SERVER_TYPE, } from './types.js';
9
9
  const DEFAULT_OFFICIAL_BASE_URL = 'https://mineru.net/api/v4';
@@ -182,8 +182,13 @@ export class MinerUClient {
182
182
  const tokenFromEnv = this.configService.get(tokenEnvKey);
183
183
  const baseUrl = baseUrlFromIntegration ||
184
184
  baseUrlFromEnv ||
185
- (this.serverType === 'official' ? DEFAULT_OFFICIAL_BASE_URL : null);
185
+ (this.serverType === 'official' ? DEFAULT_OFFICIAL_BASE_URL : undefined);
186
186
  const token = tokenFromIntegration || tokenFromEnv;
187
+ // Validate baseUrl is provided for self-hosted mode
188
+ if (this.serverType === 'self-hosted' && !baseUrl) {
189
+ throw new Error('MinerU self-hosted mode requires apiUrl to be configured in integration options or ' +
190
+ `${ENV_MINERU_API_BASE_URL} environment variable`);
191
+ }
187
192
  return { baseUrl, token };
188
193
  }
189
194
  readIntegrationOptions(integration) {
@@ -251,18 +256,141 @@ export class MinerUClient {
251
256
  }
252
257
  }
253
258
  async createSelfHostedTask(options) {
254
- const filePath = this.fileSystem.fullPath(options.filePath);
259
+ // Validate fileSystem is available for self-hosted mode
260
+ if (!this.fileSystem) {
261
+ throw new Error('MinerU self-hosted mode requires fileSystem permission');
262
+ }
263
+ // Validate filePath is provided
264
+ if (!options.filePath) {
265
+ throw new Error('MinerU self-hosted mode requires filePath to be provided');
266
+ }
267
+ // Resolve absolute file path
268
+ // Log original filePath for debugging
269
+ const basePath = this.fileSystem ? this.fileSystem.basePath : 'N/A';
270
+ this.logger.debug(`Resolving file path. Original filePath: ${options.filePath}, basePath: ${basePath}`);
271
+ // Check if filePath is already an absolute path
272
+ const isAbsolutePath = isAbsolute(options.filePath);
273
+ // Also check if it looks like a full path even without leading slash
274
+ const looksLikeFullPath = !isAbsolutePath && (options.filePath.startsWith('Users/') ||
275
+ options.filePath.startsWith('home/'));
276
+ let filePath;
277
+ if (isAbsolutePath) {
278
+ // Use absolute path directly
279
+ filePath = options.filePath;
280
+ this.logger.debug(`Using absolute path directly: ${filePath}`);
281
+ }
282
+ else if (looksLikeFullPath) {
283
+ // If it looks like a full path but doesn't start with /, add it
284
+ filePath = options.filePath.startsWith('/') ? options.filePath : '/' + options.filePath;
285
+ this.logger.debug(`Detected full path pattern, normalized to: ${filePath}`);
286
+ }
287
+ else {
288
+ // Use xpFileSystem.fullPath() to resolve relative path to absolute path
289
+ filePath = this.fileSystem.fullPath(options.filePath);
290
+ this.logger.debug(`Resolved relative path using basePath: ${filePath}`);
291
+ }
292
+ // Validate file exists and is readable before attempting to parse
293
+ try {
294
+ await fs.promises.access(filePath, fs.constants.F_OK | fs.constants.R_OK);
295
+ const stats = await fs.promises.stat(filePath);
296
+ this.logger.debug(`Processing file: ${filePath}, size: ${stats.size} bytes`);
297
+ if (stats.size === 0) {
298
+ throw new Error(`File is empty: ${filePath}`);
299
+ }
300
+ }
301
+ catch (error) {
302
+ // If file not found in the resolved path, try to find it in common alternative locations
303
+ // This handles two scenarios:
304
+ // 1. StorageFile: files/{tenantId}/filename -> apps/api/public/files/{tenantId}/filename (already tried above)
305
+ // 2. VolumeClient: folder/filename or filename -> ~/data/folder/filename or ~/data/filename
306
+ if (error instanceof Error && error.code === 'ENOENT') {
307
+ const homeDir = process.env.HOME || process.env.USERPROFILE;
308
+ const originalFilePath = options.filePath;
309
+ const fileName = basename(originalFilePath);
310
+ // Build alternative paths for VolumeClient storage
311
+ const alternativePaths = [];
312
+ // If original path contains directory separators, try both full path and just filename
313
+ if (originalFilePath.includes('/') || originalFilePath.includes('\\')) {
314
+ // Try full path in ~/data/
315
+ alternativePaths.push(pathJoin(homeDir || '', 'data', originalFilePath));
316
+ // Try just filename in ~/data/ (for VolumeClient files stored directly in root)
317
+ alternativePaths.push(pathJoin(homeDir || '', 'data', fileName));
318
+ }
319
+ else {
320
+ // If original path is just a filename, try in ~/data/ root
321
+ alternativePaths.push(pathJoin(homeDir || '', 'data', originalFilePath));
322
+ }
323
+ // Also try in knowledge base specific paths if we can determine knowledgebaseId
324
+ // Note: We don't have direct access to knowledgebaseId here, but files might be in knowledges subdirectory
325
+ const resolvedPath = this.fileSystem.fullPath(originalFilePath);
326
+ if (resolvedPath.includes('apps/api/public')) {
327
+ // This looks like a StorageFile path, but file not found
328
+ // Try VolumeClient paths as fallback
329
+ this.logger.debug(`File not found in StorageFile path, trying VolumeClient paths...`);
330
+ }
331
+ let foundPath = null;
332
+ for (const altPath of alternativePaths) {
333
+ try {
334
+ await fs.promises.access(altPath, fs.constants.F_OK | fs.constants.R_OK);
335
+ const stats = await fs.promises.stat(altPath);
336
+ this.logger.debug(`Found file in alternative location: ${altPath}, size: ${stats.size} bytes`);
337
+ foundPath = altPath;
338
+ if (stats.size === 0) {
339
+ throw new Error(`File is empty: ${foundPath}`);
340
+ }
341
+ break; // File found, exit loop
342
+ }
343
+ catch (altError) {
344
+ // Continue to next alternative path
345
+ continue;
346
+ }
347
+ }
348
+ // If file found in alternative location, use it
349
+ if (foundPath) {
350
+ filePath = foundPath;
351
+ }
352
+ else {
353
+ // If still not found after trying alternatives, throw original error
354
+ const basePath = this.fileSystem ? this.fileSystem.basePath : 'N/A';
355
+ this.logger.error(`File not found or not readable. ` +
356
+ `Original path: ${originalFilePath}, ` +
357
+ `Resolved path: ${filePath}, ` +
358
+ `Base path: ${basePath}, ` +
359
+ `Tried alternative paths: ${alternativePaths.join(', ')}`, error instanceof Error ? error.stack : error);
360
+ throw new Error(`File not found or not readable: ${filePath}. ` +
361
+ `Original path: ${originalFilePath}, ` +
362
+ `Base path: ${basePath}. ` +
363
+ `Tried alternative locations: ${alternativePaths.join(', ')}`);
364
+ }
365
+ }
366
+ else if (error instanceof Error && error.message.includes('empty')) {
367
+ this.logger.error(`File is empty: ${filePath}`);
368
+ throw error;
369
+ }
370
+ else {
371
+ // Re-throw other errors
372
+ throw error;
373
+ }
374
+ }
255
375
  const taskId = randomUUID();
256
- const result = await this.invokeSelfHostedParse(filePath, options.fileName, options);
376
+ const result = await this.invokeSelfHostedParse(filePath, options.fileName || basename(filePath), options);
257
377
  this.localTasks.set(taskId, { ...result, sourceUrl: options.url });
258
378
  return { taskId };
259
379
  }
260
380
  async invokeSelfHostedParse(filePath, fileName, options) {
261
381
  const parseUrl = this.buildApiUrl('file_parse');
382
+ this.logger.debug(`Sending parse request to: ${parseUrl}, file: ${fileName}`);
262
383
  const form = new FormData();
263
- form.append('files', fs.createReadStream(filePath), {
264
- filename: fileName,
265
- });
384
+ // Create file read stream (file existence is already validated in createSelfHostedTask)
385
+ try {
386
+ form.append('files', fs.createReadStream(filePath), {
387
+ filename: fileName,
388
+ });
389
+ }
390
+ catch (error) {
391
+ this.logger.error(`Failed to create read stream for file: ${filePath}`, error instanceof Error ? error.stack : error);
392
+ throw new Error(`Failed to read file: ${filePath}. ${error instanceof Error ? error.message : String(error)}`);
393
+ }
266
394
  // form.append('files', fileBuffer, { filename: fileName, contentType: contentType || 'application/pdf' });
267
395
  form.append('parse_method', options.parseMethod ?? 'auto');
268
396
  form.append('return_md', 'true');
@@ -290,11 +418,27 @@ export class MinerUClient {
290
418
  return this.invokeSelfHostedParseV1(filePath, fileName, options);
291
419
  }
292
420
  if (response.status === 400) {
293
- throw new BadRequestException(`MinerU self-hosted parse failed: ${response.status} ${getErrorMessage(response.data)}`);
421
+ const errorMessage = getErrorMessage(response.data);
422
+ this.logger.error(`MinerU self-hosted parse failed with 400: ${errorMessage}`, JSON.stringify(response.data));
423
+ throw new BadRequestException(`MinerU self-hosted parse failed: ${response.status} ${errorMessage}`);
294
424
  }
295
425
  if (response.status !== 200) {
296
- console.error(response.data);
297
- throw new Error(`MinerU self-hosted parse failed: ${response.status} ${response.statusText}`);
426
+ const errorMessage = getErrorMessage(response.data) || response.statusText;
427
+ const errorDetails = typeof response.data === 'object' ? JSON.stringify(response.data) : String(response.data);
428
+ this.logger.error(`MinerU self-hosted parse failed with ${response.status}: ${errorMessage}`, `Request URL: ${parseUrl}, File: ${fileName}, Details: ${errorDetails}`);
429
+ // Provide more helpful error message for common issues
430
+ let userFriendlyMessage = `MinerU self-hosted parse failed: ${response.status} ${response.statusText}`;
431
+ if (errorMessage) {
432
+ userFriendlyMessage += `. ${errorMessage}`;
433
+ }
434
+ // Check for specific error patterns
435
+ if (errorMessage && errorMessage.includes('0 active models')) {
436
+ userFriendlyMessage += ' Please ensure MinerU service has active models configured.';
437
+ }
438
+ else if (errorMessage && errorMessage.includes('NoneType')) {
439
+ userFriendlyMessage += ' This may indicate a configuration issue with the MinerU service.';
440
+ }
441
+ throw new Error(userFriendlyMessage);
298
442
  }
299
443
  return this.normalizeSelfHostedResponse(response.data);
300
444
  }
@@ -323,7 +467,9 @@ export class MinerUClient {
323
467
  validateStatus: () => true,
324
468
  });
325
469
  if (response.status !== 200) {
326
- throw new Error(`MinerU self-hosted legacy parse failed: ${response.status} ${response.statusText}`);
470
+ const errorMessage = getErrorMessage(response.data) || response.statusText;
471
+ this.logger.error(`MinerU self-hosted legacy parse failed with ${response.status}: ${errorMessage}`, JSON.stringify(response.data));
472
+ throw new Error(`MinerU self-hosted legacy parse failed: ${response.status} ${response.statusText}. ${errorMessage}`);
327
473
  }
328
474
  return this.normalizeSelfHostedResponse(response.data);
329
475
  }
@@ -1 +1 @@
1
- {"version":3,"file":"mineru.plugin.d.ts","sourceRoot":"","sources":["../../src/lib/mineru.plugin.ts"],"names":[],"mappings":"AACA,OAAO,EAAqB,kBAAkB,EAAE,gBAAgB,EAAE,MAAM,sBAAsB,CAAC;AAQ/F,qBAkBa,YAAa,YAAW,kBAAkB,EAAE,gBAAgB;IAExE,OAAO,CAAC,UAAU,CAAQ;IAE1B;;OAEG;IACH,iBAAiB,IAAI,IAAI,GAAG,OAAO,CAAC,IAAI,CAAC;IAMzC;;OAEG;IACH,eAAe,IAAI,IAAI,GAAG,OAAO,CAAC,IAAI,CAAC;CAKvC"}
1
+ {"version":3,"file":"mineru.plugin.d.ts","sourceRoot":"","sources":["../../src/lib/mineru.plugin.ts"],"names":[],"mappings":"AACA,OAAO,EAAqB,kBAAkB,EAAE,gBAAgB,EAAE,MAAM,sBAAsB,CAAC;AAO/F,qBAiBa,YAAa,YAAW,kBAAkB,EAAE,gBAAgB;IAExE,OAAO,CAAC,UAAU,CAAQ;IAE1B;;OAEG;IACH,iBAAiB,IAAI,IAAI,GAAG,OAAO,CAAC,IAAI,CAAC;IAMzC;;OAEG;IACH,eAAe,IAAI,IAAI,GAAG,OAAO,CAAC,IAAI,CAAC;CAKvC"}
@@ -7,7 +7,6 @@ import { MinerUTransformerStrategy } from './transformer-mineru.strategy.js';
7
7
  import { MinerUResultParserService } from './result-parser.service.js';
8
8
  import { MinerUIntegrationStrategy } from './integration.strategy.js';
9
9
  import { MinerUController } from './mineru.controller.js';
10
- import { MinerUToolsetStrategy } from './mineru-toolset.strategy.js';
11
10
  let MinerUPlugin = MinerUPlugin_1 = class MinerUPlugin {
12
11
  constructor() {
13
12
  // We disable by default additional logging for each event to avoid cluttering the logs
@@ -42,7 +41,6 @@ MinerUPlugin = MinerUPlugin_1 = __decorate([
42
41
  MinerUIntegrationStrategy,
43
42
  MinerUTransformerStrategy,
44
43
  MinerUResultParserService,
45
- MinerUToolsetStrategy,
46
44
  ],
47
45
  controllers: [
48
46
  MinerUController
@@ -1 +1 @@
1
- {"version":3,"file":"result-parser.service.d.ts","sourceRoot":"","sources":["../../src/lib/result-parser.service.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,QAAQ,EAAE,MAAM,2BAA2B,CAAC;AACrD,OAAO,EAAE,kBAAkB,EAAE,MAAM,kBAAkB,CAAC;AAEtD,OAAO,EACL,aAAa,EAEb,YAAY,EACb,MAAM,sBAAsB,CAAC;AAK9B,OAAO,EAEL,sBAAsB,EACtB,0BAA0B,EAC3B,MAAM,YAAY,CAAC;AAEpB,qBACa,yBAAyB;IACpC,OAAO,CAAC,QAAQ,CAAC,MAAM,CAA8C;IAE/D,YAAY,CAChB,UAAU,EAAE,MAAM,EAClB,MAAM,EAAE,MAAM,EACd,QAAQ,EAAE,OAAO,CAAC,kBAAkB,CAAC,EACrC,UAAU,EAAE,YAAY,GACvB,OAAO,CAAC;QACT,EAAE,CAAC,EAAE,MAAM,CAAC;QACZ,MAAM,EAAE,QAAQ,CAAC,aAAa,CAAC,EAAE,CAAC;QAClC,QAAQ,EAAE,sBAAsB,CAAC;KAClC,CAAC;IAqFI,cAAc,CAClB,MAAM,EAAE,0BAA0B,EAClC,MAAM,EAAE,MAAM,EACd,QAAQ,EAAE,OAAO,CAAC,kBAAkB,CAAC,EACrC,UAAU,EAAE,YAAY,GACvB,OAAO,CAAC;QACT,EAAE,CAAC,EAAE,MAAM,CAAC;QACZ,MAAM,EAAE,QAAQ,CAAC,aAAa,CAAC,EAAE,CAAC;QAClC,QAAQ,EAAE,sBAAsB,CAAC;KAClC,CAAC;CAkDH"}
1
+ {"version":3,"file":"result-parser.service.d.ts","sourceRoot":"","sources":["../../src/lib/result-parser.service.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,QAAQ,EAAE,MAAM,2BAA2B,CAAC;AACrD,OAAO,EAAE,kBAAkB,EAAE,MAAM,kBAAkB,CAAC;AAEtD,OAAO,EACL,aAAa,EAEb,YAAY,EACb,MAAM,sBAAsB,CAAC;AAK9B,OAAO,EAEL,sBAAsB,EACtB,0BAA0B,EAC3B,MAAM,YAAY,CAAC;AAEpB,qBACa,yBAAyB;IACpC,OAAO,CAAC,QAAQ,CAAC,MAAM,CAA8C;IAE/D,YAAY,CAChB,UAAU,EAAE,MAAM,EAClB,MAAM,EAAE,MAAM,EACd,QAAQ,EAAE,OAAO,CAAC,kBAAkB,CAAC,EACrC,UAAU,EAAE,YAAY,GACvB,OAAO,CAAC;QACT,EAAE,CAAC,EAAE,MAAM,CAAC;QACZ,MAAM,EAAE,QAAQ,CAAC,aAAa,CAAC,EAAE,CAAC;QAClC,QAAQ,EAAE,sBAAsB,CAAC;KAClC,CAAC;IAsFI,cAAc,CAClB,MAAM,EAAE,0BAA0B,EAClC,MAAM,EAAE,MAAM,EACd,QAAQ,EAAE,OAAO,CAAC,kBAAkB,CAAC,EACrC,UAAU,EAAE,YAAY,GACvB,OAAO,CAAC;QACT,EAAE,CAAC,EAAE,MAAM,CAAC;QACZ,MAAM,EAAE,QAAQ,CAAC,aAAa,CAAC,EAAE,CAAC;QAClC,QAAQ,EAAE,sBAAsB,CAAC;KAClC,CAAC;CAkDH"}
@@ -21,6 +21,7 @@ let MinerUResultParserService = MinerUResultParserService_1 = class MinerUResult
21
21
  const metadata = {
22
22
  parser: MinerU,
23
23
  taskId,
24
+ fullZipUrl,
24
25
  };
25
26
  // 2. Unzip the file
26
27
  const zipEntries = [];
@@ -85,6 +85,17 @@ export declare class MinerUTransformerStrategy implements IDocumentTransformerSt
85
85
  enum: string[];
86
86
  default: string;
87
87
  };
88
+ pageRanges: {
89
+ type: string;
90
+ title: {
91
+ en_US: string;
92
+ zh_Hans: string;
93
+ };
94
+ description: {
95
+ en_US: string;
96
+ zh_Hans: string;
97
+ };
98
+ };
88
99
  };
89
100
  required: any[];
90
101
  };
@@ -1 +1 @@
1
- {"version":3,"file":"transformer-mineru.strategy.d.ts","sourceRoot":"","sources":["../../src/lib/transformer-mineru.strategy.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,QAAQ,EAAE,kBAAkB,EAAE,MAAM,kBAAkB,CAAA;AAG/D,OAAO,EACL,aAAa,EAEb,oBAAoB,EACpB,4BAA4B,EAC5B,qBAAqB,EACtB,MAAM,sBAAsB,CAAA;AAI7B,OAAO,EAA8C,wBAAwB,EAAE,MAAM,YAAY,CAAA;AAEjG,qBAEa,yBAA0B,YAAW,4BAA4B,CAAC,wBAAwB,CAAC;IAEtG,OAAO,CAAC,QAAQ,CAAC,YAAY,CAA2B;IAGxD,OAAO,CAAC,QAAQ,CAAC,aAAa,CAAe;IAE7C,QAAQ,CAAC,WAAW,mDAWnB;IAED,QAAQ,CAAC,IAAI;;;;;;;;;;;kBAWM,QAAQ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;MAwE1B;IAED,cAAc,CAAC,MAAM,EAAE,GAAG,GAAG,OAAO,CAAC,IAAI,CAAC;IAIpC,kBAAkB,CACtB,SAAS,EAAE,OAAO,CAAC,kBAAkB,CAAC,EAAE,EACxC,MAAM,EAAE,wBAAwB,GAC/B,OAAO,CAAC,OAAO,CAAC,kBAAkB,CAAC,aAAa,CAAC,CAAC,EAAE,CAAC;CAsDzD"}
1
+ {"version":3,"file":"transformer-mineru.strategy.d.ts","sourceRoot":"","sources":["../../src/lib/transformer-mineru.strategy.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,QAAQ,EAAE,kBAAkB,EAAE,MAAM,kBAAkB,CAAA;AAG/D,OAAO,EACL,aAAa,EAEb,oBAAoB,EACpB,4BAA4B,EAC5B,qBAAqB,EACtB,MAAM,sBAAsB,CAAA;AAI7B,OAAO,EAA0C,wBAAwB,EAAE,MAAM,YAAY,CAAA;AAE7F,qBAEa,yBAA0B,YAAW,4BAA4B,CAAC,wBAAwB,CAAC;IAEtG,OAAO,CAAC,QAAQ,CAAC,YAAY,CAA2B;IAGxD,OAAO,CAAC,QAAQ,CAAC,aAAa,CAAe;IAE7C,QAAQ,CAAC,WAAW,mDAWnB;IAED,QAAQ,CAAC,IAAI;;;;;;;;;;;kBAWM,QAAQ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;MAmF1B;IAED,cAAc,CAAC,MAAM,EAAE,GAAG,GAAG,OAAO,CAAC,IAAI,CAAC;IAIpC,kBAAkB,CACtB,SAAS,EAAE,OAAO,CAAC,kBAAkB,CAAC,EAAE,EACxC,MAAM,EAAE,wBAAwB,GAC/B,OAAO,CAAC,OAAO,CAAC,kBAAkB,CAAC,aAAa,CAAC,CAAC,EAAE,CAAC;CAiEzD"}