paddleocr-skills 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (29) hide show
  1. package/README.md +220 -220
  2. package/bin/paddleocr-skills.js +33 -20
  3. package/lib/copy.js +39 -39
  4. package/lib/installer.js +76 -70
  5. package/lib/prompts.js +67 -67
  6. package/lib/python.js +75 -75
  7. package/lib/verify.js +121 -121
  8. package/package.json +42 -42
  9. package/templates/.env.example +12 -12
  10. package/templates/{paddleocr-vl/references/paddleocr-vl → paddleocr-vl-1.5/references/paddleocr-vl-1.5}/layout_schema.md +64 -64
  11. package/templates/{paddleocr-vl/references/paddleocr-vl → paddleocr-vl-1.5/references/paddleocr-vl-1.5}/output_format.md +154 -154
  12. package/templates/{paddleocr-vl/references/paddleocr-vl → paddleocr-vl-1.5/references/paddleocr-vl-1.5}/vl_model_spec.md +157 -157
  13. package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/_lib.py +780 -780
  14. package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/configure.py +270 -270
  15. package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/optimize_file.py +226 -226
  16. package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/requirements-optimize.txt +8 -8
  17. package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/requirements.txt +7 -7
  18. package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/smoke_test.py +199 -199
  19. package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/vl_caller.py +232 -232
  20. package/templates/{paddleocr-vl/skills/paddleocr-vl → paddleocr-vl-1.5/skills/paddleocr-vl-1.5}/SKILL.md +481 -481
  21. package/templates/ppocrv5/references/ppocrv5/agent_policy.md +258 -258
  22. package/templates/ppocrv5/references/ppocrv5/normalized_schema.md +257 -257
  23. package/templates/ppocrv5/references/ppocrv5/provider_api.md +140 -140
  24. package/templates/ppocrv5/scripts/ppocrv5/_lib.py +635 -635
  25. package/templates/ppocrv5/scripts/ppocrv5/configure.py +346 -346
  26. package/templates/ppocrv5/scripts/ppocrv5/ocr_caller.py +684 -684
  27. package/templates/ppocrv5/scripts/ppocrv5/requirements.txt +4 -4
  28. package/templates/ppocrv5/scripts/ppocrv5/smoke_test.py +139 -139
  29. package/templates/ppocrv5/skills/ppocrv5/SKILL.md +272 -272
@@ -1,140 +1,140 @@
1
- # Provider API Reference: Paddle AI Studio PP-OCRv5
2
-
3
- This document describes the external provider API contract that this skill depends on.
4
-
5
- ## Endpoint
6
-
7
- **POST** `https://<AISTUDIO_HOST>/ocr`
8
-
9
- Where `<AISTUDIO_HOST>` is provided by the user (e.g.: `your-subdomain.aistudio-app.com`).
10
-
11
- ## Authentication
12
-
13
- **Header:**
14
- ```
15
- Authorization: token <ACCESS_TOKEN>
16
- ```
17
-
18
- Where `<ACCESS_TOKEN>` is the API token obtained by the user from Paddle AI Studio.
19
-
20
- ## Request Body
21
-
22
- ```jsonc
23
- {
24
- "file": "https://example.com/image.png", // URL or base64 (without data: prefix)
25
- "fileType": 1, // 0=PDF, 1=Image
26
- "visualize": false, // Default false (avoid large responses)
27
-
28
- // Text detection options
29
- "textDetLimitSideLen": 736, // Maximum side length for detection
30
- "textDetLimitType": "max", // "min" or "max"
31
- "textDetThresh": 0.3, // Detection threshold
32
- "textDetBoxThresh": 0.6, // Box threshold
33
- "textDetUnclipRatio": 1.5, // Unclip ratio
34
-
35
- // Text recognition options
36
- "textRecScoreThresh": 0.0, // Recognition score threshold
37
-
38
- // Document preprocessing options
39
- "useDocOrientationClassify": false, // Enable orientation correction
40
- "useDocUnwarping": false, // Enable unwarping/skew correction
41
- "useTextlineOrientation": false // Enable textline orientation
42
- }
43
- ```
44
-
45
- ### Key Parameters
46
-
47
- - **file**: URL or base64 string of image/PDF (without `data:` URI prefix)
48
- - **fileType**:
49
- - `0` = PDF
50
- - `1` = Image
51
- - **visualize**: If `true`, returns visualization image (increases response size)
52
- - **useDocOrientationClassify**: Correct page orientation (0°/90°/180°/270°)
53
- - **useDocUnwarping**: Correct perspective distortion and skew
54
- - **useTextlineOrientation**: Correct individual text line angles
55
-
56
- ## Response Structure
57
-
58
- ### Success Response (errorCode == 0)
59
-
60
- ```jsonc
61
- {
62
- "errorCode": 0,
63
- "errorMsg": "",
64
- "logId": "unique-log-id",
65
- "result": {
66
- "ocrResults": [
67
- {
68
- "prunedResult": {
69
- "rec_texts": ["Invoice", "Amount", "123.45"], // Recognized text
70
- "rec_scores": [0.98, 0.95, 0.92], // Confidence scores (may be missing)
71
- "rec_boxes": [ // Bounding boxes (may be missing)
72
- [10, 20, 100, 50],
73
- [10, 60, 150, 90],
74
- [200, 60, 300, 90]
75
- ],
76
- "rec_polys": [...] // Polygons (alternative to boxes)
77
- }
78
- }
79
- ]
80
- }
81
- }
82
- ```
83
-
84
- ### Error Response (errorCode != 0)
85
-
86
- ```jsonc
87
- {
88
- "errorCode": 500,
89
- "errorMsg": "Invalid parameter",
90
- "logId": "unique-log-id"
91
- }
92
- ```
93
-
94
- ## Error Codes
95
-
96
- | HTTP Status | errorCode | Meaning | Mapped Error Code |
97
- |-------------|-----------|---------|-------------------|
98
- | 403 | N/A | Authentication failed | `PROVIDER_AUTH_ERROR` |
99
- | 429 | N/A | Quota/rate limit exceeded | `PROVIDER_QUOTA_EXCEEDED` |
100
- | 500 | 500 | Invalid parameters | `PROVIDER_BAD_REQUEST` |
101
- | 503 | N/A | Service overloaded | `PROVIDER_OVERLOADED` |
102
- | 504 | N/A | Gateway timeout | `PROVIDER_TIMEOUT` |
103
- | Other | Other | Unknown error | `PROVIDER_ERROR` |
104
-
105
- ## Field Compatibility Notes
106
-
107
- - **rec_scores**: May be missing or empty. Default to 0.5 if needed.
108
- - **rec_boxes**: May be missing. Use `rec_polys` as fallback.
109
- - **rec_polys**: May be missing. Bounding box information may not be available.
110
- - **visualize result**: Only returned when `visualize: true` (not recommended for auto mode).
111
-
112
- ## Best Practices
113
-
114
- 1. **Always set visualize to false** unless explicitly requested by user (reduces response size and latency)
115
- 2. **Handle missing fields gracefully** (rec_scores, rec_boxes, rec_polys may not exist)
116
- 3. **Retry on 503/504** with exponential backoff (up to 2 retries)
117
- 4. **Never log or print tokens** in any output or logs
118
- 5. **Normalize host input** to handle user errors (https://, trailing /ocr, etc.)
119
-
120
- ## Request Example
121
-
122
- ```bash
123
- curl -X POST https://your-subdomain.aistudio-app.com/ocr \
124
- -H "Authorization: token YOUR_ACCESS_TOKEN" \
125
- -H "Content-Type: application/json" \
126
- -d '{
127
- "file": "https://example.com/test.png",
128
- "fileType": 1,
129
- "visualize": false,
130
- "useDocOrientationClassify": true,
131
- "useDocUnwarping": false,
132
- "useTextlineOrientation": false,
133
- "textDetLimitSideLen": 736,
134
- "textDetLimitType": "max",
135
- "textDetThresh": 0.3,
136
- "textDetBoxThresh": 0.6,
137
- "textDetUnclipRatio": 1.5,
138
- "textRecScoreThresh": 0.0
139
- }'
140
- ```
1
+ # Provider API Reference: Paddle AI Studio PP-OCRv5
2
+
3
+ This document describes the external provider API contract that this skill depends on.
4
+
5
+ ## Endpoint
6
+
7
+ **POST** `https://<AISTUDIO_HOST>/ocr`
8
+
9
+ Where `<AISTUDIO_HOST>` is provided by the user (e.g.: `your-subdomain.aistudio-app.com`).
10
+
11
+ ## Authentication
12
+
13
+ **Header:**
14
+ ```
15
+ Authorization: token <ACCESS_TOKEN>
16
+ ```
17
+
18
+ Where `<ACCESS_TOKEN>` is the API token obtained by the user from Paddle AI Studio.
19
+
20
+ ## Request Body
21
+
22
+ ```jsonc
23
+ {
24
+ "file": "https://example.com/image.png", // URL or base64 (without data: prefix)
25
+ "fileType": 1, // 0=PDF, 1=Image
26
+ "visualize": false, // Default false (avoid large responses)
27
+
28
+ // Text detection options
29
+ "textDetLimitSideLen": 736, // Maximum side length for detection
30
+ "textDetLimitType": "max", // "min" or "max"
31
+ "textDetThresh": 0.3, // Detection threshold
32
+ "textDetBoxThresh": 0.6, // Box threshold
33
+ "textDetUnclipRatio": 1.5, // Unclip ratio
34
+
35
+ // Text recognition options
36
+ "textRecScoreThresh": 0.0, // Recognition score threshold
37
+
38
+ // Document preprocessing options
39
+ "useDocOrientationClassify": false, // Enable orientation correction
40
+ "useDocUnwarping": false, // Enable unwarping/skew correction
41
+ "useTextlineOrientation": false // Enable textline orientation
42
+ }
43
+ ```
44
+
45
+ ### Key Parameters
46
+
47
+ - **file**: URL or base64 string of image/PDF (without `data:` URI prefix)
48
+ - **fileType**:
49
+ - `0` = PDF
50
+ - `1` = Image
51
+ - **visualize**: If `true`, returns visualization image (increases response size)
52
+ - **useDocOrientationClassify**: Correct page orientation (0°/90°/180°/270°)
53
+ - **useDocUnwarping**: Correct perspective distortion and skew
54
+ - **useTextlineOrientation**: Correct individual text line angles
55
+
56
+ ## Response Structure
57
+
58
+ ### Success Response (errorCode == 0)
59
+
60
+ ```jsonc
61
+ {
62
+ "errorCode": 0,
63
+ "errorMsg": "",
64
+ "logId": "unique-log-id",
65
+ "result": {
66
+ "ocrResults": [
67
+ {
68
+ "prunedResult": {
69
+ "rec_texts": ["Invoice", "Amount", "123.45"], // Recognized text
70
+ "rec_scores": [0.98, 0.95, 0.92], // Confidence scores (may be missing)
71
+ "rec_boxes": [ // Bounding boxes (may be missing)
72
+ [10, 20, 100, 50],
73
+ [10, 60, 150, 90],
74
+ [200, 60, 300, 90]
75
+ ],
76
+ "rec_polys": [...] // Polygons (alternative to boxes)
77
+ }
78
+ }
79
+ ]
80
+ }
81
+ }
82
+ ```
83
+
84
+ ### Error Response (errorCode != 0)
85
+
86
+ ```jsonc
87
+ {
88
+ "errorCode": 500,
89
+ "errorMsg": "Invalid parameter",
90
+ "logId": "unique-log-id"
91
+ }
92
+ ```
93
+
94
+ ## Error Codes
95
+
96
+ | HTTP Status | errorCode | Meaning | Mapped Error Code |
97
+ |-------------|-----------|---------|-------------------|
98
+ | 403 | N/A | Authentication failed | `PROVIDER_AUTH_ERROR` |
99
+ | 429 | N/A | Quota/rate limit exceeded | `PROVIDER_QUOTA_EXCEEDED` |
100
+ | 500 | 500 | Invalid parameters | `PROVIDER_BAD_REQUEST` |
101
+ | 503 | N/A | Service overloaded | `PROVIDER_OVERLOADED` |
102
+ | 504 | N/A | Gateway timeout | `PROVIDER_TIMEOUT` |
103
+ | Other | Other | Unknown error | `PROVIDER_ERROR` |
104
+
105
+ ## Field Compatibility Notes
106
+
107
+ - **rec_scores**: May be missing or empty. Default to 0.5 if needed.
108
+ - **rec_boxes**: May be missing. Use `rec_polys` as fallback.
109
+ - **rec_polys**: May be missing. Bounding box information may not be available.
110
+ - **visualize result**: Only returned when `visualize: true` (not recommended for auto mode).
111
+
112
+ ## Best Practices
113
+
114
+ 1. **Always set visualize to false** unless explicitly requested by user (reduces response size and latency)
115
+ 2. **Handle missing fields gracefully** (rec_scores, rec_boxes, rec_polys may not exist)
116
+ 3. **Retry on 503/504** with exponential backoff (up to 2 retries)
117
+ 4. **Never log or print tokens** in any output or logs
118
+ 5. **Normalize host input** to handle user errors (https://, trailing /ocr, etc.)
119
+
120
+ ## Request Example
121
+
122
+ ```bash
123
+ curl -X POST https://your-subdomain.aistudio-app.com/ocr \
124
+ -H "Authorization: token YOUR_ACCESS_TOKEN" \
125
+ -H "Content-Type: application/json" \
126
+ -d '{
127
+ "file": "https://example.com/test.png",
128
+ "fileType": 1,
129
+ "visualize": false,
130
+ "useDocOrientationClassify": true,
131
+ "useDocUnwarping": false,
132
+ "useTextlineOrientation": false,
133
+ "textDetLimitSideLen": 736,
134
+ "textDetLimitType": "max",
135
+ "textDetThresh": 0.3,
136
+ "textDetBoxThresh": 0.6,
137
+ "textDetUnclipRatio": 1.5,
138
+ "textRecScoreThresh": 0.0
139
+ }'
140
+ ```