paddleocr-skills 1.0.0 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +220 -220
- package/bin/paddleocr-skills.js +33 -20
- package/lib/copy.js +39 -39
- package/lib/installer.js +76 -70
- package/lib/prompts.js +67 -67
- package/lib/python.js +75 -75
- package/lib/verify.js +121 -121
- package/package.json +42 -42
- package/templates/.env.example +12 -12
- package/templates/{paddleocr-vl/references/paddleocr-vl → paddleocr-vl-1.5/references/paddleocr-vl-1.5}/layout_schema.md +64 -64
- package/templates/{paddleocr-vl/references/paddleocr-vl → paddleocr-vl-1.5/references/paddleocr-vl-1.5}/output_format.md +154 -154
- package/templates/{paddleocr-vl/references/paddleocr-vl → paddleocr-vl-1.5/references/paddleocr-vl-1.5}/vl_model_spec.md +157 -157
- package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/_lib.py +780 -780
- package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/configure.py +270 -270
- package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/optimize_file.py +226 -226
- package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/requirements-optimize.txt +8 -8
- package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/requirements.txt +7 -7
- package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/smoke_test.py +199 -199
- package/templates/{paddleocr-vl/scripts/paddleocr-vl → paddleocr-vl-1.5/scripts/paddleocr-vl-1.5}/vl_caller.py +232 -232
- package/templates/{paddleocr-vl/skills/paddleocr-vl → paddleocr-vl-1.5/skills/paddleocr-vl-1.5}/SKILL.md +481 -481
- package/templates/ppocrv5/references/ppocrv5/agent_policy.md +258 -258
- package/templates/ppocrv5/references/ppocrv5/normalized_schema.md +257 -257
- package/templates/ppocrv5/references/ppocrv5/provider_api.md +140 -140
- package/templates/ppocrv5/scripts/ppocrv5/_lib.py +635 -635
- package/templates/ppocrv5/scripts/ppocrv5/configure.py +346 -346
- package/templates/ppocrv5/scripts/ppocrv5/ocr_caller.py +684 -684
- package/templates/ppocrv5/scripts/ppocrv5/requirements.txt +4 -4
- package/templates/ppocrv5/scripts/ppocrv5/smoke_test.py +139 -139
- package/templates/ppocrv5/skills/ppocrv5/SKILL.md +272 -272
|
@@ -1,140 +1,140 @@
|
|
|
1
|
-
# Provider API Reference: Paddle AI Studio PP-OCRv5
|
|
2
|
-
|
|
3
|
-
This document describes the external provider API contract that this skill depends on.
|
|
4
|
-
|
|
5
|
-
## Endpoint
|
|
6
|
-
|
|
7
|
-
**POST** `https://<AISTUDIO_HOST>/ocr`
|
|
8
|
-
|
|
9
|
-
Where `<AISTUDIO_HOST>` is provided by the user (e.g.: `your-subdomain.aistudio-app.com`).
|
|
10
|
-
|
|
11
|
-
## Authentication
|
|
12
|
-
|
|
13
|
-
**Header:**
|
|
14
|
-
```
|
|
15
|
-
Authorization: token <ACCESS_TOKEN>
|
|
16
|
-
```
|
|
17
|
-
|
|
18
|
-
Where `<ACCESS_TOKEN>` is the API token obtained by the user from Paddle AI Studio.
|
|
19
|
-
|
|
20
|
-
## Request Body
|
|
21
|
-
|
|
22
|
-
```jsonc
|
|
23
|
-
{
|
|
24
|
-
"file": "https://example.com/image.png", // URL or base64 (without data: prefix)
|
|
25
|
-
"fileType": 1, // 0=PDF, 1=Image
|
|
26
|
-
"visualize": false, // Default false (avoid large responses)
|
|
27
|
-
|
|
28
|
-
// Text detection options
|
|
29
|
-
"textDetLimitSideLen": 736, // Maximum side length for detection
|
|
30
|
-
"textDetLimitType": "max", // "min" or "max"
|
|
31
|
-
"textDetThresh": 0.3, // Detection threshold
|
|
32
|
-
"textDetBoxThresh": 0.6, // Box threshold
|
|
33
|
-
"textDetUnclipRatio": 1.5, // Unclip ratio
|
|
34
|
-
|
|
35
|
-
// Text recognition options
|
|
36
|
-
"textRecScoreThresh": 0.0, // Recognition score threshold
|
|
37
|
-
|
|
38
|
-
// Document preprocessing options
|
|
39
|
-
"useDocOrientationClassify": false, // Enable orientation correction
|
|
40
|
-
"useDocUnwarping": false, // Enable unwarping/skew correction
|
|
41
|
-
"useTextlineOrientation": false // Enable textline orientation
|
|
42
|
-
}
|
|
43
|
-
```
|
|
44
|
-
|
|
45
|
-
### Key Parameters
|
|
46
|
-
|
|
47
|
-
- **file**: URL or base64 string of image/PDF (without `data:` URI prefix)
|
|
48
|
-
- **fileType**:
|
|
49
|
-
- `0` = PDF
|
|
50
|
-
- `1` = Image
|
|
51
|
-
- **visualize**: If `true`, returns visualization image (increases response size)
|
|
52
|
-
- **useDocOrientationClassify**: Correct page orientation (0°/90°/180°/270°)
|
|
53
|
-
- **useDocUnwarping**: Correct perspective distortion and skew
|
|
54
|
-
- **useTextlineOrientation**: Correct individual text line angles
|
|
55
|
-
|
|
56
|
-
## Response Structure
|
|
57
|
-
|
|
58
|
-
### Success Response (errorCode == 0)
|
|
59
|
-
|
|
60
|
-
```jsonc
|
|
61
|
-
{
|
|
62
|
-
"errorCode": 0,
|
|
63
|
-
"errorMsg": "",
|
|
64
|
-
"logId": "unique-log-id",
|
|
65
|
-
"result": {
|
|
66
|
-
"ocrResults": [
|
|
67
|
-
{
|
|
68
|
-
"prunedResult": {
|
|
69
|
-
"rec_texts": ["Invoice", "Amount", "123.45"], // Recognized text
|
|
70
|
-
"rec_scores": [0.98, 0.95, 0.92], // Confidence scores (may be missing)
|
|
71
|
-
"rec_boxes": [ // Bounding boxes (may be missing)
|
|
72
|
-
[10, 20, 100, 50],
|
|
73
|
-
[10, 60, 150, 90],
|
|
74
|
-
[200, 60, 300, 90]
|
|
75
|
-
],
|
|
76
|
-
"rec_polys": [...] // Polygons (alternative to boxes)
|
|
77
|
-
}
|
|
78
|
-
}
|
|
79
|
-
]
|
|
80
|
-
}
|
|
81
|
-
}
|
|
82
|
-
```
|
|
83
|
-
|
|
84
|
-
### Error Response (errorCode != 0)
|
|
85
|
-
|
|
86
|
-
```jsonc
|
|
87
|
-
{
|
|
88
|
-
"errorCode": 500,
|
|
89
|
-
"errorMsg": "Invalid parameter",
|
|
90
|
-
"logId": "unique-log-id"
|
|
91
|
-
}
|
|
92
|
-
```
|
|
93
|
-
|
|
94
|
-
## Error Codes
|
|
95
|
-
|
|
96
|
-
| HTTP Status | errorCode | Meaning | Mapped Error Code |
|
|
97
|
-
|-------------|-----------|---------|-------------------|
|
|
98
|
-
| 403 | N/A | Authentication failed | `PROVIDER_AUTH_ERROR` |
|
|
99
|
-
| 429 | N/A | Quota/rate limit exceeded | `PROVIDER_QUOTA_EXCEEDED` |
|
|
100
|
-
| 500 | 500 | Invalid parameters | `PROVIDER_BAD_REQUEST` |
|
|
101
|
-
| 503 | N/A | Service overloaded | `PROVIDER_OVERLOADED` |
|
|
102
|
-
| 504 | N/A | Gateway timeout | `PROVIDER_TIMEOUT` |
|
|
103
|
-
| Other | Other | Unknown error | `PROVIDER_ERROR` |
|
|
104
|
-
|
|
105
|
-
## Field Compatibility Notes
|
|
106
|
-
|
|
107
|
-
- **rec_scores**: May be missing or empty. Default to 0.5 if needed.
|
|
108
|
-
- **rec_boxes**: May be missing. Use `rec_polys` as fallback.
|
|
109
|
-
- **rec_polys**: May be missing. Bounding box information may not be available.
|
|
110
|
-
- **visualize result**: Only returned when `visualize: true` (not recommended for auto mode).
|
|
111
|
-
|
|
112
|
-
## Best Practices
|
|
113
|
-
|
|
114
|
-
1. **Always set visualize to false** unless explicitly requested by user (reduces response size and latency)
|
|
115
|
-
2. **Handle missing fields gracefully** (rec_scores, rec_boxes, rec_polys may not exist)
|
|
116
|
-
3. **Retry on 503/504** with exponential backoff (up to 2 retries)
|
|
117
|
-
4. **Never log or print tokens** in any output or logs
|
|
118
|
-
5. **Normalize host input** to handle user errors (https://, trailing /ocr, etc.)
|
|
119
|
-
|
|
120
|
-
## Request Example
|
|
121
|
-
|
|
122
|
-
```bash
|
|
123
|
-
curl -X POST https://your-subdomain.aistudio-app.com/ocr \
|
|
124
|
-
-H "Authorization: token YOUR_ACCESS_TOKEN" \
|
|
125
|
-
-H "Content-Type: application/json" \
|
|
126
|
-
-d '{
|
|
127
|
-
"file": "https://example.com/test.png",
|
|
128
|
-
"fileType": 1,
|
|
129
|
-
"visualize": false,
|
|
130
|
-
"useDocOrientationClassify": true,
|
|
131
|
-
"useDocUnwarping": false,
|
|
132
|
-
"useTextlineOrientation": false,
|
|
133
|
-
"textDetLimitSideLen": 736,
|
|
134
|
-
"textDetLimitType": "max",
|
|
135
|
-
"textDetThresh": 0.3,
|
|
136
|
-
"textDetBoxThresh": 0.6,
|
|
137
|
-
"textDetUnclipRatio": 1.5,
|
|
138
|
-
"textRecScoreThresh": 0.0
|
|
139
|
-
}'
|
|
140
|
-
```
|
|
1
|
+
# Provider API Reference: Paddle AI Studio PP-OCRv5
|
|
2
|
+
|
|
3
|
+
This document describes the external provider API contract that this skill depends on.
|
|
4
|
+
|
|
5
|
+
## Endpoint
|
|
6
|
+
|
|
7
|
+
**POST** `https://<AISTUDIO_HOST>/ocr`
|
|
8
|
+
|
|
9
|
+
Where `<AISTUDIO_HOST>` is provided by the user (e.g.: `your-subdomain.aistudio-app.com`).
|
|
10
|
+
|
|
11
|
+
## Authentication
|
|
12
|
+
|
|
13
|
+
**Header:**
|
|
14
|
+
```
|
|
15
|
+
Authorization: token <ACCESS_TOKEN>
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
Where `<ACCESS_TOKEN>` is the API token obtained by the user from Paddle AI Studio.
|
|
19
|
+
|
|
20
|
+
## Request Body
|
|
21
|
+
|
|
22
|
+
```jsonc
|
|
23
|
+
{
|
|
24
|
+
"file": "https://example.com/image.png", // URL or base64 (without data: prefix)
|
|
25
|
+
"fileType": 1, // 0=PDF, 1=Image
|
|
26
|
+
"visualize": false, // Default false (avoid large responses)
|
|
27
|
+
|
|
28
|
+
// Text detection options
|
|
29
|
+
"textDetLimitSideLen": 736, // Maximum side length for detection
|
|
30
|
+
"textDetLimitType": "max", // "min" or "max"
|
|
31
|
+
"textDetThresh": 0.3, // Detection threshold
|
|
32
|
+
"textDetBoxThresh": 0.6, // Box threshold
|
|
33
|
+
"textDetUnclipRatio": 1.5, // Unclip ratio
|
|
34
|
+
|
|
35
|
+
// Text recognition options
|
|
36
|
+
"textRecScoreThresh": 0.0, // Recognition score threshold
|
|
37
|
+
|
|
38
|
+
// Document preprocessing options
|
|
39
|
+
"useDocOrientationClassify": false, // Enable orientation correction
|
|
40
|
+
"useDocUnwarping": false, // Enable unwarping/skew correction
|
|
41
|
+
"useTextlineOrientation": false // Enable textline orientation
|
|
42
|
+
}
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
### Key Parameters
|
|
46
|
+
|
|
47
|
+
- **file**: URL or base64 string of image/PDF (without `data:` URI prefix)
|
|
48
|
+
- **fileType**:
|
|
49
|
+
- `0` = PDF
|
|
50
|
+
- `1` = Image
|
|
51
|
+
- **visualize**: If `true`, returns visualization image (increases response size)
|
|
52
|
+
- **useDocOrientationClassify**: Correct page orientation (0°/90°/180°/270°)
|
|
53
|
+
- **useDocUnwarping**: Correct perspective distortion and skew
|
|
54
|
+
- **useTextlineOrientation**: Correct individual text line angles
|
|
55
|
+
|
|
56
|
+
## Response Structure
|
|
57
|
+
|
|
58
|
+
### Success Response (errorCode == 0)
|
|
59
|
+
|
|
60
|
+
```jsonc
|
|
61
|
+
{
|
|
62
|
+
"errorCode": 0,
|
|
63
|
+
"errorMsg": "",
|
|
64
|
+
"logId": "unique-log-id",
|
|
65
|
+
"result": {
|
|
66
|
+
"ocrResults": [
|
|
67
|
+
{
|
|
68
|
+
"prunedResult": {
|
|
69
|
+
"rec_texts": ["Invoice", "Amount", "123.45"], // Recognized text
|
|
70
|
+
"rec_scores": [0.98, 0.95, 0.92], // Confidence scores (may be missing)
|
|
71
|
+
"rec_boxes": [ // Bounding boxes (may be missing)
|
|
72
|
+
[10, 20, 100, 50],
|
|
73
|
+
[10, 60, 150, 90],
|
|
74
|
+
[200, 60, 300, 90]
|
|
75
|
+
],
|
|
76
|
+
"rec_polys": [...] // Polygons (alternative to boxes)
|
|
77
|
+
}
|
|
78
|
+
}
|
|
79
|
+
]
|
|
80
|
+
}
|
|
81
|
+
}
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### Error Response (errorCode != 0)
|
|
85
|
+
|
|
86
|
+
```jsonc
|
|
87
|
+
{
|
|
88
|
+
"errorCode": 500,
|
|
89
|
+
"errorMsg": "Invalid parameter",
|
|
90
|
+
"logId": "unique-log-id"
|
|
91
|
+
}
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
## Error Codes
|
|
95
|
+
|
|
96
|
+
| HTTP Status | errorCode | Meaning | Mapped Error Code |
|
|
97
|
+
|-------------|-----------|---------|-------------------|
|
|
98
|
+
| 403 | N/A | Authentication failed | `PROVIDER_AUTH_ERROR` |
|
|
99
|
+
| 429 | N/A | Quota/rate limit exceeded | `PROVIDER_QUOTA_EXCEEDED` |
|
|
100
|
+
| 500 | 500 | Invalid parameters | `PROVIDER_BAD_REQUEST` |
|
|
101
|
+
| 503 | N/A | Service overloaded | `PROVIDER_OVERLOADED` |
|
|
102
|
+
| 504 | N/A | Gateway timeout | `PROVIDER_TIMEOUT` |
|
|
103
|
+
| Other | Other | Unknown error | `PROVIDER_ERROR` |
|
|
104
|
+
|
|
105
|
+
## Field Compatibility Notes
|
|
106
|
+
|
|
107
|
+
- **rec_scores**: May be missing or empty. Default to 0.5 if needed.
|
|
108
|
+
- **rec_boxes**: May be missing. Use `rec_polys` as fallback.
|
|
109
|
+
- **rec_polys**: May be missing. Bounding box information may not be available.
|
|
110
|
+
- **visualize result**: Only returned when `visualize: true` (not recommended for auto mode).
|
|
111
|
+
|
|
112
|
+
## Best Practices
|
|
113
|
+
|
|
114
|
+
1. **Always set visualize to false** unless explicitly requested by user (reduces response size and latency)
|
|
115
|
+
2. **Handle missing fields gracefully** (rec_scores, rec_boxes, rec_polys may not exist)
|
|
116
|
+
3. **Retry on 503/504** with exponential backoff (up to 2 retries)
|
|
117
|
+
4. **Never log or print tokens** in any output or logs
|
|
118
|
+
5. **Normalize host input** to handle user errors (https://, trailing /ocr, etc.)
|
|
119
|
+
|
|
120
|
+
## Request Example
|
|
121
|
+
|
|
122
|
+
```bash
|
|
123
|
+
curl -X POST https://your-subdomain.aistudio-app.com/ocr \
|
|
124
|
+
-H "Authorization: token YOUR_ACCESS_TOKEN" \
|
|
125
|
+
-H "Content-Type: application/json" \
|
|
126
|
+
-d '{
|
|
127
|
+
"file": "https://example.com/test.png",
|
|
128
|
+
"fileType": 1,
|
|
129
|
+
"visualize": false,
|
|
130
|
+
"useDocOrientationClassify": true,
|
|
131
|
+
"useDocUnwarping": false,
|
|
132
|
+
"useTextlineOrientation": false,
|
|
133
|
+
"textDetLimitSideLen": 736,
|
|
134
|
+
"textDetLimitType": "max",
|
|
135
|
+
"textDetThresh": 0.3,
|
|
136
|
+
"textDetBoxThresh": 0.6,
|
|
137
|
+
"textDetUnclipRatio": 1.5,
|
|
138
|
+
"textRecScoreThresh": 0.0
|
|
139
|
+
}'
|
|
140
|
+
```
|