npm - paddleocr-skills - Versions diffs - 1.0.0 → 1.1.0 - Mend

paddleocr-skills 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

package/templates/ppocrv5/references/ppocrv5/normalized_schema.md CHANGED Viewed

@@ -1,257 +1,257 @@
-# Normalized Output Schema
-This document defines the unified output format returned by `ocr_caller.py`, and the format that downstream agents/tools should expect.
-## Schema Version
-**v0.1** (stable)
-## Output Structure
-All responses follow this top-level structure:
-```typescript
-{
-  ok: boolean,              // true indicates OCR success
-  request_id: string,       // Unique request ID (e.g.: "req_abc123")
-  provider: ProviderInfo,   // Provider API metadata
-  result: Result | null,    // OCR results (null on error)
-  quality: Quality | null,  // Quality metrics (null on error)
-  agent_trace: AgentTrace,  // Execution trace for transparency
-  raw_provider: any | null, // Raw provider response (if --return-raw-provider is used)
-  error: Error | null       // Error details (null on success)
-}
-```
-## ProviderInfo
-```typescript
-{
-  api_url: string,          // Full API endpoint used
-  status_code: number,      // HTTP status code
-  log_id: string | null     // Provider's log ID (if available)
-}
-```
-## Result (success only)
-```typescript
-{
-  pages: Page[],            // Array of pages (one per image/PDF page)
-  full_text: string         // All text joined by "\n\n"
-}
-```
-### Page
-```typescript
-{
-  page_index: number,       // 0-based page number
-  text: string,             // Page text (items joined by "\n")
-  avg_confidence: number,   // Average confidence for this page (0.0-1.0)
-  items: TextItem[]         // Individual text blocks/lines
-}
-```
-### TextItem
-```typescript
-{
-  text: string,             // Recognized text
-  score?: number,           // Confidence score (0.0-1.0), may be missing
-  box?: number[] | number[][]  // Bounding box or polygon, may be missing
-}
-```
-**Box format:**
-- `[xmin, ymin, xmax, ymax]` - Bounding box (4 numbers)
-- Or polygon points (array of arrays)
-## Quality (success only)
-```typescript
-{
-  quality_score: number,    // Overall quality (0.0-1.0)
-  avg_rec_score: number,    // Average recognition confidence (0.0-1.0)
-  text_items: number,       // Total text items detected
-  warnings: string[]        // Warnings (e.g.: "rec_scores missing")
-}
-```
-### Quality Score Formula
-```
-quality_score = 0  if text_items == 0
-              = 0.6 * norm(text_items) + 0.4 * avg_rec_score  otherwise
-norm(n) = min(1, log(1+n) / log(1+50))
-```
-## AgentTrace
-```typescript
-{
-  mode: "fast" | "quality" | "auto",  // Execution mode
-  selected_attempt: number,           // Attempt used (1-indexed)
-  attempts: Attempt[]                 // Details of all attempts
-}
-```
-### Attempt
-```typescript
-{
-  attempt: number,                    // Attempt number (1-indexed)
-  provider_time_ms: number,           // Provider call latency
-  quality_score: number,              // Quality score for this attempt
-  avg_rec_score: number,              // Average recognition score
-  text_items: number,                 // Number of text items
-  warnings: string[],                 // Warnings for this attempt
-  options_effective: {                // OCR options used
-    use_doc_orientation_classify: boolean,
-    use_doc_unwarping: boolean,
-    use_textline_orientation: boolean
-  }
-}
-```
-## Error (error only)
-```typescript
-{
-  code: ErrorCode,          // Unified error code
-  message: string,          // Human-readable error message
-  details: object           // Additional error context
-}
-```
-### ErrorCode Enum
-- `PROVIDER_AUTH_ERROR` - Authentication failed (403)
-- `PROVIDER_QUOTA_EXCEEDED` - Quota/rate limit exceeded (429)
-- `PROVIDER_BAD_REQUEST` - Invalid parameters (500)
-- `PROVIDER_OVERLOADED` - Service overloaded (503)
-- `PROVIDER_TIMEOUT` - Gateway timeout (504)
-- `PROVIDER_ERROR` - Other provider errors
-## Example: Success Response
-```json
-{
-  "ok": true,
-  "request_id": "req_abc123",
-  "provider": {
-    "api_url": "https://example.aistudio-app.com/ocr",
-    "status_code": 200,
-    "log_id": "log_xyz"
-  },
-  "result": {
-    "pages": [
-      {
-        "page_index": 0,
-        "text": "Invoice\nAmount: $123.45",
-        "avg_confidence": 0.95,
-        "items": [
-          {"text": "Invoice", "score": 0.98, "box": [10, 20, 100, 50]},
-          {"text": "Amount: $123.45", "score": 0.92, "box": [10, 60, 200, 90]}
-        ]
-      }
-    ],
-    "full_text": "Invoice\nAmount: $123.45"
-  },
-  "quality": {
-    "quality_score": 0.79,
-    "avg_rec_score": 0.95,
-    "text_items": 2,
-    "warnings": []
-  },
-  "agent_trace": {
-    "mode": "auto",
-    "selected_attempt": 1,
-    "attempts": [
-      {
-        "attempt": 1,
-        "provider_time_ms": 1200,
-        "quality_score": 0.79,
-        "avg_rec_score": 0.95,
-        "text_items": 2,
-        "warnings": [],
-        "options_effective": {
-          "use_doc_orientation_classify": false,
-          "use_doc_unwarping": false,
-          "use_textline_orientation": false
-        }
-      }
-    ]
-  },
-  "raw_provider": null,
-  "error": null
-}
-```
-## Example: Error Response
-```json
-{
-  "ok": false,
-  "request_id": "req_def456",
-  "provider": {
-    "api_url": "https://example.aistudio-app.com/ocr",
-    "status_code": 403,
-    "log_id": null
-  },
-  "result": null,
-  "quality": null,
-  "agent_trace": {
-    "mode": "auto",
-    "selected_attempt": 1,
-    "attempts": [
-      {
-        "attempt": 1,
-        "provider_time_ms": 150,
-        "quality_score": 0.0,
-        "avg_rec_score": 0.0,
-        "text_items": 0,
-        "warnings": ["Provider error: Authentication failed"],
-        "options_effective": {
-          "use_doc_orientation_classify": false,
-          "use_doc_unwarping": false,
-          "use_textline_orientation": false
-        }
-      }
-    ]
-  },
-  "raw_provider": null,
-  "error": {
-    "code": "PROVIDER_AUTH_ERROR",
-    "message": "Authentication failed",
-    "details": {
-      "error_code": 403,
-      "status_code": 403
-    }
-  }
-}
-```
-## Usage Guide
-### For Agents/Scripts
-1. **Check `ok` first**: `if response.ok:`
-2. **Extract text**: `response.result.full_text`
-3. **Extract structured data**: `response.result.pages[].items[]`
-4. **Check quality**: `response.quality.quality_score` (0.72+ is usually good)
-5. **Handle errors**: `response.error.code` and `response.error.message`
-### For Debugging
-1. **Check trace**: `response.agent_trace.attempts` shows all attempts and their quality
-2. **Selected attempt**: `response.agent_trace.selected_attempt` indicates which one was selected
-3. **Raw provider**: Use `--return-raw-provider` to see raw API response
-## Compatibility Notes
-- **Missing scores**: `items[].score` may not exist if provider didn't return scores
-- **Missing boxes**: `items[].box` may not exist if provider didn't return geometry
-- **Empty results**: `text_items == 0` means no text detected (not necessarily an error)
-- **Warnings**: Check `quality.warnings` for non-fatal issues (e.g.: missing fields)
+# Normalized Output Schema
+This document defines the unified output format returned by `ocr_caller.py`, and the format that downstream agents/tools should expect.
+## Schema Version
+**v0.1** (stable)
+## Output Structure
+All responses follow this top-level structure:
+```typescript
+{
+  ok: boolean,              // true indicates OCR success
+  request_id: string,       // Unique request ID (e.g.: "req_abc123")
+  provider: ProviderInfo,   // Provider API metadata
+  result: Result | null,    // OCR results (null on error)
+  quality: Quality | null,  // Quality metrics (null on error)
+  agent_trace: AgentTrace,  // Execution trace for transparency
+  raw_provider: any | null, // Raw provider response (if --return-raw-provider is used)
+  error: Error | null       // Error details (null on success)
+}
+```
+## ProviderInfo
+```typescript
+{
+  api_url: string,          // Full API endpoint used
+  status_code: number,      // HTTP status code
+  log_id: string | null     // Provider's log ID (if available)
+}
+```
+## Result (success only)
+```typescript
+{
+  pages: Page[],            // Array of pages (one per image/PDF page)
+  full_text: string         // All text joined by "\n\n"
+}
+```
+### Page
+```typescript
+{
+  page_index: number,       // 0-based page number
+  text: string,             // Page text (items joined by "\n")
+  avg_confidence: number,   // Average confidence for this page (0.0-1.0)
+  items: TextItem[]         // Individual text blocks/lines
+}
+```
+### TextItem
+```typescript
+{
+  text: string,             // Recognized text
+  score?: number,           // Confidence score (0.0-1.0), may be missing
+  box?: number[] | number[][]  // Bounding box or polygon, may be missing
+}
+```
+**Box format:**
+- `[xmin, ymin, xmax, ymax]` - Bounding box (4 numbers)
+- Or polygon points (array of arrays)
+## Quality (success only)
+```typescript
+{
+  quality_score: number,    // Overall quality (0.0-1.0)
+  avg_rec_score: number,    // Average recognition confidence (0.0-1.0)
+  text_items: number,       // Total text items detected
+  warnings: string[]        // Warnings (e.g.: "rec_scores missing")
+}
+```
+### Quality Score Formula
+```
+quality_score = 0  if text_items == 0
+              = 0.6 * norm(text_items) + 0.4 * avg_rec_score  otherwise
+norm(n) = min(1, log(1+n) / log(1+50))
+```
+## AgentTrace
+```typescript
+{
+  mode: "fast" | "quality" | "auto",  // Execution mode
+  selected_attempt: number,           // Attempt used (1-indexed)
+  attempts: Attempt[]                 // Details of all attempts
+}
+```
+### Attempt
+```typescript
+{
+  attempt: number,                    // Attempt number (1-indexed)
+  provider_time_ms: number,           // Provider call latency
+  quality_score: number,              // Quality score for this attempt
+  avg_rec_score: number,              // Average recognition score
+  text_items: number,                 // Number of text items
+  warnings: string[],                 // Warnings for this attempt
+  options_effective: {                // OCR options used
+    use_doc_orientation_classify: boolean,
+    use_doc_unwarping: boolean,
+    use_textline_orientation: boolean
+  }
+}
+```
+## Error (error only)
+```typescript
+{
+  code: ErrorCode,          // Unified error code
+  message: string,          // Human-readable error message
+  details: object           // Additional error context
+}
+```
+### ErrorCode Enum
+- `PROVIDER_AUTH_ERROR` - Authentication failed (403)
+- `PROVIDER_QUOTA_EXCEEDED` - Quota/rate limit exceeded (429)
+- `PROVIDER_BAD_REQUEST` - Invalid parameters (500)
+- `PROVIDER_OVERLOADED` - Service overloaded (503)
+- `PROVIDER_TIMEOUT` - Gateway timeout (504)
+- `PROVIDER_ERROR` - Other provider errors
+## Example: Success Response
+```json
+{
+  "ok": true,
+  "request_id": "req_abc123",
+  "provider": {
+    "api_url": "https://example.aistudio-app.com/ocr",
+    "status_code": 200,
+    "log_id": "log_xyz"
+  },
+  "result": {
+    "pages": [
+      {
+        "page_index": 0,
+        "text": "Invoice\nAmount: $123.45",
+        "avg_confidence": 0.95,
+        "items": [
+          {"text": "Invoice", "score": 0.98, "box": [10, 20, 100, 50]},
+          {"text": "Amount: $123.45", "score": 0.92, "box": [10, 60, 200, 90]}
+        ]
+      }
+    ],
+    "full_text": "Invoice\nAmount: $123.45"
+  },
+  "quality": {
+    "quality_score": 0.79,
+    "avg_rec_score": 0.95,
+    "text_items": 2,
+    "warnings": []
+  },
+  "agent_trace": {
+    "mode": "auto",
+    "selected_attempt": 1,
+    "attempts": [
+      {
+        "attempt": 1,
+        "provider_time_ms": 1200,
+        "quality_score": 0.79,
+        "avg_rec_score": 0.95,
+        "text_items": 2,
+        "warnings": [],
+        "options_effective": {
+          "use_doc_orientation_classify": false,
+          "use_doc_unwarping": false,
+          "use_textline_orientation": false
+        }
+      }
+    ]
+  },
+  "raw_provider": null,
+  "error": null
+}
+```
+## Example: Error Response
+```json
+{
+  "ok": false,
+  "request_id": "req_def456",
+  "provider": {
+    "api_url": "https://example.aistudio-app.com/ocr",
+    "status_code": 403,
+    "log_id": null
+  },
+  "result": null,
+  "quality": null,
+  "agent_trace": {
+    "mode": "auto",
+    "selected_attempt": 1,
+    "attempts": [
+      {
+        "attempt": 1,
+        "provider_time_ms": 150,
+        "quality_score": 0.0,
+        "avg_rec_score": 0.0,
+        "text_items": 0,
+        "warnings": ["Provider error: Authentication failed"],
+        "options_effective": {
+          "use_doc_orientation_classify": false,
+          "use_doc_unwarping": false,
+          "use_textline_orientation": false
+        }
+      }
+    ]
+  },
+  "raw_provider": null,
+  "error": {
+    "code": "PROVIDER_AUTH_ERROR",
+    "message": "Authentication failed",
+    "details": {
+      "error_code": 403,
+      "status_code": 403
+    }
+  }
+}
+```
+## Usage Guide
+### For Agents/Scripts
+1. **Check `ok` first**: `if response.ok:`
+2. **Extract text**: `response.result.full_text`
+3. **Extract structured data**: `response.result.pages[].items[]`
+4. **Check quality**: `response.quality.quality_score` (0.72+ is usually good)
+5. **Handle errors**: `response.error.code` and `response.error.message`
+### For Debugging
+1. **Check trace**: `response.agent_trace.attempts` shows all attempts and their quality
+2. **Selected attempt**: `response.agent_trace.selected_attempt` indicates which one was selected
+3. **Raw provider**: Use `--return-raw-provider` to see raw API response
+## Compatibility Notes
+- **Missing scores**: `items[].score` may not exist if provider didn't return scores
+- **Missing boxes**: `items[].box` may not exist if provider didn't return geometry
+- **Empty results**: `text_items == 0` means no text detected (not necessarily an error)
+- **Warnings**: Check `quality.warnings` for non-fatal issues (e.g.: missing fields)