@kreuzberg/wasm 4.0.0-rc.15 → 4.0.0-rc.16
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +28 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -946,6 +946,34 @@ Tesseract training data (`.traineddata` files) are loaded from jsDelivr CDN on f
|
|
|
946
946
|
|
|
947
947
|
Cloudflare Workers has a 10MB bundle size limit (compressed). The WASM binary is ~2MB compressed, leaving room for your application code.
|
|
948
948
|
|
|
949
|
+
### HTML File Size Limits
|
|
950
|
+
|
|
951
|
+
**WASM builds have a 2MB limit for HTML files** due to limited stack space. HTML files larger than 2MB will be rejected with a validation error to prevent stack overflow.
|
|
952
|
+
|
|
953
|
+
```typescript
|
|
954
|
+
// Files > 2MB will throw an error in WASM builds
|
|
955
|
+
const largeHtml = new Uint8Array(3 * 1024 * 1024); // 3MB
|
|
956
|
+
await extractBytes(largeHtml, 'text/html');
|
|
957
|
+
// ❌ Throws: "HTML file size exceeds WASM limit of 2MB"
|
|
958
|
+
```
|
|
959
|
+
|
|
960
|
+
For large HTML files, use the native [@kreuzberg/node](https://www.npmjs.com/package/@kreuzberg/node) binding which has no size limits.
|
|
961
|
+
|
|
962
|
+
### PDF Extraction in Non-Browser Environments
|
|
963
|
+
|
|
964
|
+
PDF extraction requires PDFium, which is only available in browser environments. In Deno, Node.js, and Cloudflare Workers, PDF extraction will fail with an error.
|
|
965
|
+
|
|
966
|
+
```typescript
|
|
967
|
+
// ❌ Won't work in Deno/Node.js/Workers
|
|
968
|
+
await extractBytes(pdfBytes, 'application/pdf');
|
|
969
|
+
// Throws: "PDF extraction requires proper WASM module initialization"
|
|
970
|
+
```
|
|
971
|
+
|
|
972
|
+
**Solutions:**
|
|
973
|
+
- **Browser**: PDF extraction works out of the box
|
|
974
|
+
- **Deno/Node.js**: Use [@kreuzberg/node](https://www.npmjs.com/package/@kreuzberg/node) with native PDFium bindings
|
|
975
|
+
- **Cloudflare Workers**: PDF extraction is not currently supported
|
|
976
|
+
|
|
949
977
|
## Troubleshooting
|
|
950
978
|
|
|
951
979
|
### "WASM module failed to initialize"
|