@kreuzberg/wasm 4.0.0-rc.15 → 4.0.0-rc.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +28 -0
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -946,6 +946,34 @@ Tesseract training data (`.traineddata` files) are loaded from jsDelivr CDN on f
946
946
 
947
947
  Cloudflare Workers has a 10MB bundle size limit (compressed). The WASM binary is ~2MB compressed, leaving room for your application code.
948
948
 
949
+ ### HTML File Size Limits
950
+
951
+ **WASM builds have a 2MB limit for HTML files** due to limited stack space. HTML files larger than 2MB will be rejected with a validation error to prevent stack overflow.
952
+
953
+ ```typescript
954
+ // Files > 2MB will throw an error in WASM builds
955
+ const largeHtml = new Uint8Array(3 * 1024 * 1024); // 3MB
956
+ await extractBytes(largeHtml, 'text/html');
957
+ // ❌ Throws: "HTML file size exceeds WASM limit of 2MB"
958
+ ```
959
+
960
+ For large HTML files, use the native [@kreuzberg/node](https://www.npmjs.com/package/@kreuzberg/node) binding which has no size limits.
961
+
962
+ ### PDF Extraction in Non-Browser Environments
963
+
964
+ PDF extraction requires PDFium, which is only available in browser environments. In Deno, Node.js, and Cloudflare Workers, PDF extraction will fail with an error.
965
+
966
+ ```typescript
967
+ // ❌ Won't work in Deno/Node.js/Workers
968
+ await extractBytes(pdfBytes, 'application/pdf');
969
+ // Throws: "PDF extraction requires proper WASM module initialization"
970
+ ```
971
+
972
+ **Solutions:**
973
+ - **Browser**: PDF extraction works out of the box
974
+ - **Deno/Node.js**: Use [@kreuzberg/node](https://www.npmjs.com/package/@kreuzberg/node) with native PDFium bindings
975
+ - **Cloudflare Workers**: PDF extraction is not currently supported
976
+
949
977
  ## Troubleshooting
950
978
 
951
979
  ### "WASM module failed to initialize"
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@kreuzberg/wasm",
3
- "version": "4.0.0-rc.15",
3
+ "version": "4.0.0-rc.16",
4
4
  "type": "module",
5
5
  "packageManager": "pnpm@10.17.0",
6
6
  "description": "Kreuzberg document intelligence - WebAssembly bindings",