npm - @jax-js/jax - Versions diffs - 0.0.4 → 0.0.5 - Mend

@jax-js/jax 0.0.4 → 0.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/README.md +67 -24
package/dist/{backend-EBRGmEYw.js → backend-CdcTZEOF.js} +35 -6
package/dist/{backend-Ss1Mev_-.cjs → backend-yEU0L_ig.cjs} +40 -5
package/dist/index.cjs +324 -225
package/dist/index.d.cts +71 -26
package/dist/index.d.ts +71 -26
package/dist/index.js +314 -215
package/dist/{webgpu-ow0Pn_6q.js → webgpu-CM-xNYzW.js} +1 -1
package/dist/{webgpu-BVdMaO9T.cjs → webgpu-CNOpiO5T.cjs} +1 -1
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -2,8 +2,9 @@
 [Website](https://www.ekzhang.com/jax-js/) | [API Reference](https://www.ekzhang.com/jax-js/docs/)
-This is a machine learning framework for the browser. It aims to bring JAX-style, high-performance
-CPU and GPU kernels to JavaScript, so you can run numerical applications on the web.
+**jax-js** is a machine learning framework for the browser. It aims to bring JAX-style,
+high-performance CPU and GPU kernels to JavaScript, so you can run numerical applications on the
+web.
 ```bash
 npm i @jax-js/jax
@@ -12,6 +13,10 @@ npm i @jax-js/jax
 Under the hood, it translates array operations into a compiler representation, then synthesizes
 kernels in WebAssembly and WebGPU.
+The library is written from scratch, with zero external dependencies. It maintains close API
+compatibility with NumPy/JAX. Since everything runs client-side, jax-js is likely the most portable
+GPU ML framework, since it runs anywhere a browser can run.
 ## Quickstart
 You can use `jax-js` as an array API, just like NumPy.
@@ -24,7 +29,7 @@ const x = np.array([1, 2, 3]);
 const y = x.mul(4); // [4, 8, 12]
 ```
-It also lets you take derivatives like in JAX.
+It also lets you take derivatives with `grad` like in JAX (as well as `vmap`, `jit`).
 ```js
 import { grad, numpy as np } from "@jax-js/jax";
@@ -37,11 +42,14 @@ const xnorm = norm(x.ref); // 1^2 + 2^2 + 3^2 = 14
 const xgrad = grad(norm)(x); // [2, 4, 6]
 ```
-The default backend runs on CPU, but on [supported browsers](https://caniuse.com/webgpu), you can
-switch to GPU for better performance.
+The default backend runs on CPU, but on [supported browsers](https://caniuse.com/webgpu) including
+Chrome and iOS Safari, you can switch to GPU for better performance.
 ```js
-import { defaultDevice, numpy as np } from "@jax-js/jax";
+import { defaultDevice, init, numpy as np } from "@jax-js/jax";
+// Initialize the GPU backend.
+await init("webgpu");
 // Change the default backend to GPU.
 defaultDevice("webgpu");
@@ -53,8 +61,43 @@ const y = np.dot(x.ref, x); // JIT-compiled into a matrix multiplication kernel
 Most common JAX APIs are supported. See the [compatibility table](./FEATURES.md) for a full
 breakdown of what features are available.
+### Web usage (CDN)
+If you want to use `jax-js` in vanilla JavaScript (without a bundler), just import from a module
+script tag. This is the easiest way to get started on a blank HTML page.
+```html
+<script type="module">
+  import { numpy as np } from "https://esm.sh/@jax-js/jax";
+</script>
+```
+### Performance
+We haven't spent a ton of time optimizing yet, but performance is generally pretty good. `jit` is
+very helpful for fusing operations together, and it's a feature only available on the web in jax-js.
+The default kernel-tuning heuristics get about 3000 GFLOP/s for matrix multiplication on an M4 Pro
+chip ([try it](https://www.ekzhang.com/jax-js/bench/matmul)).
+For that example, it's around the same GFLOP/s as
+[TensorFlow.js](https://github.com/tensorflow/tfjs) and
+[ONNX Runtime Web](https://www.npmjs.com/package/onnxruntime-web), which both use handwritten
+libraries of custom kernels (versus jax-js, which generates kernels with an ML compiler).
+## Examples
+If you make something cool with jax-js, don't be a stranger! We can feature it here.
+- [In-browser REPL](https://www.ekzhang.com/jax-js/repl)
+- [Interactive MNIST training](https://www.ekzhang.com/jax-js/mnist)
+- [Matmul benchmark](https://www.ekzhang.com/jax-js/bench/matmul)
+- [Conv2d benchmark](https://www.ekzhang.com/jax-js/bench/conv2d)
+- [Mandelbrot set](https://www.ekzhang.com/jax-js/mandelbrot)
 ## Development
+_The following technical details are for contributing to jax-js and modifying its internals._
 This repository is managed by [`pnpm`](https://pnpm.io/). You can compile and build all packages in
 watch mode with:
@@ -70,8 +113,8 @@ pnpm exec playwright install
 pnpm test
 ```
-_We are currently on an older version of Playwright that supports using WebGPU in headless mode;
-newer versions seem to skip the WebGPU tests._
+We are currently on an older version of Playwright that supports using WebGPU in headless mode;
+newer versions skip the WebGPU tests.
 To start a Vite dev server running the website, demos and REPL:
@@ -79,15 +122,26 @@ To start a Vite dev server running the website, demos and REPL:
 pnpm -C website dev
 ```
+## Future work / help wanted
+Contributions are welcomed in the following areas:
+- Adding support for more JAX functions and operations, see [compatibility table](./FEATURES.md).
+- Improving performance of the WebGPU and Wasm runtimes, generating better kernels, and using SIMD
+  and multithreading.
+- Helping the JIT compiler to fuse operations in more cases.
+- Adding WebGL runtime for older browsers that don't support WebGPU.
+- Making a fast transformer inference engine, comparing against onnxruntime-web.
+- Ergonomics and API improvements.
 ## Next on Eric's mind
 - Finish CLIP inference demo and associated features (depthwise convolution, vmap of gather, etc.)
 - Performance
-  - Improve perf of MNIST neural network
-    - Optimize conv2d further (maybe blocks -> local dims?)
+  - Improve perf of MobileCLIP neural network
     - Add fused epilogue to JIT
+    - Fix fusion of activation functions with branches like tanh
     - Reduce kernel overhead of constants / inline expressions
-  - Investigate why jax-js Matmul is 2x slower on Safari TP than unroll kernel
   - How many threads to create per workgroup, depends on hardware
 ## Milestones
@@ -120,20 +174,9 @@ pnpm -C website dev
   - [ ] SIMD support for Wasm backend
   - [ ] Async / multithreading Wasm support
 - [ ] Full support of weak types and committed devices
-  - [ ] High-level ops have automatic type promotion
-  - [ ] Weak types - [ref](https://docs.jax.dev/en/latest/type_promotion.html#weak-types)
+  - [x] High-level ops have automatic type promotion
+  - [x] Weak types - [ref](https://docs.jax.dev/en/latest/type_promotion.html#weak-types)
   - [ ] Committed devices -
         [ref](https://docs.jax.dev/en/latest/sharded-computation.html#sharded-data-placement)
   - [ ] Device switching with `device_put()` between webgpu/cpu/wasm
 - [x] numpy/jax API compatibility table
-## Future work / help wanted
-Contributions are welcomed in the following areas:
-- Adding support for more JAX functions and operations, see [compatibility table](./FEATURES.md).
-- Improving performance of the WebGPU and Wasm runtimes, generating better kernels, using SIMD and
-  multithreading.
-- Adding WebGL runtime for older browsers that don't support WebGPU.
-- Making a fast transformer inference engine, comparing against onnxruntime-web.
-- Ergonomics and API improvements.

package/dist/{backend-EBRGmEYw.js → backend-CdcTZEOF.js} RENAMED Viewed

@@ -206,6 +206,35 @@ function findPow2(hint, max) {
 	while (ret < hint && 2 * ret <= max) ret *= 2;
 	return ret;
 }
+/**
+* Implements a NumPy-style generalized broadcast rule on two array shapes.
+*
+* "When operating on two arrays, NumPy compares their shapes element-wise. It
+* starts with the trailing (i.e. rightmost) dimension and works its way left.
+* Two dimensions are compatible when:
+*   1. they are equal, or
+*   2. one of them is 1."
+*
+* Throws a TypeError if the broadcast is not possible.
+*
+* <https://numpy.org/doc/stable/user/basics.broadcasting.html#general-broadcasting-rules>
+*/
+function generalBroadcast(a, b) {
+	const out = [];
+	let i = a.length - 1;
+	let j = b.length - 1;
+	for (; i >= 0 && j >= 0; i--, j--) {
+		const x = a[i];
+		const y = b[j];
+		if (x === y) out.push(x);
+		else if (x === 1) out.push(y);
+		else if (y === 1) out.push(x);
+		else throw new TypeError(`Incompatible array broadcast shapes: ${a} vs ${b}`);
+	}
+	for (; i >= 0; i--) out.push(a[i]);
+	for (; j >= 0; j--) out.push(b[j]);
+	return out.reverse();
+}
 function recursiveFlatten(ar) {
 	if (!Array.isArray(ar)) return [ar];
 	return ar.flat(Infinity);
@@ -294,12 +323,12 @@ const isFloatDtype = (dtype) => dtype === DType.Float32 || dtype === DType.Float
 * **Type lattice:**
 * ```text
 * bool -> uint32 -> int32 -> float16 -> float32
-*  weak f* --^
+*  weakType --^
 * ```
 *
-* The asterisk f* is a weak type used for JS number constants. When creating
-* arrays, JS numbers default to float32 but "weak" so they cast to the dtype of
-* any array they are first combined with.
+* `weakType` represents weakly typed arrays. These are created for JS numbers,
+* which default to float32 but "weak" so they cast to the dtype of any array
+* they are first combined with, except `bool`.
 *
 * **Examples:**
 * - `promoteTypes(bool, int32) → int32`
@@ -3760,7 +3789,7 @@ async function createBackend(device) {
 		if (!navigator.gpu) return null;
 		const adapter = await navigator.gpu.requestAdapter({ powerPreference: "high-performance" });
 		if (!adapter) return null;
-		const { WebGPUBackend } = await import("./webgpu-ow0Pn_6q.js");
+		const { WebGPUBackend } = await import("./webgpu-CM-xNYzW.js");
 		const importantLimits = [
 			"maxBufferSize",
 			"maxComputeInvocationsPerWorkgroup",
@@ -3813,4 +3842,4 @@ var UnsupportedOpError = class extends Error {
 };
 //#endregion
-export { AluExp, AluGroup, AluOp, AluVar, DEBUG, DType, Executable, FpHash, Kernel, PPrint, Reduction, ShapeTracker, SlotError, UnsupportedOpError, accessorAluExp, accessorGlobal, byteWidth, checkAxis, deepEqual, defaultDevice, devices, dtypedArray, dtypedJsArray, findPow2, getBackend, init, invertPermutation, isFloatDtype, isNumberPair, isPermutation, normalizeAxis, partitionList, prod, promoteTypes, range, recursiveFlatten, rep, runWithCache, setDebug, strip1, toposort, tuneWebgpu, union, unravelAlu, unzip2, zip, zipn };
+export { AluExp, AluGroup, AluOp, AluVar, DEBUG, DType, Executable, FpHash, Kernel, PPrint, Reduction, ShapeTracker, SlotError, UnsupportedOpError, accessorAluExp, accessorGlobal, byteWidth, checkAxis, deepEqual, defaultDevice, devices, dtypedArray, dtypedJsArray, findPow2, generalBroadcast, getBackend, init, invertPermutation, isFloatDtype, isNumberPair, isPermutation, normalizeAxis, partitionList, prod, promoteTypes, range, recursiveFlatten, rep, runWithCache, setDebug, strip1, toposort, tuneWebgpu, union, unravelAlu, unzip2, zip, zipn };

package/dist/{backend-Ss1Mev_-.cjs → backend-yEU0L_ig.cjs} RENAMED Viewed

@@ -207,6 +207,35 @@ function findPow2(hint, max) {
 	while (ret < hint && 2 * ret <= max) ret *= 2;
 	return ret;
 }
+/**
+* Implements a NumPy-style generalized broadcast rule on two array shapes.
+*
+* "When operating on two arrays, NumPy compares their shapes element-wise. It
+* starts with the trailing (i.e. rightmost) dimension and works its way left.
+* Two dimensions are compatible when:
+*   1. they are equal, or
+*   2. one of them is 1."
+*
+* Throws a TypeError if the broadcast is not possible.
+*
+* <https://numpy.org/doc/stable/user/basics.broadcasting.html#general-broadcasting-rules>
+*/
+function generalBroadcast(a, b) {
+	const out = [];
+	let i = a.length - 1;
+	let j = b.length - 1;
+	for (; i >= 0 && j >= 0; i--, j--) {
+		const x = a[i];
+		const y = b[j];
+		if (x === y) out.push(x);
+		else if (x === 1) out.push(y);
+		else if (y === 1) out.push(x);
+		else throw new TypeError(`Incompatible array broadcast shapes: ${a} vs ${b}`);
+	}
+	for (; i >= 0; i--) out.push(a[i]);
+	for (; j >= 0; j--) out.push(b[j]);
+	return out.reverse();
+}
 function recursiveFlatten(ar) {
 	if (!Array.isArray(ar)) return [ar];
 	return ar.flat(Infinity);
@@ -295,12 +324,12 @@ const isFloatDtype = (dtype) => dtype === DType.Float32 || dtype === DType.Float
 * **Type lattice:**
 * ```text
 * bool -> uint32 -> int32 -> float16 -> float32
-*  weak f* --^
+*  weakType --^
 * ```
 *
-* The asterisk f* is a weak type used for JS number constants. When creating
-* arrays, JS numbers default to float32 but "weak" so they cast to the dtype of
-* any array they are first combined with.
+* `weakType` represents weakly typed arrays. These are created for JS numbers,
+* which default to float32 but "weak" so they cast to the dtype of any array
+* they are first combined with, except `bool`.
 *
 * **Examples:**
 * - `promoteTypes(bool, int32) → int32`
@@ -3761,7 +3790,7 @@ async function createBackend(device) {
 		if (!navigator.gpu) return null;
 		const adapter = await navigator.gpu.requestAdapter({ powerPreference: "high-performance" });
 		if (!adapter) return null;
-		const { WebGPUBackend } = await Promise.resolve().then(() => require("./webgpu-BVdMaO9T.cjs"));
+		const { WebGPUBackend } = await Promise.resolve().then(() => require("./webgpu-CNOpiO5T.cjs"));
 		const importantLimits = [
 			"maxBufferSize",
 			"maxComputeInvocationsPerWorkgroup",
@@ -3958,6 +3987,12 @@ Object.defineProperty(exports, 'findPow2', {
     return findPow2;
   }
 });
+Object.defineProperty(exports, 'generalBroadcast', {
+  enumerable: true,
+  get: function () {
+    return generalBroadcast;
+  }
+});
 Object.defineProperty(exports, 'getBackend', {
   enumerable: true,
   get: function () {