npm - umap-gpu - Versions diffs - 0.1.0 → 0.2.8 - Mend

umap-gpu 0.1.0 → 0.2.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +60 -43
package/dist/fallback/cpu-sgd.d.ts +18 -1
package/dist/fuzzy-set.d.ts +13 -0
package/dist/gpu/sgd.d.ts +1 -1
package/dist/hnsw-knn.d.ts +15 -0
package/dist/index.d.ts +3 -3
package/dist/index.js +377 -168
package/dist/umap.d.ts +71 -1
package/package.json +12 -2

package/README.md CHANGED Viewed

@@ -1,87 +1,104 @@
 # umap-gpu
-UMAP dimensionality reduction with HNSW k-nearest-neighbor search and WebGPU-accelerated SGD optimization, with a transparent CPU fallback.
+UMAP dimensionality reduction with WebGPU-accelerated SGD and HNSW approximate nearest neighbors.
-## What it does
+Embed millions of high-dimensional vectors into 2D in seconds — not minutes.
-Takes a set of high-dimensional vectors and returns a low-dimensional embedding (default: 2D) suitable for visualization or downstream tasks.
+## Why GPU?
-The pipeline runs in three stages:
+The bottleneck in UMAP is the SGD optimization loop: thousands of epochs, millions of edge updates per epoch. On CPU this is sequential. On GPU, all edges run in parallel across thousands of shader cores — expect a significant speedup on large datasets, scaling with both the number of points and the number of epochs.
-1. **k-NN** — approximate nearest neighbors via [hnswlib-wasm](https://github.com/yoshoku/hnswlib-wasm) (O(n log n))
-2. **Fuzzy simplicial set** — builds a weighted graph from the k-NN graph using smooth distances
-3. **SGD** — optimizes the embedding using attraction/repulsion forces:
-   - **WebGPU** compute shader when available (Chrome 113+, Edge 113+)
-   - **CPU** fallback otherwise — identical output, just slower
+The k-NN stage uses [hnswlib-wasm](https://github.com/yoshoku/hnswlib-wasm) (O(n log n)) so it stays fast regardless.
+A transparent CPU fallback guarantees identical output everywhere WebGPU isn't available.
 ## Install
 ```bash
+# npm
 npm install umap-gpu
-```
-> Requires a browser or runtime with WebGPU support for GPU acceleration. The CPU fallback works anywhere.
+# Bun
+bun add umap-gpu
+# pnpm
+pnpm add umap-gpu
+```
-## Usage
+## Quick start
 ```ts
 import { fit } from 'umap-gpu';
 const vectors = [
-  [1.0, 0.0, 0.3],
-  [0.9, 0.1, 0.4],
-  [0.0, 1.0, 0.8],
+  [0.1, 0.4, 0.9, ...],  // high-dimensional points
+  [0.2, 0.3, 0.8, ...],
   // ...
 ];
 const embedding = await fit(vectors);
-// Float32Array of length n * nComponents (default: n * 2)
-// embedding[i*2], embedding[i*2 + 1] → 2D coordinates of point i
+// Float32Array — embedding[i*2], embedding[i*2+1] are the 2D coords of point i
+```
+## Train once, project many times
+Use the `UMAP` class to embed a training set and later project new points into the same space without retraining.
+```ts
+import { UMAP } from 'umap-gpu';
+const umap = new UMAP({ nNeighbors: 15, minDist: 0.1 });
+// Train
+await umap.fit(trainVectors);
+console.log(umap.embedding); // Float32Array [nTrain × 2]
+// Project new points (training embedding stays fixed)
+const projected = await umap.transform(newVectors);
+// Float32Array [nNew × 2]
 ```
-### Options
+## Options
 ```ts
-const embedding = await fit(vectors, {
-  nComponents: 2,      // output dimensions (default: 2)
-  nNeighbors:  15,     // k-NN graph degree (default: 15)
-  nEpochs:     500,    // SGD iterations (default: 500 for <10k points, 200 otherwise)
-  minDist:     0.1,    // minimum distance between points in the embedding (default: 0.1)
-  spread:      1.0,    // scale of the embedding (default: 1.0)
+const umap = new UMAP({
+  nComponents: 2,      // output dimensions          (default: 2)
+  nNeighbors:  15,     // k-NN graph degree          (default: 15)
+  nEpochs:     500,    // SGD iterations             (default: auto — 500 for <10k points, 200 otherwise)
+  minDist:     0.1,    // min distance in embedding  (default: 0.1)
+  spread:      1.0,    // scale of the embedding     (default: 1.0)
   hnsw: {
-    M:               16,  // HNSW graph connectivity (default: 16)
-    efConstruction: 200,  // build-time search width (default: 200)
-    efSearch:        50,  // query-time search width (default: 50)
+    M:               16,  // graph connectivity        (default: 16)
+    efConstruction: 200,  // build-time search width  (default: 200)
+    efSearch:        50,  // query-time search width  (default: 50)
   },
 });
+// Same options work with the functional API
+const embedding = await fit(vectors, { nNeighbors: 15, minDist: 0.05 });
 ```
-### Checking GPU availability
+## Check GPU availability
 ```ts
 import { isWebGPUAvailable } from 'umap-gpu';
-if (isWebGPUAvailable()) {
-  console.log('Will use WebGPU-accelerated SGD');
-} else {
-  console.log('Will fall back to CPU SGD');
-}
-```
-## Build
-```bash
-npm run build   # compiles TypeScript to dist/
-npm test        # runs the unit test suite (Vitest)
+console.log(isWebGPUAvailable()); // true → GPU path, false → CPU fallback
 ```
 ## Browser support
-| Feature | Requirement |
+| Feature | Supported in |
 |---------|-------------|
 | WebGPU SGD | Chrome 113+, Edge 113+, Safari 18+ |
-| CPU fallback | Any modern browser / Node.js |
-| HNSW (WASM) | Any environment with WebAssembly support |
+| CPU fallback | Any modern browser / Node.js / Bun |
+| HNSW (WASM) | Any environment with WebAssembly |
+## Development
+```bash
+npm test        # Vitest unit tests
+npm run build   # TypeScript → dist/
+```
 ## License

package/dist/fallback/cpu-sgd.d.ts CHANGED Viewed

@@ -9,4 +9,21 @@ export interface CPUSgdParams {
  * CPU fallback SGD optimizer for environments without WebGPU.
  * Mirrors the GPU shader logic: per-edge attraction + negative-sample repulsion.
  */
-export declare function cpuSgd(embedding: Float32Array, graph: FuzzyGraph, epochsPerSample: Float32Array, nVertices: number, nComponents: number, nEpochs: number, params: CPUSgdParams): Float32Array;
+export declare function cpuSgd(embedding: Float32Array, graph: FuzzyGraph, epochsPerSample: Float32Array, nVertices: number, nComponents: number, nEpochs: number, params: CPUSgdParams, onProgress?: (epoch: number, nEpochs: number) => void): Float32Array;
+/**
+ * CPU SGD for UMAP.transform(): optimizes only the new-point embeddings.
+ * The training embedding is read-only; attraction pulls new points toward
+ * their training neighbors, and repulsion pushes them away from random
+ * training points.
+ *
+ * @param embeddingNew   - New-point embeddings to optimize [nNew × nComponents]
+ * @param embeddingTrain - Fixed training embeddings [nTrain × nComponents]
+ * @param graph          - Bipartite graph: rows=new-point indices, cols=training-point indices
+ * @param epochsPerSample - Per-edge epoch sampling schedule
+ * @param nNew           - Number of new points
+ * @param nTrain         - Number of training points
+ * @param nComponents    - Embedding dimensionality
+ * @param nEpochs        - Number of optimization epochs
+ * @param params         - UMAP curve parameters
+ */
+export declare function cpuSgdTransform(embeddingNew: Float32Array, embeddingTrain: Float32Array, graph: FuzzyGraph, epochsPerSample: Float32Array, nNew: number, nTrain: number, nComponents: number, nEpochs: number, params: CPUSgdParams, onProgress?: (epoch: number, nEpochs: number) => void): Float32Array;

package/dist/fuzzy-set.d.ts CHANGED Viewed

@@ -10,3 +10,16 @@ export interface FuzzyGraph {
  * (sigmas, rhos) and symmetrizes with the fuzzy set union operation.
  */
 export declare function computeFuzzySimplicialSet(knnIndices: number[][], knnDistances: number[][], nNeighbors: number, setOpMixRatio?: number): FuzzyGraph;
+/**
+ * Compute the fuzzy weight graph between new (query) points and training points.
+ * Used by UMAP.transform() to project unseen data into an existing embedding.
+ *
+ * Unlike computeFuzzySimplicialSet, this produces a bipartite graph
+ * (new points → training points) with no symmetrization.
+ *
+ * @param knnIndices   - For each new point, the indices of its training neighbors
+ * @param knnDistances - For each new point, the distances to those neighbors
+ * @param nNeighbors   - Number of neighbors used
+ * @returns FuzzyGraph where rows are new-point indices, cols are training-point indices
+ */
+export declare function computeTransformFuzzyWeights(knnIndices: number[][], knnDistances: number[][], nNeighbors: number): FuzzyGraph;

package/dist/gpu/sgd.d.ts CHANGED Viewed

@@ -25,6 +25,6 @@ export declare class GPUSgd {
      * @param params      - UMAP curve parameters and repulsion settings
      * @returns Optimized embedding as Float32Array
      */
-    optimize(embedding: Float32Array, head: Uint32Array, tail: Uint32Array, epochsPerSample: Float32Array, nVertices: number, nComponents: number, nEpochs: number, params: SGDParams): Promise<Float32Array>;
+    optimize(embedding: Float32Array, head: Uint32Array, tail: Uint32Array, epochsPerSample: Float32Array, nVertices: number, nComponents: number, nEpochs: number, params: SGDParams, onProgress?: (epoch: number, nEpochs: number) => void): Promise<Float32Array>;
     private makeBuffer;
 }

package/dist/hnsw-knn.d.ts CHANGED Viewed

@@ -7,9 +7,24 @@ export interface HNSWOptions {
     efConstruction?: number;
     efSearch?: number;
 }
+/**
+ * A built HNSW index that can be queried to find nearest neighbors in the
+ * training data for new (unseen) points — used by UMAP.transform().
+ */
+export interface HNSWSearchableIndex {
+    searchKnn(queryVectors: number[][], nNeighbors: number): KNNResult;
+}
 /**
  * Compute k-nearest neighbors using HNSW (Hierarchical Navigable Small World)
  * via hnswlib-wasm, replacing the O(n^2) brute-force search in umap-js with
  * an O(n log n) approximate nearest neighbor search.
  */
 export declare function computeKNN(vectors: number[][], nNeighbors: number, opts?: HNSWOptions): Promise<KNNResult>;
+/**
+ * Like computeKNN, but also returns the built HNSW index so it can be reused
+ * later to project new points (used by UMAP.transform()).
+ */
+export declare function computeKNNWithIndex(vectors: number[][], nNeighbors: number, opts?: HNSWOptions): Promise<{
+    knn: KNNResult;
+    index: HNSWSearchableIndex;
+}>;

package/dist/index.d.ts CHANGED Viewed

@@ -1,5 +1,5 @@
-export { fit } from './umap';
-export type { UMAPOptions } from './umap';
-export type { KNNResult, HNSWOptions } from './hnsw-knn';
+export { fit, UMAP } from './umap';
+export type { UMAPOptions, ProgressCallback } from './umap';
+export type { KNNResult, HNSWOptions, HNSWSearchableIndex } from './hnsw-knn';
 export type { FuzzyGraph } from './fuzzy-set';
 export { isWebGPUAvailable } from './gpu/device';

package/dist/index.js CHANGED Viewed

@@ -1,65 +1,98 @@
-var N = Object.defineProperty;
-var q = (e, n, a) => n in e ? N(e, n, { enumerable: !0, configurable: !0, writable: !0, value: a }) : e[n] = a;
-var O = (e, n, a) => q(e, typeof n != "symbol" ? n + "" : n, a);
-import { loadHnswlib as z } from "hnswlib-wasm";
-async function T(e, n, a = {}) {
-  const { M: d = 16, efConstruction: t = 200, efSearch: h = 50 } = a, p = await z(), u = e[0].length, c = e.length, o = new p.HierarchicalNSW("l2", u, "");
-  o.initIndex(c, d, t, 200), o.setEfSearch(Math.max(h, n)), o.addItems(e, !1);
-  const r = [], s = [];
-  for (let f = 0; f < c; f++) {
-    const l = o.searchKnn(e[f], n + 1, void 0), m = l.neighbors.map((_, w) => ({ idx: _, dist: l.distances[w] })).filter(({ idx: _ }) => _ !== f).slice(0, n);
-    r.push(m.map(({ idx: _ }) => _)), s.push(m.map(({ dist: _ }) => _));
+var H = Object.defineProperty;
+var V = (t, e, i) => e in t ? H(t, e, { enumerable: !0, configurable: !0, writable: !0, value: i }) : t[e] = i;
+var E = (t, e, i) => V(t, typeof e != "symbol" ? e + "" : e, i);
+import { loadHnswlib as C } from "hnswlib-wasm";
+async function Y(t, e, i = {}) {
+  const { M: a = 16, efConstruction: s = 200, efSearch: p = 50 } = i, f = await C(), h = t[0].length, c = t.length, o = new f.HierarchicalNSW("l2", h, "");
+  o.initIndex(c, a, s, 200), o.setEfSearch(Math.max(p, e)), o.addItems(t, !1);
+  const n = [], r = [];
+  for (let l = 0; l < c; l++) {
+    const d = o.searchKnn(t[l], e + 1, void 0), u = d.neighbors.map((g, _) => ({ idx: g, dist: d.distances[_] })).filter(({ idx: g }) => g !== l).slice(0, e);
+    n.push(u.map(({ idx: g }) => g)), r.push(u.map(({ dist: g }) => g));
   }
-  return { indices: r, distances: s };
+  return { indices: n, distances: r };
 }
-function L(e, n, a, d = 1) {
-  const t = e.length, { sigmas: h, rhos: p } = C(n, a), u = [], c = [], o = [];
-  for (let s = 0; s < t; s++)
-    for (let f = 0; f < e[s].length; f++) {
-      const l = n[s][f], m = l <= p[s] ? 1 : Math.exp(-((l - p[s]) / h[s]));
-      u.push(s), c.push(e[s][f]), o.push(m);
+async function J(t, e, i = {}) {
+  const { M: a = 16, efConstruction: s = 200, efSearch: p = 50 } = i, f = await C(), h = t[0].length, c = t.length, o = new f.HierarchicalNSW("l2", h, "");
+  o.initIndex(c, a, s, 200), o.setEfSearch(Math.max(p, e)), o.addItems(t, !1);
+  const n = [], r = [];
+  for (let d = 0; d < c; d++) {
+    const u = o.searchKnn(t[d], e + 1, void 0), g = u.neighbors.map((_, w) => ({ idx: _, dist: u.distances[w] })).filter(({ idx: _ }) => _ !== d).slice(0, e);
+    n.push(g.map(({ idx: _ }) => _)), r.push(g.map(({ dist: _ }) => _));
+  }
+  return { knn: { indices: n, distances: r }, index: {
+    searchKnn(d, u) {
+      const g = [], _ = [];
+      for (const w of d) {
+        const y = o.searchKnn(w, u, void 0), b = y.neighbors.map((M, x) => ({ idx: M, dist: y.distances[x] })).sort((M, x) => M.dist - x.dist).slice(0, u);
+        g.push(b.map(({ idx: M }) => M)), _.push(b.map(({ dist: M }) => M));
+      }
+      return { indices: g, distances: _ };
+    }
+  } };
+}
+function D(t, e, i, a = 1) {
+  const s = t.length, { sigmas: p, rhos: f } = L(e, i), h = [], c = [], o = [];
+  for (let r = 0; r < s; r++)
+    for (let l = 0; l < t[r].length; l++) {
+      const d = e[r][l], u = d <= f[r] ? 1 : Math.exp(-((d - f[r]) / p[r]));
+      h.push(r), c.push(t[r][l]), o.push(u);
+    }
+  return { ...X(h, c, o, s, a), nVertices: s };
+}
+function Q(t, e, i) {
+  const a = t.length, { sigmas: s, rhos: p } = L(e, i), f = [], h = [], c = [];
+  for (let o = 0; o < a; o++)
+    for (let n = 0; n < t[o].length; n++) {
+      const r = e[o][n], l = r <= p[o] ? 1 : Math.exp(-((r - p[o]) / s[o]));
+      f.push(o), h.push(t[o][n]), c.push(l);
     }
-  return { ...D(u, c, o, t, d), nVertices: t };
+  return {
+    rows: new Float32Array(f),
+    cols: new Float32Array(h),
+    vals: new Float32Array(c),
+    nVertices: a
+  };
 }
-function C(e, n) {
-  const d = e.length, t = new Float32Array(d), h = new Float32Array(d);
-  for (let p = 0; p < d; p++) {
-    const u = e[p];
-    h[p] = u.find((f) => f > 0) ?? 0;
-    let c = 0, o = 1 / 0, r = 1;
-    const s = Math.log2(n);
-    for (let f = 0; f < 64; f++) {
-      let l = 0;
-      for (let m = 1; m < u.length; m++)
-        l += Math.exp(-Math.max(0, u[m] - h[p]) / r);
-      if (Math.abs(l - s) < 1e-5) break;
-      l > s ? (o = r, r = (c + o) / 2) : (c = r, r = o === 1 / 0 ? r * 2 : (c + o) / 2);
+function L(t, e) {
+  const a = t.length, s = new Float32Array(a), p = new Float32Array(a);
+  for (let f = 0; f < a; f++) {
+    const h = t[f];
+    p[f] = h.find((l) => l > 0) ?? 0;
+    let c = 0, o = 1 / 0, n = 1;
+    const r = Math.log2(e);
+    for (let l = 0; l < 64; l++) {
+      let d = 0;
+      for (let u = 1; u < h.length; u++)
+        d += Math.exp(-Math.max(0, h[u] - p[f]) / n);
+      if (Math.abs(d - r) < 1e-5) break;
+      d > r ? (o = n, n = (c + o) / 2) : (c = n, n = o === 1 / 0 ? n * 2 : (c + o) / 2);
     }
-    t[p] = r;
+    s[f] = n;
   }
-  return { sigmas: t, rhos: h };
+  return { sigmas: s, rhos: p };
 }
-function D(e, n, a, d, t) {
-  const h = /* @__PURE__ */ new Map(), p = (r, s, f) => {
-    const l = `${r},${s}`, m = h.get(l) ?? 0;
-    h.set(l, m + f);
+function X(t, e, i, a, s) {
+  const p = /* @__PURE__ */ new Map(), f = (n, r, l) => {
+    const d = n * a + r;
+    p.set(d, (p.get(d) ?? 0) + l);
   };
-  for (let r = 0; r < e.length; r++)
-    p(e[r], n[r], a[r]), p(n[r], e[r], a[r]);
-  const u = [], c = [], o = [];
-  for (const [r, s] of h.entries()) {
-    const [f, l] = r.split(",").map(Number);
-    u.push(f), c.push(l), o.push(
-      s > 1 ? t * (2 - s) + (1 - t) * (s - 1) : s
+  for (let n = 0; n < t.length; n++)
+    f(t[n], e[n], i[n]), f(e[n], t[n], i[n]);
+  const h = [], c = [], o = [];
+  for (const [n, r] of p.entries()) {
+    const l = Math.floor(n / a), d = n % a;
+    h.push(l), c.push(d), o.push(
+      r > 1 ? s * (2 - r) + (1 - s) * (r - 1) : r
     );
   }
   return {
-    rows: new Float32Array(u),
+    rows: new Float32Array(h),
     cols: new Float32Array(c),
     vals: new Float32Array(o)
   };
 }
-const W = `// UMAP SGD compute shader — processes one graph edge per GPU thread.
+const Z = `// UMAP SGD compute shader — processes one graph edge per GPU thread.
 // Applies attraction forces between connected nodes and repulsion forces
 // against negative samples.
@@ -162,18 +195,18 @@ fn main(@builtin(global_invocation_id) gid: vec3<u32>) {
     epochs_per_sample[edge_idx] / f32(params.negative_sample_rate);
 }
 `;
-class j {
+class W {
   constructor() {
-    O(this, "device");
-    O(this, "pipeline");
+    E(this, "device");
+    E(this, "pipeline");
   }
   async init() {
-    const n = await navigator.gpu.requestAdapter();
-    if (!n) throw new Error("WebGPU not supported");
-    this.device = await n.requestDevice(), this.pipeline = this.device.createComputePipeline({
+    const e = await navigator.gpu.requestAdapter();
+    if (!e) throw new Error("WebGPU not supported");
+    this.device = await e.requestDevice(), this.pipeline = this.device.createComputePipeline({
       layout: "auto",
       compute: {
-        module: this.device.createShaderModule({ code: W }),
+        module: this.device.createShaderModule({ code: Z }),
         entryPoint: "main"
       }
     });
@@ -191,171 +224,347 @@ class j {
    * @param params      - UMAP curve parameters and repulsion settings
    * @returns Optimized embedding as Float32Array
    */
-  async optimize(n, a, d, t, h, p, u, c) {
-    const { device: o } = this, r = a.length, s = this.makeBuffer(
-      n,
+  async optimize(e, i, a, s, p, f, h, c, o) {
+    const { device: n } = this, r = i.length, l = this.makeBuffer(
+      e,
       GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_SRC
-    ), f = this.makeBuffer(a, GPUBufferUsage.STORAGE), l = this.makeBuffer(d, GPUBufferUsage.STORAGE), m = this.makeBuffer(t, GPUBufferUsage.STORAGE), _ = new Float32Array(r).fill(0), w = this.makeBuffer(_, GPUBufferUsage.STORAGE), g = new Float32Array(r);
-    for (let i = 0; i < r; i++)
-      g[i] = t[i] / c.negativeSampleRate;
-    const S = this.makeBuffer(g, GPUBufferUsage.STORAGE), b = new Uint32Array(r);
-    for (let i = 0; i < r; i++)
-      b[i] = Math.random() * 4294967295 | 0;
-    const v = this.makeBuffer(b, GPUBufferUsage.STORAGE), M = o.createBuffer({
+    ), d = this.makeBuffer(i, GPUBufferUsage.STORAGE), u = this.makeBuffer(a, GPUBufferUsage.STORAGE), g = this.makeBuffer(s, GPUBufferUsage.STORAGE), _ = new Float32Array(r).fill(0), w = this.makeBuffer(_, GPUBufferUsage.STORAGE), y = new Float32Array(r);
+    for (let m = 0; m < r; m++)
+      y[m] = s[m] / c.negativeSampleRate;
+    const b = this.makeBuffer(y, GPUBufferUsage.STORAGE), M = new Uint32Array(r);
+    for (let m = 0; m < r; m++)
+      M[m] = Math.random() * 4294967295 | 0;
+    const x = this.makeBuffer(M, GPUBufferUsage.STORAGE), B = n.createBuffer({
       size: 40,
       usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST
     });
-    for (let i = 0; i < u; i++) {
-      const A = 1 - i / u, B = new ArrayBuffer(40), U = new Uint32Array(B), y = new Float32Array(B);
-      U[0] = r, U[1] = h, U[2] = p, U[3] = i, U[4] = u, y[5] = A, y[6] = c.a, y[7] = c.b, y[8] = c.gamma, U[9] = c.negativeSampleRate, o.queue.writeBuffer(M, 0, B);
-      const G = o.createBindGroup({
+    for (let m = 0; m < h; m++) {
+      const F = 1 - m / h, A = new ArrayBuffer(40), v = new Uint32Array(A), N = new Float32Array(A);
+      v[0] = r, v[1] = p, v[2] = f, v[3] = m, v[4] = h, N[5] = F, N[6] = c.a, N[7] = c.b, N[8] = c.gamma, v[9] = c.negativeSampleRate, n.queue.writeBuffer(B, 0, A);
+      const S = n.createBindGroup({
         layout: this.pipeline.getBindGroupLayout(0),
         entries: [
-          { binding: 0, resource: { buffer: m } },
-          { binding: 1, resource: { buffer: f } },
-          { binding: 2, resource: { buffer: l } },
-          { binding: 3, resource: { buffer: s } },
+          { binding: 0, resource: { buffer: g } },
+          { binding: 1, resource: { buffer: d } },
+          { binding: 2, resource: { buffer: u } },
+          { binding: 3, resource: { buffer: l } },
           { binding: 4, resource: { buffer: w } },
-          { binding: 5, resource: { buffer: S } },
-          { binding: 6, resource: { buffer: M } },
-          { binding: 7, resource: { buffer: v } }
+          { binding: 5, resource: { buffer: b } },
+          { binding: 6, resource: { buffer: B } },
+          { binding: 7, resource: { buffer: x } }
         ]
-      }), E = o.createCommandEncoder(), R = E.beginComputePass();
-      R.setPipeline(this.pipeline), R.setBindGroup(0, G), R.dispatchWorkgroups(Math.ceil(r / 256)), R.end(), o.queue.submit([E.finish()]), i % 10 === 0 && await o.queue.onSubmittedWorkDone();
+      }), k = n.createCommandEncoder(), U = k.beginComputePass();
+      U.setPipeline(this.pipeline), U.setBindGroup(0, S), U.dispatchWorkgroups(Math.ceil(r / 256)), U.end(), n.queue.submit([k.finish()]), m % 10 === 0 && (await n.queue.onSubmittedWorkDone(), o == null || o(m, h));
     }
-    const x = o.createBuffer({
-      size: n.byteLength,
+    const G = n.createBuffer({
+      size: e.byteLength,
       usage: GPUBufferUsage.COPY_DST | GPUBufferUsage.MAP_READ
-    }), P = o.createCommandEncoder();
-    P.copyBufferToBuffer(s, 0, x, 0, n.byteLength), o.queue.submit([P.finish()]), await x.mapAsync(GPUMapMode.READ);
-    const k = new Float32Array(x.getMappedRange().slice(0));
-    return x.unmap(), s.destroy(), f.destroy(), l.destroy(), m.destroy(), w.destroy(), S.destroy(), v.destroy(), M.destroy(), x.destroy(), k;
+    }), R = n.createCommandEncoder();
+    R.copyBufferToBuffer(l, 0, G, 0, e.byteLength), n.queue.submit([R.finish()]), await G.mapAsync(GPUMapMode.READ);
+    const O = new Float32Array(G.getMappedRange().slice(0));
+    return G.unmap(), l.destroy(), d.destroy(), u.destroy(), g.destroy(), w.destroy(), b.destroy(), x.destroy(), B.destroy(), G.destroy(), O;
   }
-  makeBuffer(n, a) {
-    const d = this.device.createBuffer({
-      size: n.byteLength,
-      usage: a,
+  makeBuffer(e, i) {
+    const a = this.device.createBuffer({
+      size: e.byteLength,
+      usage: i,
       mappedAtCreation: !0
     });
-    return n instanceof Float32Array ? new Float32Array(d.getMappedRange()).set(n) : new Uint32Array(d.getMappedRange()).set(n), d.unmap(), d;
+    return e instanceof Float32Array ? new Float32Array(a.getMappedRange()).set(e) : new Uint32Array(a.getMappedRange()).set(e), a.unmap(), a;
   }
 }
-function F(e, n, a, d, t, h, p) {
-  const { a: u, b: c, gamma: o = 1, negativeSampleRate: r = 5 } = p, s = n.rows.length, f = new Uint32Array(n.rows), l = new Uint32Array(n.cols), m = new Float32Array(s).fill(0), _ = new Float32Array(s);
-  for (let g = 0; g < s; g++)
-    _[g] = a[g] / r;
-  function w(g) {
-    return Math.max(-4, Math.min(4, g));
+function P(t) {
+  return Math.max(-4, Math.min(4, t));
+}
+function q(t, e, i, a, s, p, f, h) {
+  const { a: c, b: o, gamma: n = 1, negativeSampleRate: r = 5 } = f, l = e.rows.length, d = new Uint32Array(e.rows), u = new Uint32Array(e.cols), g = new Float32Array(l).fill(0), _ = new Float32Array(l);
+  for (let w = 0; w < l; w++)
+    _[w] = i[w] / r;
+  for (let w = 0; w < p; w++) {
+    h == null || h(w, p);
+    const y = 1 - w / p;
+    for (let b = 0; b < l; b++) {
+      if (g[b] > w) continue;
+      const M = d[b], x = u[b];
+      let B = 0;
+      for (let m = 0; m < s; m++) {
+        const F = t[M * s + m] - t[x * s + m];
+        B += F * F;
+      }
+      const G = Math.pow(B, o), R = -2 * c * o * (B > 0 ? G / B : 0) / (c * G + 1);
+      for (let m = 0; m < s; m++) {
+        const F = t[M * s + m] - t[x * s + m], A = P(R * F);
+        t[M * s + m] += y * A;
+      }
+      g[b] += i[b];
+      const O = _[b] > 0 ? Math.floor(i[b] / _[b]) : 0;
+      for (let m = 0; m < O; m++) {
+        const F = Math.floor(Math.random() * a);
+        if (F === M) continue;
+        let A = 0;
+        for (let S = 0; S < s; S++) {
+          const k = t[M * s + S] - t[F * s + S];
+          A += k * k;
+        }
+        const v = Math.pow(A, o), N = 2 * n * o / ((1e-3 + A) * (c * v + 1));
+        for (let S = 0; S < s; S++) {
+          const k = t[M * s + S] - t[F * s + S], U = P(N * k);
+          t[M * s + S] += y * U;
+        }
+      }
+      _[b] += i[b] / r;
+    }
   }
-  for (let g = 0; g < h; g++) {
-    const S = 1 - g / h;
-    for (let b = 0; b < s; b++) {
-      if (m[b] > g) continue;
-      const v = f[b], M = l[b];
-      let x = 0;
-      for (let i = 0; i < t; i++) {
-        const A = e[v * t + i] - e[M * t + i];
-        x += A * A;
+  return t;
+}
+function $(t, e, i, a, s, p, f, h, c, o) {
+  const { a: n, b: r, gamma: l = 1, negativeSampleRate: d = 5 } = c, u = i.rows.length, g = new Uint32Array(i.rows), _ = new Uint32Array(i.cols), w = new Float32Array(u).fill(0), y = new Float32Array(u);
+  for (let b = 0; b < u; b++)
+    y[b] = a[b] / d;
+  for (let b = 0; b < h; b++) {
+    const M = 1 - b / h;
+    for (let x = 0; x < u; x++) {
+      if (w[x] > b) continue;
+      const B = g[x], G = _[x];
+      let R = 0;
+      for (let A = 0; A < f; A++) {
+        const v = t[B * f + A] - e[G * f + A];
+        R += v * v;
       }
-      const P = -2 * u * c * Math.pow(x, c - 1) / (u * Math.pow(x, c) + 1);
-      for (let i = 0; i < t; i++) {
-        const A = e[v * t + i] - e[M * t + i], B = w(P * A);
-        e[v * t + i] += S * B;
+      const O = Math.pow(R, r), m = -2 * n * r * (R > 0 ? O / R : 0) / (n * O + 1);
+      for (let A = 0; A < f; A++) {
+        const v = t[B * f + A] - e[G * f + A];
+        t[B * f + A] += M * P(m * v);
       }
-      m[b] += a[b];
-      const k = _[b] > 0 ? Math.floor(a[b] / _[b]) : 0;
-      for (let i = 0; i < k; i++) {
-        const A = Math.floor(Math.random() * d);
-        if (A === v) continue;
-        let B = 0;
-        for (let y = 0; y < t; y++) {
-          const G = e[v * t + y] - e[A * t + y];
-          B += G * G;
+      w[x] += a[x];
+      const F = y[x] > 0 ? Math.floor(a[x] / y[x]) : 0;
+      for (let A = 0; A < F; A++) {
+        const v = Math.floor(Math.random() * p);
+        if (v === G) continue;
+        let N = 0;
+        for (let U = 0; U < f; U++) {
+          const z = t[B * f + U] - e[v * f + U];
+          N += z * z;
         }
-        const U = 2 * o * c / ((1e-3 + B) * (u * Math.pow(B, c) + 1));
-        for (let y = 0; y < t; y++) {
-          const G = e[v * t + y] - e[A * t + y], E = w(U * G);
-          e[v * t + y] += S * E;
+        const S = Math.pow(N, r), k = 2 * l * r / ((1e-3 + N) * (n * S + 1));
+        for (let U = 0; U < f; U++) {
+          const z = t[B * f + U] - e[v * f + U];
+          t[B * f + U] += M * P(k * z);
         }
       }
-      _[b] += a[b] / r;
+      y[x] += a[x] / d;
     }
   }
-  return e;
+  return t;
 }
-function I() {
+function K() {
   return typeof navigator < "u" && !!navigator.gpu;
 }
-async function Q(e, n = {}) {
+async function ae(t, e = {}, i) {
   const {
     nComponents: a = 2,
-    nNeighbors: d = 15,
-    minDist: t = 0.1,
-    spread: h = 1,
-    hnsw: p = {}
-  } = n, u = n.nEpochs ?? (e.length > 1e4 ? 200 : 500);
+    nNeighbors: s = 15,
+    minDist: p = 0.1,
+    spread: f = 1,
+    hnsw: h = {}
+  } = e, c = e.nEpochs ?? (t.length > 1e4 ? 200 : 500);
   console.time("knn");
-  const { indices: c, distances: o } = await T(e, d, {
-    M: p.M ?? 16,
-    efConstruction: p.efConstruction ?? 200,
-    efSearch: p.efSearch ?? 50
+  const { indices: o, distances: n } = await Y(t, s, {
+    M: h.M ?? 16,
+    efConstruction: h.efConstruction ?? 200,
+    efSearch: h.efSearch ?? 50
   });
   console.timeEnd("knn"), console.time("fuzzy-set");
-  const r = L(c, o, d);
+  const r = D(o, n, s);
   console.timeEnd("fuzzy-set");
-  const { a: s, b: f } = K(t, h), l = Y(r.vals, u), m = e.length, _ = new Float32Array(m * a);
-  for (let g = 0; g < _.length; g++)
-    _[g] = Math.random() * 20 - 10;
+  const { a: l, b: d } = j(p, f), u = I(r.vals, c), g = t.length, _ = new Float32Array(g * a);
+  for (let y = 0; y < _.length; y++)
+    _[y] = Math.random() * 20 - 10;
   console.time("sgd");
   let w;
-  if (I())
+  if (K())
     try {
-      const g = new j();
-      await g.init(), w = await g.optimize(
+      const y = new W();
+      await y.init(), w = await y.optimize(
         _,
         new Uint32Array(r.rows),
         new Uint32Array(r.cols),
-        l,
-        m,
-        a,
         u,
-        { a: s, b: f, gamma: 1, negativeSampleRate: 5 }
+        g,
+        a,
+        c,
+        { a: l, b: d, gamma: 1, negativeSampleRate: 5 },
+        i
       );
-    } catch (g) {
-      console.warn("WebGPU SGD failed, falling back to CPU:", g), w = F(_, r, l, m, a, u, { a: s, b: f });
+    } catch (y) {
+      console.warn("WebGPU SGD failed, falling back to CPU:", y), w = q(_, r, u, g, a, c, { a: l, b: d }, i);
     }
   else
-    w = F(_, r, l, m, a, u, { a: s, b: f });
+    w = q(_, r, u, g, a, c, { a: l, b: d }, i);
   return console.timeEnd("sgd"), w;
 }
-function K(e, n) {
-  if (Math.abs(n - 1) < 1e-6 && Math.abs(e - 0.1) < 1e-6)
+function j(t, e) {
+  if (Math.abs(e - 1) < 1e-6 && Math.abs(t - 0.1) < 1e-6)
     return { a: 1.9292, b: 0.7915 };
-  if (Math.abs(n - 1) < 1e-6 && Math.abs(e - 0) < 1e-6)
+  if (Math.abs(e - 1) < 1e-6 && Math.abs(t - 0) < 1e-6)
     return { a: 1.8956, b: 0.8006 };
-  if (Math.abs(n - 1) < 1e-6 && Math.abs(e - 0.5) < 1e-6)
+  if (Math.abs(e - 1) < 1e-6 && Math.abs(t - 0.5) < 1e-6)
     return { a: 1.5769, b: 0.8951 };
-  const a = H(e, n);
-  return { a: V(e, n, a), b: a };
+  const i = ee(t, e);
+  return { a: te(t, e, i), b: i };
 }
-function H(e, n) {
-  return 1 / (n * 1.2);
+function ee(t, e) {
+  return 1 / (e * 1.2);
 }
-function V(e, n, a) {
-  return e < 1e-6 ? 1.8956 : (1 / (1 + 1e-3) - 1) / -Math.pow(e, 2 * a);
+function te(t, e, i) {
+  return t < 1e-6 ? 1.8956 : (1 / (1 + 1e-3) - 1) / -Math.pow(t, 2 * i);
+}
+class re {
+  constructor(e = {}) {
+    E(this, "_nComponents");
+    E(this, "_nNeighbors");
+    E(this, "_minDist");
+    E(this, "_spread");
+    E(this, "_nEpochs");
+    E(this, "_hnswOpts");
+    E(this, "_a");
+    E(this, "_b");
+    /** The low-dimensional embedding produced by the last fit() call. */
+    E(this, "embedding", null);
+    E(this, "_hnswIndex", null);
+    E(this, "_nTrain", 0);
+    this._nComponents = e.nComponents ?? 2, this._nNeighbors = e.nNeighbors ?? 15, this._minDist = e.minDist ?? 0.1, this._spread = e.spread ?? 1, this._nEpochs = e.nEpochs, this._hnswOpts = e.hnsw ?? {};
+    const { a: i, b: a } = j(this._minDist, this._spread);
+    this._a = i, this._b = a;
+  }
+  /**
+   * Train UMAP on `vectors`.
+   * Stores the resulting embedding in `this.embedding` and retains the HNSW
+   * index so that transform() can project new points later.
+   * Returns `this` for chaining.
+   */
+  async fit(e, i) {
+    const a = e.length, s = this._nEpochs ?? (a > 1e4 ? 200 : 500), { M: p = 16, efConstruction: f = 200, efSearch: h = 50 } = this._hnswOpts;
+    console.time("knn");
+    const { knn: c, index: o } = await J(e, this._nNeighbors, {
+      M: p,
+      efConstruction: f,
+      efSearch: h
+    });
+    this._hnswIndex = o, this._nTrain = a, console.timeEnd("knn"), console.time("fuzzy-set");
+    const n = D(c.indices, c.distances, this._nNeighbors);
+    console.timeEnd("fuzzy-set");
+    const r = I(n.vals, s), l = new Float32Array(a * this._nComponents);
+    for (let d = 0; d < l.length; d++)
+      l[d] = Math.random() * 20 - 10;
+    if (console.time("sgd"), K())
+      try {
+        const d = new W();
+        await d.init(), this.embedding = await d.optimize(
+          l,
+          new Uint32Array(n.rows),
+          new Uint32Array(n.cols),
+          r,
+          a,
+          this._nComponents,
+          s,
+          { a: this._a, b: this._b, gamma: 1, negativeSampleRate: 5 },
+          i
+        );
+      } catch (d) {
+        console.warn("WebGPU SGD failed, falling back to CPU:", d), this.embedding = q(l, n, r, a, this._nComponents, s, {
+          a: this._a,
+          b: this._b
+        }, i);
+      }
+    else
+      this.embedding = q(l, n, r, a, this._nComponents, s, {
+        a: this._a,
+        b: this._b
+      }, i);
+    return console.timeEnd("sgd"), this;
+  }
+  /**
+   * Project new (unseen) `vectors` into the embedding space learned by fit().
+   * Must be called after fit().
+   *
+   * The training embedding is kept fixed; only the new-point positions are
+   * optimised. Returns a Float32Array of shape [vectors.length × nComponents].
+   *
+   * @param normalize - When `true`, min-max normalise each dimension of the
+   *   returned embedding to [0, 1].  The stored training embedding is never
+   *   mutated.  Defaults to `false`.
+   */
+  async transform(e, i = !1) {
+    if (!this._hnswIndex || !this.embedding)
+      throw new Error("UMAP.transform() must be called after fit()");
+    const a = e.length, s = this._nEpochs ?? (this._nTrain > 1e4 ? 200 : 500), p = Math.max(100, Math.floor(s / 4)), f = this._hnswIndex.searchKnn(e, this._nNeighbors), h = Q(f.indices, f.distances, this._nNeighbors), c = new Uint32Array(h.rows), o = new Uint32Array(h.cols), n = new Float32Array(a), r = new Float32Array(a * this._nComponents);
+    for (let u = 0; u < c.length; u++) {
+      const g = c[u], _ = o[u], w = h.vals[u];
+      n[g] += w;
+      for (let y = 0; y < this._nComponents; y++)
+        r[g * this._nComponents + y] += w * this.embedding[_ * this._nComponents + y];
+    }
+    for (let u = 0; u < a; u++)
+      if (n[u] > 0)
+        for (let g = 0; g < this._nComponents; g++)
+          r[u * this._nComponents + g] /= n[u];
+      else
+        for (let g = 0; g < this._nComponents; g++)
+          r[u * this._nComponents + g] = Math.random() * 20 - 10;
+    const l = I(h.vals, p), d = $(
+      r,
+      this.embedding,
+      h,
+      l,
+      a,
+      this._nTrain,
+      this._nComponents,
+      p,
+      { a: this._a, b: this._b }
+    );
+    return i ? T(d, a, this._nComponents) : d;
+  }
+  /**
+   * Convenience method equivalent to `fit(vectors)` followed by
+   * `transform(vectors)` — but more efficient because the training embedding
+   * is returned directly without a second optimization pass.
+   *
+   * @param normalize - When `true`, min-max normalise each dimension of the
+   *   returned embedding to [0, 1].  `this.embedding` is never mutated.
+   *   Defaults to `false`.
+   */
+  async fit_transform(e, i, a = !1) {
+    return await this.fit(e, i), a ? T(this.embedding, e.length, this._nComponents) : this.embedding;
+  }
+}
+function T(t, e, i) {
+  const a = new Float32Array(t.length);
+  for (let s = 0; s < i; s++) {
+    let p = 1 / 0, f = -1 / 0;
+    for (let c = 0; c < e; c++) {
+      const o = t[c * i + s];
+      o < p && (p = o), o > f && (f = o);
+    }
+    const h = f - p;
+    for (let c = 0; c < e; c++)
+      a[c * i + s] = h > 0 ? (t[c * i + s] - p) / h : 0;
+  }
+  return a;
 }
-function Y(e, n) {
-  let a = -1 / 0;
-  for (let t = 0; t < e.length; t++)
-    e[t] > a && (a = e[t]);
-  const d = new Float32Array(e.length);
-  for (let t = 0; t < e.length; t++) {
-    const h = e[t] / a;
-    d[t] = h > 0 ? n / h : -1;
+function I(t, e) {
+  let i = -1 / 0;
+  for (let s = 0; s < t.length; s++)
+    t[s] > i && (i = t[s]);
+  const a = new Float32Array(t.length);
+  for (let s = 0; s < t.length; s++) {
+    const p = t[s] / i;
+    a[s] = p > 0 ? e / p : -1;
   }
-  return d;
+  return a;
 }
 export {
-  Q as fit,
-  I as isWebGPUAvailable
+  re as UMAP,
+  ae as fit,
+  K as isWebGPUAvailable
 };

package/dist/umap.d.ts CHANGED Viewed

@@ -16,6 +16,14 @@ export interface UMAPOptions {
         efSearch?: number;
     };
 }
+/**
+ * Called after each completed SGD epoch (or every 10 epochs on the GPU path,
+ * piggybacking on the existing GPU synchronisation point to avoid extra stalls).
+ *
+ * @param epoch   - Zero-based index of the epoch that just finished.
+ * @param nEpochs - Total number of epochs.
+ */
+export type ProgressCallback = (epoch: number, nEpochs: number) => void;
 /**
  * Fit UMAP to the given high-dimensional vectors and return a low-dimensional embedding.
  *
@@ -24,7 +32,7 @@ export interface UMAPOptions {
  * 2. Fuzzy simplicial set construction (graph weights)
  * 3. SGD optimization (WebGPU accelerated, with CPU fallback)
  */
-export declare function fit(vectors: number[][], opts?: UMAPOptions): Promise<Float32Array>;
+export declare function fit(vectors: number[][], opts?: UMAPOptions, onProgress?: ProgressCallback): Promise<Float32Array>;
 /**
  * Compute the a, b parameters for the UMAP curve 1/(1 + a*d^(2b)).
  *
@@ -36,6 +44,68 @@ export declare function findAB(minDist: number, spread: number): {
     a: number;
     b: number;
 };
+/**
+ * Stateful UMAP model that supports separate fit / transform / fit_transform.
+ *
+ * Usage:
+ * ```ts
+ * const umap = new UMAP({ nNeighbors: 15, nComponents: 2 });
+ *
+ * // Train on high-dimensional data:
+ * await umap.fit(trainVectors);
+ * console.log(umap.embedding); // Float32Array [nTrain * nComponents]
+ *
+ * // Project new points into the same space:
+ * const newEmbedding = await umap.transform(testVectors);
+ *
+ * // Or do both at once:
+ * const embedding = await umap.fit_transform(vectors);
+ * ```
+ */
+export declare class UMAP {
+    private readonly _nComponents;
+    private readonly _nNeighbors;
+    private readonly _minDist;
+    private readonly _spread;
+    private readonly _nEpochs;
+    private readonly _hnswOpts;
+    private readonly _a;
+    private readonly _b;
+    /** The low-dimensional embedding produced by the last fit() call. */
+    embedding: Float32Array | null;
+    private _hnswIndex;
+    private _nTrain;
+    constructor(opts?: UMAPOptions);
+    /**
+     * Train UMAP on `vectors`.
+     * Stores the resulting embedding in `this.embedding` and retains the HNSW
+     * index so that transform() can project new points later.
+     * Returns `this` for chaining.
+     */
+    fit(vectors: number[][], onProgress?: ProgressCallback): Promise<this>;
+    /**
+     * Project new (unseen) `vectors` into the embedding space learned by fit().
+     * Must be called after fit().
+     *
+     * The training embedding is kept fixed; only the new-point positions are
+     * optimised. Returns a Float32Array of shape [vectors.length × nComponents].
+     *
+     * @param normalize - When `true`, min-max normalise each dimension of the
+     *   returned embedding to [0, 1].  The stored training embedding is never
+     *   mutated.  Defaults to `false`.
+     */
+    transform(vectors: number[][], normalize?: boolean): Promise<Float32Array>;
+    /**
+     * Convenience method equivalent to `fit(vectors)` followed by
+     * `transform(vectors)` — but more efficient because the training embedding
+     * is returned directly without a second optimization pass.
+     *
+     * @param normalize - When `true`, min-max normalise each dimension of the
+     *   returned embedding to [0, 1].  `this.embedding` is never mutated.
+     *   Defaults to `false`.
+     */
+    fit_transform(vectors: number[][], onProgress?: ProgressCallback, normalize?: boolean): Promise<Float32Array>;
+}
 /**
  * Compute per-edge epoch sampling periods based on edge weights.
  * Higher-weight edges are sampled more frequently.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "umap-gpu",
-  "version": "0.1.0",
+  "version": "0.2.8",
   "description": "UMAP with HNSW kNN and WebGPU-accelerated SGD",
   "type": "module",
   "main": "dist/index.js",
@@ -8,19 +8,29 @@
   "files": [
     "dist"
   ],
+  "repository": {
+    "type": "git",
+    "url": "https://github.com/Achuttarsing/umap-gpu"
+  },
   "scripts": {
     "build": "vite build && tsc",
     "dev": "vite",
     "test": "vitest run",
-    "prepublishOnly": "npm test && npm run build"
+    "prepublishOnly": "bun test && bun run build",
+    "docs:dev": "vitepress dev docs",
+    "docs:build": "vitepress build docs",
+    "docs:generate": "bun run build && bunx api-extractor run && bun run docs:build"
   },
   "dependencies": {
     "hnswlib-wasm": "^0.8.2"
   },
   "devDependencies": {
+    "@microsoft/api-extractor": "^7.57.6",
     "@webgpu/types": "^0.1.40",
     "typescript": "^5.4.0",
     "vite": "^5.0.0",
+    "vitepress": "^1.6.4",
+    "vitepress-plugin-llms": "^1.11.0",
     "vitest": "^4.0.18"
   },
   "license": "MIT"