mini-jstorch 2.0.0 → 2.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/Docs/API.md ADDED
@@ -0,0 +1,277 @@
1
+ # API Reference
2
+
3
+ ## Model Container
4
+
5
+ ### Sequential
6
+
7
+ ```js
8
+ new Sequential(layers: Layer[])
9
+ ```
10
+
11
+ Container that chains layers sequentially.
12
+
13
+ **Methods:**
14
+ - forward(x) — Pass input through all layers
15
+ - backward(grad) — Backpropagate gradient through all layers
16
+ - parameters() — Returns [{param, grad}, ...] for all trainable parameters
17
+ - zeroGrad() — Zero all parameter gradients
18
+ - train() — Set all layers to training mode
19
+ - eval() — Set all layers to evaluation mode
20
+ - stateDict() — Get {layer_0.weight, layer_0.bias, ...}
21
+ - loadStateDict(dict) — Load weights from state dict object
22
+ - step(lr) — Apply SGD step to all layers directly
23
+
24
+ ---
25
+
26
+ ## Layers
27
+
28
+ ### Linear
29
+
30
+ ```js
31
+ new Linear(inFeatures: number, outFeatures: number)
32
+ ```
33
+
34
+ Fully connected layer. Weight shape: [inFeatures, outFeatures]. Bias shape: [1, outFeatures].
35
+
36
+ ### Conv2D (experimental)
37
+
38
+ ```js
39
+ new Conv2D(inChannels: number, outChannels: number, kernelSize: number, stride?: number, padding?: number)
40
+ ```
41
+
42
+ 2D convolution layer. Input shape: [batch, channels, height, width].
43
+
44
+ ### Flatten
45
+
46
+ ```js
47
+ new Flatten()
48
+ ```
49
+
50
+ Flattens multi-dimensional input per sample. Preserves batch dimension.
51
+
52
+ ---
53
+
54
+ ## Activations
55
+
56
+ All activations follow the same interface:
57
+ - forward(x) — x: [batch, features], returns: [batch, features]
58
+ - backward(grad) — grad: [batch, features], returns: [batch, features]
59
+
60
+ | Class | Formula |
61
+ |-------|---------|
62
+ | ReLU() | max(0, x) |
63
+ | Sigmoid() | 1 / (1 + exp(-x)) |
64
+ | Tanh() | tanh(x) |
65
+ | LeakyReLU(alpha?) | x > 0 ? x : alpha * x |
66
+ | GELU() | 0.5 * x * (1 + tanh(sqrt(2/pi) * (x + 0.044715 * x^3))) |
67
+ | ELU(alpha?) | x > 0 ? x : alpha * (exp(x) - 1) |
68
+ | Mish() | x * tanh(ln(1 + exp(x))) |
69
+ | SiLU() | x * sigmoid(x) |
70
+ | Softmax(dim?) | exp(x - max) / sum(exp(x - max)) |
71
+
72
+ ---
73
+
74
+ ## Loss Functions
75
+
76
+ ### MSELoss
77
+
78
+ ```js
79
+ new MSELoss()
80
+ ```
81
+
82
+ loss = mean((pred - target)^2)
83
+ gradient = 2 * (pred - target) / batchSize
84
+
85
+ ### SoftmaxCrossEntropyLoss (recommended for classification)
86
+
87
+ ```js
88
+ new SoftmaxCrossEntropyLoss()
89
+ ```
90
+
91
+ Combines softmax + cross-entropy in a numerically stable way. Input: logits (not probabilities). Do NOT combine with a Softmax layer.
92
+
93
+ ### BCEWithLogitsLoss (recommended for binary classification)
94
+
95
+ ```js
96
+ new BCEWithLogitsLoss()
97
+ ```
98
+
99
+ Combines sigmoid + binary cross-entropy. Numerically stable. Input: logits (not probabilities). Do NOT combine with a Sigmoid layer.
100
+
101
+ ### CrossEntropyLoss (deprecated)
102
+
103
+ ```js
104
+ new CrossEntropyLoss()
105
+ ```
106
+
107
+ Use SoftmaxCrossEntropyLoss instead. Exists for backward compatibility.
108
+
109
+ ---
110
+
111
+ ## Optimizers
112
+
113
+ ### Adam (recommended)
114
+
115
+ ```js
116
+ new Adam(parameters, options)
117
+ // or
118
+ new Adam(parameters, lr, beta1, beta2, eps, maxGradNorm)
119
+ ```
120
+
121
+ Options object: { lr: 0.001, b1: 0.9, b2: 0.999, eps: 1e-8, max_grad_norm: 1.0 }
122
+
123
+ ### AdamW
124
+
125
+ ```js
126
+ new AdamW(parameters, options)
127
+ ```
128
+
129
+ Adam with decoupled weight decay. Options include weight_decay (default: 0.01).
130
+
131
+ ### SGD
132
+
133
+ ```js
134
+ new SGD(parameters, lr?, maxGradNorm?)
135
+ ```
136
+
137
+ ### Lion
138
+
139
+ ```js
140
+ new LION(parameters, options)
141
+ ```
142
+
143
+ Memory-efficient optimizer.
144
+
145
+ **All optimizers have:**
146
+ - step() — Update parameters using accumulated gradients
147
+ - zeroGrad() — Zero all parameter gradients
148
+
149
+ ---
150
+
151
+ ## Learning Rate Schedulers
152
+
153
+ ### StepLR
154
+
155
+ ```js
156
+ new StepLR(optimizer, stepSize, gamma)
157
+ ```
158
+
159
+ Multiplies LR by gamma every stepSize steps.
160
+
161
+ ### LambdaLR
162
+
163
+ ```js
164
+ new LambdaLR(optimizer, fn)
165
+ ```
166
+
167
+ Sets LR = baseLr * fn(epoch) on each step.
168
+
169
+ ### ReduceLROnPlateau
170
+
171
+ ```js
172
+ new ReduceLROnPlateau(optimizer, options)
173
+ ```
174
+
175
+ Reduces LR when loss stops improving. Options: patience, factor, min_lr, threshold, cooldown, verbose.
176
+
177
+ ---
178
+
179
+ ## Regularization
180
+
181
+ ### Dropout
182
+
183
+ ```js
184
+ new Dropout(p?)
185
+ ```
186
+
187
+ Randomly zeros p fraction of neurons during training. Default p = 0.5. Call layer.train() / layer.eval() to toggle.
188
+
189
+ ### BatchNorm2d (experimental)
190
+
191
+ ```js
192
+ new BatchNorm2d(numFeatures, eps?, momentum?, affine?)
193
+ ```
194
+
195
+ 2D batch normalization. Input shape: [batch, channels, height, width].
196
+
197
+ ---
198
+
199
+ ## Tokenizer
200
+
201
+ ```js
202
+ new Tokenizer(vocabSize?)
203
+ ```
204
+
205
+ | Method | Description |
206
+ |--------|-------------|
207
+ | fit(texts) | Build vocabulary from text array |
208
+ | transform(texts, maxLength?, padToMax?) | Convert texts to token indices |
209
+ | fitTransform(texts, maxLength?, padToMax?) | Fit + transform in one call |
210
+ | inverseTransform(tokens, skipPad?) | Convert token indices back to text |
211
+ | getVocabulary() | Get vocabulary as string array |
212
+ | getVocabSize() | Get vocabulary size |
213
+ | getWordCounts() | Get word frequency map |
214
+ | getMostCommon(n?) | Get top N most frequent words |
215
+
216
+ ---
217
+
218
+ ## Utilities
219
+
220
+ zeros(rows, cols) — Matrix filled with 0
221
+ ones(rows, cols) — Matrix filled with 1
222
+ randomMatrix(rows, cols, scale?) — Random matrix (Xavier init if scale omitted)
223
+ transpose(matrix) — Matrix transpose
224
+ dot(a, b) — Matrix multiplication
225
+ addMatrices(a, b) — Element-wise addition
226
+ softmax(vector) — Softmax on 1D array
227
+ crossEntropy(pred, target) — Cross-entropy loss (scalar)
228
+ reshape(tensor, rows, cols) — Reshape to new dimensions
229
+ flattenBatch(batch) — Flatten batch to 2D
230
+ concat(a, b, axis) — Concatenate along axis 0 or 1
231
+ stack(tensors) — Stack tensors
232
+
233
+ ---
234
+
235
+ ## Model Persistence
236
+
237
+ saveModel(model) — Serialize model to JSON string
238
+ loadModel(model, json) — Load weights from JSON into model
239
+
240
+ Supports all layer types. Validates layer types and shapes. Logs warnings for mismatches.
241
+
242
+ ---
243
+
244
+ ## Tensor (Advanced)
245
+
246
+ ```js
247
+ new Tensor(data, requiresGrad?)
248
+ ```
249
+
250
+ | Method | Description |
251
+ |--------|-------------|
252
+ | add(tensor) | Element-wise addition |
253
+ | mul(tensor) | Element-wise multiplication |
254
+ | matmul(tensor) | Matrix multiplication |
255
+ | transpose() | Transpose tensor |
256
+ | flatten() | Flatten to 1D array |
257
+ | shape() | Returns [rows, cols] |
258
+
259
+ Static: Tensor.zeros(r,c), Tensor.ones(r,c), Tensor.random(r,c,scale?)
260
+
261
+ ---
262
+
263
+ ## User-Friendly Utilities (fu_*)
264
+
265
+ fu_tensor(data, requiresGrad?) — Create tensor from 2D array
266
+ fu_add(a, b) — Element-wise add
267
+ fu_mul(a, b) — Element-wise multiply
268
+ fu_matmul(a, b) — Matrix multiply
269
+ fu_sum(tensor) — Sum all elements
270
+ fu_mean(tensor) — Mean of all elements
271
+ fu_relu(tensor) — ReLU activation
272
+ fu_sigmoid(tensor) — Sigmoid activation
273
+ fu_tanh(tensor) — Tanh activation
274
+ fu_softmax(tensor) — Softmax activation
275
+ fu_flatten(tensor) — Flatten to 1D
276
+ fu_reshape(tensor, rows, cols) — Reshape tensor
277
+ fu_stack(tensors) — Stack tensors
package/README.md CHANGED
@@ -1,4 +1,4 @@
1
- ## Mini-JSTorch (MAJOR UPDATE)
1
+ ## Mini-JSTorch (v2.0.2)
2
2
 
3
3
  ---
4
4
 
@@ -7,15 +7,14 @@ It runs in Node.js and modern browsers, with a simple API inspired by PyTorch-st
7
7
 
8
8
  This project prioritizes `clarity`, `numerical correctness`, and `accessibility` over performance or large-scale production use.
9
9
 
10
- In this version `2.0.0`, we introduce:
11
- - **Fixed Linear layer cache** (critical bug fix for training)
12
- - **Fixed GELU gradient calculation**
13
- - **Fixed MSELoss gradient scaling**
14
- - **Optimized Softmax gradient** (O(n²) → O(n))
15
- - **Improved Tokenizer** with proper PAD/UNK separation
16
- - **Added Sequential.zeroGrad(), train(), eval(), stateDict() methods**
10
+ ### Changelog
17
11
 
18
- ---
12
+ **v2.0.2:**
13
+ - **Fixed critical training bug:** Optimizers (Adam, SGD, AdamW, Lion) now correctly update Linear and Conv2D layer weights
14
+ - **Fixed BatchNorm2d:** Inference mode no longer produces NaN for multi-channel inputs
15
+ - **Fixed ELU activation:** Backward pass now uses correct derivative formula
16
+ - **Fixed saveModel/loadModel:** Now correctly saves and restores all layer types including Conv2D and BatchNorm2d
17
+ - **Fixed BatchNorm2d gradient zeroing:** gradWeight/gradBias now correctly reset between batches
19
18
 
20
19
  **⚠️ BREAKING CHANGES in v2.0.0:**
21
20
  - Tokenizer API: `tokenizeBatch()` → `transform()`, `detokenizeBatch()` → `inverseTransform()`
@@ -80,7 +79,6 @@ In Browser/Website:
80
79
  async function train() {
81
80
  const statusEl = document.getElementById('status');
82
81
  const logEl = document.getElementById('log');
83
-
84
82
  try {
85
83
  const model = new Sequential([
86
84
  new Linear(2, 16), new Tanh(),
@@ -107,13 +105,13 @@ In Browser/Website:
107
105
  }
108
106
  }
109
107
 
110
- statusEl.textContent = 'Done';
108
+ statusEl.textContent = 'Done';
111
109
  const preds = model.forward(X);
112
110
  document.getElementById('res').innerHTML = `<h4>Results:</h4>` +
113
111
  X.map((input, i) => `[${input}] -> <b>${preds[i][0].toFixed(4)}</b> (Target: ${y[i][0]})`).join('<br>');
114
112
 
115
113
  } catch (e) {
116
- statusEl.textContent = 'Error: ' + e.message;
114
+ statusEl.textContent = 'Error: ' + e.message;
117
115
  }
118
116
  }
119
117
  train();
@@ -202,7 +200,7 @@ git clone https://github.com/Rizal-HID11/mini-jstorch-github
202
200
 
203
201
  # Quick Start (Recommended Loss)
204
202
 
205
- # Multi-class Classification (SoftmaxCrossEntropy)
203
+ ## Multi-class Classification (SoftmaxCrossEntropy)
206
204
 
207
205
  ```javascript
208
206
  import {
@@ -247,7 +245,7 @@ for (let epoch = 1; epoch <= 300; epoch++) {
247
245
  ```
248
246
  `Important:` Do not combine `SoftmaxCrossEntropyLoss` with a `Softmax` layer.
249
247
 
250
- # Binary Classifiaction (BCEWithLogitsLoss)
248
+ ## Binary Classifiaction (BCEWithLogitsLoss)
251
249
 
252
250
  ```javascript
253
251
  import {
@@ -312,18 +310,23 @@ finalProbs.forEach((prob, i) => {
312
310
  });
313
311
  console.log(`\nAccuracy: ${(correct / X.length * 100).toFixed(2)}%`);
314
312
  ```
315
- Do not combine `BCEWithLogitsLoss` with a `Sigmoid` layer.
313
+ `Important:` Do not combine `BCEWithLogitsLoss` with a `Sigmoid` layer.
316
314
 
317
315
  ---
318
316
 
319
317
  # Save & Load Models
320
318
 
321
319
  ```javascript
322
- // WARN: Error/Bug may be expected for this time!
323
- import { saveModel, loadModel, Sequential } from "./src/jstorch.js";
320
+ import { saveModel, loadModel, Sequential } from "./src/jstorch";
324
321
 
322
+ // Save trained model
325
323
  const json = saveModel(model);
326
- const model2 = new Sequential([...]); // same architecture
324
+
325
+ // Create fresh model with same architecture and load weights
326
+ const model2 = new Sequential([
327
+ new Linear(2, 16), new ReLU(),
328
+ new Linear(16, 1)
329
+ ]);
327
330
  loadModel(model2, json);
328
331
  ```
329
332
 
@@ -360,7 +363,7 @@ node demo/<fileNameInDemo>.js
360
363
 
361
364
  MIT License
362
365
 
363
- Copyright (c) 2024
366
+ Copyright (c) 2024-2025
364
367
  rizal-editors
365
368
 
366
369
  ---
@@ -39,4 +39,4 @@ finalPred.forEach((p, i) => {
39
39
  });
40
40
  console.log(`\nAverage Error: ${(totalError / X.length).toFixed(2)}`);
41
41
  console.log(`Weight (slope): ${model.layers[0].W[0][0].toFixed(4)} (expected: 2.0)`);
42
- console.log(`Bias: ${model.layers[0].b[0].toFixed(4)} (expected: 0.0)`);
42
+ console.log(`Bias: ${model.layers[0].b[0][0].toFixed(4)} (expected: 0.0)`);
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mini-jstorch",
3
- "version": "2.0.0",
3
+ "version": "2.0.2",
4
4
  "type": "module",
5
5
  "description": "A lightweight JavaScript neural network library for learning AI concepts and rapid Frontend experimentation. PyTorch-inspired, zero dependencies, perfect for educational use.",
6
6
  "main": "index.js",
@@ -15,8 +15,7 @@
15
15
  "tiny-ml",
16
16
  "mini-neural-network",
17
17
  "mini-ml-library",
18
- "mini-js-ml",
19
- "educational-ml"
18
+ "mini-js-ml"
20
19
  ],
21
20
  "author": "Rizal",
22
21
  "license": "MIT"