mini-jstorch 2.0.0 → 2.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/Docs/API.md +277 -0
- package/README.md +22 -19
- package/demo/linear_regression.js +1 -1
- package/package.json +2 -3
- package/src/jstorch.js +380 -411
package/Docs/API.md
ADDED
|
@@ -0,0 +1,277 @@
|
|
|
1
|
+
# API Reference
|
|
2
|
+
|
|
3
|
+
## Model Container
|
|
4
|
+
|
|
5
|
+
### Sequential
|
|
6
|
+
|
|
7
|
+
```js
|
|
8
|
+
new Sequential(layers: Layer[])
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
Container that chains layers sequentially.
|
|
12
|
+
|
|
13
|
+
**Methods:**
|
|
14
|
+
- forward(x) — Pass input through all layers
|
|
15
|
+
- backward(grad) — Backpropagate gradient through all layers
|
|
16
|
+
- parameters() — Returns [{param, grad}, ...] for all trainable parameters
|
|
17
|
+
- zeroGrad() — Zero all parameter gradients
|
|
18
|
+
- train() — Set all layers to training mode
|
|
19
|
+
- eval() — Set all layers to evaluation mode
|
|
20
|
+
- stateDict() — Get {layer_0.weight, layer_0.bias, ...}
|
|
21
|
+
- loadStateDict(dict) — Load weights from state dict object
|
|
22
|
+
- step(lr) — Apply SGD step to all layers directly
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
|
|
26
|
+
## Layers
|
|
27
|
+
|
|
28
|
+
### Linear
|
|
29
|
+
|
|
30
|
+
```js
|
|
31
|
+
new Linear(inFeatures: number, outFeatures: number)
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
Fully connected layer. Weight shape: [inFeatures, outFeatures]. Bias shape: [1, outFeatures].
|
|
35
|
+
|
|
36
|
+
### Conv2D (experimental)
|
|
37
|
+
|
|
38
|
+
```js
|
|
39
|
+
new Conv2D(inChannels: number, outChannels: number, kernelSize: number, stride?: number, padding?: number)
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
2D convolution layer. Input shape: [batch, channels, height, width].
|
|
43
|
+
|
|
44
|
+
### Flatten
|
|
45
|
+
|
|
46
|
+
```js
|
|
47
|
+
new Flatten()
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
Flattens multi-dimensional input per sample. Preserves batch dimension.
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## Activations
|
|
55
|
+
|
|
56
|
+
All activations follow the same interface:
|
|
57
|
+
- forward(x) — x: [batch, features], returns: [batch, features]
|
|
58
|
+
- backward(grad) — grad: [batch, features], returns: [batch, features]
|
|
59
|
+
|
|
60
|
+
| Class | Formula |
|
|
61
|
+
|-------|---------|
|
|
62
|
+
| ReLU() | max(0, x) |
|
|
63
|
+
| Sigmoid() | 1 / (1 + exp(-x)) |
|
|
64
|
+
| Tanh() | tanh(x) |
|
|
65
|
+
| LeakyReLU(alpha?) | x > 0 ? x : alpha * x |
|
|
66
|
+
| GELU() | 0.5 * x * (1 + tanh(sqrt(2/pi) * (x + 0.044715 * x^3))) |
|
|
67
|
+
| ELU(alpha?) | x > 0 ? x : alpha * (exp(x) - 1) |
|
|
68
|
+
| Mish() | x * tanh(ln(1 + exp(x))) |
|
|
69
|
+
| SiLU() | x * sigmoid(x) |
|
|
70
|
+
| Softmax(dim?) | exp(x - max) / sum(exp(x - max)) |
|
|
71
|
+
|
|
72
|
+
---
|
|
73
|
+
|
|
74
|
+
## Loss Functions
|
|
75
|
+
|
|
76
|
+
### MSELoss
|
|
77
|
+
|
|
78
|
+
```js
|
|
79
|
+
new MSELoss()
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
loss = mean((pred - target)^2)
|
|
83
|
+
gradient = 2 * (pred - target) / batchSize
|
|
84
|
+
|
|
85
|
+
### SoftmaxCrossEntropyLoss (recommended for classification)
|
|
86
|
+
|
|
87
|
+
```js
|
|
88
|
+
new SoftmaxCrossEntropyLoss()
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
Combines softmax + cross-entropy in a numerically stable way. Input: logits (not probabilities). Do NOT combine with a Softmax layer.
|
|
92
|
+
|
|
93
|
+
### BCEWithLogitsLoss (recommended for binary classification)
|
|
94
|
+
|
|
95
|
+
```js
|
|
96
|
+
new BCEWithLogitsLoss()
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
Combines sigmoid + binary cross-entropy. Numerically stable. Input: logits (not probabilities). Do NOT combine with a Sigmoid layer.
|
|
100
|
+
|
|
101
|
+
### CrossEntropyLoss (deprecated)
|
|
102
|
+
|
|
103
|
+
```js
|
|
104
|
+
new CrossEntropyLoss()
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
Use SoftmaxCrossEntropyLoss instead. Exists for backward compatibility.
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
## Optimizers
|
|
112
|
+
|
|
113
|
+
### Adam (recommended)
|
|
114
|
+
|
|
115
|
+
```js
|
|
116
|
+
new Adam(parameters, options)
|
|
117
|
+
// or
|
|
118
|
+
new Adam(parameters, lr, beta1, beta2, eps, maxGradNorm)
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
Options object: { lr: 0.001, b1: 0.9, b2: 0.999, eps: 1e-8, max_grad_norm: 1.0 }
|
|
122
|
+
|
|
123
|
+
### AdamW
|
|
124
|
+
|
|
125
|
+
```js
|
|
126
|
+
new AdamW(parameters, options)
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
Adam with decoupled weight decay. Options include weight_decay (default: 0.01).
|
|
130
|
+
|
|
131
|
+
### SGD
|
|
132
|
+
|
|
133
|
+
```js
|
|
134
|
+
new SGD(parameters, lr?, maxGradNorm?)
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
### Lion
|
|
138
|
+
|
|
139
|
+
```js
|
|
140
|
+
new LION(parameters, options)
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
Memory-efficient optimizer.
|
|
144
|
+
|
|
145
|
+
**All optimizers have:**
|
|
146
|
+
- step() — Update parameters using accumulated gradients
|
|
147
|
+
- zeroGrad() — Zero all parameter gradients
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
## Learning Rate Schedulers
|
|
152
|
+
|
|
153
|
+
### StepLR
|
|
154
|
+
|
|
155
|
+
```js
|
|
156
|
+
new StepLR(optimizer, stepSize, gamma)
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
Multiplies LR by gamma every stepSize steps.
|
|
160
|
+
|
|
161
|
+
### LambdaLR
|
|
162
|
+
|
|
163
|
+
```js
|
|
164
|
+
new LambdaLR(optimizer, fn)
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
Sets LR = baseLr * fn(epoch) on each step.
|
|
168
|
+
|
|
169
|
+
### ReduceLROnPlateau
|
|
170
|
+
|
|
171
|
+
```js
|
|
172
|
+
new ReduceLROnPlateau(optimizer, options)
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
Reduces LR when loss stops improving. Options: patience, factor, min_lr, threshold, cooldown, verbose.
|
|
176
|
+
|
|
177
|
+
---
|
|
178
|
+
|
|
179
|
+
## Regularization
|
|
180
|
+
|
|
181
|
+
### Dropout
|
|
182
|
+
|
|
183
|
+
```js
|
|
184
|
+
new Dropout(p?)
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
Randomly zeros p fraction of neurons during training. Default p = 0.5. Call layer.train() / layer.eval() to toggle.
|
|
188
|
+
|
|
189
|
+
### BatchNorm2d (experimental)
|
|
190
|
+
|
|
191
|
+
```js
|
|
192
|
+
new BatchNorm2d(numFeatures, eps?, momentum?, affine?)
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
2D batch normalization. Input shape: [batch, channels, height, width].
|
|
196
|
+
|
|
197
|
+
---
|
|
198
|
+
|
|
199
|
+
## Tokenizer
|
|
200
|
+
|
|
201
|
+
```js
|
|
202
|
+
new Tokenizer(vocabSize?)
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
| Method | Description |
|
|
206
|
+
|--------|-------------|
|
|
207
|
+
| fit(texts) | Build vocabulary from text array |
|
|
208
|
+
| transform(texts, maxLength?, padToMax?) | Convert texts to token indices |
|
|
209
|
+
| fitTransform(texts, maxLength?, padToMax?) | Fit + transform in one call |
|
|
210
|
+
| inverseTransform(tokens, skipPad?) | Convert token indices back to text |
|
|
211
|
+
| getVocabulary() | Get vocabulary as string array |
|
|
212
|
+
| getVocabSize() | Get vocabulary size |
|
|
213
|
+
| getWordCounts() | Get word frequency map |
|
|
214
|
+
| getMostCommon(n?) | Get top N most frequent words |
|
|
215
|
+
|
|
216
|
+
---
|
|
217
|
+
|
|
218
|
+
## Utilities
|
|
219
|
+
|
|
220
|
+
zeros(rows, cols) — Matrix filled with 0
|
|
221
|
+
ones(rows, cols) — Matrix filled with 1
|
|
222
|
+
randomMatrix(rows, cols, scale?) — Random matrix (Xavier init if scale omitted)
|
|
223
|
+
transpose(matrix) — Matrix transpose
|
|
224
|
+
dot(a, b) — Matrix multiplication
|
|
225
|
+
addMatrices(a, b) — Element-wise addition
|
|
226
|
+
softmax(vector) — Softmax on 1D array
|
|
227
|
+
crossEntropy(pred, target) — Cross-entropy loss (scalar)
|
|
228
|
+
reshape(tensor, rows, cols) — Reshape to new dimensions
|
|
229
|
+
flattenBatch(batch) — Flatten batch to 2D
|
|
230
|
+
concat(a, b, axis) — Concatenate along axis 0 or 1
|
|
231
|
+
stack(tensors) — Stack tensors
|
|
232
|
+
|
|
233
|
+
---
|
|
234
|
+
|
|
235
|
+
## Model Persistence
|
|
236
|
+
|
|
237
|
+
saveModel(model) — Serialize model to JSON string
|
|
238
|
+
loadModel(model, json) — Load weights from JSON into model
|
|
239
|
+
|
|
240
|
+
Supports all layer types. Validates layer types and shapes. Logs warnings for mismatches.
|
|
241
|
+
|
|
242
|
+
---
|
|
243
|
+
|
|
244
|
+
## Tensor (Advanced)
|
|
245
|
+
|
|
246
|
+
```js
|
|
247
|
+
new Tensor(data, requiresGrad?)
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
| Method | Description |
|
|
251
|
+
|--------|-------------|
|
|
252
|
+
| add(tensor) | Element-wise addition |
|
|
253
|
+
| mul(tensor) | Element-wise multiplication |
|
|
254
|
+
| matmul(tensor) | Matrix multiplication |
|
|
255
|
+
| transpose() | Transpose tensor |
|
|
256
|
+
| flatten() | Flatten to 1D array |
|
|
257
|
+
| shape() | Returns [rows, cols] |
|
|
258
|
+
|
|
259
|
+
Static: Tensor.zeros(r,c), Tensor.ones(r,c), Tensor.random(r,c,scale?)
|
|
260
|
+
|
|
261
|
+
---
|
|
262
|
+
|
|
263
|
+
## User-Friendly Utilities (fu_*)
|
|
264
|
+
|
|
265
|
+
fu_tensor(data, requiresGrad?) — Create tensor from 2D array
|
|
266
|
+
fu_add(a, b) — Element-wise add
|
|
267
|
+
fu_mul(a, b) — Element-wise multiply
|
|
268
|
+
fu_matmul(a, b) — Matrix multiply
|
|
269
|
+
fu_sum(tensor) — Sum all elements
|
|
270
|
+
fu_mean(tensor) — Mean of all elements
|
|
271
|
+
fu_relu(tensor) — ReLU activation
|
|
272
|
+
fu_sigmoid(tensor) — Sigmoid activation
|
|
273
|
+
fu_tanh(tensor) — Tanh activation
|
|
274
|
+
fu_softmax(tensor) — Softmax activation
|
|
275
|
+
fu_flatten(tensor) — Flatten to 1D
|
|
276
|
+
fu_reshape(tensor, rows, cols) — Reshape tensor
|
|
277
|
+
fu_stack(tensors) — Stack tensors
|
package/README.md
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
## Mini-JSTorch (
|
|
1
|
+
## Mini-JSTorch (v2.0.2)
|
|
2
2
|
|
|
3
3
|
---
|
|
4
4
|
|
|
@@ -7,15 +7,14 @@ It runs in Node.js and modern browsers, with a simple API inspired by PyTorch-st
|
|
|
7
7
|
|
|
8
8
|
This project prioritizes `clarity`, `numerical correctness`, and `accessibility` over performance or large-scale production use.
|
|
9
9
|
|
|
10
|
-
|
|
11
|
-
- **Fixed Linear layer cache** (critical bug fix for training)
|
|
12
|
-
- **Fixed GELU gradient calculation**
|
|
13
|
-
- **Fixed MSELoss gradient scaling**
|
|
14
|
-
- **Optimized Softmax gradient** (O(n²) → O(n))
|
|
15
|
-
- **Improved Tokenizer** with proper PAD/UNK separation
|
|
16
|
-
- **Added Sequential.zeroGrad(), train(), eval(), stateDict() methods**
|
|
10
|
+
### Changelog
|
|
17
11
|
|
|
18
|
-
|
|
12
|
+
**v2.0.2:**
|
|
13
|
+
- **Fixed critical training bug:** Optimizers (Adam, SGD, AdamW, Lion) now correctly update Linear and Conv2D layer weights
|
|
14
|
+
- **Fixed BatchNorm2d:** Inference mode no longer produces NaN for multi-channel inputs
|
|
15
|
+
- **Fixed ELU activation:** Backward pass now uses correct derivative formula
|
|
16
|
+
- **Fixed saveModel/loadModel:** Now correctly saves and restores all layer types including Conv2D and BatchNorm2d
|
|
17
|
+
- **Fixed BatchNorm2d gradient zeroing:** gradWeight/gradBias now correctly reset between batches
|
|
19
18
|
|
|
20
19
|
**⚠️ BREAKING CHANGES in v2.0.0:**
|
|
21
20
|
- Tokenizer API: `tokenizeBatch()` → `transform()`, `detokenizeBatch()` → `inverseTransform()`
|
|
@@ -80,7 +79,6 @@ In Browser/Website:
|
|
|
80
79
|
async function train() {
|
|
81
80
|
const statusEl = document.getElementById('status');
|
|
82
81
|
const logEl = document.getElementById('log');
|
|
83
|
-
|
|
84
82
|
try {
|
|
85
83
|
const model = new Sequential([
|
|
86
84
|
new Linear(2, 16), new Tanh(),
|
|
@@ -107,13 +105,13 @@ In Browser/Website:
|
|
|
107
105
|
}
|
|
108
106
|
}
|
|
109
107
|
|
|
110
|
-
statusEl.textContent = '
|
|
108
|
+
statusEl.textContent = 'Done';
|
|
111
109
|
const preds = model.forward(X);
|
|
112
110
|
document.getElementById('res').innerHTML = `<h4>Results:</h4>` +
|
|
113
111
|
X.map((input, i) => `[${input}] -> <b>${preds[i][0].toFixed(4)}</b> (Target: ${y[i][0]})`).join('<br>');
|
|
114
112
|
|
|
115
113
|
} catch (e) {
|
|
116
|
-
statusEl.textContent = '
|
|
114
|
+
statusEl.textContent = 'Error: ' + e.message;
|
|
117
115
|
}
|
|
118
116
|
}
|
|
119
117
|
train();
|
|
@@ -202,7 +200,7 @@ git clone https://github.com/Rizal-HID11/mini-jstorch-github
|
|
|
202
200
|
|
|
203
201
|
# Quick Start (Recommended Loss)
|
|
204
202
|
|
|
205
|
-
|
|
203
|
+
## Multi-class Classification (SoftmaxCrossEntropy)
|
|
206
204
|
|
|
207
205
|
```javascript
|
|
208
206
|
import {
|
|
@@ -247,7 +245,7 @@ for (let epoch = 1; epoch <= 300; epoch++) {
|
|
|
247
245
|
```
|
|
248
246
|
`Important:` Do not combine `SoftmaxCrossEntropyLoss` with a `Softmax` layer.
|
|
249
247
|
|
|
250
|
-
|
|
248
|
+
## Binary Classifiaction (BCEWithLogitsLoss)
|
|
251
249
|
|
|
252
250
|
```javascript
|
|
253
251
|
import {
|
|
@@ -312,18 +310,23 @@ finalProbs.forEach((prob, i) => {
|
|
|
312
310
|
});
|
|
313
311
|
console.log(`\nAccuracy: ${(correct / X.length * 100).toFixed(2)}%`);
|
|
314
312
|
```
|
|
315
|
-
Do not combine `BCEWithLogitsLoss` with a `Sigmoid` layer.
|
|
313
|
+
`Important:` Do not combine `BCEWithLogitsLoss` with a `Sigmoid` layer.
|
|
316
314
|
|
|
317
315
|
---
|
|
318
316
|
|
|
319
317
|
# Save & Load Models
|
|
320
318
|
|
|
321
319
|
```javascript
|
|
322
|
-
|
|
323
|
-
import { saveModel, loadModel, Sequential } from "./src/jstorch.js";
|
|
320
|
+
import { saveModel, loadModel, Sequential } from "./src/jstorch";
|
|
324
321
|
|
|
322
|
+
// Save trained model
|
|
325
323
|
const json = saveModel(model);
|
|
326
|
-
|
|
324
|
+
|
|
325
|
+
// Create fresh model with same architecture and load weights
|
|
326
|
+
const model2 = new Sequential([
|
|
327
|
+
new Linear(2, 16), new ReLU(),
|
|
328
|
+
new Linear(16, 1)
|
|
329
|
+
]);
|
|
327
330
|
loadModel(model2, json);
|
|
328
331
|
```
|
|
329
332
|
|
|
@@ -360,7 +363,7 @@ node demo/<fileNameInDemo>.js
|
|
|
360
363
|
|
|
361
364
|
MIT License
|
|
362
365
|
|
|
363
|
-
Copyright (c) 2024
|
|
366
|
+
Copyright (c) 2024-2025
|
|
364
367
|
rizal-editors
|
|
365
368
|
|
|
366
369
|
---
|
|
@@ -39,4 +39,4 @@ finalPred.forEach((p, i) => {
|
|
|
39
39
|
});
|
|
40
40
|
console.log(`\nAverage Error: ${(totalError / X.length).toFixed(2)}`);
|
|
41
41
|
console.log(`Weight (slope): ${model.layers[0].W[0][0].toFixed(4)} (expected: 2.0)`);
|
|
42
|
-
console.log(`Bias: ${model.layers[0].b[0].toFixed(4)} (expected: 0.0)`);
|
|
42
|
+
console.log(`Bias: ${model.layers[0].b[0][0].toFixed(4)} (expected: 0.0)`);
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "mini-jstorch",
|
|
3
|
-
"version": "2.0.
|
|
3
|
+
"version": "2.0.2",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"description": "A lightweight JavaScript neural network library for learning AI concepts and rapid Frontend experimentation. PyTorch-inspired, zero dependencies, perfect for educational use.",
|
|
6
6
|
"main": "index.js",
|
|
@@ -15,8 +15,7 @@
|
|
|
15
15
|
"tiny-ml",
|
|
16
16
|
"mini-neural-network",
|
|
17
17
|
"mini-ml-library",
|
|
18
|
-
"mini-js-ml"
|
|
19
|
-
"educational-ml"
|
|
18
|
+
"mini-js-ml"
|
|
20
19
|
],
|
|
21
20
|
"author": "Rizal",
|
|
22
21
|
"license": "MIT"
|