@dniskav/neuron 0.2.7 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +456 -195
- package/dist/index.d.mts +470 -1
- package/dist/index.d.ts +470 -1
- package/dist/index.js +3023 -2
- package/dist/index.mjs +2985 -2
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
|
|
4
4
|
A minimal, dependency-free neural network library built from scratch in TypeScript. Designed for learning and experimentation — every line of math is readable.
|
|
5
5
|
|
|
6
|
-
Each class is a building block for the next: from a single neuron to a full Transformer with causal attention.
|
|
6
|
+
Each class is a building block for the next: from a single neuron to a full Transformer with causal attention. v0.3.0 adds classical ML, unsupervised learning, generative models, autograd, and training utilities — all in pure TypeScript, zero dependencies.
|
|
7
7
|
|
|
8
8
|
```mermaid
|
|
9
9
|
graph TD
|
|
@@ -20,13 +20,43 @@ graph TD
|
|
|
20
20
|
K["NetworkTransformer\nembeddings → blocks → per-token logits"]
|
|
21
21
|
L["NetworkTransformerRL\ncontinuous projection → causal attention → Q-values"]
|
|
22
22
|
|
|
23
|
+
subgraph Classical ML
|
|
24
|
+
P["Perceptron\nstep function · Rosenblatt rule"]
|
|
25
|
+
LR["LinearRegression\nnormal equation · gradient descent"]
|
|
26
|
+
LOG["LogisticRegression\nsigmoid · BCE · SoftmaxRegression"]
|
|
27
|
+
NB["GaussianNaiveBayes\nlog-probabilities · Gaussian P(x|c)"]
|
|
28
|
+
DT["DecisionTree\nCART · Gini · MSE split"]
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
subgraph Unsupervised
|
|
32
|
+
KM["KMeans\nK-Means++ · inertia · elbow"]
|
|
33
|
+
PCA["PCA\npower iteration · projection · reconstruction"]
|
|
34
|
+
SOM["SOM\nKohonen · BMU · Gaussian neighborhood"]
|
|
35
|
+
HN["HopfieldNetwork\nHebbian · energy · associative memory"]
|
|
36
|
+
AE["Autoencoder\nencoder · bottleneck · decoder"]
|
|
37
|
+
end
|
|
38
|
+
|
|
39
|
+
subgraph Generative
|
|
40
|
+
GAN["GAN\ngenerator · discriminator · min-max"]
|
|
41
|
+
VAE["VAE\nreparametrization trick · ELBO · KL"]
|
|
42
|
+
end
|
|
43
|
+
|
|
44
|
+
subgraph Autograd
|
|
45
|
+
TAP["Value / Tape\nreverse-mode · computational graph · backward"]
|
|
46
|
+
end
|
|
47
|
+
|
|
23
48
|
A --> B --> C --> D --> E
|
|
24
49
|
E --> F --> G
|
|
25
50
|
E --> H --> I --> J --> K --> L
|
|
51
|
+
E --> AE
|
|
52
|
+
E --> GAN
|
|
53
|
+
E --> VAE
|
|
26
54
|
```
|
|
27
55
|
|
|
28
56
|
## What's inside
|
|
29
57
|
|
|
58
|
+
### Neural network building blocks
|
|
59
|
+
|
|
30
60
|
| Export | Description |
|
|
31
61
|
|--------|-------------|
|
|
32
62
|
| `Neuron` | Single-input neuron. The simplest possible unit: one weight, one bias. |
|
|
@@ -36,20 +66,115 @@ graph TD
|
|
|
36
66
|
| `NetworkN` | Deep network of arbitrary depth. Define your architecture as `[inputs, ...hidden, outputs]`. |
|
|
37
67
|
| `LSTMLayer` | Recurrent layer with persistent hidden and cell state. Learns sequences via BPTT. |
|
|
38
68
|
| `NetworkLSTM` | Wraps an `LSTMLayer` + dense layers. Maintains memory across steps within an episode. |
|
|
69
|
+
| `GRULayer` | Gated Recurrent Unit — lighter alternative to LSTM, two gates instead of three. |
|
|
39
70
|
| `NetworkTransformer` | Full token-classification Transformer: embeddings → N blocks → per-token logits. |
|
|
40
|
-
| `NetworkTransformerRL` | Transformer for RL agents: continuous input projection → causal attention → Q-values.
|
|
71
|
+
| `NetworkTransformerRL` | Transformer for RL agents: continuous input projection → causal attention → Q-values. |
|
|
41
72
|
| `TransformerBlock` | One Transformer block: multi-head attention + FFN + LayerNorm × 2 with residuals. |
|
|
42
73
|
| `MultiHeadAttention` | N parallel attention heads concatenated and projected to `d_model`. |
|
|
43
74
|
| `AttentionHead` | Single scaled dot-product self-attention head (Q / K / V projections + backprop). |
|
|
75
|
+
|
|
76
|
+
### Layers & components
|
|
77
|
+
|
|
78
|
+
| Export | Description |
|
|
79
|
+
|--------|-------------|
|
|
80
|
+
| `Conv1D` | 1D convolution over sequences. Multi-channel, configurable stride and padding. |
|
|
81
|
+
| `Conv2D` | 2D convolution for images. Kernels `[filters][kH][kW][C]`, full forward + backward. |
|
|
82
|
+
| `MaxPool2D` | Max pooling 2D. Stores position mask for exact gradient routing in backprop. |
|
|
83
|
+
| `Flatten` | Converts `[H][W][C]` tensors to flat vectors. Bridges Conv layers to dense layers. |
|
|
84
|
+
| `RNN` | Vanilla RNN with BPTT. Explicitly shows where and why gradients vanish. |
|
|
85
|
+
| `Seq2Seq` | Encoder + Decoder LSTMs with context vector transfer. Teacher forcing in training. |
|
|
86
|
+
| `CausalConv1D` | Causal dilated 1D convolution. One building block of a TCN. |
|
|
87
|
+
| `TCN` | Temporal Convolutional Network. Stacks causal dilated convolutions for sequences without recurrence. |
|
|
44
88
|
| `LayerNorm` | Layer normalization with learnable γ / β per feature. |
|
|
45
|
-
| `
|
|
46
|
-
| `
|
|
89
|
+
| `BatchNorm` | Batch normalization with running mean/variance for inference. |
|
|
90
|
+
| `Dropout` | Inverted dropout for regularization. Active only during training. |
|
|
91
|
+
| `WeightMatrix` | 2D weight matrix with per-scalar Adam optimizers and optional gradient clipping. |
|
|
92
|
+
| `BiasVector` | 1D bias vector with per-scalar Adam optimizers. |
|
|
47
93
|
| `EmbeddingMatrix` | Lookup-table embedding matrix with SGD updates. |
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
|
52
|
-
|
|
94
|
+
|
|
95
|
+
### Classical ML
|
|
96
|
+
|
|
97
|
+
| Export | Description |
|
|
98
|
+
|--------|-------------|
|
|
99
|
+
| `Perceptron` | The historical Rosenblatt perceptron (1957). Step function, linear rule. Shows why XOR is impossible. |
|
|
100
|
+
| `LinearRegression` | Closed-form normal equation `(XᵀX)⁻¹Xᵀy` + gradient descent mode. Pure array arithmetic. |
|
|
101
|
+
| `LogisticRegression` | Sigmoid + binary cross-entropy, no hidden layers. The boundary between classical ML and neural nets. |
|
|
102
|
+
| `SoftmaxRegression` | Multinomial logistic regression. Log-sum-exp trick for numerical stability. |
|
|
103
|
+
| `GaussianNaiveBayes` | `P(c|x) ∝ P(c)·∏P(xᵢ|c)` in log-space. Zero gradient descent — pure Bayes. |
|
|
104
|
+
| `DecisionTree` | CART with Gini impurity (classification) or variance (regression). Fully recursive. |
|
|
105
|
+
|
|
106
|
+
### Unsupervised learning
|
|
107
|
+
|
|
108
|
+
| Export | Description |
|
|
109
|
+
|--------|-------------|
|
|
110
|
+
| `KMeans` | K-Means++ initialization + Lloyd's algorithm. `inertia()` for the elbow method. |
|
|
111
|
+
| `PCA` | Principal Component Analysis via power iteration + Hotelling deflation. Projects, reconstructs, explains variance. |
|
|
112
|
+
| `SOM` | Self-Organizing Map (Kohonen). BMU search, Gaussian neighborhood, topology preservation. |
|
|
113
|
+
| `HopfieldNetwork` | Associative memory. Hebbian storage, energy function, async recall. Capacity ~0.138·N. |
|
|
114
|
+
| `Autoencoder` | Encoder + bottleneck + decoder using two `NetworkN` instances. Learns compressed representations. |
|
|
115
|
+
|
|
116
|
+
### Generative models
|
|
117
|
+
|
|
118
|
+
| Export | Description |
|
|
119
|
+
|--------|-------------|
|
|
120
|
+
| `GAN` | Generator vs Discriminator min-max game. Documents Nash equilibrium and mode collapse. |
|
|
121
|
+
| `VAE` | Variational Autoencoder. Reparametrization trick, ELBO = reconstruction + KL divergence. |
|
|
122
|
+
|
|
123
|
+
### Automatic differentiation
|
|
124
|
+
|
|
125
|
+
| Export | Description |
|
|
126
|
+
|--------|-------------|
|
|
127
|
+
| `Value` | Scalar autograd node. Builds a computational graph and propagates gradients with `.backward()`. Inspired by micrograd. |
|
|
128
|
+
|
|
129
|
+
### Activations & math
|
|
130
|
+
|
|
131
|
+
| Export | Description |
|
|
132
|
+
|--------|-------------|
|
|
133
|
+
| `sigmoid` `relu` `tanh` `linear` `leakyRelu` `elu` | Built-in activation functions with `fn` and `dfn` (derivative from output). |
|
|
134
|
+
| `makeLeakyRelu(α)` `makeElu(α)` | Parametric variants. |
|
|
135
|
+
| `matMul` `transpose` `softmax` `softmaxBackward` | Matrix math utilities. |
|
|
136
|
+
|
|
137
|
+
### Optimizers
|
|
138
|
+
|
|
139
|
+
| Export | Description |
|
|
140
|
+
|--------|-------------|
|
|
141
|
+
| `SGD` | Vanilla stochastic gradient descent. Stateless. |
|
|
142
|
+
| `Momentum` | Accumulates velocity in the gradient direction. |
|
|
143
|
+
| `Adam` | Adaptive moment estimation. Per-parameter first and second moments with bias correction. |
|
|
144
|
+
| `ClipOptimizer` | Wraps any optimizer with gradient clipping. |
|
|
145
|
+
| `ClippedOptimizerFactory` | Factory wrapper that clips all created optimizers. |
|
|
146
|
+
| `defaultOptimizer` | Default factory (`() => new SGD()`). Shared fallback across all classes. |
|
|
147
|
+
|
|
148
|
+
### Loss functions
|
|
149
|
+
|
|
150
|
+
| Export | Description |
|
|
151
|
+
|--------|-------------|
|
|
152
|
+
| `mse` `crossEntropy` | Scalar loss functions for evaluation and logging. |
|
|
153
|
+
| `mseDelta` `crossEntropyDelta` `crossEntropyDeltaRaw` | Output-layer delta functions for `trainWithDeltas`. |
|
|
154
|
+
|
|
155
|
+
### Metrics & evaluation
|
|
156
|
+
|
|
157
|
+
| Export | Description |
|
|
158
|
+
|--------|-------------|
|
|
159
|
+
| `confusionMatrix` | Returns `number[][]` confusion matrix. |
|
|
160
|
+
| `accuracy` `precision` `recall` `f1Score` | Standard classification metrics. |
|
|
161
|
+
| `rocCurve` `auc` | ROC curve points and area under the curve (trapezoidal rule). |
|
|
162
|
+
| `mae` `rmse` `r2Score` | Regression metrics. |
|
|
163
|
+
| `perplexity` | `exp(mean cross-entropy)` — natural metric for language models. |
|
|
164
|
+
| `printConfusionMatrix` `classificationReport` | Console-formatted output tables. |
|
|
165
|
+
|
|
166
|
+
### Training utilities
|
|
167
|
+
|
|
168
|
+
| Export | Description |
|
|
169
|
+
|--------|-------------|
|
|
170
|
+
| `Trainer` | Training loop with epochs, batches, metrics, and callbacks. |
|
|
171
|
+
| `DataLoader` | Dataset wrapper with shuffling and validation split. |
|
|
172
|
+
| `LRScheduler` | Learning rate schedules (step, exponential, cosine). |
|
|
173
|
+
| `EarlyStopping` | Stops training when a metric stalls. Configurable patience, mode, and best-weight restore. |
|
|
174
|
+
| `LossPlotter` | Renders a loss curve as ASCII art in the terminal. |
|
|
175
|
+
| `WeightInspector` | Per-layer weight statistics (mean, std, dead weights). Detects dead ReLUs. |
|
|
176
|
+
| `DataAugmentation` | Noise, jitter, normalization, z-score, shuffle, train/val/test split. |
|
|
177
|
+
| `ModelSaver` | Universal serialization via flat `getWeights()` / `setWeights()`. |
|
|
53
178
|
|
|
54
179
|
## Install
|
|
55
180
|
|
|
@@ -66,303 +191,439 @@ import { Neuron } from "@dniskav/neuron";
|
|
|
66
191
|
|
|
67
192
|
const neuron = new Neuron();
|
|
68
193
|
|
|
69
|
-
// Train: output 1 if input >= 18, else 0
|
|
70
194
|
for (let epoch = 0; epoch < 1000; epoch++) {
|
|
71
195
|
neuron.train(20, 1, 0.1); // adult
|
|
72
196
|
neuron.train(15, 0, 0.1); // minor
|
|
73
197
|
}
|
|
74
198
|
|
|
75
|
-
console.log(neuron.predict(17)); // ~0.1
|
|
76
|
-
console.log(neuron.predict(25)); // ~0.9
|
|
199
|
+
console.log(neuron.predict(17)); // ~0.1
|
|
200
|
+
console.log(neuron.predict(25)); // ~0.9
|
|
77
201
|
```
|
|
78
202
|
|
|
79
|
-
###
|
|
203
|
+
### NetworkN — deep network with custom architecture
|
|
80
204
|
|
|
81
205
|
```ts
|
|
82
|
-
import {
|
|
83
|
-
|
|
84
|
-
const neuron = new NeuronN(3); // 3 inputs: R, G, B
|
|
206
|
+
import { NetworkN, relu, sigmoid, Adam } from "@dniskav/neuron";
|
|
85
207
|
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
208
|
+
const net = new NetworkN([3, 64, 32, 1], {
|
|
209
|
+
activations: [relu, relu, sigmoid],
|
|
210
|
+
optimizer: () => new Adam(),
|
|
211
|
+
});
|
|
89
212
|
|
|
90
|
-
|
|
213
|
+
net.train([0.5, 0.3, 0.8], [1], 0.001);
|
|
214
|
+
const [out] = net.predict([0.5, 0.3, 0.8]);
|
|
91
215
|
```
|
|
92
216
|
|
|
93
|
-
###
|
|
217
|
+
### Historical Perceptron — step function, no hidden layers
|
|
94
218
|
|
|
95
219
|
```ts
|
|
96
|
-
import {
|
|
97
|
-
|
|
98
|
-
// 2 inputs → 8 hidden neurons → 1 output
|
|
99
|
-
const net = new Network(2, 8, 1);
|
|
220
|
+
import { Perceptron } from "@dniskav/neuron";
|
|
100
221
|
|
|
101
|
-
|
|
102
|
-
const data = [[0,0,0], [0,1,1], [1,0,1], [1,1,0]];
|
|
222
|
+
const p = new Perceptron(2);
|
|
103
223
|
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
}
|
|
224
|
+
// Learns AND gate (linearly separable)
|
|
225
|
+
const data = [[0,0,0],[0,1,0],[1,0,0],[1,1,1]];
|
|
226
|
+
for (let e = 0; e < 100; e++)
|
|
227
|
+
for (const [a, b, t] of data) p.train([a, b], t, 0.1);
|
|
109
228
|
|
|
110
|
-
console.log(
|
|
111
|
-
console.log(
|
|
229
|
+
console.log(p.predict([1, 1])); // 1
|
|
230
|
+
console.log(p.predict([0, 1])); // 0
|
|
231
|
+
// XOR cannot be learned — not linearly separable
|
|
112
232
|
```
|
|
113
233
|
|
|
114
|
-
###
|
|
234
|
+
### Linear Regression — normal equation
|
|
115
235
|
|
|
116
236
|
```ts
|
|
117
|
-
import {
|
|
237
|
+
import { LinearRegression } from "@dniskav/neuron";
|
|
118
238
|
|
|
119
|
-
|
|
120
|
-
const net = new NetworkN([3, 24, 16, 2]);
|
|
239
|
+
const model = new LinearRegression();
|
|
121
240
|
|
|
122
|
-
//
|
|
123
|
-
|
|
241
|
+
// Exact closed-form solution in one call
|
|
242
|
+
model.fitNormal(
|
|
243
|
+
[[1], [2], [3], [4]], // X
|
|
244
|
+
[2, 4, 6, 8] // y = 2x
|
|
245
|
+
);
|
|
124
246
|
|
|
125
|
-
//
|
|
126
|
-
|
|
247
|
+
console.log(model.predict([5])); // ~10
|
|
248
|
+
console.log(model.getCoefficients()); // { weights: [2], bias: ~0 }
|
|
127
249
|
```
|
|
128
250
|
|
|
129
|
-
###
|
|
251
|
+
### Logistic Regression — sigmoid + BCE
|
|
252
|
+
|
|
253
|
+
```ts
|
|
254
|
+
import { LogisticRegression } from "@dniskav/neuron";
|
|
255
|
+
|
|
256
|
+
const clf = new LogisticRegression(2);
|
|
257
|
+
const lossHistory = clf.train(
|
|
258
|
+
[[0,0],[1,1],[1,0],[0,1]],
|
|
259
|
+
[0, 1, 1, 0],
|
|
260
|
+
0.1, 500
|
|
261
|
+
);
|
|
262
|
+
|
|
263
|
+
console.log(clf.classify([0.9, 0.9])); // 1
|
|
264
|
+
console.log(clf.classify([0.1, 0.1])); // 0
|
|
265
|
+
```
|
|
130
266
|
|
|
131
|
-
|
|
267
|
+
### Gaussian Naive Bayes — zero gradient descent
|
|
132
268
|
|
|
133
269
|
```ts
|
|
134
|
-
import {
|
|
270
|
+
import { GaussianNaiveBayes } from "@dniskav/neuron";
|
|
135
271
|
|
|
136
|
-
const
|
|
137
|
-
|
|
138
|
-
|
|
272
|
+
const nb = new GaussianNaiveBayes();
|
|
273
|
+
nb.fit(
|
|
274
|
+
[[1.2, 0.5], [1.4, 0.7], [5.0, 4.5], [5.2, 4.8]],
|
|
275
|
+
[0, 0, 1, 1]
|
|
276
|
+
);
|
|
277
|
+
|
|
278
|
+
console.log(nb.predict([1.3, 0.6])); // 0
|
|
279
|
+
console.log(nb.predict([5.1, 4.6])); // 1
|
|
139
280
|
```
|
|
140
281
|
|
|
141
|
-
|
|
282
|
+
### Decision Tree — Gini split
|
|
142
283
|
|
|
143
|
-
|
|
284
|
+
```ts
|
|
285
|
+
import { DecisionTree } from "@dniskav/neuron";
|
|
286
|
+
|
|
287
|
+
const tree = new DecisionTree({ maxDepth: 4, task: 'classification' });
|
|
288
|
+
tree.fit(X_train, y_train);
|
|
289
|
+
const predictions = tree.predictBatch(X_test);
|
|
290
|
+
```
|
|
144
291
|
|
|
145
|
-
|
|
292
|
+
### K-Means — unsupervised clustering
|
|
146
293
|
|
|
147
294
|
```ts
|
|
148
|
-
import {
|
|
295
|
+
import { KMeans } from "@dniskav/neuron";
|
|
149
296
|
|
|
150
|
-
const
|
|
151
|
-
|
|
152
|
-
optimizer: () => new Adam(), // default: beta1=0.9, beta2=0.999
|
|
153
|
-
});
|
|
297
|
+
const km = new KMeans(3); // 3 clusters
|
|
298
|
+
km.fit(points);
|
|
154
299
|
|
|
155
|
-
//
|
|
156
|
-
|
|
157
|
-
const net2 = new NetworkN([2, 32, 1], {
|
|
158
|
-
optimizer: () => new Momentum(0.9),
|
|
159
|
-
});
|
|
300
|
+
const cluster = km.predict([1.2, 0.5]); // index 0, 1 or 2
|
|
301
|
+
console.log(km.inertia(points)); // lower = better fit
|
|
160
302
|
```
|
|
161
303
|
|
|
162
|
-
|
|
304
|
+
### PCA — dimensionality reduction
|
|
163
305
|
|
|
164
306
|
```ts
|
|
165
|
-
import {
|
|
307
|
+
import { PCA } from "@dniskav/neuron";
|
|
166
308
|
|
|
167
|
-
const
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
309
|
+
const pca = new PCA(2); // keep top 2 components
|
|
310
|
+
pca.fit(X); // 100 samples × 10 features
|
|
311
|
+
|
|
312
|
+
const Z = pca.transform(X); // 100 × 2
|
|
313
|
+
const X2 = pca.inverseTransform(Z); // reconstructed 100 × 10
|
|
314
|
+
|
|
315
|
+
console.log(pca.explainedVarianceRatio()); // [0.72, 0.15, ...]
|
|
171
316
|
```
|
|
172
317
|
|
|
173
|
-
###
|
|
318
|
+
### Self-Organizing Map
|
|
174
319
|
|
|
175
320
|
```ts
|
|
176
|
-
import {
|
|
321
|
+
import { SOM } from "@dniskav/neuron";
|
|
177
322
|
|
|
178
|
-
const
|
|
179
|
-
|
|
180
|
-
console.log(crossEntropy(predicted, [1, 0]));
|
|
181
|
-
```
|
|
323
|
+
const som = new SOM(10, 10, 3); // 10×10 grid, 3-dimensional inputs (RGB)
|
|
324
|
+
som.train(colors, 500);
|
|
182
325
|
|
|
183
|
-
|
|
326
|
+
const [row, col] = som.getBMU([255, 0, 0]); // find best matching unit for red
|
|
327
|
+
console.log(som.quantizationError(colors));
|
|
328
|
+
```
|
|
184
329
|
|
|
185
|
-
|
|
330
|
+
### Hopfield Network — associative memory
|
|
186
331
|
|
|
187
332
|
```ts
|
|
188
|
-
import {
|
|
333
|
+
import { HopfieldNetwork } from "@dniskav/neuron";
|
|
189
334
|
|
|
190
|
-
const net = new
|
|
191
|
-
const pred = net.predict(inputs);
|
|
335
|
+
const net = new HopfieldNetwork(64); // 64 binary neurons
|
|
192
336
|
|
|
193
|
-
//
|
|
194
|
-
|
|
195
|
-
net.
|
|
196
|
-
```
|
|
337
|
+
// Store two 64-bit patterns
|
|
338
|
+
net.store(HopfieldNetwork.binarize(pattern1)); // converts 0/1 → -1/+1
|
|
339
|
+
net.store(HopfieldNetwork.binarize(pattern2));
|
|
197
340
|
|
|
198
|
-
|
|
341
|
+
// Recall from noisy input
|
|
342
|
+
const recovered = net.recall(HopfieldNetwork.binarize(noisyPattern1));
|
|
343
|
+
console.log(net.energy(recovered)); // local minimum = stored memory
|
|
344
|
+
```
|
|
199
345
|
|
|
200
|
-
|
|
346
|
+
### Autoencoder — learn compressed representations
|
|
201
347
|
|
|
202
348
|
```ts
|
|
203
|
-
import {
|
|
349
|
+
import { Autoencoder } from "@dniskav/neuron";
|
|
204
350
|
|
|
205
|
-
//
|
|
206
|
-
const
|
|
351
|
+
// 784 → [128, 64] → 16 (latent) → [64, 128] → 784
|
|
352
|
+
const ae = new Autoencoder(784, [128, 64], 16, [64, 128]);
|
|
207
353
|
|
|
208
|
-
|
|
209
|
-
|
|
354
|
+
for (let e = 0; e < 1000; e++)
|
|
355
|
+
for (const x of images)
|
|
356
|
+
ae.train(x, 0.001);
|
|
210
357
|
|
|
211
|
-
|
|
212
|
-
|
|
358
|
+
const latent = ae.encode(image); // compressed: 16 values
|
|
359
|
+
const reconstructed = ae.reconstruct(image); // decoded back: 784 values
|
|
360
|
+
```
|
|
213
361
|
|
|
214
|
-
|
|
215
|
-
for (let step = 0; step < 6; step++) {
|
|
216
|
-
net.predict([1]); // same input every step
|
|
217
|
-
targets.push([step >= 3 ? 1 : 0]);
|
|
218
|
-
}
|
|
362
|
+
### GAN — generative adversarial training
|
|
219
363
|
|
|
220
|
-
|
|
364
|
+
```ts
|
|
365
|
+
import { GAN } from "@dniskav/neuron";
|
|
366
|
+
|
|
367
|
+
const gan = new GAN(
|
|
368
|
+
16, // latentDim
|
|
369
|
+
[32, 64], // generator hidden layers
|
|
370
|
+
8, // outputDim (size of generated samples)
|
|
371
|
+
[64, 32], // discriminator hidden layers
|
|
372
|
+
);
|
|
373
|
+
|
|
374
|
+
for (let step = 0; step < 10000; step++) {
|
|
375
|
+
const { dLoss, gLoss } = gan.trainStep(realBatch, 0.0002);
|
|
376
|
+
if (step % 500 === 0) console.log(`D: ${dLoss.toFixed(3)} G: ${gLoss.toFixed(3)}`);
|
|
221
377
|
}
|
|
222
378
|
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
227
|
-
|
|
379
|
+
const fake = gan.generate(); // new synthetic sample
|
|
380
|
+
```
|
|
381
|
+
|
|
382
|
+
### VAE — variational autoencoder
|
|
383
|
+
|
|
384
|
+
```ts
|
|
385
|
+
import { VAE } from "@dniskav/neuron";
|
|
386
|
+
|
|
387
|
+
const vae = new VAE(784, [256, 128], 32, [128, 256]);
|
|
388
|
+
|
|
389
|
+
for (const x of dataset) {
|
|
390
|
+
const { totalLoss, reconLoss, klLoss } = vae.train(x, 0.001);
|
|
228
391
|
}
|
|
229
|
-
|
|
230
|
-
//
|
|
231
|
-
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
// step 5: 0.93 (expected: 1)
|
|
392
|
+
|
|
393
|
+
// Sample from latent space
|
|
394
|
+
const generated = vae.generate(); // random sample
|
|
395
|
+
const { mu, logVar } = vae.encode(image); // encode → distribution params
|
|
396
|
+
const z = vae.reparametrize(mu, logVar); // sample z ~ N(μ, σ²)
|
|
235
397
|
```
|
|
236
398
|
|
|
237
|
-
|
|
399
|
+
### Value / Tape — automatic differentiation
|
|
238
400
|
|
|
239
|
-
|
|
401
|
+
```ts
|
|
402
|
+
import { Value } from "@dniskav/neuron";
|
|
240
403
|
|
|
241
|
-
|
|
404
|
+
// Build a computation graph
|
|
405
|
+
const x = new Value(2.0);
|
|
406
|
+
const w = new Value(-3.0);
|
|
407
|
+
const b = new Value(6.7);
|
|
408
|
+
const n = x.mul(w).add(b); // n = x*w + b
|
|
409
|
+
const o = n.tanh(); // o = tanh(n)
|
|
242
410
|
|
|
411
|
+
// Backward pass — fills .grad for every node
|
|
412
|
+
o.backward();
|
|
413
|
+
|
|
414
|
+
console.log(x.grad); // ∂o/∂x
|
|
415
|
+
console.log(w.grad); // ∂o/∂w
|
|
416
|
+
console.log(b.grad); // ∂o/∂b
|
|
243
417
|
```
|
|
244
|
-
|
|
245
|
-
|
|
418
|
+
|
|
419
|
+
### Conv2D + MaxPool2D + Flatten — CNN pipeline
|
|
420
|
+
|
|
421
|
+
```ts
|
|
422
|
+
import { Conv2D, MaxPool2D, Flatten, NetworkN, relu, sigmoid } from "@dniskav/neuron";
|
|
423
|
+
|
|
424
|
+
const conv = new Conv2D(28, 28, 1, 3, 8); // 28×28×1 → 26×26×8
|
|
425
|
+
const pool = new MaxPool2D(2); // 26×26×8 → 13×13×8
|
|
426
|
+
const flatten = new Flatten();
|
|
427
|
+
const dense = new NetworkN([13*13*8, 64, 10]);
|
|
428
|
+
|
|
429
|
+
// Forward
|
|
430
|
+
const featureMaps = conv.forward(image); // [H][W][C]
|
|
431
|
+
const pooled = pool.forward(featureMaps);
|
|
432
|
+
const flat = flatten.forward(pooled); // 1352 values
|
|
433
|
+
const logits = dense.predict(flat);
|
|
246
434
|
```
|
|
247
435
|
|
|
248
|
-
|
|
436
|
+
### RNN — vanilla recurrent network
|
|
249
437
|
|
|
250
|
-
|
|
438
|
+
```ts
|
|
439
|
+
import { RNN } from "@dniskav/neuron";
|
|
251
440
|
|
|
252
|
-
|
|
441
|
+
// 1 input → 16 hidden → 1 output, over a sequence
|
|
442
|
+
const rnn = new RNN(1, 16, 1);
|
|
253
443
|
|
|
254
|
-
|
|
444
|
+
const sequence = [[0.1], [0.3], [0.7], [0.9]]; // 4 timesteps
|
|
445
|
+
const { outputs, hiddens } = rnn.forward(sequence);
|
|
255
446
|
|
|
256
|
-
|
|
257
|
-
|
|
258
|
-
|
|
447
|
+
// BPTT backward — returns MSE loss
|
|
448
|
+
const targets = [[0.2], [0.5], [0.8], [1.0]];
|
|
449
|
+
const loss = rnn.backward(sequence, targets, 0.01);
|
|
259
450
|
```
|
|
260
451
|
|
|
261
|
-
|
|
452
|
+
### TCN — Temporal Convolutional Network
|
|
262
453
|
|
|
263
|
-
|
|
454
|
+
```ts
|
|
455
|
+
import { TCN } from "@dniskav/neuron";
|
|
456
|
+
|
|
457
|
+
// 3 input channels → 32 channels × 4 levels → 1 output
|
|
458
|
+
// Receptive field = (3-1)·(2⁴-1)+1 = 30 timesteps
|
|
459
|
+
const tcn = new TCN(3, 32, 3, 4, 1);
|
|
460
|
+
|
|
461
|
+
const sequence = Array.from({ length: 50 }, () => [Math.random(), Math.random(), Math.random()]);
|
|
462
|
+
const outputs = tcn.forward(sequence); // [50][1]
|
|
463
|
+
```
|
|
264
464
|
|
|
265
|
-
###
|
|
465
|
+
### NetworkLSTM — recurrent memory
|
|
266
466
|
|
|
267
467
|
```ts
|
|
268
|
-
import {
|
|
269
|
-
|
|
270
|
-
|
|
271
|
-
|
|
272
|
-
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
|
|
468
|
+
import { NetworkLSTM } from "@dniskav/neuron";
|
|
469
|
+
|
|
470
|
+
const net = new NetworkLSTM(1, 8, [4, 1]);
|
|
471
|
+
|
|
472
|
+
for (let epoch = 0; epoch < 300; epoch++) {
|
|
473
|
+
net.resetState();
|
|
474
|
+
for (let step = 0; step < 6; step++) net.predict([1]);
|
|
475
|
+
net.train([[0],[0],[0],[1],[1],[1]], 0.05);
|
|
476
|
+
}
|
|
477
|
+
```
|
|
478
|
+
|
|
479
|
+
### Metrics — evaluate your model
|
|
480
|
+
|
|
481
|
+
```ts
|
|
482
|
+
import { accuracy, f1Score, confusionMatrix, printConfusionMatrix, auc, classificationReport } from "@dniskav/neuron";
|
|
483
|
+
|
|
484
|
+
const yTrue = [0, 1, 1, 0, 1];
|
|
485
|
+
const yPred = [0, 1, 0, 0, 1];
|
|
486
|
+
|
|
487
|
+
console.log(accuracy(yTrue, yPred)); // 0.8
|
|
488
|
+
console.log(f1Score(yTrue, yPred)); // 0.8
|
|
279
489
|
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
const targets = [...]; // 81*9 one-hot values
|
|
283
|
-
const mask = puzzle.map(v => v === 0); // only train on empty cells
|
|
490
|
+
const cm = confusionMatrix(yTrue, yPred);
|
|
491
|
+
printConfusionMatrix(cm, ['neg', 'pos']);
|
|
284
492
|
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
493
|
+
// AUC-ROC
|
|
494
|
+
const scores = [0.1, 0.9, 0.4, 0.2, 0.8];
|
|
495
|
+
console.log(auc(yTrue, scores)); // ~0.9
|
|
288
496
|
|
|
289
|
-
|
|
290
|
-
const weights = net.getAttentionWeights();
|
|
291
|
-
// weights[blockIdx][headIdx] → seqLen × seqLen matrix
|
|
497
|
+
classificationReport(yTrue, yPred, ['neg', 'pos']);
|
|
292
498
|
```
|
|
293
499
|
|
|
294
|
-
|
|
295
|
-
3×3 box). The network figures this out by itself through training.
|
|
500
|
+
### EarlyStopping
|
|
296
501
|
|
|
297
|
-
|
|
502
|
+
```ts
|
|
503
|
+
import { EarlyStopping } from "@dniskav/neuron";
|
|
298
504
|
|
|
299
|
-
|
|
505
|
+
const stopper = new EarlyStopping({ patience: 10, minDelta: 1e-4, mode: 'min' });
|
|
506
|
+
|
|
507
|
+
for (let epoch = 0; epoch < 1000; epoch++) {
|
|
508
|
+
const valLoss = trainEpoch();
|
|
509
|
+
if (stopper.update(valLoss, epoch)) {
|
|
510
|
+
console.log(`Stopped at epoch ${epoch}`);
|
|
511
|
+
break;
|
|
512
|
+
}
|
|
513
|
+
}
|
|
514
|
+
```
|
|
515
|
+
|
|
516
|
+
### LossPlotter — ASCII loss curve
|
|
300
517
|
|
|
301
518
|
```ts
|
|
302
|
-
import {
|
|
303
|
-
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
519
|
+
import { LossPlotter } from "@dniskav/neuron";
|
|
520
|
+
|
|
521
|
+
const plotter = new LossPlotter({ width: 60, height: 12, title: 'Training Loss' });
|
|
522
|
+
|
|
523
|
+
for (let e = 0; e < 500; e++) {
|
|
524
|
+
const loss = trainStep();
|
|
525
|
+
plotter.add(loss, e);
|
|
526
|
+
}
|
|
527
|
+
|
|
528
|
+
plotter.print();
|
|
529
|
+
// Training Loss
|
|
530
|
+
// ┌────────────────────────────────────────────────────────────┐
|
|
531
|
+
// │ 2.31 ·
|
|
532
|
+
// │ · ·
|
|
533
|
+
// │ · · ·
|
|
534
|
+
// │ · · · · · · ·
|
|
535
|
+
// │ 0.02 · · · · · · · · · · · · · · ·
|
|
536
|
+
// └────────────────────────────────────────────────────────────┘
|
|
537
|
+
// 0 250 499
|
|
538
|
+
```
|
|
539
|
+
|
|
540
|
+
### DataAugmentation
|
|
312
541
|
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
const qValues = net.predict(sequence); // number[4]
|
|
542
|
+
```ts
|
|
543
|
+
import { DataAugmentation } from "@dniskav/neuron";
|
|
316
544
|
|
|
317
|
-
//
|
|
318
|
-
const
|
|
319
|
-
const reward = env.step(action);
|
|
320
|
-
const targets = qValues.slice();
|
|
321
|
-
targets[action] = reward + 0.99 * Math.max(...net.predict(nextSequence));
|
|
545
|
+
// Split dataset
|
|
546
|
+
const { trainX, trainY, valX, valY } = DataAugmentation.split(X, y, 0.8, 0.1);
|
|
322
547
|
|
|
323
|
-
|
|
548
|
+
// Normalize (fit on train, apply to all)
|
|
549
|
+
const { normalized: normTrain, min, max } = DataAugmentation.normalize(trainX);
|
|
550
|
+
const normVal = valX.map(x => DataAugmentation.normalizePoint(x, min, max));
|
|
551
|
+
|
|
552
|
+
// Augment training set (×3 copies with Gaussian noise)
|
|
553
|
+
const { X: augX, y: augY } = DataAugmentation.augmentBatch(normTrain, trainY, 3, 0.02);
|
|
324
554
|
```
|
|
325
555
|
|
|
326
|
-
|
|
556
|
+
### WeightInspector — diagnose your network
|
|
327
557
|
|
|
328
558
|
```ts
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
559
|
+
import { NetworkN, WeightInspector, relu } from "@dniskav/neuron";
|
|
560
|
+
|
|
561
|
+
const net = new NetworkN([784, 256, 128, 10], { activations: [relu, relu, relu] });
|
|
562
|
+
// ... train ...
|
|
563
|
+
|
|
564
|
+
WeightInspector.print(net);
|
|
565
|
+
// Layer 0: mean=0.001 std=0.056 min=-0.21 max=0.19 dead=0 params=200960
|
|
566
|
+
// Layer 1: mean=0.000 std=0.079 min=-0.31 max=0.28 dead=3 params=32896
|
|
567
|
+
// Layer 2: mean=-0.001 std=0.091 min=-0.28 max=0.32 dead=0 params=1290
|
|
568
|
+
```
|
|
569
|
+
|
|
570
|
+
## How it works
|
|
571
|
+
|
|
572
|
+
Each class applies an **activation function** to the weighted sum of inputs and uses **gradient descent** to update weights:
|
|
573
|
+
|
|
574
|
+
```
|
|
575
|
+
weight += lr × delta × input
|
|
576
|
+
bias += lr × delta
|
|
577
|
+
```
|
|
578
|
+
|
|
579
|
+
`NetworkN` implements full **backpropagation** across all layers, propagating deltas from the output back to the first layer using the chain rule. `NeuronN` uses **Xavier initialization** — weights start in `[-√(1/n), +√(1/n)]`.
|
|
580
|
+
|
|
581
|
+
When an **optimizer** is used (e.g., Adam), the raw gradient is passed to the optimizer instead of being applied directly. Each weight maintains its own optimizer state.
|
|
582
|
+
|
|
583
|
+
The `Value` class implements **reverse-mode automatic differentiation**: every operation records its inputs and a backward function. Calling `.backward()` on the output node performs a topological sort and propagates `∂L/∂w` through the entire graph.
|
|
584
|
+
|
|
585
|
+
## Build
|
|
586
|
+
|
|
587
|
+
```bash
|
|
588
|
+
npm run build # outputs CJS + ESM + type declarations to dist/
|
|
589
|
+
npm run dev # watch mode
|
|
590
|
+
npm test # run test suite
|
|
332
591
|
```
|
|
333
592
|
|
|
593
|
+
## For AI agents
|
|
594
|
+
|
|
595
|
+
If you are an AI agent or LLM working with this codebase, read [AGENTS.md](AGENTS.md) first. It contains the full class hierarchy, design constraints, and what this library does not do.
|
|
596
|
+
|
|
334
597
|
## Changelog
|
|
335
598
|
|
|
599
|
+
### v0.3.0
|
|
600
|
+
- **New — Classical ML:** `Perceptron`, `LinearRegression` (normal equation + GD), `LogisticRegression`, `SoftmaxRegression`, `GaussianNaiveBayes`, `DecisionTree` (CART, Gini/MSE)
|
|
601
|
+
- **New — Unsupervised:** `KMeans` (K-Means++ init), `PCA` (power iteration + Hotelling deflation), `SOM` (Kohonen map), `HopfieldNetwork` (Hebbian storage + energy), `Autoencoder`
|
|
602
|
+
- **New — Deep Learning:** `Conv2D` (full forward/backward), `MaxPool2D` (position mask for exact backprop), `Flatten`, `RNN` (BPTT, documents vanishing gradient), `Seq2Seq` (encoder-decoder LSTM), `CausalConv1D`, `TCN` (dilated temporal convolutions)
|
|
603
|
+
- **New — Generative:** `GAN` (min-max game, Box-Muller sampling), `VAE` (reparametrization trick, ELBO = MSE + KL)
|
|
604
|
+
- **New — Autograd:** `Value` / `Tape` — scalar reverse-mode AD with topological backprop (micrograd-style)
|
|
605
|
+
- **New — Metrics:** `confusionMatrix`, `accuracy`, `precision`, `recall`, `f1Score`, `rocCurve`, `auc`, `mae`, `rmse`, `r2Score`, `perplexity`, `printConfusionMatrix`, `classificationReport`
|
|
606
|
+
- **New — Utilities:** `EarlyStopping` (patience + best-weight restore), `LossPlotter` (ASCII terminal curve), `WeightInspector` (per-layer stats, dead ReLU detection), `DataAugmentation` (noise, normalize, z-score, shuffle, split)
|
|
607
|
+
|
|
336
608
|
### v0.2.7
|
|
337
|
-
- **Docs:** Added architecture diagram to README
|
|
609
|
+
- **Docs:** Added architecture diagram to README
|
|
338
610
|
|
|
339
611
|
### v0.2.6
|
|
340
612
|
- **Fix:** `Network.predict` now returns `number[]` (consistent with all other network classes)
|
|
341
|
-
- **Fix:** `Network.train` now uses the configured optimizer and `activation.dfn()`
|
|
342
|
-
- **Fix:** `LayerNorm.backwardOne`
|
|
343
|
-
- **Fix:** LSTM and GRU gate initialization corrected
|
|
344
|
-
- **New:** `BiasVector` — 1D counterpart to `WeightMatrix`
|
|
345
|
-
- **New:** `defaultOptimizer`
|
|
346
|
-
- **Refactor:** `NetworkN
|
|
347
|
-
- **Refactor:** `Transformer` backward methods now throw descriptive errors instead of crashing with a cryptic `TypeError` when called before `predict()`
|
|
348
|
-
- **Refactor:** `NetworkTransformer.setWeights()` and `NetworkTransformerRL.setWeightsFlat()` use each component's own `setWeights()` instead of direct `.W` mutation
|
|
613
|
+
- **Fix:** `Network.train` now uses the configured optimizer and `activation.dfn()`
|
|
614
|
+
- **Fix:** `LayerNorm.backwardOne` correctly uses pre-update γ
|
|
615
|
+
- **Fix:** LSTM and GRU gate initialization corrected to Xavier fan-in+out
|
|
616
|
+
- **New:** `BiasVector` — 1D counterpart to `WeightMatrix`
|
|
617
|
+
- **New:** `defaultOptimizer` — shared default factory
|
|
618
|
+
- **Refactor:** `NetworkN` extracts `_forwardAll()` and `_backpropLayers()`
|
|
349
619
|
|
|
350
620
|
### v0.2.5
|
|
351
|
-
- Unified optimizer factories for `LSTMLayer`, `GRULayer`, `Conv1D`
|
|
352
|
-
- `NetworkN`: residual connections
|
|
353
|
-
- `Conv1D`: multi-channel input
|
|
354
|
-
- `
|
|
355
|
-
- `
|
|
356
|
-
- `
|
|
357
|
-
- `ModelSaver`: universal serialization via flat `getWeights()`/`setWeights()` for all classes
|
|
358
|
-
- Gradient check test suite (`tests/GradientCheck.test.ts`)
|
|
359
|
-
|
|
360
|
-
## Possible improvements
|
|
361
|
-
|
|
362
|
-
1. **Support for batches** in training to improve efficiency and gradient stability.
|
|
363
|
-
2. **Global gradient norm clipping** — `WeightMatrix.update` supports per-element clipping; a utility to clip across all matrices by total norm would be more principled.
|
|
364
|
-
3. **Learning rate warmup** — standard practice for Transformers; ramp LR from 0 to target over the first N steps.
|
|
365
|
-
4. **Pre-norm architecture** — LayerNorm before the residual add (instead of after) is more stable for deep stacks.
|
|
621
|
+
- Unified optimizer factories for `LSTMLayer`, `GRULayer`, `Conv1D`
|
|
622
|
+
- `NetworkN`: residual connections and dropout
|
|
623
|
+
- `Conv1D`: multi-channel input
|
|
624
|
+
- `Trainer`: weight decay, early stopping, classification metrics
|
|
625
|
+
- `DataLoader`: validation split
|
|
626
|
+
- `ModelSaver`: universal serialization
|
|
366
627
|
|
|
367
628
|
## License
|
|
368
629
|
|