@dniskav/neuron 0.2.6 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +479 -193
- package/dist/index.d.mts +470 -1
- package/dist/index.d.ts +470 -1
- package/dist/index.js +3023 -2
- package/dist/index.mjs +2985 -2
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -3,8 +3,60 @@
|
|
|
3
3
|
|
|
4
4
|
A minimal, dependency-free neural network library built from scratch in TypeScript. Designed for learning and experimentation — every line of math is readable.
|
|
5
5
|
|
|
6
|
+
Each class is a building block for the next: from a single neuron to a full Transformer with causal attention. v0.3.0 adds classical ML, unsupervised learning, generative models, autograd, and training utilities — all in pure TypeScript, zero dependencies.
|
|
7
|
+
|
|
8
|
+
```mermaid
|
|
9
|
+
graph TD
|
|
10
|
+
A["Neuron\n1 input · 1 weight · 1 bias"]
|
|
11
|
+
B["NeuronN\nN inputs · Xavier init · configurable activation"]
|
|
12
|
+
C["Layer\ngroup of NeuronN sharing the same inputs"]
|
|
13
|
+
D["Network\nhidden + output · backprop"]
|
|
14
|
+
E["NetworkN\narbitrary depth · define as [inputs, ...hidden, outputs]"]
|
|
15
|
+
F["LSTMLayer\nrecurrent · hidden + cell state · BPTT"]
|
|
16
|
+
G["NetworkLSTM\nLSTM + dense layers · sequence memory"]
|
|
17
|
+
H["AttentionHead\nQ · K · V · scaled dot-product"]
|
|
18
|
+
I["MultiHeadAttention\nN heads in parallel"]
|
|
19
|
+
J["TransformerBlock\nattention + FFN + LayerNorm × 2 + residuals"]
|
|
20
|
+
K["NetworkTransformer\nembeddings → blocks → per-token logits"]
|
|
21
|
+
L["NetworkTransformerRL\ncontinuous projection → causal attention → Q-values"]
|
|
22
|
+
|
|
23
|
+
subgraph Classical ML
|
|
24
|
+
P["Perceptron\nstep function · Rosenblatt rule"]
|
|
25
|
+
LR["LinearRegression\nnormal equation · gradient descent"]
|
|
26
|
+
LOG["LogisticRegression\nsigmoid · BCE · SoftmaxRegression"]
|
|
27
|
+
NB["GaussianNaiveBayes\nlog-probabilities · Gaussian P(x|c)"]
|
|
28
|
+
DT["DecisionTree\nCART · Gini · MSE split"]
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
subgraph Unsupervised
|
|
32
|
+
KM["KMeans\nK-Means++ · inertia · elbow"]
|
|
33
|
+
PCA["PCA\npower iteration · projection · reconstruction"]
|
|
34
|
+
SOM["SOM\nKohonen · BMU · Gaussian neighborhood"]
|
|
35
|
+
HN["HopfieldNetwork\nHebbian · energy · associative memory"]
|
|
36
|
+
AE["Autoencoder\nencoder · bottleneck · decoder"]
|
|
37
|
+
end
|
|
38
|
+
|
|
39
|
+
subgraph Generative
|
|
40
|
+
GAN["GAN\ngenerator · discriminator · min-max"]
|
|
41
|
+
VAE["VAE\nreparametrization trick · ELBO · KL"]
|
|
42
|
+
end
|
|
43
|
+
|
|
44
|
+
subgraph Autograd
|
|
45
|
+
TAP["Value / Tape\nreverse-mode · computational graph · backward"]
|
|
46
|
+
end
|
|
47
|
+
|
|
48
|
+
A --> B --> C --> D --> E
|
|
49
|
+
E --> F --> G
|
|
50
|
+
E --> H --> I --> J --> K --> L
|
|
51
|
+
E --> AE
|
|
52
|
+
E --> GAN
|
|
53
|
+
E --> VAE
|
|
54
|
+
```
|
|
55
|
+
|
|
6
56
|
## What's inside
|
|
7
57
|
|
|
58
|
+
### Neural network building blocks
|
|
59
|
+
|
|
8
60
|
| Export | Description |
|
|
9
61
|
|--------|-------------|
|
|
10
62
|
| `Neuron` | Single-input neuron. The simplest possible unit: one weight, one bias. |
|
|
@@ -14,20 +66,115 @@ A minimal, dependency-free neural network library built from scratch in TypeScri
|
|
|
14
66
|
| `NetworkN` | Deep network of arbitrary depth. Define your architecture as `[inputs, ...hidden, outputs]`. |
|
|
15
67
|
| `LSTMLayer` | Recurrent layer with persistent hidden and cell state. Learns sequences via BPTT. |
|
|
16
68
|
| `NetworkLSTM` | Wraps an `LSTMLayer` + dense layers. Maintains memory across steps within an episode. |
|
|
69
|
+
| `GRULayer` | Gated Recurrent Unit — lighter alternative to LSTM, two gates instead of three. |
|
|
17
70
|
| `NetworkTransformer` | Full token-classification Transformer: embeddings → N blocks → per-token logits. |
|
|
18
|
-
| `NetworkTransformerRL` | Transformer for RL agents: continuous input projection → causal attention → Q-values.
|
|
71
|
+
| `NetworkTransformerRL` | Transformer for RL agents: continuous input projection → causal attention → Q-values. |
|
|
19
72
|
| `TransformerBlock` | One Transformer block: multi-head attention + FFN + LayerNorm × 2 with residuals. |
|
|
20
73
|
| `MultiHeadAttention` | N parallel attention heads concatenated and projected to `d_model`. |
|
|
21
74
|
| `AttentionHead` | Single scaled dot-product self-attention head (Q / K / V projections + backprop). |
|
|
75
|
+
|
|
76
|
+
### Layers & components
|
|
77
|
+
|
|
78
|
+
| Export | Description |
|
|
79
|
+
|--------|-------------|
|
|
80
|
+
| `Conv1D` | 1D convolution over sequences. Multi-channel, configurable stride and padding. |
|
|
81
|
+
| `Conv2D` | 2D convolution for images. Kernels `[filters][kH][kW][C]`, full forward + backward. |
|
|
82
|
+
| `MaxPool2D` | Max pooling 2D. Stores position mask for exact gradient routing in backprop. |
|
|
83
|
+
| `Flatten` | Converts `[H][W][C]` tensors to flat vectors. Bridges Conv layers to dense layers. |
|
|
84
|
+
| `RNN` | Vanilla RNN with BPTT. Explicitly shows where and why gradients vanish. |
|
|
85
|
+
| `Seq2Seq` | Encoder + Decoder LSTMs with context vector transfer. Teacher forcing in training. |
|
|
86
|
+
| `CausalConv1D` | Causal dilated 1D convolution. One building block of a TCN. |
|
|
87
|
+
| `TCN` | Temporal Convolutional Network. Stacks causal dilated convolutions for sequences without recurrence. |
|
|
22
88
|
| `LayerNorm` | Layer normalization with learnable γ / β per feature. |
|
|
23
|
-
| `
|
|
24
|
-
| `
|
|
89
|
+
| `BatchNorm` | Batch normalization with running mean/variance for inference. |
|
|
90
|
+
| `Dropout` | Inverted dropout for regularization. Active only during training. |
|
|
91
|
+
| `WeightMatrix` | 2D weight matrix with per-scalar Adam optimizers and optional gradient clipping. |
|
|
92
|
+
| `BiasVector` | 1D bias vector with per-scalar Adam optimizers. |
|
|
25
93
|
| `EmbeddingMatrix` | Lookup-table embedding matrix with SGD updates. |
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
|
30
|
-
|
|
94
|
+
|
|
95
|
+
### Classical ML
|
|
96
|
+
|
|
97
|
+
| Export | Description |
|
|
98
|
+
|--------|-------------|
|
|
99
|
+
| `Perceptron` | The historical Rosenblatt perceptron (1957). Step function, linear rule. Shows why XOR is impossible. |
|
|
100
|
+
| `LinearRegression` | Closed-form normal equation `(XᵀX)⁻¹Xᵀy` + gradient descent mode. Pure array arithmetic. |
|
|
101
|
+
| `LogisticRegression` | Sigmoid + binary cross-entropy, no hidden layers. The boundary between classical ML and neural nets. |
|
|
102
|
+
| `SoftmaxRegression` | Multinomial logistic regression. Log-sum-exp trick for numerical stability. |
|
|
103
|
+
| `GaussianNaiveBayes` | `P(c|x) ∝ P(c)·∏P(xᵢ|c)` in log-space. Zero gradient descent — pure Bayes. |
|
|
104
|
+
| `DecisionTree` | CART with Gini impurity (classification) or variance (regression). Fully recursive. |
|
|
105
|
+
|
|
106
|
+
### Unsupervised learning
|
|
107
|
+
|
|
108
|
+
| Export | Description |
|
|
109
|
+
|--------|-------------|
|
|
110
|
+
| `KMeans` | K-Means++ initialization + Lloyd's algorithm. `inertia()` for the elbow method. |
|
|
111
|
+
| `PCA` | Principal Component Analysis via power iteration + Hotelling deflation. Projects, reconstructs, explains variance. |
|
|
112
|
+
| `SOM` | Self-Organizing Map (Kohonen). BMU search, Gaussian neighborhood, topology preservation. |
|
|
113
|
+
| `HopfieldNetwork` | Associative memory. Hebbian storage, energy function, async recall. Capacity ~0.138·N. |
|
|
114
|
+
| `Autoencoder` | Encoder + bottleneck + decoder using two `NetworkN` instances. Learns compressed representations. |
|
|
115
|
+
|
|
116
|
+
### Generative models
|
|
117
|
+
|
|
118
|
+
| Export | Description |
|
|
119
|
+
|--------|-------------|
|
|
120
|
+
| `GAN` | Generator vs Discriminator min-max game. Documents Nash equilibrium and mode collapse. |
|
|
121
|
+
| `VAE` | Variational Autoencoder. Reparametrization trick, ELBO = reconstruction + KL divergence. |
|
|
122
|
+
|
|
123
|
+
### Automatic differentiation
|
|
124
|
+
|
|
125
|
+
| Export | Description |
|
|
126
|
+
|--------|-------------|
|
|
127
|
+
| `Value` | Scalar autograd node. Builds a computational graph and propagates gradients with `.backward()`. Inspired by micrograd. |
|
|
128
|
+
|
|
129
|
+
### Activations & math
|
|
130
|
+
|
|
131
|
+
| Export | Description |
|
|
132
|
+
|--------|-------------|
|
|
133
|
+
| `sigmoid` `relu` `tanh` `linear` `leakyRelu` `elu` | Built-in activation functions with `fn` and `dfn` (derivative from output). |
|
|
134
|
+
| `makeLeakyRelu(α)` `makeElu(α)` | Parametric variants. |
|
|
135
|
+
| `matMul` `transpose` `softmax` `softmaxBackward` | Matrix math utilities. |
|
|
136
|
+
|
|
137
|
+
### Optimizers
|
|
138
|
+
|
|
139
|
+
| Export | Description |
|
|
140
|
+
|--------|-------------|
|
|
141
|
+
| `SGD` | Vanilla stochastic gradient descent. Stateless. |
|
|
142
|
+
| `Momentum` | Accumulates velocity in the gradient direction. |
|
|
143
|
+
| `Adam` | Adaptive moment estimation. Per-parameter first and second moments with bias correction. |
|
|
144
|
+
| `ClipOptimizer` | Wraps any optimizer with gradient clipping. |
|
|
145
|
+
| `ClippedOptimizerFactory` | Factory wrapper that clips all created optimizers. |
|
|
146
|
+
| `defaultOptimizer` | Default factory (`() => new SGD()`). Shared fallback across all classes. |
|
|
147
|
+
|
|
148
|
+
### Loss functions
|
|
149
|
+
|
|
150
|
+
| Export | Description |
|
|
151
|
+
|--------|-------------|
|
|
152
|
+
| `mse` `crossEntropy` | Scalar loss functions for evaluation and logging. |
|
|
153
|
+
| `mseDelta` `crossEntropyDelta` `crossEntropyDeltaRaw` | Output-layer delta functions for `trainWithDeltas`. |
|
|
154
|
+
|
|
155
|
+
### Metrics & evaluation
|
|
156
|
+
|
|
157
|
+
| Export | Description |
|
|
158
|
+
|--------|-------------|
|
|
159
|
+
| `confusionMatrix` | Returns `number[][]` confusion matrix. |
|
|
160
|
+
| `accuracy` `precision` `recall` `f1Score` | Standard classification metrics. |
|
|
161
|
+
| `rocCurve` `auc` | ROC curve points and area under the curve (trapezoidal rule). |
|
|
162
|
+
| `mae` `rmse` `r2Score` | Regression metrics. |
|
|
163
|
+
| `perplexity` | `exp(mean cross-entropy)` — natural metric for language models. |
|
|
164
|
+
| `printConfusionMatrix` `classificationReport` | Console-formatted output tables. |
|
|
165
|
+
|
|
166
|
+
### Training utilities
|
|
167
|
+
|
|
168
|
+
| Export | Description |
|
|
169
|
+
|--------|-------------|
|
|
170
|
+
| `Trainer` | Training loop with epochs, batches, metrics, and callbacks. |
|
|
171
|
+
| `DataLoader` | Dataset wrapper with shuffling and validation split. |
|
|
172
|
+
| `LRScheduler` | Learning rate schedules (step, exponential, cosine). |
|
|
173
|
+
| `EarlyStopping` | Stops training when a metric stalls. Configurable patience, mode, and best-weight restore. |
|
|
174
|
+
| `LossPlotter` | Renders a loss curve as ASCII art in the terminal. |
|
|
175
|
+
| `WeightInspector` | Per-layer weight statistics (mean, std, dead weights). Detects dead ReLUs. |
|
|
176
|
+
| `DataAugmentation` | Noise, jitter, normalization, z-score, shuffle, train/val/test split. |
|
|
177
|
+
| `ModelSaver` | Universal serialization via flat `getWeights()` / `setWeights()`. |
|
|
31
178
|
|
|
32
179
|
## Install
|
|
33
180
|
|
|
@@ -44,300 +191,439 @@ import { Neuron } from "@dniskav/neuron";
|
|
|
44
191
|
|
|
45
192
|
const neuron = new Neuron();
|
|
46
193
|
|
|
47
|
-
// Train: output 1 if input >= 18, else 0
|
|
48
194
|
for (let epoch = 0; epoch < 1000; epoch++) {
|
|
49
195
|
neuron.train(20, 1, 0.1); // adult
|
|
50
196
|
neuron.train(15, 0, 0.1); // minor
|
|
51
197
|
}
|
|
52
198
|
|
|
53
|
-
console.log(neuron.predict(17)); // ~0.1
|
|
54
|
-
console.log(neuron.predict(25)); // ~0.9
|
|
199
|
+
console.log(neuron.predict(17)); // ~0.1
|
|
200
|
+
console.log(neuron.predict(25)); // ~0.9
|
|
55
201
|
```
|
|
56
202
|
|
|
57
|
-
###
|
|
203
|
+
### NetworkN — deep network with custom architecture
|
|
58
204
|
|
|
59
205
|
```ts
|
|
60
|
-
import {
|
|
61
|
-
|
|
62
|
-
const neuron = new NeuronN(3); // 3 inputs: R, G, B
|
|
206
|
+
import { NetworkN, relu, sigmoid, Adam } from "@dniskav/neuron";
|
|
63
207
|
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
208
|
+
const net = new NetworkN([3, 64, 32, 1], {
|
|
209
|
+
activations: [relu, relu, sigmoid],
|
|
210
|
+
optimizer: () => new Adam(),
|
|
211
|
+
});
|
|
67
212
|
|
|
68
|
-
|
|
213
|
+
net.train([0.5, 0.3, 0.8], [1], 0.001);
|
|
214
|
+
const [out] = net.predict([0.5, 0.3, 0.8]);
|
|
69
215
|
```
|
|
70
216
|
|
|
71
|
-
###
|
|
217
|
+
### Historical Perceptron — step function, no hidden layers
|
|
72
218
|
|
|
73
219
|
```ts
|
|
74
|
-
import {
|
|
75
|
-
|
|
76
|
-
// 2 inputs → 8 hidden neurons → 1 output
|
|
77
|
-
const net = new Network(2, 8, 1);
|
|
220
|
+
import { Perceptron } from "@dniskav/neuron";
|
|
78
221
|
|
|
79
|
-
|
|
80
|
-
const data = [[0,0,0], [0,1,1], [1,0,1], [1,1,0]];
|
|
222
|
+
const p = new Perceptron(2);
|
|
81
223
|
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
}
|
|
224
|
+
// Learns AND gate (linearly separable)
|
|
225
|
+
const data = [[0,0,0],[0,1,0],[1,0,0],[1,1,1]];
|
|
226
|
+
for (let e = 0; e < 100; e++)
|
|
227
|
+
for (const [a, b, t] of data) p.train([a, b], t, 0.1);
|
|
87
228
|
|
|
88
|
-
console.log(
|
|
89
|
-
console.log(
|
|
229
|
+
console.log(p.predict([1, 1])); // 1
|
|
230
|
+
console.log(p.predict([0, 1])); // 0
|
|
231
|
+
// XOR cannot be learned — not linearly separable
|
|
90
232
|
```
|
|
91
233
|
|
|
92
|
-
###
|
|
234
|
+
### Linear Regression — normal equation
|
|
93
235
|
|
|
94
236
|
```ts
|
|
95
|
-
import {
|
|
237
|
+
import { LinearRegression } from "@dniskav/neuron";
|
|
96
238
|
|
|
97
|
-
|
|
98
|
-
const net = new NetworkN([3, 24, 16, 2]);
|
|
239
|
+
const model = new LinearRegression();
|
|
99
240
|
|
|
100
|
-
//
|
|
101
|
-
|
|
241
|
+
// Exact closed-form solution in one call
|
|
242
|
+
model.fitNormal(
|
|
243
|
+
[[1], [2], [3], [4]], // X
|
|
244
|
+
[2, 4, 6, 8] // y = 2x
|
|
245
|
+
);
|
|
102
246
|
|
|
103
|
-
//
|
|
104
|
-
|
|
247
|
+
console.log(model.predict([5])); // ~10
|
|
248
|
+
console.log(model.getCoefficients()); // { weights: [2], bias: ~0 }
|
|
105
249
|
```
|
|
106
250
|
|
|
107
|
-
###
|
|
251
|
+
### Logistic Regression — sigmoid + BCE
|
|
252
|
+
|
|
253
|
+
```ts
|
|
254
|
+
import { LogisticRegression } from "@dniskav/neuron";
|
|
255
|
+
|
|
256
|
+
const clf = new LogisticRegression(2);
|
|
257
|
+
const lossHistory = clf.train(
|
|
258
|
+
[[0,0],[1,1],[1,0],[0,1]],
|
|
259
|
+
[0, 1, 1, 0],
|
|
260
|
+
0.1, 500
|
|
261
|
+
);
|
|
262
|
+
|
|
263
|
+
console.log(clf.classify([0.9, 0.9])); // 1
|
|
264
|
+
console.log(clf.classify([0.1, 0.1])); // 0
|
|
265
|
+
```
|
|
108
266
|
|
|
109
|
-
|
|
267
|
+
### Gaussian Naive Bayes — zero gradient descent
|
|
110
268
|
|
|
111
269
|
```ts
|
|
112
|
-
import {
|
|
270
|
+
import { GaussianNaiveBayes } from "@dniskav/neuron";
|
|
113
271
|
|
|
114
|
-
const
|
|
115
|
-
|
|
116
|
-
|
|
272
|
+
const nb = new GaussianNaiveBayes();
|
|
273
|
+
nb.fit(
|
|
274
|
+
[[1.2, 0.5], [1.4, 0.7], [5.0, 4.5], [5.2, 4.8]],
|
|
275
|
+
[0, 0, 1, 1]
|
|
276
|
+
);
|
|
277
|
+
|
|
278
|
+
console.log(nb.predict([1.3, 0.6])); // 0
|
|
279
|
+
console.log(nb.predict([5.1, 4.6])); // 1
|
|
117
280
|
```
|
|
118
281
|
|
|
119
|
-
|
|
282
|
+
### Decision Tree — Gini split
|
|
120
283
|
|
|
121
|
-
|
|
284
|
+
```ts
|
|
285
|
+
import { DecisionTree } from "@dniskav/neuron";
|
|
286
|
+
|
|
287
|
+
const tree = new DecisionTree({ maxDepth: 4, task: 'classification' });
|
|
288
|
+
tree.fit(X_train, y_train);
|
|
289
|
+
const predictions = tree.predictBatch(X_test);
|
|
290
|
+
```
|
|
122
291
|
|
|
123
|
-
|
|
292
|
+
### K-Means — unsupervised clustering
|
|
124
293
|
|
|
125
294
|
```ts
|
|
126
|
-
import {
|
|
295
|
+
import { KMeans } from "@dniskav/neuron";
|
|
127
296
|
|
|
128
|
-
const
|
|
129
|
-
|
|
130
|
-
optimizer: () => new Adam(), // default: beta1=0.9, beta2=0.999
|
|
131
|
-
});
|
|
297
|
+
const km = new KMeans(3); // 3 clusters
|
|
298
|
+
km.fit(points);
|
|
132
299
|
|
|
133
|
-
//
|
|
134
|
-
|
|
135
|
-
const net2 = new NetworkN([2, 32, 1], {
|
|
136
|
-
optimizer: () => new Momentum(0.9),
|
|
137
|
-
});
|
|
300
|
+
const cluster = km.predict([1.2, 0.5]); // index 0, 1 or 2
|
|
301
|
+
console.log(km.inertia(points)); // lower = better fit
|
|
138
302
|
```
|
|
139
303
|
|
|
140
|
-
|
|
304
|
+
### PCA — dimensionality reduction
|
|
141
305
|
|
|
142
306
|
```ts
|
|
143
|
-
import {
|
|
307
|
+
import { PCA } from "@dniskav/neuron";
|
|
144
308
|
|
|
145
|
-
const
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
309
|
+
const pca = new PCA(2); // keep top 2 components
|
|
310
|
+
pca.fit(X); // 100 samples × 10 features
|
|
311
|
+
|
|
312
|
+
const Z = pca.transform(X); // 100 × 2
|
|
313
|
+
const X2 = pca.inverseTransform(Z); // reconstructed 100 × 10
|
|
314
|
+
|
|
315
|
+
console.log(pca.explainedVarianceRatio()); // [0.72, 0.15, ...]
|
|
149
316
|
```
|
|
150
317
|
|
|
151
|
-
###
|
|
318
|
+
### Self-Organizing Map
|
|
152
319
|
|
|
153
320
|
```ts
|
|
154
|
-
import {
|
|
321
|
+
import { SOM } from "@dniskav/neuron";
|
|
155
322
|
|
|
156
|
-
const
|
|
157
|
-
|
|
158
|
-
console.log(crossEntropy(predicted, [1, 0]));
|
|
159
|
-
```
|
|
323
|
+
const som = new SOM(10, 10, 3); // 10×10 grid, 3-dimensional inputs (RGB)
|
|
324
|
+
som.train(colors, 500);
|
|
160
325
|
|
|
161
|
-
|
|
326
|
+
const [row, col] = som.getBMU([255, 0, 0]); // find best matching unit for red
|
|
327
|
+
console.log(som.quantizationError(colors));
|
|
328
|
+
```
|
|
162
329
|
|
|
163
|
-
|
|
330
|
+
### Hopfield Network — associative memory
|
|
164
331
|
|
|
165
332
|
```ts
|
|
166
|
-
import {
|
|
333
|
+
import { HopfieldNetwork } from "@dniskav/neuron";
|
|
167
334
|
|
|
168
|
-
const net = new
|
|
169
|
-
const pred = net.predict(inputs);
|
|
335
|
+
const net = new HopfieldNetwork(64); // 64 binary neurons
|
|
170
336
|
|
|
171
|
-
//
|
|
172
|
-
|
|
173
|
-
net.
|
|
174
|
-
```
|
|
337
|
+
// Store two 64-bit patterns
|
|
338
|
+
net.store(HopfieldNetwork.binarize(pattern1)); // converts 0/1 → -1/+1
|
|
339
|
+
net.store(HopfieldNetwork.binarize(pattern2));
|
|
175
340
|
|
|
176
|
-
|
|
341
|
+
// Recall from noisy input
|
|
342
|
+
const recovered = net.recall(HopfieldNetwork.binarize(noisyPattern1));
|
|
343
|
+
console.log(net.energy(recovered)); // local minimum = stored memory
|
|
344
|
+
```
|
|
177
345
|
|
|
178
|
-
|
|
346
|
+
### Autoencoder — learn compressed representations
|
|
179
347
|
|
|
180
348
|
```ts
|
|
181
|
-
import {
|
|
349
|
+
import { Autoencoder } from "@dniskav/neuron";
|
|
182
350
|
|
|
183
|
-
//
|
|
184
|
-
const
|
|
351
|
+
// 784 → [128, 64] → 16 (latent) → [64, 128] → 784
|
|
352
|
+
const ae = new Autoencoder(784, [128, 64], 16, [64, 128]);
|
|
185
353
|
|
|
186
|
-
|
|
187
|
-
|
|
354
|
+
for (let e = 0; e < 1000; e++)
|
|
355
|
+
for (const x of images)
|
|
356
|
+
ae.train(x, 0.001);
|
|
188
357
|
|
|
189
|
-
|
|
190
|
-
|
|
358
|
+
const latent = ae.encode(image); // compressed: 16 values
|
|
359
|
+
const reconstructed = ae.reconstruct(image); // decoded back: 784 values
|
|
360
|
+
```
|
|
191
361
|
|
|
192
|
-
|
|
193
|
-
for (let step = 0; step < 6; step++) {
|
|
194
|
-
net.predict([1]); // same input every step
|
|
195
|
-
targets.push([step >= 3 ? 1 : 0]);
|
|
196
|
-
}
|
|
362
|
+
### GAN — generative adversarial training
|
|
197
363
|
|
|
198
|
-
|
|
364
|
+
```ts
|
|
365
|
+
import { GAN } from "@dniskav/neuron";
|
|
366
|
+
|
|
367
|
+
const gan = new GAN(
|
|
368
|
+
16, // latentDim
|
|
369
|
+
[32, 64], // generator hidden layers
|
|
370
|
+
8, // outputDim (size of generated samples)
|
|
371
|
+
[64, 32], // discriminator hidden layers
|
|
372
|
+
);
|
|
373
|
+
|
|
374
|
+
for (let step = 0; step < 10000; step++) {
|
|
375
|
+
const { dLoss, gLoss } = gan.trainStep(realBatch, 0.0002);
|
|
376
|
+
if (step % 500 === 0) console.log(`D: ${dLoss.toFixed(3)} G: ${gLoss.toFixed(3)}`);
|
|
199
377
|
}
|
|
200
378
|
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
379
|
+
const fake = gan.generate(); // new synthetic sample
|
|
380
|
+
```
|
|
381
|
+
|
|
382
|
+
### VAE — variational autoencoder
|
|
383
|
+
|
|
384
|
+
```ts
|
|
385
|
+
import { VAE } from "@dniskav/neuron";
|
|
386
|
+
|
|
387
|
+
const vae = new VAE(784, [256, 128], 32, [128, 256]);
|
|
388
|
+
|
|
389
|
+
for (const x of dataset) {
|
|
390
|
+
const { totalLoss, reconLoss, klLoss } = vae.train(x, 0.001);
|
|
206
391
|
}
|
|
207
|
-
|
|
208
|
-
//
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
|
|
212
|
-
// step 5: 0.93 (expected: 1)
|
|
392
|
+
|
|
393
|
+
// Sample from latent space
|
|
394
|
+
const generated = vae.generate(); // random sample
|
|
395
|
+
const { mu, logVar } = vae.encode(image); // encode → distribution params
|
|
396
|
+
const z = vae.reparametrize(mu, logVar); // sample z ~ N(μ, σ²)
|
|
213
397
|
```
|
|
214
398
|
|
|
215
|
-
|
|
399
|
+
### Value / Tape — automatic differentiation
|
|
216
400
|
|
|
217
|
-
|
|
401
|
+
```ts
|
|
402
|
+
import { Value } from "@dniskav/neuron";
|
|
218
403
|
|
|
219
|
-
|
|
404
|
+
// Build a computation graph
|
|
405
|
+
const x = new Value(2.0);
|
|
406
|
+
const w = new Value(-3.0);
|
|
407
|
+
const b = new Value(6.7);
|
|
408
|
+
const n = x.mul(w).add(b); // n = x*w + b
|
|
409
|
+
const o = n.tanh(); // o = tanh(n)
|
|
220
410
|
|
|
411
|
+
// Backward pass — fills .grad for every node
|
|
412
|
+
o.backward();
|
|
413
|
+
|
|
414
|
+
console.log(x.grad); // ∂o/∂x
|
|
415
|
+
console.log(w.grad); // ∂o/∂w
|
|
416
|
+
console.log(b.grad); // ∂o/∂b
|
|
221
417
|
```
|
|
222
|
-
|
|
223
|
-
|
|
418
|
+
|
|
419
|
+
### Conv2D + MaxPool2D + Flatten — CNN pipeline
|
|
420
|
+
|
|
421
|
+
```ts
|
|
422
|
+
import { Conv2D, MaxPool2D, Flatten, NetworkN, relu, sigmoid } from "@dniskav/neuron";
|
|
423
|
+
|
|
424
|
+
const conv = new Conv2D(28, 28, 1, 3, 8); // 28×28×1 → 26×26×8
|
|
425
|
+
const pool = new MaxPool2D(2); // 26×26×8 → 13×13×8
|
|
426
|
+
const flatten = new Flatten();
|
|
427
|
+
const dense = new NetworkN([13*13*8, 64, 10]);
|
|
428
|
+
|
|
429
|
+
// Forward
|
|
430
|
+
const featureMaps = conv.forward(image); // [H][W][C]
|
|
431
|
+
const pooled = pool.forward(featureMaps);
|
|
432
|
+
const flat = flatten.forward(pooled); // 1352 values
|
|
433
|
+
const logits = dense.predict(flat);
|
|
224
434
|
```
|
|
225
435
|
|
|
226
|
-
|
|
436
|
+
### RNN — vanilla recurrent network
|
|
227
437
|
|
|
228
|
-
|
|
438
|
+
```ts
|
|
439
|
+
import { RNN } from "@dniskav/neuron";
|
|
229
440
|
|
|
230
|
-
|
|
441
|
+
// 1 input → 16 hidden → 1 output, over a sequence
|
|
442
|
+
const rnn = new RNN(1, 16, 1);
|
|
231
443
|
|
|
232
|
-
|
|
444
|
+
const sequence = [[0.1], [0.3], [0.7], [0.9]]; // 4 timesteps
|
|
445
|
+
const { outputs, hiddens } = rnn.forward(sequence);
|
|
233
446
|
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
447
|
+
// BPTT backward — returns MSE loss
|
|
448
|
+
const targets = [[0.2], [0.5], [0.8], [1.0]];
|
|
449
|
+
const loss = rnn.backward(sequence, targets, 0.01);
|
|
237
450
|
```
|
|
238
451
|
|
|
239
|
-
|
|
452
|
+
### TCN — Temporal Convolutional Network
|
|
240
453
|
|
|
241
|
-
|
|
454
|
+
```ts
|
|
455
|
+
import { TCN } from "@dniskav/neuron";
|
|
456
|
+
|
|
457
|
+
// 3 input channels → 32 channels × 4 levels → 1 output
|
|
458
|
+
// Receptive field = (3-1)·(2⁴-1)+1 = 30 timesteps
|
|
459
|
+
const tcn = new TCN(3, 32, 3, 4, 1);
|
|
460
|
+
|
|
461
|
+
const sequence = Array.from({ length: 50 }, () => [Math.random(), Math.random(), Math.random()]);
|
|
462
|
+
const outputs = tcn.forward(sequence); // [50][1]
|
|
463
|
+
```
|
|
242
464
|
|
|
243
|
-
###
|
|
465
|
+
### NetworkLSTM — recurrent memory
|
|
244
466
|
|
|
245
467
|
```ts
|
|
246
|
-
import {
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
468
|
+
import { NetworkLSTM } from "@dniskav/neuron";
|
|
469
|
+
|
|
470
|
+
const net = new NetworkLSTM(1, 8, [4, 1]);
|
|
471
|
+
|
|
472
|
+
for (let epoch = 0; epoch < 300; epoch++) {
|
|
473
|
+
net.resetState();
|
|
474
|
+
for (let step = 0; step < 6; step++) net.predict([1]);
|
|
475
|
+
net.train([[0],[0],[0],[1],[1],[1]], 0.05);
|
|
476
|
+
}
|
|
477
|
+
```
|
|
478
|
+
|
|
479
|
+
### Metrics — evaluate your model
|
|
480
|
+
|
|
481
|
+
```ts
|
|
482
|
+
import { accuracy, f1Score, confusionMatrix, printConfusionMatrix, auc, classificationReport } from "@dniskav/neuron";
|
|
483
|
+
|
|
484
|
+
const yTrue = [0, 1, 1, 0, 1];
|
|
485
|
+
const yPred = [0, 1, 0, 0, 1];
|
|
486
|
+
|
|
487
|
+
console.log(accuracy(yTrue, yPred)); // 0.8
|
|
488
|
+
console.log(f1Score(yTrue, yPred)); // 0.8
|
|
489
|
+
|
|
490
|
+
const cm = confusionMatrix(yTrue, yPred);
|
|
491
|
+
printConfusionMatrix(cm, ['neg', 'pos']);
|
|
492
|
+
|
|
493
|
+
// AUC-ROC
|
|
494
|
+
const scores = [0.1, 0.9, 0.4, 0.2, 0.8];
|
|
495
|
+
console.log(auc(yTrue, scores)); // ~0.9
|
|
257
496
|
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
497
|
+
classificationReport(yTrue, yPred, ['neg', 'pos']);
|
|
498
|
+
```
|
|
499
|
+
|
|
500
|
+
### EarlyStopping
|
|
501
|
+
|
|
502
|
+
```ts
|
|
503
|
+
import { EarlyStopping } from "@dniskav/neuron";
|
|
262
504
|
|
|
263
|
-
const
|
|
264
|
-
// loss is cross-entropy (not MSE) — decreases from ~2.2 toward 0 as training progresses
|
|
265
|
-
const logits = net.predict(puzzle); // 729 logits (81 × 9)
|
|
505
|
+
const stopper = new EarlyStopping({ patience: 10, minDelta: 1e-4, mode: 'min' });
|
|
266
506
|
|
|
267
|
-
|
|
268
|
-
const
|
|
269
|
-
|
|
507
|
+
for (let epoch = 0; epoch < 1000; epoch++) {
|
|
508
|
+
const valLoss = trainEpoch();
|
|
509
|
+
if (stopper.update(valLoss, epoch)) {
|
|
510
|
+
console.log(`Stopped at epoch ${epoch}`);
|
|
511
|
+
break;
|
|
512
|
+
}
|
|
513
|
+
}
|
|
270
514
|
```
|
|
271
515
|
|
|
272
|
-
|
|
273
|
-
3×3 box). The network figures this out by itself through training.
|
|
516
|
+
### LossPlotter — ASCII loss curve
|
|
274
517
|
|
|
275
|
-
|
|
518
|
+
```ts
|
|
519
|
+
import { LossPlotter } from "@dniskav/neuron";
|
|
276
520
|
|
|
277
|
-
|
|
521
|
+
const plotter = new LossPlotter({ width: 60, height: 12, title: 'Training Loss' });
|
|
522
|
+
|
|
523
|
+
for (let e = 0; e < 500; e++) {
|
|
524
|
+
const loss = trainStep();
|
|
525
|
+
plotter.add(loss, e);
|
|
526
|
+
}
|
|
527
|
+
|
|
528
|
+
plotter.print();
|
|
529
|
+
// Training Loss
|
|
530
|
+
// ┌────────────────────────────────────────────────────────────┐
|
|
531
|
+
// │ 2.31 ·
|
|
532
|
+
// │ · ·
|
|
533
|
+
// │ · · ·
|
|
534
|
+
// │ · · · · · · ·
|
|
535
|
+
// │ 0.02 · · · · · · · · · · · · · · ·
|
|
536
|
+
// └────────────────────────────────────────────────────────────┘
|
|
537
|
+
// 0 250 499
|
|
538
|
+
```
|
|
539
|
+
|
|
540
|
+
### DataAugmentation
|
|
278
541
|
|
|
279
542
|
```ts
|
|
280
|
-
import {
|
|
281
|
-
|
|
282
|
-
// Agent sees the last 8 steps, each step is a 7-value sensor vector → 4 actions
|
|
283
|
-
const net = new NetworkTransformerRL(8, 7, {
|
|
284
|
-
d_model: 32,
|
|
285
|
-
nHeads: 2,
|
|
286
|
-
d_ff: 64,
|
|
287
|
-
nBlocks: 2,
|
|
288
|
-
nActions: 4,
|
|
289
|
-
});
|
|
543
|
+
import { DataAugmentation } from "@dniskav/neuron";
|
|
290
544
|
|
|
291
|
-
//
|
|
292
|
-
const
|
|
293
|
-
const qValues = net.predict(sequence); // number[4]
|
|
545
|
+
// Split dataset
|
|
546
|
+
const { trainX, trainY, valX, valY } = DataAugmentation.split(X, y, 0.8, 0.1);
|
|
294
547
|
|
|
295
|
-
//
|
|
296
|
-
const
|
|
297
|
-
const
|
|
298
|
-
const targets = qValues.slice();
|
|
299
|
-
targets[action] = reward + 0.99 * Math.max(...net.predict(nextSequence));
|
|
548
|
+
// Normalize (fit on train, apply to all)
|
|
549
|
+
const { normalized: normTrain, min, max } = DataAugmentation.normalize(trainX);
|
|
550
|
+
const normVal = valX.map(x => DataAugmentation.normalizePoint(x, min, max));
|
|
300
551
|
|
|
301
|
-
|
|
552
|
+
// Augment training set (×3 copies with Gaussian noise)
|
|
553
|
+
const { X: augX, y: augY } = DataAugmentation.augmentBatch(normTrain, trainY, 3, 0.02);
|
|
302
554
|
```
|
|
303
555
|
|
|
304
|
-
|
|
556
|
+
### WeightInspector — diagnose your network
|
|
305
557
|
|
|
306
558
|
```ts
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
559
|
+
import { NetworkN, WeightInspector, relu } from "@dniskav/neuron";
|
|
560
|
+
|
|
561
|
+
const net = new NetworkN([784, 256, 128, 10], { activations: [relu, relu, relu] });
|
|
562
|
+
// ... train ...
|
|
563
|
+
|
|
564
|
+
WeightInspector.print(net);
|
|
565
|
+
// Layer 0: mean=0.001 std=0.056 min=-0.21 max=0.19 dead=0 params=200960
|
|
566
|
+
// Layer 1: mean=0.000 std=0.079 min=-0.31 max=0.28 dead=3 params=32896
|
|
567
|
+
// Layer 2: mean=-0.001 std=0.091 min=-0.28 max=0.32 dead=0 params=1290
|
|
310
568
|
```
|
|
311
569
|
|
|
570
|
+
## How it works
|
|
571
|
+
|
|
572
|
+
Each class applies an **activation function** to the weighted sum of inputs and uses **gradient descent** to update weights:
|
|
573
|
+
|
|
574
|
+
```
|
|
575
|
+
weight += lr × delta × input
|
|
576
|
+
bias += lr × delta
|
|
577
|
+
```
|
|
578
|
+
|
|
579
|
+
`NetworkN` implements full **backpropagation** across all layers, propagating deltas from the output back to the first layer using the chain rule. `NeuronN` uses **Xavier initialization** — weights start in `[-√(1/n), +√(1/n)]`.
|
|
580
|
+
|
|
581
|
+
When an **optimizer** is used (e.g., Adam), the raw gradient is passed to the optimizer instead of being applied directly. Each weight maintains its own optimizer state.
|
|
582
|
+
|
|
583
|
+
The `Value` class implements **reverse-mode automatic differentiation**: every operation records its inputs and a backward function. Calling `.backward()` on the output node performs a topological sort and propagates `∂L/∂w` through the entire graph.
|
|
584
|
+
|
|
585
|
+
## Build
|
|
586
|
+
|
|
587
|
+
```bash
|
|
588
|
+
npm run build # outputs CJS + ESM + type declarations to dist/
|
|
589
|
+
npm run dev # watch mode
|
|
590
|
+
npm test # run test suite
|
|
591
|
+
```
|
|
592
|
+
|
|
593
|
+
## For AI agents
|
|
594
|
+
|
|
595
|
+
If you are an AI agent or LLM working with this codebase, read [AGENTS.md](AGENTS.md) first. It contains the full class hierarchy, design constraints, and what this library does not do.
|
|
596
|
+
|
|
312
597
|
## Changelog
|
|
313
598
|
|
|
599
|
+
### v0.3.0
|
|
600
|
+
- **New — Classical ML:** `Perceptron`, `LinearRegression` (normal equation + GD), `LogisticRegression`, `SoftmaxRegression`, `GaussianNaiveBayes`, `DecisionTree` (CART, Gini/MSE)
|
|
601
|
+
- **New — Unsupervised:** `KMeans` (K-Means++ init), `PCA` (power iteration + Hotelling deflation), `SOM` (Kohonen map), `HopfieldNetwork` (Hebbian storage + energy), `Autoencoder`
|
|
602
|
+
- **New — Deep Learning:** `Conv2D` (full forward/backward), `MaxPool2D` (position mask for exact backprop), `Flatten`, `RNN` (BPTT, documents vanishing gradient), `Seq2Seq` (encoder-decoder LSTM), `CausalConv1D`, `TCN` (dilated temporal convolutions)
|
|
603
|
+
- **New — Generative:** `GAN` (min-max game, Box-Muller sampling), `VAE` (reparametrization trick, ELBO = MSE + KL)
|
|
604
|
+
- **New — Autograd:** `Value` / `Tape` — scalar reverse-mode AD with topological backprop (micrograd-style)
|
|
605
|
+
- **New — Metrics:** `confusionMatrix`, `accuracy`, `precision`, `recall`, `f1Score`, `rocCurve`, `auc`, `mae`, `rmse`, `r2Score`, `perplexity`, `printConfusionMatrix`, `classificationReport`
|
|
606
|
+
- **New — Utilities:** `EarlyStopping` (patience + best-weight restore), `LossPlotter` (ASCII terminal curve), `WeightInspector` (per-layer stats, dead ReLU detection), `DataAugmentation` (noise, normalize, z-score, shuffle, split)
|
|
607
|
+
|
|
608
|
+
### v0.2.7
|
|
609
|
+
- **Docs:** Added architecture diagram to README
|
|
610
|
+
|
|
314
611
|
### v0.2.6
|
|
315
612
|
- **Fix:** `Network.predict` now returns `number[]` (consistent with all other network classes)
|
|
316
|
-
- **Fix:** `Network.train` now uses the configured optimizer and `activation.dfn()`
|
|
317
|
-
- **Fix:** `LayerNorm.backwardOne`
|
|
318
|
-
- **Fix:** LSTM and GRU gate initialization corrected
|
|
319
|
-
- **New:** `BiasVector` — 1D counterpart to `WeightMatrix`
|
|
320
|
-
- **New:** `defaultOptimizer`
|
|
321
|
-
- **Refactor:** `NetworkN
|
|
322
|
-
- **Refactor:** `Transformer` backward methods now throw descriptive errors instead of crashing with a cryptic `TypeError` when called before `predict()`
|
|
323
|
-
- **Refactor:** `NetworkTransformer.setWeights()` and `NetworkTransformerRL.setWeightsFlat()` use each component's own `setWeights()` instead of direct `.W` mutation
|
|
613
|
+
- **Fix:** `Network.train` now uses the configured optimizer and `activation.dfn()`
|
|
614
|
+
- **Fix:** `LayerNorm.backwardOne` correctly uses pre-update γ
|
|
615
|
+
- **Fix:** LSTM and GRU gate initialization corrected to Xavier fan-in+out
|
|
616
|
+
- **New:** `BiasVector` — 1D counterpart to `WeightMatrix`
|
|
617
|
+
- **New:** `defaultOptimizer` — shared default factory
|
|
618
|
+
- **Refactor:** `NetworkN` extracts `_forwardAll()` and `_backpropLayers()`
|
|
324
619
|
|
|
325
620
|
### v0.2.5
|
|
326
|
-
- Unified optimizer factories for `LSTMLayer`, `GRULayer`, `Conv1D`
|
|
327
|
-
- `NetworkN`: residual connections
|
|
328
|
-
- `Conv1D`: multi-channel input
|
|
329
|
-
- `
|
|
330
|
-
- `
|
|
331
|
-
- `
|
|
332
|
-
- `ModelSaver`: universal serialization via flat `getWeights()`/`setWeights()` for all classes
|
|
333
|
-
- Gradient check test suite (`tests/GradientCheck.test.ts`)
|
|
334
|
-
|
|
335
|
-
## Possible improvements
|
|
336
|
-
|
|
337
|
-
1. **Support for batches** in training to improve efficiency and gradient stability.
|
|
338
|
-
2. **Global gradient norm clipping** — `WeightMatrix.update` supports per-element clipping; a utility to clip across all matrices by total norm would be more principled.
|
|
339
|
-
3. **Learning rate warmup** — standard practice for Transformers; ramp LR from 0 to target over the first N steps.
|
|
340
|
-
4. **Pre-norm architecture** — LayerNorm before the residual add (instead of after) is more stable for deep stacks.
|
|
621
|
+
- Unified optimizer factories for `LSTMLayer`, `GRULayer`, `Conv1D`
|
|
622
|
+
- `NetworkN`: residual connections and dropout
|
|
623
|
+
- `Conv1D`: multi-channel input
|
|
624
|
+
- `Trainer`: weight decay, early stopping, classification metrics
|
|
625
|
+
- `DataLoader`: validation split
|
|
626
|
+
- `ModelSaver`: universal serialization
|
|
341
627
|
|
|
342
628
|
## License
|
|
343
629
|
|