neuronet 6.1.0 → 7.0.230416

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,786 +1,137 @@
1
- # Neuronet 6.0.1
1
+ # Neuronet
2
2
 
3
- Library to create neural networks.
4
-
5
- * Gem: <https://rubygems.org/gems/neuronet>
6
- * Git: <https://github.com/carlosjhr64/neuronet>
7
- * Author: <carlosjhr64@gmail.com>
8
- * Copyright: 2013
9
- * License: [GPL](http://www.gnu.org/licenses/gpl.html)
10
-
11
- ## Installation
12
-
13
- gem install neuronet
14
-
15
- ## Synopsis
16
-
17
- Given some set of inputs (of at least length 3) and
18
- targets that are Array's of Float's. Then:
19
-
20
- # data = [ [input, target], ... }
21
- # n = input.length # > 3
22
- # t = target.length
23
- # m = n + t
24
- # l = data.length
25
- # Then:
26
- # Create a general purpose neuronet
27
-
28
- neuronet = Neuronet::ScaledNetwork.new([n, m, t])
29
-
30
- # "Bless" it as a TaoYinYang,
31
- # a perceptron hybrid with the middle layer
32
- # initially mirroring the input layer and
33
- # mirrored by the output layer.
34
-
35
- Neuronet::TaoYinYang.bless(neuronet)
36
-
37
- # The following sets the learning constant
38
- # to something I think is reasonable.
39
-
40
- neuronet.num(l)
41
-
42
- # Start training
43
-
44
- MANY.times do
45
- data.shuffle.each do |input, target|
46
- neuronet.reset(input)
47
- neuronet.train!(target)
48
- end
49
- end # or until some small enough error
50
-
51
- # See how well the training went
52
-
53
- require 'pp'
54
- data.each do |input, target|
55
- puts "Input:"
56
- pp input
57
- puts "Output:"
58
- neuronet.reset(input) # sets the input values
59
- pp neuronet.output # gets the output values
60
- puts "Target:"
61
- pp target
62
- end
63
-
64
- ## Introduction
65
-
66
- Neuronet is a pure Ruby 1.9, sigmoid squashed, neural network building library.
67
- It allows one to build a network by connecting one neuron at a time, or a layer at a time,
68
- or up to a full feed forward network that automatically scales the inputs and outputs.
69
-
70
- I chose a TaoYinYang'ed ScaledNetwork neuronet for the synopsis because
71
- it will probably handle most anything with 3 or more input variables you'd throw at it.
72
- But there's a lot you can do to the data before throwing it at a neuronet.
73
- And you can build a neuronet specifically to solve a particular kind of problem.
74
- Properly transforming the data and choosing the right neuronet architecture
75
- can greatly reduce the amount of training time the neuronet will require.
76
- A neuronet with the wrong architecture for a problem will be unable to solve it.
77
- Raw data without hints as to what's important in the data will take longer to solve.
78
-
79
- As an analogy, think of what you can do with
80
- [linear regression](http://en.wikipedia.org/wiki/Linear_regression).
81
- Your raw data might not be linear, but if a transform converts it to a linear form,
82
- you can use linear regression to find the best fit line, and
83
- from that deduce the properties of the untransformed data.
84
- Likewise, if you can transform the data into something the neuronet can solve,
85
- you can by inverse get back the answer you're lookin for.
86
-
87
- # Examples
88
-
89
- ## Time Series
90
-
91
- A common use for a neural-net is to attempt to forecast future set of data points
92
- based on past set of data points, [Time series](http://en.wikipedia.org/wiki/Time_series).
93
- To demonstrate, I'll train a network with the following function:
94
-
95
- f(t) = A + B sine(C + D t), t in [0,1,2,3,...]
96
-
97
- I'll set A, B, C, and D with random numbers and see
98
- if eventually the network can predict the next set of values based on previous values.
99
- I'll try:
100
-
101
- [f(n),...,f(n+19)] => [f(n+20),...,f(n+24)]
102
-
103
- That is... given 20 consecutive values, give the next 5 in the series.
104
- There is no loss, and probably greater generality,
105
- if I set at random the phase (C above), so that for any given random phase we want:
106
-
107
- [f(0),...,f(19)] => [f(20),...,f(24)]
108
-
109
- I'll be using [Neuronet::ScaledNetwork](http://rubydoc.info/gems/neuronet/Neuronet/ScaledNetwork).
110
- Also note that the Sine function is entirely defined within a cycle ( 2 Math::PI ) and
111
- so parameters (particularly C) need only to be set within this cycle.
112
- After a lot of testing, I've verified that a
113
- [Perceptron](http://en.wikipedia.org/wiki/Perceptron) is enough to solve the problem.
114
- The Sine function is [Linearly separable](http://en.wikipedia.org/wiki/Linearly_separable).
115
- Adding hidden layers needlessly adds training time, but does converge.
116
-
117
- The gist of the
118
- [example code](https://github.com/carlosjhr64/neuronet/blob/master/examples/sine_series.rb)
119
- is:
120
-
121
- ...
122
- # The constructor
123
- neuronet = Neuronet::ScaledNetwork.new([INPUTS, OUTPUTS])
124
- ...
125
- # Setting learning constant
126
- neuronet.num(1.0)
127
- ...
128
- # Setting the input values
129
- neuronet.reset(input)
130
- ...
131
- # Getting the neuronet's output
132
- output = neuronet.output
133
- ...
134
- # Training the target
135
- neuronet.train!(target)
136
- ...
137
-
138
- Heres a sample output:
139
-
140
- f(phase, t) = 3.002 + 3.28*Sin(phase + 1.694*t)
141
- Cycle step = 0.27
142
-
143
- Iterations: 1738
144
- Relative Error (std/B): 0.79% Standard Deviation: 0.026
145
- Examples:
146
-
147
- Input: 0.522, 1.178, 5.932, 4.104, -0.199, 2.689, 6.28, 2.506, -0.154, 4.276, 5.844, 1.028, 0.647, 5.557, 4.727, 0.022, 2.011, 6.227, 3.198, -0.271
148
- Target: 3.613, 6.124, 1.621, 0.22, 5.069
149
- Output: 3.575, 6.101, 1.664, 0.227, 5.028
150
-
151
- Input: 5.265, 5.079, 0.227, 1.609, 6.12, 3.626, -0.27, 3.184, 6.229, 2.024, 0.016, 4.716, 5.565, 0.656, 1.017, 5.837, 4.288, -0.151, 2.493, 6.28
152
- Target: 2.703, -0.202, 4.091, 5.938, 1.189
153
- Output: 2.728, -0.186, 4.062, 5.931, 1.216
154
-
155
- Input: 5.028, 0.193, 1.669, 6.14, 3.561, -0.274, 3.25, 6.217, 1.961, 0.044, 4.772, 5.524, 0.61, 1.07, 5.87, 4.227, -0.168, 2.558, 6.281, 2.637
156
- Target: -0.188, 4.153, 5.908, 1.135, 0.557
157
- Output: -0.158, 4.112, 5.887, 1.175, 0.564
158
-
159
- ScaledNetwork automatically scales each input via
160
- [Neuronet::Gaussian](http://rubydoc.info/gems/neuronet/Neuronet/Gaussian),
161
- so the input needs to be many variables and
162
- the output entirely determined by the shape of the input and not it's scale.
163
- That is, two inputs that are different only in scale should
164
- produce outputs that are different only in scale.
165
- The input must have at least three points.
166
-
167
- You can tackle many problems just with
168
- [Neuronet::ScaledNetwork](http://rubydoc.info/gems/neuronet/Neuronet/ScaledNetwork)
169
- as described above.
170
-
171
- # Component Architecture
172
-
173
- ## Nodes and Neurons
174
-
175
- [Nodes](http://rubydoc.info/gems/neuronet/Neuronet/Node)
176
- are used to set inputs while
177
- [Neurons](http://rubydoc.info/gems/neuronet/Neuronet/Neuron)
178
- are used for outputs and middle layers.
179
- It's easy to create and connect Nodes and Neurons.
180
- You can assemble custom neuronets one neuron at a time.
181
- Too illustrate, here's a simple network that adds two random numbers.
182
-
183
- require 'neuronet'
184
- include Neuronet
185
-
186
- def random
187
- rand - rand
188
- end
189
-
190
- # create the input nodes
191
- a = Node.new
192
- b = Node.new
193
-
194
- # create the output neuron
195
- sum = Neuron.new
196
-
197
- # and a neuron on the side
198
- adjuster = Neuron.new
199
-
200
- # connect the adjuster to a and b
201
- adjuster.connect(a)
202
- adjuster.connect(b)
203
-
204
- # connect sum to a and b
205
- sum.connect(a)
206
- sum.connect(b)
207
- # and to the adjuster
208
- sum.connect(adjuster)
209
-
210
- # The learning constant is about...
211
- learning = 0.1
212
-
213
- # Train the tiny network
214
- 10_000.times do
215
- a.value = x = random
216
- b.value = y = random
217
- target = x+y
218
- output = sum.update
219
- sum.backpropagate(learning*(target-output))
220
- end
221
-
222
- # Let's see how well the training went
223
- 10.times do
224
- a.value = x = random
225
- b.value = y = random
226
- target = x+y
227
- output = sum.update
228
- puts "#{x.round(3)} + #{y.round(3)} = #{target.round(3)}"
229
- puts " Neuron says #{output.round(3)}, #{(100.0*(target-output)/target).round(2)}% error."
230
- end
231
-
232
-
233
- Here's a sample output:
234
-
235
- 0.003 + -0.413 = -0.41
236
- Neuron says -0.413, -0.87% error.
237
- -0.458 + 0.528 = 0.07
238
- Neuron says 0.07, -0.45% error.
239
- 0.434 + -0.125 = 0.309
240
- Neuron says 0.313, -1.43% error.
241
- -0.212 + 0.34 = 0.127
242
- Neuron says 0.131, -2.83% error.
243
- -0.364 + 0.659 = 0.294
244
- Neuron says 0.286, 2.86% error.
245
- 0.045 + 0.323 = 0.368
246
- Neuron says 0.378, -2.75% error.
247
- 0.545 + 0.901 = 1.446
248
- Neuron says 1.418, 1.9% error.
249
- -0.451 + -0.486 = -0.937
250
- Neuron says -0.944, -0.77% error.
251
- -0.008 + 0.219 = 0.211
252
- Neuron says 0.219, -3.58% error.
253
- 0.61 + 0.554 = 1.163
254
- Neuron says 1.166, -0.25% error.
255
-
256
- Note that the tiny neuronet has a limit on how precisely it can match the target, and
257
- even after a million times training it won't do any beter than when it trains a few thousands.
258
- [code](https://github.com/carlosjhr64/neuronet/blob/master/examples/neurons.rb)
259
-
260
-
261
- ## InputLayer and Layer
262
-
263
- Instead of working with individual neurons, you can work with layers.
264
- Here we build a [Perceptron](http://en.wikipedia.org/wiki/Perceptron):
265
-
266
- in = InputLayer.new(9)
267
- out = Layer.new(1)
268
- out.connect(in)
269
-
270
- When making connections keep in mind "outputs connects to inputs",
271
- not the other way around.
272
- You can set the input values and update this way:
273
-
274
- in.set([1,2,3,4,5,6,7,8,9])
275
- out.partial
276
-
277
- Partial means the update wont travel further than the current layer,
278
- which is all we have in this case anyways.
279
- You get the output this way:
280
-
281
- output = out.output # returns an array of values
282
-
283
- You train this way:
284
-
285
- target = [1] #<= whatever value you want in the array
286
- learning = 0.1
287
- out.train(target, learning)
288
-
289
- ## FeedForward Network
290
-
291
- Most of the time, you'll just use a network created with the
292
- [FeedForward](http://rubydoc.info/gems/neuronet/Neuronet/FeedForward) class,
293
- or a modified version or subclass of it.
294
- Here we build a neuronet with four layers.
295
- The input layer has four neurons, and the output has three.
296
- Then we train it with a list of inputs and targets
297
- using the method [#exemplar](http://rubydoc.info/gems/neuronet/Neuronet/FeedForward:exemplar):
298
-
299
- neuronet = Neuronet::FeedForward.new([4,5,6,3])
300
- LIST.each do |input, target|
301
- neuronet.exemplar(input, target)
302
- # you could also train this way:
303
- # neuronet.set(input)
304
- # neuronet.train!(target)
305
- end
306
-
307
- The first layer is the input layer and the last layer is the output layer.
308
- Neuronet also names the second and second last layer.
309
- The second layer is called yin.
310
- The second last layer is called yang.
311
- For the example above, we can check their lengths.
312
-
313
- puts neuronet.in.length #=> 4
314
- puts neuronet.yin.length #=> 5
315
- puts neuronet.yang.length #=> 6
316
- puts neuronet.out.length #=> 3
317
-
318
- ## Tao, Yin, Yang, and Brahma
319
-
320
- Tao
321
- : The absolute principle underlying the universe,
322
- combining within itself the principles of yin and yang and
323
- signifying the way, or code of behavior,
324
- that is in harmony with the natural order.
325
-
326
- Perceptrons are already very capable and quick to train.
327
- By connecting the input layer to the output layer of a multilayer FeedForward network,
328
- you'll get the Perceptron solution quicker while the middle layers work on the harder problem.
329
- You can do that this way:
330
-
331
- neronet.out.connect(neuronet.in)
332
-
333
- But giving that a name, [Tao](http://rubydoc.info/gems/neuronet/Neuronet/Tao),
334
- and using a prototype pattern to modify the instance is more fun:
335
-
336
- Tao.bless(neuronet)
337
-
338
- Yin
339
- : The passive female principle of the universe, characterized as female and
340
- sustaining and associated with earth, dark, and cold.
341
-
342
- Initially FeedForward sets the weights of all connections to zero.
343
- That is, there is no association made from input to ouput.
344
- Changes in the inputs have no effect on the output.
345
- Training begins the process that sets the weights to associate the two.
346
- But you can also manually set the initial weights.
347
- One useful way to initially set the weigths is to have one layer mirror another.
348
- The [Yin](http://rubydoc.info/gems/neuronet/Neuronet/Yin) bless makes yin mirror the input.
349
- The length of yin must be at least that of in.
350
- The pairing starts with in.first and yin.first on up.
351
-
352
- Yin.bless(neuronet)
353
-
354
- Yang
355
- : The active male principle of the universe, characterized as male and
356
- creative and associated with heaven, heat, and light.
357
-
358
- On the other hand, the [Yang](http://rubydoc.info/gems/neuronet/Neuronet/Yang)
359
- bless makes the output mirror yang.
360
- The length of yang must be a least that of out.
361
- The pairing starts from yang.last and out.last on down.
362
-
363
- Yang.bless(neuronet)
364
-
365
- Brahma
366
- : The creator god in later Hinduism, who forms a triad with Vishnu the preserver and Shiva the destroyer.
367
-
368
- [Brahma](http://rubydoc.info/gems/neuronet/Neuronet/Brahma)
369
- pairs each input node with two yin neurons sending them respectively the positive and negative value of its activation.
370
- I'd say then that yin both mirrors and shadows input.
371
- The length of yin must be at least twice that of in.
372
- The pairing starts with in.first and yin.first on up.
373
-
374
- Brahma.bless(neuronet)
375
-
376
- Bless
377
- : Pronounce words in a religious rite, to confer or invoke divine favor upon.
378
-
379
- The reason Tao, Yin, and Yang are not classes onto themselves is that
380
- you can combine these, and a protoptype pattern (bless) works better in this case.
381
- Bless is the keyword used in [Perl](http://www.perl.org/) to create objects,
382
- so it's not without precedent.
383
- To combine all three features, Tao, Yin, and Yang, do this:
384
-
385
- Tao.bless Yin.bless Yang.bless neuronet
386
-
387
- To save typing, the library provides the possible combinations.
388
- For example:
389
-
390
- TaoYinYang.bless neuronet
391
-
392
- # Scaling The Problem
3
+ * [VERSION 7.0.230416](https://github.com/carlosjhr64/neuronet/releases)
4
+ * [github](https://github.com/carlosjhr64/neuronet)
5
+ * [rubygems](https://rubygems.org/gems/neuronet)
393
6
 
394
- The squashing function, sigmoid, maps real numbers (negative infinity, positive infinity)
395
- to the segment zero to one (0,1).
396
- But for the sake of computation in a neural net,
397
- sigmoid works best if the problem is scaled to numbers
398
- between negative one and positive one (-1, 1).
399
- Study the following table and see if you can see why:
7
+ ## DESCRIPTION:
400
8
 
401
- x => sigmoid(x)
402
- 9 => 0.99987...
403
- 3 => 0.95257...
404
- 2 => 0.88079...
405
- 1 => 0.73105...
406
- 0 => 0.50000...
407
- -1 => 0.26894...
408
- -2 => 0.11920...
409
- -3 => 0.04742...
410
- -9 => 0.00012...
411
-
412
- As x gets much higher than 3, sigmoid(x) gets to be pretty close to just 1, and
413
- as x gets much lower than -3, sigmoid(x) gets to be pretty close to 0.
414
- Note that sigmoid is centered about 0.5 which maps to 0.0 in problem space.
415
- It is for this reason that I suggest the problem be displaced (subtracted)
416
- by it's average to be centered about zero and scaled (divided) by it standard deviation.
417
- Try to get most of the data to fit within sigmoid's central "field of view" (-1, 1).
418
-
419
- ## Scale, Gaussian, and Log Normal
420
-
421
- Neuronet provides three classes to help scale the problem space.
422
- [Neuronet::Scale](http://rubydoc.info/gems/neuronet/Neuronet/Scale)
423
- is the simplest most straight forward.
424
- It finds the range and center of a list of values, and
425
- linearly tranforms it to a range of (-1,1) centered at 0.
426
- For example:
427
-
428
- scale = Neuronet::Scale.new
429
- values = [ 1, -3, 5, -2 ]
430
- scale.set( values )
431
- mapped = scale.mapped( values )
432
- puts mapped.join(', ') # 0.0, -1.0, 1.0, -0.75
433
- puts scale.unmapped( mapped ).join(', ') # 1.0, -3.0, 5.0, -2.0
434
-
435
- The mapping is the following:
436
-
437
- center = (maximum + minimum) / 2.0 if center.nil? # calculate center if not given
438
- spread = (maximum - minimum) / 2.0 if spread.nil? # calculate spread if not given
439
- inputs.map{ |value| (value - center) / (factor * spread) }
440
-
441
- One can change the range of the map to (-1/factor, 1/factor)
442
- where factor is the spread multiplier and force
443
- a (perhaps pre-calculated) value for center and spread.
444
- The constructor is:
445
-
446
- scale = Neuronet::Scale.new( factor=1.0, center=nil, spread=nil )
447
-
448
- In the constructor, if the value of center is provided, then
449
- that value will be used instead of it being calculated from the values passed to method set.
450
- Likewise, if spread is provided, that value of spread will be used.
451
-
452
- [Neuronet::Gaussian](http://rubydoc.info/gems/neuronet/Neuronet/Gaussian)
453
- works the same way, except that it uses the average value of the list given
454
- for the center, and the standard deviation for the spread.
455
-
456
- And [Neuronet::LogNormal](http://rubydoc.info/gems/neuronet/Neuronet/LogNormal)
457
- is just like Gaussian except that it first pipes values through a logarithm, and
458
- then pipes the output back through exponentiation.
459
-
460
- ## ScaledNetwork
461
-
462
- [Neuronet::ScaledNetwork](http://rubydoc.info/gems/neuronet/Neuronet/ScaledNetwork)
463
- automates the problem space scaling.
464
- You can choose to do your scaling over the entire data set if you think
465
- the relative scale of the individual inputs matter.
466
- For example if in the problem one apple is good but two is to many...
467
- In that case do this:
468
-
469
- scaled_network.distribution.set( data_set.flatten )
470
- data_set.each do |inputs,outputs|
471
- # ... do your stuff using scaled_network.set( inputs )
472
- end
473
-
474
- If on the other hand the scale of the individual inputs is not the relevant feature,
475
- you can you your scaling per individual input.
476
- For example a small apple is an apple, and so is the big one. They're both apples.
477
- Then do this:
478
-
479
- data_set.each do |inputs,outputs|
480
- # ... do your stuff using scaled_network.reset( inputs )
481
- end
482
-
483
- Note that in the first case you are using
484
- [#set](http://rubydoc.info/gems/neuronet/Neuronet/ScaledNetwork:set)
485
- and in the second case you are using
486
- [#reset](http://rubydoc.info/gems/neuronet/Neuronet/ScaledNetwork:reset).
487
-
488
- # Pit Falls
489
-
490
- When sub-classing a Neuronet::Scale type class,
491
- make sure mapped\_input, mapped\_output, unmapped\_input,
492
- and unmapped\_output are defined as you intended.
493
- If you don't override them, they will point to the first ancestor that defines them.
494
- Overriding #mapped does not piggyback the aliases and
495
- they will continue to point to the original #mapped method.
496
-
497
- Another pitfall is confusing the input/output flow in connections and back-propagation.
498
- Remember to connect outputs to inputs (out.connect(in)) and
499
- to back-propagate from outputs to inputs (out.train(targets)).
500
-
501
- # Interesting Custom Networks
502
-
503
- Note that a particularly interesting YinYang with n inputs and m outputs
504
- would be constructed this way:
505
-
506
- yinyang = YinYang.bless FeedForward.new( [n, n+m, m] )
507
-
508
- Here yinyang's hidden layer (which is both yin and yang)
509
- initially would have the first n neurons mirror the input and
510
- the last m neurons be mirrored by the output.
511
- Another interesting YinYang would be an input to output mirror:
512
-
513
- yinyang = YinYang.bless FeedForward.new( [n, n, n] )
514
-
515
- # Theory
516
-
517
- ## The Biological Description of a Neuron
518
-
519
- Usually a neuron is described as being either on or off.
520
- I think it is more useful to describe a neuron as having a pulse rate.
521
- A neuron would either have a high or a low pulse rate.
522
- In absence of any stimuli from neighboring neurons, the neuron may also have a rest pulse rate.
523
- A neuron receives stimuli from other neurons through the axons that connects them.
524
- These axons communicate to the receiving neuron the pulse rates of the transmitting neurons.
525
- The signal from other neurons are either strengthen or weakened at the synapse, and
526
- might either inhibit or excite the receiving neuron.
527
- Regardless of how much stimuli the neuron gets,
528
- a neuron has a maximum pulse it cannot exceed.
529
-
530
- ## The Mathematical Model of a Neuron
531
-
532
- Since my readers here are probably Ruby programmers, I'll write the math in a Ruby-ish way.
533
- Allow me to sum this way:
534
-
535
- module Enumerable
536
- def sum
537
- map{|a| yield(a)}.inject(0, :+)
538
- end
539
- end
540
- [1,2,3].sum{|i| 2*i} == 2+4+6 # => true
541
-
542
- Can I convince you that taking the derivative of a function looks like this?
543
-
544
- def d(x)
545
- dx = SMALL
546
- f = yield(x)
547
- (yield(x+dx) - f)/dx
548
- end
549
- dfdx = d(a){|x| f(x)}
550
-
551
- So the Ruby-ish way to write one of the rules of Calculus is:
552
-
553
- d{|x| Ax^n} == nAx^(n-1)
554
-
555
- We won't bother distinguishing integers from floats.
556
- The sigmoid function is:
557
-
558
- def sigmoid(x)
559
- 1/(1+exp(-x))
560
- end
561
- sigmoid(a) == 1/(1+exp(a))
562
-
563
- A neuron's pulserate increases with increasing stimulus, so
564
- we need a model that adds up all the stimuli a neuron gets.
565
- The sum of all stimuli we will call the neuron's value.
566
- (I find this confusing, but
567
- it works out that it is this sum that will give us the problem space value.)
568
- To model the neuron's rest pulse, we'll say that it has a bias value, it's own stimuli.
569
- Stimuli from other neurons comes through the connections,
570
- so there is a sum over all the connections.
571
- The stimuli from other transmitting neurons is be proportional to their own pulsetates and
572
- the weight the receiving neuron gives them.
573
- In the model we will call the pulserate the neuron's activation.
574
- Lastly, to more closely match the code, a neuron is a node.
575
- This is what we have so far:
576
-
577
- value = bias + connections.sum{|connection| connection.weight * connection.node.activation }
578
-
579
- # or by their biological synonyms
580
-
581
- stimulus = unsquashed_rest_pulse_rate +
582
- connections.sum{|connection| connection.weight * connection.neuron.pulserate}
583
-
584
- Unsquashed rest pulse rate? Yeah, I'm about to close the loop here.
585
- As described, a neuron can have a very low pulse rate, effectively zero,
586
- and a maximum pulse which I will define as being one.
587
- The sigmoid function will take any amount it gets and
588
- squashes it to a number between zero and one,
589
- which is what we need to model the neuron's behavior.
590
- To get the node's activation (aka neuron's pulserate)
591
- from the node's value (aka neuron's stimulus),
592
- we squash the value with the sigmoid function.
593
-
594
- # the node's activation from it's value
595
- activation = sigmoid(value)
596
-
597
- # or by their biological synonyms
598
-
599
- # the neuron's pulserate from its stimulus
600
- pulserate = sigmoid(stimulus)
601
-
602
- So the "rest pulse rate" is sigmoid("unsquashed rest pulse rate").
603
-
604
- ## Backpropagation of Errors
605
-
606
- There's a lot of really complicated math in understanding how neural networks work.
607
- But if we concentrate on just the part pertinent to the bacpkpropagation code, it's not that bad.
608
- The trick is to do the analysis in the problem space (otherwise things get real ugly).
609
- When we train a neuron, we want the neuron's value to match a target as closely as possible.
610
- The deviation from the target is the error:
611
-
612
- error = target - value
613
-
614
- Where does the error come from?
615
- It comes from deviations from the ideal bias and weights the neuron should have.
616
-
617
- target = value + error
618
- target = bias + bias_error +
619
- connections.sum{|connection| (connection.weight + weight_error) * connection.node.activation }
620
- error = bias_error + connections.sum{|connection| weight_error * connection.node.activation }
621
-
622
- Next we assume that the errors are equally likely everywhere,
623
- so that the bias error is expected to be same on average as weight error.
624
- That's where the learning constant comes in.
625
- We need to divide the error equally among all contributors, say 1/N.
626
- Then:
627
-
628
- error = error/N + connections.sum{|connection| error/N * connection.node.activation }
629
-
630
- Note that if the equation above represents the entire network, then
631
-
632
- N = 1 + connections.length
633
-
634
- So now that we know the error, we can modify the bias and weights.
635
-
636
- bias += error/N
637
- connection.weight += connection.node.activation * error/N
638
-
639
- The Calculus is:
640
-
641
- d{|bias| bias + connections.sum{|connection| connection.weight * connection.node.activation }}
642
- == d{|bias| bias}
643
-
644
- d{|connection.weight| bias + connections.sum{|connection| connection.weight * connection.node.activation }}
645
- == connection.node.activation * d{|weight| connection.weight }
646
-
647
- So what's all the ugly math you'll see elsewhere?
648
- Well, you can try to do the above analysis in neuron space.
649
- Then you're inside the squash function.
650
- I'll just show derivative of the sigmoid function:
651
-
652
- d{|x| sigmoid(x)} ==
653
- d{|x| 1/(1+exp(-x))} ==
654
- 1/(1+exp(-x))^2 * d{|x|(1+exp(-x)} ==
655
- 1/(1+exp(-x))^2 * d{|x|(exp(-x)} ==
656
- 1/(1+exp(-x))^2 * d{|x| -x}*exp(-x) ==
657
- 1/(1+exp(-x))^2 * (-1)*exp(-x) ==
658
- -exp(-x)/(1+exp(-x))^2 ==
659
- (1 -1 - exp(-x))/(1+exp(-x))^2 ==
660
- (1 - (1 + exp(-x)))/(1+exp(-x))^2 ==
661
- (1 - 1/sigmoid(x)) * sigmoid^2(x) ==
662
- (sigmoid(x) - 1) * sigmoid(x) ==
663
- sigmoid(x)*(sigmoid(x) - 1)
664
- # =>
665
- d{|x| sigmoid(x)} == sigmoid(x)*(sigmoid(x) - 1)
666
-
667
- From there you try to find the errors from the point of view of the activation instead of the value.
668
- But as the code clearly shows, the analysis need not get this deep.
669
-
670
- ## Learning Constant
671
-
672
- One can think of a neural network as a sheet of very elastic rubber
673
- which one pokes and pulls to fit the training data while
674
- otherwise keeping the sheet as smooth as possible.
675
- One concern is that the training data may contain noise, random errors.
676
- So the training of the network should add up the true signal in the data
677
- while canceling out the noise. This balance is set via the learning constant.
678
-
679
- neuronet.learning
680
- # Returns the current value of the network's learning constant
681
-
682
- neuronet.learning = float
683
- # where float is greater than zero but less than one.
684
-
685
- By default, Neuronet::FeedForward sets the learning constant to 1/N, where
686
- N is the number of biases and weights in the network
687
- (plus one, just because...). You can get the vale of N with
688
- [#mu](http://rubydoc.info/gems/neuronet/Neuronet/FeedForward:mu).
689
-
690
- So I'm now making up a few more names for stuff.
691
- The number of contributors to errors in the network is #mu.
692
- The learning constant based on #mu is
693
- [#muk](http://rubydoc.info/gems/neuronet/Neuronet/FeedForward:muk).
694
- You can modify the learning constant to some fraction of muk, say 0.7, this way:
695
-
696
- neuronet.muk(0.7)
697
-
698
- I've not come across any hard rule for the learning constant.
699
- I have my own intuition derived from the behavior of random walks.
700
- The distance away from a starting point in a random walk is
701
- proportional to the square root of the number of steps.
702
- I conjecture that the number of training data points is related to
703
- the optimal learning constant in the same way.
704
- So I provide a way to set the learning constant based on the size of the data with
705
- [#num](http://rubydoc.info/gems/neuronet/Neuronet/FeedForward:num)
706
-
707
- neuronet.num(n)
708
-
709
- The value of #num(n) is #muk(1.0)/Math.sqrt(n)).
710
-
711
- ## Mirroring
712
-
713
- Because the squash function is not linear, mirroring is going to be warped.
714
- Nonetheless, I'd like to map zeroes to zeroes and ones to ones.
715
- That gives us the following two equations:
716
-
717
- weight*sigmoid(1.0) + bias = 1.0
718
- weight*sigmoid(0.0) + bias = 0.0
719
-
720
- We can solve that! Consider the zeroes to zeroes map:
721
-
722
- weight*sigmoid(0.0) + bias = 0.0
723
- weight*sigmoid(0.0) = -bias
724
- weight*0.5 = -bias
725
- weight = -2*bias
726
-
727
- Now the ones to ones:
728
-
729
- weight*sigmoid(1.0) + bias = 1.0
730
- -2.0*bias*sigmoid(1.0) + bias = 1.0
731
- bias*(-2.0*sigmoid(1.0) + 1.0) = 1.0
732
- bias = 1.0 / (1.0 - 2.0*sigmoid(1.0))
733
-
734
- We get the numerical values:
735
-
736
- bias = -2.163953413738653 # BZERO
737
- weight = 4.327906827477306 # WONE
738
-
739
- In the code I call this bias and weight BZERO and WONE respectively.
740
- What about "shadowing"?
741
-
742
- weight*sigmoid(1.0) + bias = -1.0
743
- weight*sigmoid(0.0) + bias = 0.0
744
-
745
- weight = -2.0*bias # <== same a before
746
-
747
- weight*sigmoid(1.0) + bias = -1.0
748
- -2.0*bias*sigmoid(1.0) + bias = -1.0
749
- bias*(-2.0*sigmoid(1.0) + 1.0) = -1.0
750
- bias = -1.0 / (-2.0*sigmoid(1.0) + 1.0)
751
- bias = 1.0 / (2.0*sigmoid(1.0) - 1.0)
752
- # ^== this is just negative what we got before.
753
-
754
- Shadowing is just the negative of mirroring.
755
- There's a test, [tests/mirror.rb](https://github.com/carlosjhr64/neuronet/blob/master/tests/mirror.rb),
756
- which demostrates mirroring. Here's the output:
757
-
758
- ### YinYang ###
759
- Input:
760
- -1.0, 0.0, 1.0
761
- In:
762
- 0.2689414213699951, 0.5, 0.7310585786300049
763
- Yin/Yang:
764
- 0.2689414213699951, 0.5, 0.7310585786300049
765
- 0.2689414213699951, 0.5, 0.7310585786300049
766
- Out:
767
- 0.2689414213699951, 0.5, 0.7310585786300049
768
- Output:
769
- -1.0000000000000002, 0.0, 1.0
770
-
771
- ### BrahmaYang ###
772
- Input:
773
- -1.0, 0.0, 1.0
774
- In:
775
- 0.2689414213699951, 0.5, 0.7310585786300049
776
- Yin/Yang:
777
- 0.2689414213699951, 0.7310585786300049, 0.5, 0.5, 0.7310585786300049, 0.2689414213699951
778
- 0.2689414213699951, 0.7310585786300049, 0.5, 0.5, 0.7310585786300049, 0.2689414213699951
779
- Out:
780
- 0.2689414213699951, 0.7310585786300049, 0.5, 0.5, 0.7310585786300049, 0.2689414213699951
781
- Output:
782
- -1.0000000000000002, 1.0, 0.0, 0.0, 1.0, -1.0000000000000002
783
-
784
- # Questions?
9
+ Library to create neural networks.
785
10
 
786
- Email me!
11
+ This is primarily a math project meant to be used to investigate the behavior of
12
+ different small neural networks.
13
+
14
+ ## INSTALL:
15
+ ```console
16
+ gem install neuronet
17
+ ```
18
+ ## SYNOPSIS:
19
+
20
+ The library is meant to be read, but here is a motivating example:
21
+ ```ruby
22
+ require 'neuronet'
23
+ include Neuronet
24
+
25
+ ff = FeedForward.new([3,3])
26
+ # It can mirror, equivalent to "copy":
27
+ ff.last.mirror
28
+ values = ff * [-1, 0, 1]
29
+ values.map { '%.13g' % _1 } #=> ["-1", "0", "1"]
30
+ # It can anti-mirror, equivalent to "not":
31
+ ff.last.mirror(-1)
32
+ values = ff * [-1, 0, 1]
33
+ values.map { '%.13g' % _1 } #=> ["1", "0", "-1"]
34
+
35
+ # It can "and";
36
+ ff = FeedForward.new([2,2,1])
37
+ ff[1].mirror(-1)
38
+ ff.last.connect(ff.first)
39
+ ff.last.average
40
+ # Training "and" pairs:
41
+ pairs = [
42
+ [[1, 1], [1]],
43
+ [[-1, 1], [-1]],
44
+ [[1, -1], [-1]],
45
+ [[-1, -1], [-1]],
46
+ ]
47
+ # Train until values match:
48
+ ff.pairs(pairs) do
49
+ pairs.any? { |input, target| (ff * input).map { _1.round(1) } != target }
50
+ end
51
+ (ff * [-1, -1]).map{ _1.round } #=> [-1]
52
+ (ff * [-1, 1]).map{ _1.round } #=> [-1]
53
+ (ff * [ 1, -1]).map{ _1.round } #=> [-1]
54
+ (ff * [ 1, 1]).map{ _1.round } #=> [1]
55
+
56
+ # It can "or";
57
+ ff = FeedForward.new([2,2,1])
58
+ ff[1].mirror(-1)
59
+ ff.last.connect(ff.first)
60
+ ff.last.average
61
+ # Training "or" pairs:
62
+ pairs = [
63
+ [[1, 1], [1]],
64
+ [[-1, 1], [1]],
65
+ [[1, -1], [1]],
66
+ [[-1, -1], [-1]],
67
+ ]
68
+ # Train until values match:
69
+ ff.pairs(pairs) do
70
+ pairs.any? { |input, target| (ff * input).map { _1.round(1) } != target }
71
+ end
72
+ (ff * [-1, -1]).map{ _1.round } #=> [-1]
73
+ (ff * [-1, 1]).map{ _1.round } #=> [1]
74
+ (ff * [ 1, -1]).map{ _1.round } #=> [1]
75
+ (ff * [ 1, 1]).map{ _1.round } #=> [1]
76
+ ```
77
+ ## CONTENTS:
78
+
79
+ * [Neuronet wiki](https://github.com/carlosjhr64/neuronet/wiki)
80
+
81
+ ### Mju
82
+
83
+ Mju is a Marklar which value depends on which Marklar is asked.
84
+ Other known Marklars are Mu and Kappa.
85
+ Hope it's not confusing...
86
+ I tried to give related Marklars the same name.
87
+ ![Marklar](img/marklar.png)
88
+
89
+ ### Marshal
90
+
91
+ Marshal works with Neuronet to save your networks:
92
+ ```ruby
93
+ dump = Marshal.dump ff
94
+ ff2 = Marshal.load dump
95
+ ff2.inspect == ff.inspect #=> true
96
+ ```
97
+ ### Base
98
+
99
+ * [Requires and autoloads](lib/neuronet.rb)
100
+ * [Constants and lambdas](lib/neuronet/constants.rb)
101
+ * [Connection](lib/neuronet/connection.rb)
102
+ * [Neuron](lib/neuronet/neuron.rb)
103
+ * [Layer](lib/neuronet/layer.rb)
104
+ * [FeedForward](lib/neuronet/feed_forward.rb)
105
+
106
+ ### Scaled
107
+
108
+ * [Scale](lib/neuronet/scale.rb)
109
+ * [Gaussian](lib/neuronet/gaussian.rb)
110
+ * [LogNormal](lib/neuronet/log_normal.rb)
111
+ * [ScaledNetwork](lib/neuronet/scaled_network.rb)
112
+
113
+ ## LICENSE:
114
+
115
+ Copyright (c) 2023 CarlosJHR64
116
+
117
+ Permission is hereby granted, free of charge,
118
+ to any person obtaining a copy of this software and
119
+ associated documentation files (the "Software"),
120
+ to deal in the Software without restriction,
121
+ including without limitation the rights
122
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
123
+ copies of the Software, and
124
+ to permit persons to whom the Software is furnished to do so,
125
+ subject to the following conditions:
126
+
127
+ The above copyright notice and this permission notice
128
+ shall be included in all copies or substantial portions of the Software.
129
+
130
+ THE SOFTWARE IS PROVIDED "AS IS",
131
+ WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
132
+ INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
133
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
134
+ IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
135
+ DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
136
+ TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH
137
+ THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.