neuronet 6.0.1 → 7.0.230416

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,714 +1,137 @@
1
- # Neuronet 6.0.1
1
+ # Neuronet
2
2
 
3
- Library to create neural networks.
4
-
5
- * Gem: <https://rubygems.org/gems/neuronet>
6
- * Git: <https://github.com/carlosjhr64/neuronet>
7
- * Author: <carlosjhr64@gmail.com>
8
- * Copyright: 2013
9
- * License: [GPL](http://www.gnu.org/licenses/gpl.html)
10
-
11
- ## Installation
12
-
13
- gem install neuronet
14
-
15
- ## Synopsis
16
-
17
- Given some set of inputs (of at least length 3) and
18
- targets that are Array's of Float's. Then:
19
-
20
- # data = [ [input, target], ... }
21
- # n = input.length # > 3
22
- # t = target.length
23
- # m = n + t
24
- # l = data.length
25
- # Then:
26
- # Create a general purpose neuronet
27
-
28
- neuronet = Neuronet::ScaledNetwork.new([n, m, t])
29
-
30
- # "Bless" it as a TaoYinYang,
31
- # a perceptron hybrid with the middle layer
32
- # initially mirroring the input layer and
33
- # mirrored by the output layer.
34
-
35
- Neuronet::TaoYinYang.bless(neuronet)
36
-
37
- # The following sets the learning constant
38
- # to something I think is reasonable.
39
-
40
- neuronet.num(l)
41
-
42
- # Start training
43
-
44
- MANY.times do
45
- data.shuffle.each do |input, target|
46
- neuronet.reset(input)
47
- neuronet.train!(target)
48
- end
49
- end # or until some small enough error
50
-
51
- # See how well the training went
52
-
53
- require 'pp'
54
- data.each do |input, target|
55
- puts "Input:"
56
- pp input
57
- puts "Output:"
58
- neuronet.reset(input) # sets the input values
59
- pp neuronet.output # gets the output values
60
- puts "Target:"
61
- pp target
62
- end
63
-
64
- ## Introduction
65
-
66
- Neuronet is a pure Ruby 1.9, sigmoid squashed, neural network building library.
67
- It allows one to build a network by connecting one neuron at a time, or a layer at a time,
68
- or up to a full feed forward network that automatically scales the inputs and outputs.
69
-
70
- I chose a TaoYinYang'ed ScaledNetwork neuronet for the synopsis because
71
- it will probably handle most anything with 3 or more input variables you'd throw at it.
72
- But there's a lot you can do to the data before throwing it at a neuronet.
73
- And you can build a neuronet specifically to solve a particular kind of problem.
74
- Properly transforming the data and choosing the right neuronet architecture
75
- can greatly reduce the amount of training time the neuronet will require.
76
- A neuronet with the wrong architecture for a problem will be unable to solve it.
77
- Raw data without hints as to what's important in the data will take longer to solve.
78
-
79
- As an analogy, think of what you can do with
80
- [linear regression](http://en.wikipedia.org/wiki/Linear_regression).
81
- Your raw data might not be linear, but if a transform converts it to a linear form,
82
- you can use linear regression to find the best fit line, and
83
- from that deduce the properties of the untransformed data.
84
- Likewise, if you can transform the data into something the neuronet can solve,
85
- you can by inverse get back the answer you're lookin for.
86
-
87
- # Examples
88
-
89
- ## Time Series
90
-
91
- A common use for a neural-net is to attempt to forecast future set of data points
92
- based on past set of data points, [Time series](http://en.wikipedia.org/wiki/Time_series).
93
- To demonstrate, I'll train a network with the following function:
94
-
95
- f(t) = A + B sine(C + D t), t in [0,1,2,3,...]
96
-
97
- I'll set A, B, C, and D with random numbers and see
98
- if eventually the network can predict the next set of values based on previous values.
99
- I'll try:
100
-
101
- [f(n),...,f(n+19)] => [f(n+20),...,f(n+24)]
102
-
103
- That is... given 20 consecutive values, give the next 5 in the series.
104
- There is no loss, and probably greater generality,
105
- if I set at random the phase (C above), so that for any given random phase we want:
106
-
107
- [f(0),...,f(19)] => [f(20),...,f(24)]
108
-
109
- I'll be using [Neuronet::ScaledNetwork](http://rubydoc.info/gems/neuronet/Neuronet/ScaledNetwork).
110
- Also note that the Sine function is entirely defined within a cycle ( 2 Math::PI ) and
111
- so parameters (particularly C) need only to be set within this cycle.
112
- After a lot of testing, I've verified that a
113
- [Perceptron](http://en.wikipedia.org/wiki/Perceptron) is enough to solve the problem.
114
- The Sine function is [Linearly separable](http://en.wikipedia.org/wiki/Linearly_separable).
115
- Adding hidden layers needlessly adds training time, but does converge.
116
-
117
- The gist of the
118
- [example code](https://github.com/carlosjhr64/neuronet/blob/master/examples/sine_series.rb)
119
- is:
120
-
121
- ...
122
- # The constructor
123
- neuronet = Neuronet::ScaledNetwork.new([INPUTS, OUTPUTS])
124
- ...
125
- # Setting learning constant
126
- neuronet.num(1.0)
127
- ...
128
- # Setting the input values
129
- neuronet.reset(input)
130
- ...
131
- # Getting the neuronet's output
132
- output = neuronet.output
133
- ...
134
- # Training the target
135
- neuronet.train!(target)
136
- ...
137
-
138
- Heres a sample output:
139
-
140
- f(phase, t) = 3.002 + 3.28*Sin(phase + 1.694*t)
141
- Cycle step = 0.27
142
-
143
- Iterations: 1738
144
- Relative Error (std/B): 0.79% Standard Deviation: 0.026
145
- Examples:
146
-
147
- Input: 0.522, 1.178, 5.932, 4.104, -0.199, 2.689, 6.28, 2.506, -0.154, 4.276, 5.844, 1.028, 0.647, 5.557, 4.727, 0.022, 2.011, 6.227, 3.198, -0.271
148
- Target: 3.613, 6.124, 1.621, 0.22, 5.069
149
- Output: 3.575, 6.101, 1.664, 0.227, 5.028
150
-
151
- Input: 5.265, 5.079, 0.227, 1.609, 6.12, 3.626, -0.27, 3.184, 6.229, 2.024, 0.016, 4.716, 5.565, 0.656, 1.017, 5.837, 4.288, -0.151, 2.493, 6.28
152
- Target: 2.703, -0.202, 4.091, 5.938, 1.189
153
- Output: 2.728, -0.186, 4.062, 5.931, 1.216
154
-
155
- Input: 5.028, 0.193, 1.669, 6.14, 3.561, -0.274, 3.25, 6.217, 1.961, 0.044, 4.772, 5.524, 0.61, 1.07, 5.87, 4.227, -0.168, 2.558, 6.281, 2.637
156
- Target: -0.188, 4.153, 5.908, 1.135, 0.557
157
- Output: -0.158, 4.112, 5.887, 1.175, 0.564
158
-
159
- ScaledNetwork automatically scales each input via
160
- [Neuronet::Gaussian](http://rubydoc.info/gems/neuronet/Neuronet/Gaussian),
161
- so the input needs to be many variables and
162
- the output entirely determined by the shape of the input and not it's scale.
163
- That is, two inputs that are different only in scale should
164
- produce outputs that are different only in scale.
165
- The input must have at least three points.
166
-
167
- You can tackle many problems just with
168
- [Neuronet::ScaledNetwork](http://rubydoc.info/gems/neuronet/Neuronet/ScaledNetwork)
169
- as described above.
170
-
171
- # Component Architecture
172
-
173
- ## Nodes and Neurons
174
-
175
- [Nodes](http://rubydoc.info/gems/neuronet/Neuronet/Node)
176
- are used to set inputs while
177
- [Neurons](http://rubydoc.info/gems/neuronet/Neuronet/Neuron)
178
- are used for outputs and middle layers.
179
- It's easy to create and connect Nodes and Neurons.
180
- You can assemble custom neuronets one neuron at a time.
181
- Too illustrate, here's a simple network that adds two random numbers.
182
-
183
- require 'neuronet'
184
- include Neuronet
185
-
186
- def random
187
- rand - rand
188
- end
189
-
190
- # create the input nodes
191
- a = Node.new
192
- b = Node.new
193
-
194
- # create the output neuron
195
- sum = Neuron.new
196
-
197
- # and a neuron on the side
198
- adjuster = Neuron.new
199
-
200
- # connect the adjuster to a and b
201
- adjuster.connect(a)
202
- adjuster.connect(b)
203
-
204
- # connect sum to a and b
205
- sum.connect(a)
206
- sum.connect(b)
207
- # and to the adjuster
208
- sum.connect(adjuster)
209
-
210
- # The learning constant is about...
211
- learning = 0.1
212
-
213
- # Train the tiny network
214
- 10_000.times do
215
- a.value = x = random
216
- b.value = y = random
217
- target = x+y
218
- output = sum.update
219
- sum.backpropagate(learning*(target-output))
220
- end
221
-
222
- # Let's see how well the training went
223
- 10.times do
224
- a.value = x = random
225
- b.value = y = random
226
- target = x+y
227
- output = sum.update
228
- puts "#{x.round(3)} + #{y.round(3)} = #{target.round(3)}"
229
- puts " Neuron says #{output.round(3)}, #{(100.0*(target-output)/target).round(2)}% error."
230
- end
231
-
232
-
233
- Here's a sample output:
234
-
235
- 0.003 + -0.413 = -0.41
236
- Neuron says -0.413, -0.87% error.
237
- -0.458 + 0.528 = 0.07
238
- Neuron says 0.07, -0.45% error.
239
- 0.434 + -0.125 = 0.309
240
- Neuron says 0.313, -1.43% error.
241
- -0.212 + 0.34 = 0.127
242
- Neuron says 0.131, -2.83% error.
243
- -0.364 + 0.659 = 0.294
244
- Neuron says 0.286, 2.86% error.
245
- 0.045 + 0.323 = 0.368
246
- Neuron says 0.378, -2.75% error.
247
- 0.545 + 0.901 = 1.446
248
- Neuron says 1.418, 1.9% error.
249
- -0.451 + -0.486 = -0.937
250
- Neuron says -0.944, -0.77% error.
251
- -0.008 + 0.219 = 0.211
252
- Neuron says 0.219, -3.58% error.
253
- 0.61 + 0.554 = 1.163
254
- Neuron says 1.166, -0.25% error.
255
-
256
- Note that the tiny neuronet has a limit on how precisely it can match the target, and
257
- even after a million times training it won't do any beter than when it trains a few thousands.
258
- [code](https://github.com/carlosjhr64/neuronet/blob/master/examples/neurons.rb)
259
-
260
-
261
- ## InputLayer and Layer
262
-
263
- Instead of working with individual neurons, you can work with layers.
264
- Here we build a [Perceptron](http://en.wikipedia.org/wiki/Perceptron):
265
-
266
- in = InputLayer.new(9)
267
- out = Layer.new(1)
268
- out.connect(in)
269
-
270
- When making connections keep in mind "outputs connects to inputs",
271
- not the other way around.
272
- You can set the input values and update this way:
273
-
274
- in.set([1,2,3,4,5,6,7,8,9])
275
- out.partial
276
-
277
- Partial means the update wont travel further than the current layer,
278
- which is all we have in this case anyways.
279
- You get the output this way:
280
-
281
- output = out.output # returns an array of values
282
-
283
- You train this way:
284
-
285
- target = [1] #<= whatever value you want in the array
286
- learning = 0.1
287
- out.train(target, learning)
288
-
289
- ## FeedForward Network
290
-
291
- Most of the time, you'll just use a network created with the
292
- [FeedForward](http://rubydoc.info/gems/neuronet/Neuronet/FeedForward) class,
293
- or a modified version or subclass of it.
294
- Here we build a neuronet with four layers.
295
- The input layer has four neurons, and the output has three.
296
- Then we train it with a list of inputs and targets
297
- using the method [#exemplar](http://rubydoc.info/gems/neuronet/Neuronet/FeedForward:exemplar):
298
-
299
- neuronet = Neuronet::FeedForward.new([4,5,6,3])
300
- LIST.each do |input, target|
301
- neuronet.exemplar(input, target)
302
- # you could also train this way:
303
- # neuronet.set(input)
304
- # neuronet.train!(target)
305
- end
306
-
307
- The first layer is the input layer and the last layer is the output layer.
308
- Neuronet also names the second and second last layer.
309
- The second layer is called yin.
310
- The second last layer is called yang.
311
- For the example above, we can check their lengths.
312
-
313
- puts neuronet.in.length #=> 4
314
- puts neuronet.yin.length #=> 5
315
- puts neuronet.yang.length #=> 6
316
- puts neuronet.out.length #=> 3
317
-
318
- ## Tao, Yin, and Yang
319
-
320
- Tao
321
- : The absolute principle underlying the universe,
322
- combining within itself the principles of yin and yang and
323
- signifying the way, or code of behavior,
324
- that is in harmony with the natural order.
325
-
326
- Perceptrons are already very capable and quick to train.
327
- By connecting the input layer to the output layer of a multilayer FeedForward network,
328
- you'll get the Perceptron solution quicker while the middle layers work on the harder problem.
329
- You can do that this way:
330
-
331
- neronet.out.connect(neuronet.in)
332
-
333
- But giving that a name, [Tao](http://rubydoc.info/gems/neuronet/Neuronet/Tao),
334
- and using a prototype pattern to modify the instance is more fun:
335
-
336
- Tao.bless(neuronet)
337
-
338
- Yin
339
- : The passive female principle of the universe, characterized as female and
340
- sustaining and associated with earth, dark, and cold.
341
-
342
- Initially FeedForward sets the weights of all connections to zero.
343
- That is, there is no association made from input to ouput.
344
- Changes in the inputs have no effect on the output.
345
- Training begins the process that sets the weights to associate the two.
346
- But you can also manually set the initial weights.
347
- One useful way to initially set the weigths is to have one layer mirror another.
348
- The [Yin](http://rubydoc.info/gems/neuronet/Neuronet/Yin) bless makes yin mirror the input.
349
-
350
- Yin.bless(neuronet)
351
-
352
- Yang
353
- : The active male principle of the universe, characterized as male and
354
- creative and associated with heaven, heat, and light.
355
-
356
- One the other hand, the [Yang](http://rubydoc.info/gems/neuronet/Neuronet/Yang)
357
- bless makes the output mirror yang.
358
-
359
- Yang.bless(neuronet)
360
-
361
- Bless
362
- : Pronounce words in a religious rite, to confer or invoke divine favor upon.
363
-
364
- The reason Tao, Yin, and Yang are not classes onto themselves is that
365
- you can combine these, and a protoptype pattern (bless) works better in this case.
366
- Bless is the keyword used in [Perl](http://www.perl.org/) to create objects,
367
- so it's not without precedent.
368
- To combine all three features, Tao, Yin, and Yang, do this:
3
+ * [VERSION 7.0.230416](https://github.com/carlosjhr64/neuronet/releases)
4
+ * [github](https://github.com/carlosjhr64/neuronet)
5
+ * [rubygems](https://rubygems.org/gems/neuronet)
369
6
 
370
- Tao.bless Yin.bless Yang.bless neuronet
7
+ ## DESCRIPTION:
371
8
 
372
- To save typing, the library provides the possible combinations.
373
- For example:
374
-
375
- TaoYinYang.bless neuronet
376
-
377
- # Scaling The Problem
378
-
379
- The squashing function, sigmoid, maps real numbers (negative infinity, positive infinity)
380
- to the segment zero to one (0,1).
381
- But for the sake of computation in a neural net,
382
- sigmoid works best if the problem is scaled to numbers
383
- between negative one and positive one (-1, 1).
384
- Study the following table and see if you can see why:
385
-
386
- x => sigmoid(x)
387
- 9 => 0.99987...
388
- 3 => 0.95257...
389
- 2 => 0.88079...
390
- 1 => 0.73105...
391
- 0 => 0.50000...
392
- -1 => 0.26894...
393
- -2 => 0.11920...
394
- -3 => 0.04742...
395
- -9 => 0.00012...
396
-
397
- As x gets much higher than 3, sigmoid(x) gets to be pretty close to just 1, and
398
- as x gets much lower than -3, sigmoid(x) gets to be pretty close to 0.
399
- Note that sigmoid is centered about 0.5 which maps to 0.0 in problem space.
400
- It is for this reason that I suggest the problem be displaced (subtracted)
401
- by it's average to be centered about zero and scaled (divided) by it standard deviation.
402
- Try to get most of the data to fit within sigmoid's central "field of view" (-1, 1).
403
-
404
- ## Scale, Gaussian, and Log Normal
405
-
406
- Neuronet provides three classes to help scale the problem space.
407
- [Neuronet::Scale](http://rubydoc.info/gems/neuronet/Neuronet/Scale)
408
- is the simplest most straight forward.
409
- It finds the range and center of a list of values, and
410
- linearly tranforms it to a range of (-1,1) centered at 0.
411
- For example:
412
-
413
- scale = Neuronet::Scale.new
414
- values = [ 1, -3, 5, -2 ]
415
- scale.set( values )
416
- mapped = scale.mapped( values )
417
- puts mapped.join(', ') # 0.0, -1.0, 1.0, -0.75
418
- puts scale.unmapped( mapped ).join(', ') # 1.0, -3.0, 5.0, -2.0
419
-
420
- The mapping is the following:
421
-
422
- center = (maximum + minimum) / 2.0 if center.nil? # calculate center if not given
423
- spread = (maximum - minimum) / 2.0 if spread.nil? # calculate spread if not given
424
- inputs.map{ |value| (value - center) / (factor * spread) }
425
-
426
- One can change the range of the map to (-1/factor, 1/factor)
427
- where factor is the spread multiplier and force
428
- a (perhaps pre-calculated) value for center and spread.
429
- The constructor is:
430
-
431
- scale = Neuronet::Scale.new( factor=1.0, center=nil, spread=nil )
432
-
433
- In the constructor, if the value of center is provided, then
434
- that value will be used instead of it being calculated from the values passed to method set.
435
- Likewise, if spread is provided, that value of spread will be used.
436
-
437
- [Neuronet::Gaussian](http://rubydoc.info/gems/neuronet/Neuronet/Gaussian)
438
- works the same way, except that it uses the average value of the list given
439
- for the center, and the standard deviation for the spread.
440
-
441
- And [Neuronet::LogNormal](http://rubydoc.info/gems/neuronet/Neuronet/LogNormal)
442
- is just like Gaussian except that it first pipes values through a logarithm, and
443
- then pipes the output back through exponentiation.
444
-
445
- ## ScaledNetwork
446
-
447
- [Neuronet::ScaledNetwork](http://rubydoc.info/gems/neuronet/Neuronet/ScaledNetwork)
448
- automates the problem space scaling.
449
- You can choose to do your scaling over the entire data set if you think
450
- the relative scale of the individual inputs matter.
451
- For example if in the problem one apple is good but two is to many...
452
- In that case do this:
453
-
454
- scaled_network.distribution.set( data_set.flatten )
455
- data_set.each do |inputs,outputs|
456
- # ... do your stuff using scaled_network.set( inputs )
457
- end
458
-
459
- If on the other hand the scale of the individual inputs is not the relevant feature,
460
- you can you your scaling per individual input.
461
- For example a small apple is an apple, and so is the big one. They're both apples.
462
- Then do this:
463
-
464
- data_set.each do |inputs,outputs|
465
- # ... do your stuff using scaled_network.reset( inputs )
466
- end
467
-
468
- Note that in the first case you are using
469
- [#set](http://rubydoc.info/gems/neuronet/Neuronet/ScaledNetwork:set)
470
- and in the second case you are using
471
- [#reset](http://rubydoc.info/gems/neuronet/Neuronet/ScaledNetwork:reset).
472
-
473
- # Pit Falls
474
-
475
- When sub-classing a Neuronet::Scale type class,
476
- make sure mapped\_input, mapped\_output, unmapped\_input,
477
- and unmapped\_output are defined as you intended.
478
- If you don't override them, they will point to the first ancestor that defines them.
479
- Overriding #mapped does not piggyback the aliases and
480
- they will continue to point to the original #mapped method.
481
-
482
- Another pitfall is confusing the input/output flow in connections and back-propagation.
483
- Remember to connect outputs to inputs (out.connect(in)) and
484
- to back-propagate from outputs to inputs (out.train(targets)).
485
-
486
- # Interesting Custom Networks
487
-
488
- Note that a particularly interesting YinYang with n inputs and m outputs
489
- would be constructed this way:
490
-
491
- yinyang = YinYang.bless FeedForward.new( [n, n+m, m] )
492
-
493
- Here yinyang's hidden layer (which is both yin and yang)
494
- initially would have the first n neurons mirror the input and
495
- the last m neurons be mirrored by the output.
496
- Another interesting YinYang would be:
497
-
498
- yinyang = YinYang.bless FeedForward.new( [n, n, n] )
499
-
500
- The following code demonstrates what is meant by "mirroring":
501
-
502
- yinyang = YinYang.bless FeedForward.new( [3, 3, 3] )
503
- yinyang.set( [-1,0,1] )
504
- puts yinyang.in.map{|x| x.activation}.join(', ')
505
- puts yinyang.yin.map{|x| x.activation}.join(', ')
506
- puts yinyang.out.map{|x| x.activation}.join(', ')
507
- puts yinyang.output.join(', ')
508
-
509
- Here's the output:
510
-
511
- 0.268941421369995, 0.5, 0.731058578630005
512
- 0.442490985892539, 0.5, 0.557509014107461
513
- 0.485626707638021, 0.5, 0.514373292361979
514
- -0.0575090141074614, 0.0, 0.057509014107461
515
-
516
- # Theory
517
-
518
- ## The Biological Description of a Neuron
519
-
520
- Usually a neuron is described as being either on or off.
521
- I think it is more useful to describe a neuron as having a pulse rate.
522
- A neuron would either have a high or a low pulse rate.
523
- In absence of any stimuli from neighboring neurons, the neuron may also have a rest pulse rate.
524
- A neuron receives stimuli from other neurons through the axons that connects them.
525
- These axons communicate to the receiving neuron the pulse rates of the transmitting neurons.
526
- The signal from other neurons are either strengthen or weakened at the synapse, and
527
- might either inhibit or excite the receiving neuron.
528
- Regardless of how much stimuli the neuron gets,
529
- a neuron has a maximum pulse it cannot exceed.
530
-
531
- ## The Mathematical Model of a Neuron
532
-
533
- Since my readers here are probably Ruby programmers, I'll write the math in a Ruby-ish way.
534
- Allow me to sum this way:
535
-
536
- module Enumerable
537
- def sum
538
- map{|a| yield(a)}.inject(0, :+)
539
- end
540
- end
541
- [1,2,3].sum{|i| 2*i} == 2+4+6 # => true
542
-
543
- Can I convince you that taking the derivative of a function looks like this?
544
-
545
- def d(x)
546
- dx = SMALL
547
- f = yield(x)
548
- (yield(x+dx) - f)/dx
549
- end
550
- dfdx = d(a){|x| f(x)}
551
-
552
- So the Ruby-ish way to write one of the rules of Calculus is:
553
-
554
- d{|x| Ax^n} == nAx^(n-1)
555
-
556
- We won't bother distinguishing integers from floats.
557
- The sigmoid function is:
558
-
559
- def sigmoid(x)
560
- 1/(1+exp(-x))
561
- end
562
- sigmoid(a) == 1/(1+exp(a))
563
-
564
- A neuron's pulserate increases with increasing stimulus, so
565
- we need a model that adds up all the stimuli a neuron gets.
566
- The sum of all stimuli we will call the neuron's value.
567
- (I find this confusing, but
568
- it works out that it is this sum that will give us the problem space value.)
569
- To model the neuron's rest pulse, we'll say that it has a bias value, it's own stimuli.
570
- Stimuli from other neurons comes through the connections,
571
- so there is a sum over all the connections.
572
- The stimuli from other transmitting neurons is be proportional to their own pulsetates and
573
- the weight the receiving neuron gives them.
574
- In the model we will call the pulserate the neuron's activation.
575
- Lastly, to more closely match the code, a neuron is a node.
576
- This is what we have so far:
577
-
578
- value = bias + connections.sum{|connection| connection.weight * connection.node.activation }
579
-
580
- # or by their biological synonyms
581
-
582
- stimulus = unsquashed_rest_pulse_rate +
583
- connections.sum{|connection| connection.weight * connection.neuron.pulserate}
584
-
585
- Unsquashed rest pulse rate? Yeah, I'm about to close the loop here.
586
- As described, a neuron can have a very low pulse rate, effectively zero,
587
- and a maximum pulse which I will define as being one.
588
- The sigmoid function will take any amount it gets and
589
- squashes it to a number between zero and one,
590
- which is what we need to model the neuron's behavior.
591
- To get the node's activation (aka neuron's pulserate)
592
- from the node's value (aka neuron's stimulus),
593
- we squash the value with the sigmoid function.
594
-
595
- # the node's activation from it's value
596
- activation = sigmoid(value)
597
-
598
- # or by their biological synonyms
599
-
600
- # the neuron's pulserate from its stimulus
601
- pulserate = sigmoid(stimulus)
602
-
603
- So the "rest pulse rate" is sigmoid("unsquashed rest pulse rate").
604
-
605
- ## Backpropagation of Errors
606
-
607
- There's a lot of really complicated math in understanding how neural networks work.
608
- But if we concentrate on just the part pertinent to the bacpkpropagation code, it's not that bad.
609
- The trick is to do the analysis in the problem space (otherwise things get real ugly).
610
- When we train a neuron, we want the neuron's value to match a target as closely as possible.
611
- The deviation from the target is the error:
612
-
613
- error = target - value
614
-
615
- Where does the error come from?
616
- It comes from deviations from the ideal bias and weights the neuron should have.
617
-
618
- target = value + error
619
- target = bias + bias_error +
620
- connections.sum{|connection| (connection.weight + weight_error) * connection.node.activation }
621
- error = bias_error + connections.sum{|connection| weight_error * connection.node.activation }
622
-
623
- Next we assume that the errors are equally likely everywhere,
624
- so that the bias error is expected to be same on average as weight error.
625
- That's where the learning constant comes in.
626
- We need to divide the error equally among all contributors, say 1/N.
627
- Then:
628
-
629
- error = error/N + connections.sum{|connection| error/N * connection.node.activation }
630
-
631
- Note that if the equation above represents the entire network, then
632
-
633
- N = 1 + connections.length
634
-
635
- So now that we know the error, we can modify the bias and weights.
636
-
637
- bias += error/N
638
- connection.weight += connection.node.activation * error/N
639
-
640
- The Calculus is:
641
-
642
- d{|bias| bias + connections.sum{|connection| connection.weight * connection.node.activation }}
643
- == d{|bias| bias}
644
-
645
- d{|connection.weight| bias + connections.sum{|connection| connection.weight * connection.node.activation }}
646
- == connection.node.activation * d{|weight| connection.weight }
647
-
648
- So what's all the ugly math you'll see elsewhere?
649
- Well, you can try to do the above analysis in neuron space.
650
- Then you're inside the squash function.
651
- I'll just show derivative of the sigmoid function:
652
-
653
- d{|x| sigmoid(x)} ==
654
- d{|x| 1/(1+exp(-x))} ==
655
- 1/(1+exp(-x))^2 * d{|x|(1+exp(-x)} ==
656
- 1/(1+exp(-x))^2 * d{|x|(exp(-x)} ==
657
- 1/(1+exp(-x))^2 * d{|x| -x}*exp(-x) ==
658
- 1/(1+exp(-x))^2 * (-1)*exp(-x) ==
659
- -exp(-x)/(1+exp(-x))^2 ==
660
- (1 -1 - exp(-x))/(1+exp(-x))^2 ==
661
- (1 - (1 + exp(-x)))/(1+exp(-x))^2 ==
662
- (1 - 1/sigmoid(x)) * sigmoid^2(x) ==
663
- (sigmoid(x) - 1) * sigmoid(x) ==
664
- sigmoid(x)*(sigmoid(x) - 1)
665
- # =>
666
- d{|x| sigmoid(x)} == sigmoid(x)*(sigmoid(x) - 1)
667
-
668
- From there you try to find the errors from the point of view of the activation instead of the value.
669
- But as the code clearly shows, the analysis need not get this deep.
670
-
671
- ## Learning Constant
672
-
673
- One can think of a neural network as a sheet of very elastic rubber
674
- which one pokes and pulls to fit the training data while
675
- otherwise keeping the sheet as smooth as possible.
676
- One concern is that the training data may contain noise, random errors.
677
- So the training of the network should add up the true signal in the data
678
- while canceling out the noise. This balance is set via the learning constant.
679
-
680
- neuronet.learning
681
- # Returns the current value of the network's learning constant
682
-
683
- neuronet.learning = float
684
- # where float is greater than zero but less than one.
685
-
686
- By default, Neuronet::FeedForward sets the learning constant to 1/N, where
687
- N is the number of biases and weights in the network
688
- (plus one, just because...). You can get the vale of N with
689
- [#mu](http://rubydoc.info/gems/neuronet/Neuronet/FeedForward:mu).
690
-
691
- So I'm now making up a few more names for stuff.
692
- The number of contributors to errors in the network is #mu.
693
- The learning constant based on #mu is
694
- [#muk](http://rubydoc.info/gems/neuronet/Neuronet/FeedForward:muk).
695
- You can modify the learning constant to some fraction of muk, say 0.7, this way:
696
-
697
- neuronet.muk(0.7)
698
-
699
- I've not come across any hard rule for the learning constant.
700
- I have my own intuition derived from the behavior of random walks.
701
- The distance away from a starting point in a random walk is
702
- proportional to the square root of the number of steps.
703
- I conjecture that the number of training data points is related to
704
- the optimal learning constant in the same way.
705
- So I provide a way to set the learning constant based on the size of the data with
706
- [#num](http://rubydoc.info/gems/neuronet/Neuronet/FeedForward:num)
707
-
708
- neuronet.num(n)
709
-
710
- The value of #num(n) is #muk(1.0)/Math.sqrt(n)).
711
-
712
- # Questions?
9
+ Library to create neural networks.
713
10
 
714
- Email me!
11
+ This is primarily a math project meant to be used to investigate the behavior of
12
+ different small neural networks.
13
+
14
+ ## INSTALL:
15
+ ```console
16
+ gem install neuronet
17
+ ```
18
+ ## SYNOPSIS:
19
+
20
+ The library is meant to be read, but here is a motivating example:
21
+ ```ruby
22
+ require 'neuronet'
23
+ include Neuronet
24
+
25
+ ff = FeedForward.new([3,3])
26
+ # It can mirror, equivalent to "copy":
27
+ ff.last.mirror
28
+ values = ff * [-1, 0, 1]
29
+ values.map { '%.13g' % _1 } #=> ["-1", "0", "1"]
30
+ # It can anti-mirror, equivalent to "not":
31
+ ff.last.mirror(-1)
32
+ values = ff * [-1, 0, 1]
33
+ values.map { '%.13g' % _1 } #=> ["1", "0", "-1"]
34
+
35
+ # It can "and";
36
+ ff = FeedForward.new([2,2,1])
37
+ ff[1].mirror(-1)
38
+ ff.last.connect(ff.first)
39
+ ff.last.average
40
+ # Training "and" pairs:
41
+ pairs = [
42
+ [[1, 1], [1]],
43
+ [[-1, 1], [-1]],
44
+ [[1, -1], [-1]],
45
+ [[-1, -1], [-1]],
46
+ ]
47
+ # Train until values match:
48
+ ff.pairs(pairs) do
49
+ pairs.any? { |input, target| (ff * input).map { _1.round(1) } != target }
50
+ end
51
+ (ff * [-1, -1]).map{ _1.round } #=> [-1]
52
+ (ff * [-1, 1]).map{ _1.round } #=> [-1]
53
+ (ff * [ 1, -1]).map{ _1.round } #=> [-1]
54
+ (ff * [ 1, 1]).map{ _1.round } #=> [1]
55
+
56
+ # It can "or";
57
+ ff = FeedForward.new([2,2,1])
58
+ ff[1].mirror(-1)
59
+ ff.last.connect(ff.first)
60
+ ff.last.average
61
+ # Training "or" pairs:
62
+ pairs = [
63
+ [[1, 1], [1]],
64
+ [[-1, 1], [1]],
65
+ [[1, -1], [1]],
66
+ [[-1, -1], [-1]],
67
+ ]
68
+ # Train until values match:
69
+ ff.pairs(pairs) do
70
+ pairs.any? { |input, target| (ff * input).map { _1.round(1) } != target }
71
+ end
72
+ (ff * [-1, -1]).map{ _1.round } #=> [-1]
73
+ (ff * [-1, 1]).map{ _1.round } #=> [1]
74
+ (ff * [ 1, -1]).map{ _1.round } #=> [1]
75
+ (ff * [ 1, 1]).map{ _1.round } #=> [1]
76
+ ```
77
+ ## CONTENTS:
78
+
79
+ * [Neuronet wiki](https://github.com/carlosjhr64/neuronet/wiki)
80
+
81
+ ### Mju
82
+
83
+ Mju is a Marklar which value depends on which Marklar is asked.
84
+ Other known Marklars are Mu and Kappa.
85
+ Hope it's not confusing...
86
+ I tried to give related Marklars the same name.
87
+ ![Marklar](img/marklar.png)
88
+
89
+ ### Marshal
90
+
91
+ Marshal works with Neuronet to save your networks:
92
+ ```ruby
93
+ dump = Marshal.dump ff
94
+ ff2 = Marshal.load dump
95
+ ff2.inspect == ff.inspect #=> true
96
+ ```
97
+ ### Base
98
+
99
+ * [Requires and autoloads](lib/neuronet.rb)
100
+ * [Constants and lambdas](lib/neuronet/constants.rb)
101
+ * [Connection](lib/neuronet/connection.rb)
102
+ * [Neuron](lib/neuronet/neuron.rb)
103
+ * [Layer](lib/neuronet/layer.rb)
104
+ * [FeedForward](lib/neuronet/feed_forward.rb)
105
+
106
+ ### Scaled
107
+
108
+ * [Scale](lib/neuronet/scale.rb)
109
+ * [Gaussian](lib/neuronet/gaussian.rb)
110
+ * [LogNormal](lib/neuronet/log_normal.rb)
111
+ * [ScaledNetwork](lib/neuronet/scaled_network.rb)
112
+
113
+ ## LICENSE:
114
+
115
+ Copyright (c) 2023 CarlosJHR64
116
+
117
+ Permission is hereby granted, free of charge,
118
+ to any person obtaining a copy of this software and
119
+ associated documentation files (the "Software"),
120
+ to deal in the Software without restriction,
121
+ including without limitation the rights
122
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
123
+ copies of the Software, and
124
+ to permit persons to whom the Software is furnished to do so,
125
+ subject to the following conditions:
126
+
127
+ The above copyright notice and this permission notice
128
+ shall be included in all copies or substantial portions of the Software.
129
+
130
+ THE SOFTWARE IS PROVIDED "AS IS",
131
+ WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
132
+ INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
133
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
134
+ IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
135
+ DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
136
+ TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH
137
+ THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.