prolly 0.0.1 → 0.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.markdown +286 -173
- data/lib/prolly/ps.rb +3 -3
- data/lib/prolly/ps/storage/rubylist.rb +1 -3
- metadata +1 -1
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 2c026ea02e8ac2de33f7278ac2fe87e3d8864cf0
|
4
|
+
data.tar.gz: 7f9c5e4186921a36dcf74fe7662aa1b58551fd3e
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 0bdfe8480833e3f61552ae656763f24acff8e8ca331e4c3d5e452286ed29f0b1e537067730fa1df043beec4e680c04c530f0151f0d025c282c837dceedf678f3
|
7
|
+
data.tar.gz: 6b5fcc2c664c0c18c123c0bc8c81bf109623293732696a886db3880212ee1f38898343f463b2e322180bb4f860f3b3d5d4cc959c179d2c63800dfd9e4bbf09a2
|
data/README.markdown
CHANGED
@@ -6,9 +6,10 @@ specifically for answering questions about probabilities of events based on the
|
|
6
6
|
samples you've seen before.
|
7
7
|
|
8
8
|
So instead of counting all the events yourself, you just express
|
9
|
-
probabilities much like how math books express it.
|
10
|
-
probabilities is useful for writing machine learning
|
11
|
-
of abstraction. The right level abstraction makes things
|
9
|
+
probabilities, entropies, and information gain much like how math books express it.
|
10
|
+
Being able to express probabilities is useful for writing machine learning
|
11
|
+
algorithms at a higher level of abstraction. The right level abstraction makes things
|
12
|
+
easier to build.
|
12
13
|
|
13
14
|
We can now making decisions in code not just based on the current data, like `if`
|
14
15
|
statements do, but we can make decisions based on the chance of prior data and
|
@@ -90,17 +91,99 @@ Ps.rv(color: :blue).given(size: :small).prob
|
|
90
91
|
```
|
91
92
|
And that will give you the probability of the random variable Color is :blue given that the Size was :small.
|
92
93
|
|
93
|
-
###
|
94
|
+
### Probabilities
|
95
|
+
|
96
|
+
What is the probability there is a blue marble?
|
97
|
+
```ruby
|
98
|
+
# P(C = blue)
|
99
|
+
Ps.rv(color: :blue).prob
|
100
|
+
```
|
101
|
+
|
102
|
+
What is the joint probability there is a blue marble that also has a rough texture?
|
103
|
+
```ruby
|
104
|
+
# P(C = blue, T = rough)
|
105
|
+
Ps.rv(color: :blue, texture: :rough).prob
|
106
|
+
```
|
107
|
+
|
108
|
+
What is the probability a marble is small or med sized?
|
109
|
+
```ruby
|
110
|
+
# P(S = small, med)
|
111
|
+
Ps.rv(size: [:small, :med]).prob
|
112
|
+
```
|
113
|
+
|
114
|
+
What is the probability of a blue marble given that the marble is small?
|
115
|
+
```ruby
|
116
|
+
# P(C = blue | S = small)
|
117
|
+
Ps.rv(color: :blue).given(size: :small).prob
|
118
|
+
```
|
119
|
+
|
120
|
+
What is the probability of a blue marble and rough texture given that the marble is small?
|
121
|
+
```ruby
|
122
|
+
# P(C = blue, T = rough | S = small)
|
123
|
+
Ps.rv(color: :blue, texture: :rough).given(size: :small).prob
|
124
|
+
```
|
125
|
+
|
126
|
+
### Probability density functions
|
127
|
+
|
128
|
+
Probability density for a random variable.
|
129
|
+
```ruby
|
130
|
+
Ps.rv(:color).pdf
|
131
|
+
```
|
132
|
+
|
133
|
+
Probability density for a conditional random variable.
|
134
|
+
```ruby
|
135
|
+
Ps.rv(:color).given(size: :small).pdf
|
136
|
+
```
|
137
|
+
|
138
|
+
### Entropy
|
139
|
+
|
140
|
+
Entropy of the RV color.
|
141
|
+
```ruby
|
142
|
+
# H(C)
|
143
|
+
Ps.rv(:color).entropy
|
144
|
+
```
|
145
|
+
|
146
|
+
Entropy of color given the marble is small
|
147
|
+
```ruby
|
148
|
+
# H(C | S = small)
|
149
|
+
Ps.rv(:color).given(size: :small).entropy
|
150
|
+
```
|
151
|
+
|
152
|
+
### Information Gain
|
153
|
+
|
154
|
+
Information gain of color and size.
|
155
|
+
```ruby
|
156
|
+
# IG(C | S)
|
157
|
+
Ps.rv(:color).given(:size).infogain
|
158
|
+
```
|
159
|
+
|
160
|
+
Information gain of color and size, when we already know texture and opacity.
|
161
|
+
```ruby
|
162
|
+
# IG(C | S, T=smooth, O=opaque)
|
163
|
+
Ps.rv(:color).given(:size, { texture: :smooth, opacity: :opaque }).infogain
|
164
|
+
```
|
165
|
+
|
166
|
+
### Counts
|
167
|
+
|
168
|
+
At the base of all the probabilities are counts of stuff.
|
169
|
+
```ruby
|
170
|
+
Ps.rv(color: :blue).count
|
171
|
+
```
|
172
|
+
|
173
|
+
```ruby
|
174
|
+
Ps.rv(:color).given(:size).count
|
175
|
+
```
|
176
|
+
## Full Reference
|
94
177
|
|
95
178
|
A random variable can be specified `Ps.rv(:color)` or unspecified `Ps.rv(color: :blue)`. So too can conditional random variables be specified or unspecified.
|
96
179
|
|
97
180
|
Prolly currently supports five operations.
|
98
181
|
|
99
|
-
- .prob · Calculates probability, a fractional number representing the belief you have that an event will occur; based on the amount of evidence you've seen for that event.
|
100
|
-
- .pdf · Calculates probability density function, a hash of all possible probabilities for the random variable.
|
101
|
-
- .entropy · Calculates entropy, a fractional number representing the spikiness or smoothness of a density function, which implies how much information is in the random variable.
|
102
|
-
- .infogain · Calculates information gain, a fractional number representing the amount of information (that is, reduction in uncertainty) that knowing either variable provides about the other.
|
103
|
-
- .count · Counts the number of events satisfying the conditions.
|
182
|
+
- .prob() · Calculates probability, a fractional number representing the belief you have that an event will occur; based on the amount of evidence you've seen for that event.
|
183
|
+
- .pdf() · Calculates probability density function, a hash of all possible probabilities for the random variable.
|
184
|
+
- .entropy() · Calculates entropy, a fractional number representing the spikiness or smoothness of a density function, which implies how much information is in the random variable.
|
185
|
+
- .infogain() · Calculates information gain, a fractional number representing the amount of information (that is, reduction in uncertainty) that knowing either variable provides about the other.
|
186
|
+
- .count() · Counts the number of events satisfying the conditions.
|
104
187
|
|
105
188
|
Each of the operations will only work with certain combinations of random variables. The possibilities are listed below, and Prolly will throw an exception if it's violated.
|
106
189
|
|
@@ -108,271 +191,301 @@ Legend:
|
|
108
191
|
- ✓ available for this operator
|
109
192
|
- Δ! available, but not yet implemented for this operator.
|
110
193
|
|
194
|
+
### The Probability Operator: .prob()
|
195
|
+
|
111
196
|
<table>
|
112
197
|
<tr>
|
113
|
-
<th
|
114
|
-
<th>
|
115
|
-
<th>.
|
116
|
-
<th>.
|
117
|
-
<th>.
|
118
|
-
<th>.
|
119
|
-
<th>.
|
198
|
+
<th></th>
|
199
|
+
<th>n/a</th>
|
200
|
+
<th>.given(:size)</th>
|
201
|
+
<th>.given(size: :small)</th>
|
202
|
+
<th>.given(size: :small, weight: :fat)</th>
|
203
|
+
<th>.given(:size, weight: :fat)</th>
|
204
|
+
<th>.given(:size, :weight)</th>
|
120
205
|
</tr>
|
121
206
|
<tr>
|
122
|
-
<th>
|
207
|
+
<th>rv(color: :blue)</th>
|
208
|
+
<th>✓</th>
|
209
|
+
<th>✓</th>
|
210
|
+
<th>✓</th>
|
211
|
+
<th>✓</th>
|
123
212
|
<th></th>
|
213
|
+
<th></th>
|
214
|
+
</tr>
|
215
|
+
<tr>
|
216
|
+
<th>rv(color: [:blue, :green])</th>
|
124
217
|
<th>✓</th>
|
125
218
|
<th></th>
|
126
219
|
<th></th>
|
127
220
|
<th></th>
|
128
|
-
<th
|
221
|
+
<th></th>
|
222
|
+
<th></th>
|
129
223
|
</tr>
|
130
224
|
<tr>
|
131
|
-
<th>
|
132
|
-
<th
|
225
|
+
<th>rv(color: :blue, texture: :rough)</th>
|
226
|
+
<th>✓</th>
|
133
227
|
<th>✓</th>
|
228
|
+
<th>✓</th>
|
229
|
+
<th>✓</th>
|
230
|
+
<th></th>
|
231
|
+
<th></th>
|
232
|
+
</tr>
|
233
|
+
<tr>
|
234
|
+
<th>rv(:color)</th>
|
235
|
+
<th></th>
|
236
|
+
<th></th>
|
237
|
+
<th></th>
|
134
238
|
<th></th>
|
135
239
|
<th></th>
|
136
240
|
<th></th>
|
137
|
-
<th>✓</th>
|
138
241
|
</tr>
|
139
242
|
<tr>
|
140
|
-
<th>
|
141
|
-
<th
|
142
|
-
<th
|
243
|
+
<th>rv(:color, :texture)</th>
|
244
|
+
<th></th>
|
245
|
+
<th></th>
|
246
|
+
<th></th>
|
143
247
|
<th></th>
|
144
248
|
<th></th>
|
145
249
|
<th></th>
|
146
|
-
<th>✓</th>
|
147
250
|
</tr>
|
251
|
+
</table>
|
252
|
+
|
253
|
+
### The Probability Density Function Operator: .pdf()
|
254
|
+
|
255
|
+
<table>
|
148
256
|
<tr>
|
149
|
-
<th
|
257
|
+
<th></th>
|
258
|
+
<th>n/a</th>
|
259
|
+
<th>.given(:size)</th>
|
260
|
+
<th>.given(size: :small)</th>
|
150
261
|
<th>.given(size: :small, weight: :fat)</th>
|
151
|
-
<th
|
262
|
+
<th>.given(:size, weight: :fat)</th>
|
263
|
+
<th>.given(:size, :weight)</th>
|
264
|
+
</tr>
|
265
|
+
<tr>
|
266
|
+
<th>rv(color: :blue)</th>
|
267
|
+
<th></th>
|
268
|
+
<th></th>
|
269
|
+
<th></th>
|
152
270
|
<th></th>
|
153
271
|
<th></th>
|
154
272
|
<th></th>
|
155
|
-
<th>✓</th>
|
156
273
|
</tr>
|
157
|
-
<tr>
|
158
|
-
<th>Ps.rv(color: [:blue, :green])</th>
|
159
|
-
<th></th>
|
160
|
-
<th>✓</th>
|
161
|
-
<th></th>
|
162
|
-
<th></th>
|
163
|
-
<th></th>
|
164
|
-
<th>✓</th>
|
165
|
-
</tr>
|
166
274
|
<tr>
|
167
|
-
<th>
|
275
|
+
<th>rv(color: [:blue, :green])</th>
|
276
|
+
<th></th>
|
277
|
+
<th></th>
|
168
278
|
<th></th>
|
169
|
-
<th>✓</th>
|
170
279
|
<th></th>
|
171
280
|
<th></th>
|
172
281
|
<th></th>
|
173
|
-
<th>✓</th>
|
174
282
|
</tr>
|
175
283
|
<tr>
|
176
|
-
<th>
|
177
|
-
<th
|
178
|
-
<th
|
284
|
+
<th>rv(color: :blue, texture: :rough)</th>
|
285
|
+
<th></th>
|
286
|
+
<th></th>
|
287
|
+
<th></th>
|
179
288
|
<th></th>
|
180
289
|
<th></th>
|
181
290
|
<th></th>
|
182
|
-
<th>✓</th>
|
183
291
|
</tr>
|
184
292
|
<tr>
|
185
|
-
<th>
|
186
|
-
<th
|
293
|
+
<th>rv(:color)</th>
|
294
|
+
<th>✓</th>
|
295
|
+
<th>✓</th>
|
296
|
+
<th>✓</th>
|
187
297
|
<th>✓</th>
|
188
298
|
<th></th>
|
189
299
|
<th></th>
|
300
|
+
</tr>
|
301
|
+
<tr>
|
302
|
+
<th>rv(:color, :texture)</th>
|
303
|
+
<th>Δ!</th>
|
304
|
+
<th>Δ!</th>
|
305
|
+
<th>Δ!</th>
|
306
|
+
<th>Δ!</th>
|
307
|
+
<th>Δ!</th>
|
190
308
|
<th></th>
|
191
|
-
<th>✓</th>
|
192
309
|
</tr>
|
310
|
+
</table>
|
311
|
+
|
312
|
+
### The Entropy Operator: .entropy()
|
313
|
+
|
314
|
+
<table>
|
193
315
|
<tr>
|
194
|
-
<th
|
316
|
+
<th></th>
|
317
|
+
<th>n/a</th>
|
318
|
+
<th>.given(:size)</th>
|
319
|
+
<th>.given(size: :small)</th>
|
195
320
|
<th>.given(size: :small, weight: :fat)</th>
|
196
|
-
<th
|
321
|
+
<th>.given(:size, weight: :fat)</th>
|
322
|
+
<th>.given(:size, :weight)</th>
|
323
|
+
</tr>
|
324
|
+
<tr>
|
325
|
+
<th>rv(color: :blue)</th>
|
326
|
+
<th></th>
|
327
|
+
<th></th>
|
328
|
+
<th></th>
|
197
329
|
<th></th>
|
198
330
|
<th></th>
|
199
331
|
<th></th>
|
200
|
-
<th>✓</th>
|
201
332
|
</tr>
|
202
333
|
<tr>
|
203
|
-
<th>
|
334
|
+
<th>rv(color: [:blue, :green])</th>
|
335
|
+
<th></th>
|
336
|
+
<th></th>
|
337
|
+
<th></th>
|
204
338
|
<th></th>
|
205
339
|
<th></th>
|
206
|
-
<th>✓</th>
|
207
|
-
<th>✓</th>
|
208
340
|
<th></th>
|
209
|
-
<th>✓</th>
|
210
341
|
</tr>
|
211
342
|
<tr>
|
212
|
-
<th>
|
213
|
-
<th
|
343
|
+
<th>rv(color: :blue, texture: :rough)</th>
|
344
|
+
<th></th>
|
345
|
+
<th></th>
|
346
|
+
<th></th>
|
347
|
+
<th></th>
|
214
348
|
<th></th>
|
349
|
+
<th></th>
|
350
|
+
</tr>
|
351
|
+
<tr>
|
352
|
+
<th>rv(:color)</th>
|
215
353
|
<th>✓</th>
|
216
354
|
<th>✓</th>
|
217
355
|
<th>✓</th>
|
218
356
|
<th>✓</th>
|
357
|
+
<th>✓</th>
|
358
|
+
<th></th>
|
219
359
|
</tr>
|
220
360
|
<tr>
|
221
|
-
<th>
|
222
|
-
<th
|
223
|
-
<th
|
361
|
+
<th>rv(:color, :texture)</th>
|
362
|
+
<th>✓</th>
|
363
|
+
<th>Δ!</th>
|
224
364
|
<th>✓</th>
|
365
|
+
<th>Δ!</th>
|
225
366
|
<th>✓</th>
|
226
367
|
<th></th>
|
227
|
-
<th>✓</th>
|
228
368
|
</tr>
|
369
|
+
</table>
|
370
|
+
|
371
|
+
### The Information Gain Operator: .infogain()
|
372
|
+
|
373
|
+
<table>
|
229
374
|
<tr>
|
230
|
-
<th
|
375
|
+
<th></th>
|
376
|
+
<th>n/a</th>
|
377
|
+
<th>.given(:size)</th>
|
378
|
+
<th>.given(size: :small)</th>
|
231
379
|
<th>.given(size: :small, weight: :fat)</th>
|
380
|
+
<th>.given(:size, weight: :fat)</th>
|
381
|
+
<th>.given(:size, :weight)</th>
|
382
|
+
</tr>
|
383
|
+
<tr>
|
384
|
+
<th>rv(color: :blue)</th>
|
385
|
+
<th></th>
|
386
|
+
<th></th>
|
387
|
+
<th></th>
|
388
|
+
<th></th>
|
232
389
|
<th></th>
|
233
|
-
<th>✓</th>
|
234
|
-
<th>✓</th>
|
235
390
|
<th></th>
|
236
|
-
<th>✓</th>
|
237
391
|
</tr>
|
238
392
|
<tr>
|
239
|
-
<th>
|
240
|
-
<th
|
393
|
+
<th>rv(color: [:blue, :green])</th>
|
394
|
+
<th></th>
|
395
|
+
<th></th>
|
396
|
+
<th></th>
|
397
|
+
<th></th>
|
241
398
|
<th></th>
|
242
399
|
<th></th>
|
243
|
-
<th>✓</th>
|
244
|
-
<th>✓</th>
|
245
|
-
<th>✓</th>
|
246
400
|
</tr>
|
247
401
|
<tr>
|
248
|
-
<th>
|
402
|
+
<th>rv(color: :blue, texture: :rough)</th>
|
249
403
|
<th></th>
|
250
404
|
<th></th>
|
251
|
-
<th
|
405
|
+
<th></th>
|
406
|
+
<th></th>
|
407
|
+
<th></th>
|
408
|
+
<th></th>
|
409
|
+
</tr>
|
410
|
+
<tr>
|
411
|
+
<th>rv(:color)</th>
|
412
|
+
<th></th>
|
252
413
|
<th>✓</th>
|
253
414
|
<th></th>
|
415
|
+
<th></th>
|
254
416
|
<th>✓</th>
|
417
|
+
<th></th>
|
255
418
|
</tr>
|
256
419
|
<tr>
|
257
|
-
<th>
|
258
|
-
<th
|
420
|
+
<th>rv(:color, :texture)</th>
|
421
|
+
<th></th>
|
422
|
+
<th></th>
|
423
|
+
<th></th>
|
424
|
+
<th></th>
|
259
425
|
<th></th>
|
260
|
-
<th>Δ!</th>
|
261
|
-
<th>Δ!</th>
|
262
426
|
<th></th>
|
263
|
-
<th>✓</th>
|
264
427
|
</tr>
|
428
|
+
</table>
|
429
|
+
|
430
|
+
### The Count Operator: .count()
|
431
|
+
|
432
|
+
<table>
|
265
433
|
<tr>
|
266
|
-
<th>Ps.rv(:color, :texture)</th>
|
267
|
-
<th>.given(size: :small)</th>
|
268
434
|
<th></th>
|
269
|
-
<th
|
435
|
+
<th>n/a</th>
|
436
|
+
<th>.given(:size)</th>
|
437
|
+
<th>.given(size: :small)</th>
|
438
|
+
<th>.given(size: :small, weight: :fat)</th>
|
439
|
+
<th>.given(:size, weight: :fat)</th>
|
440
|
+
<th>.given(:size, :weight)</th>
|
441
|
+
</tr>
|
442
|
+
<tr>
|
443
|
+
<th>rv(color: :blue)</th>
|
444
|
+
<th>✓</th>
|
445
|
+
<th>✓</th>
|
446
|
+
<th>✓</th>
|
447
|
+
<th>✓</th>
|
270
448
|
<th>✓</th>
|
271
|
-
<th></th>
|
272
449
|
<th>✓</th>
|
273
450
|
</tr>
|
274
451
|
<tr>
|
275
|
-
<th>
|
276
|
-
<th
|
277
|
-
<th
|
278
|
-
<th
|
279
|
-
<th
|
280
|
-
<th
|
452
|
+
<th>rv(color: [:blue, :green])</th>
|
453
|
+
<th>✓</th>
|
454
|
+
<th>✓</th>
|
455
|
+
<th>✓</th>
|
456
|
+
<th>✓</th>
|
457
|
+
<th>✓</th>
|
281
458
|
<th>✓</th>
|
282
459
|
</tr>
|
283
460
|
<tr>
|
284
|
-
<th>
|
285
|
-
<th
|
286
|
-
<th
|
287
|
-
<th
|
461
|
+
<th>rv(color: :blue, texture: :rough)</th>
|
462
|
+
<th>✓</th>
|
463
|
+
<th>✓</th>
|
464
|
+
<th>✓</th>
|
465
|
+
<th>✓</th>
|
466
|
+
<th>✓</th>
|
467
|
+
<th>✓</th>
|
468
|
+
</tr>
|
469
|
+
<tr>
|
470
|
+
<th>rv(:color)</th>
|
471
|
+
<th>✓</th>
|
472
|
+
<th>✓</th>
|
473
|
+
<th>✓</th>
|
474
|
+
<th>✓</th>
|
475
|
+
<th>✓</th>
|
476
|
+
<th>✓</th>
|
477
|
+
</tr>
|
478
|
+
<tr>
|
479
|
+
<th>rv(:color, :texture)</th>
|
480
|
+
<th>✓</th>
|
481
|
+
<th>✓</th>
|
482
|
+
<th>✓</th>
|
483
|
+
<th>✓</th>
|
288
484
|
<th>✓</th>
|
289
|
-
<th></th>
|
290
485
|
<th>✓</th>
|
291
486
|
</tr>
|
292
487
|
</table>
|
293
488
|
|
294
|
-
## Examples
|
295
|
-
|
296
|
-
There are examples of using Prolly to write learning algorithms.
|
297
|
-
|
298
|
-
- [Decision Tree](https://github.com/iamwilhelm/prolly/tree/master/examples/decision_tree)
|
299
|
-
|
300
|
-
### Probabilities
|
301
|
-
|
302
|
-
What is the probability there is a blue marble?
|
303
|
-
```ruby
|
304
|
-
# P(C = blue)
|
305
|
-
Ps.rv(color: :blue).prob
|
306
|
-
```
|
307
|
-
|
308
|
-
What is the joint probability there is a blue marble that also has a rough texture?
|
309
|
-
```ruby
|
310
|
-
# P(C = blue, T = rough)
|
311
|
-
Ps.rv(color: :blue, texture: :rough).prob
|
312
|
-
```
|
313
|
-
|
314
|
-
What is the probability a marble is small or med sized?
|
315
|
-
```ruby
|
316
|
-
# P(S = small, med)
|
317
|
-
Ps.rv(size: [:small, :med]).prob
|
318
|
-
```
|
319
|
-
|
320
|
-
What is the probability of a blue marble given that the marble is small?
|
321
|
-
```ruby
|
322
|
-
# P(C = blue | S = small)
|
323
|
-
Ps.rv(color: :blue).given(size: :small).prob
|
324
|
-
```
|
325
|
-
|
326
|
-
What is the probability of a blue marble and rough texture given that the marble is small?
|
327
|
-
```ruby
|
328
|
-
# P(C = blue, T = rough | S = small)
|
329
|
-
Ps.rv(color: :blue, texture: :rough).given(size: :small).prob
|
330
|
-
```
|
331
|
-
|
332
|
-
### Probability density functions
|
333
|
-
|
334
|
-
Probability density for a random variable.
|
335
|
-
```ruby
|
336
|
-
Ps.rv(:color).pdf
|
337
|
-
```
|
338
|
-
|
339
|
-
Probability density for a conditional random variable.
|
340
|
-
```ruby
|
341
|
-
Ps.rv(:color).given(size: :small).pdf
|
342
|
-
```
|
343
|
-
|
344
|
-
### Entropy
|
345
|
-
|
346
|
-
Entropy of the RV color.
|
347
|
-
```ruby
|
348
|
-
# H(C)
|
349
|
-
Ps.rv(:color).entropy
|
350
|
-
```
|
351
|
-
|
352
|
-
Entropy of color given the marble is small
|
353
|
-
```ruby
|
354
|
-
# H(C | S = small)
|
355
|
-
Ps.rv(:color).given(size: :small).entropy
|
356
|
-
```
|
357
|
-
|
358
|
-
### Information Gain
|
359
|
-
|
360
|
-
Information gain of color and size.
|
361
|
-
```ruby
|
362
|
-
# IG(C | S)
|
363
|
-
Ps.rv(:color).given(:size).infogain
|
364
|
-
```
|
365
|
-
### Counts
|
366
|
-
|
367
|
-
At the base of all the probabilities are counts of stuff.
|
368
|
-
```ruby
|
369
|
-
Ps.rv(color: :blue).count
|
370
|
-
```
|
371
|
-
|
372
|
-
```ruby
|
373
|
-
Ps.rv(:color).given(:size).count
|
374
|
-
```
|
375
|
-
|
376
489
|
## Stores
|
377
490
|
|
378
491
|
Prolly can use different stores to remember the prior event data from which it
|
data/lib/prolly/ps.rb
CHANGED
@@ -60,9 +60,9 @@ module Prolly
|
|
60
60
|
|
61
61
|
def_delegators :@storage, :reset, :add, :count, :rand_vars, :uniq_vals, :import
|
62
62
|
|
63
|
-
def initialize
|
64
|
-
#@storage = Storage::
|
65
|
-
@storage = Storage::
|
63
|
+
def initialize(storage = nil)
|
64
|
+
#@storage = Storage::Mongodb.new()
|
65
|
+
@storage = Storage::Rubylist.new()
|
66
66
|
#@storage = Storage::Redis.new()
|
67
67
|
end
|
68
68
|
|
@@ -22,18 +22,16 @@ module Prolly
|
|
22
22
|
|
23
23
|
def count(rvs, options = {})
|
24
24
|
reload = options[:reload] || false
|
25
|
-
start_time = Time.now
|
26
25
|
if rvs.kind_of?(Array)
|
27
26
|
value = @data.count { |e| rvs.all? { |rv| e.has_key?(rv) } }
|
28
27
|
elsif rvs.kind_of?(Hash)
|
29
28
|
value = @data.count { |e|
|
30
29
|
rvs.map { |rkey, rval|
|
31
30
|
vals = rval.kind_of?(Array) ? rval : [rval]
|
32
|
-
vals.include?(e[rkey])
|
31
|
+
vals.include?(e[rkey])
|
33
32
|
}.all?
|
34
33
|
}
|
35
34
|
end
|
36
|
-
elapsed = Time.now - start_time
|
37
35
|
return value
|
38
36
|
end
|
39
37
|
|