fathom 0.2.1 → 0.2.2

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,8 +1,8 @@
1
1
  Fathom
2
- ------
2
+ ======
3
3
 
4
4
  Introduction
5
- ============
5
+ ------------
6
6
 
7
7
  This is a library for decision support. It is useful for recording various types of information, and then combining it in useful ways. As of right now, it's not very useful, but I'm actively working on it again.
8
8
 
@@ -22,8 +22,8 @@ Setting up the data and models starts with a decoupled Ruby library. I'll give
22
22
 
23
23
  Keeping the data and models in context is more of a user interface question, which I'll build in another library. I'm considering hosting that solution myself and just making it available publicly. We'll see after all the core ideas are gathered.
24
24
 
25
- Usage
26
- =====
25
+ Fathom Basics
26
+ -------------
27
27
 
28
28
  Enrico Fermi [said](http://www.lucidcafe.com/library/95sep/fermi.html):
29
29
  There are two possible outcomes: if the result confirms the hypothesis, then you've made a measurement.
@@ -113,6 +113,153 @@ To use imported data in a ValueDescription, just reference this knowledge base:
113
113
  ...
114
114
  end
115
115
 
116
+ Serial Agent Based Modeling
117
+ ---------------------------
118
+
119
+ I have added some basic support for Agent Based Modeling (ABM). Right now, this only supports serial simulations. I will be adding an Agent Cluster, which will allow us to run large simulations asynchronously using EventMachine. Until then, here's a really simple example of how to do things.
120
+
121
+ First, let's create a couple agents, a Cola and a Consumer:
122
+
123
+ class Cola < Agent
124
+ property :sweetness
125
+ property :number_sold
126
+
127
+ def on_purchase(consumer)
128
+ self.number_sold += 1
129
+ log_purchase
130
+ end
131
+
132
+ def on_tick(simulation)
133
+ self.sweetness = suggest_sweetness
134
+ end
135
+
136
+ def inspect
137
+ "Cola: sweetness: #{self.sweetness}, sales: #{self.number_sold}"
138
+ end
139
+
140
+ protected
141
+
142
+ # This is where the fun is as well. This is an admittedly poor suggestion engine.
143
+ def suggest_sweetness
144
+ case purchases.length
145
+ when *(0..10).to_a
146
+ self.node_for_sweetness.rand
147
+ when *(10..50).to_a
148
+ (self.node_for_sweetness.rand * 0.4) +
149
+ (average_purchase_sweetness * 0.6)
150
+ when *(50..250).to_a
151
+ (self.node_for_sweetness.rand * 0.2) +
152
+ (average_purchase_sweetness * 0.8)
153
+ else
154
+ (self.node_for_sweetness.rand * 0.05) +
155
+ (average_purchase_sweetness * 0.95)
156
+ end
157
+ end
158
+
159
+ def average_purchase_sweetness
160
+ purchases.inject(0.0) {|s, e| s += e} / purchases.length
161
+ end
162
+
163
+ def log_purchase
164
+ purchases << sweetness
165
+ end
166
+
167
+ def purchases
168
+ @purchases ||= []
169
+ end
170
+ end
171
+
172
+ class Consumer < Agent
173
+ property :sweetness_preference
174
+
175
+ attr_reader :simulation
176
+
177
+ def on_tick(simulation)
178
+ @simulation ||= simulation
179
+ purchase_cola
180
+ end
181
+
182
+ def inspect
183
+ "Consumer: preferred sweetness: #{self.sweetness_preference}"
184
+ end
185
+
186
+ protected
187
+ def agents_using_purchase
188
+ @agents_using_purchase ||= simulation.agents_using_purchase
189
+ end
190
+
191
+ # This is where all the fun happens.
192
+ def purchase_cola
193
+ if rand < 0.1
194
+ agents_using_purchase.rand.on_purchase(self)
195
+ else
196
+ distances = agents_using_purchase.map {|agent| [agent, (self.sweetness_preference - agent.sweetness).abs] }
197
+ sorted_distances = distances.sort {|a, b| a.last <=> b.last }
198
+ purchased = sorted_distances.first.first
199
+ purchased.on_purchase(self)
200
+ end
201
+ end
202
+ end
203
+
204
+ Agents need to do just a few things:
205
+
206
+ * define their properties
207
+ * define which events they listen to
208
+ * define the behavior we're after for each event
209
+
210
+ Properties can be whatever you're after. Usually, these are seeded with some knowledge that we're working on in the knowledge base. Declaring a property gives us a getter and a setter for that property, as well as access to the seed objects we use when setting up the agent.
211
+
212
+ Events are setup by defining a method starting with on_. A consumer responds to on_tick, and the cola responds to on_tick and on_purchase. We setup events with this convention so that it's a little easier to coordinate the traffic amongst the agents and between the agents and the simulation. When we start using EventMachine for agent clusters, it will be more important to have this interface explicitly defined like this so that things don't get confused.
213
+
214
+ The underlying behavior is where we can have a lot of fun. We can start adopting reinforcement learning techniques, or mimic real-world interactions. For this example, I had the consumer purchase some cola at every tick. Right now, it optimizes for the cola that's nearest its preference for sweetness. You may imagine how fun this would get to introduce different types of consumers, or start mimicking a satisficing algorithm (allow the consumers to make a choice that's good enough, rather than optimal). We could start adding budgets, ages, and proximity to the cola. Once the behaviors and properties are setup, models can be iterated over extensively until the system dynamics are thoroughly explored, or even some prognostic value begins to emerge from the experiments.
215
+
216
+ To show the whole example, let me give you some configuration data I stored in a YAML file:
217
+
218
+ :american_consumer_sweetness_preference:
219
+ hard_lower_bound: 0
220
+ hard_upper_bound: 1
221
+ min: 0.2
222
+ max: 0.3
223
+ name: American Consumer Sweetness Preference
224
+
225
+ :cola_sweetness_range:
226
+ hard_lower_bound: 0
227
+ hard_upper_bound: 1
228
+
229
+ Also, here is the actual simulation:
230
+
231
+ require 'rubygems'
232
+ require 'fathom'
233
+ require 'cola'
234
+ require 'consumer'
235
+
236
+ YAMLImport.import(File.expand_path('nodes.yml'))
237
+
238
+ @rb_cola = Cola.new(:sweetness => Fathom.kb[:cola_sweetness_range], :number_sold => 0)
239
+ @ruby_cola = Cola.new(:sweetness => Fathom.kb[:cola_sweetness_range], :number_sold => 0)
240
+ @american_consumer = Consumer.new(
241
+ :sweetness_preference => Fathom.kb[:american_consumer_sweetness_preference],
242
+ :budget => Fathom.kb[:american_cola_budget]
243
+ )
244
+
245
+ @simulation = TickSimulation.new(@rb_cola, @ruby_cola, @american_consumer)
246
+ @simulation.process(1_000)
247
+ puts @american_consumer.inspect, @rb_cola.inspect, @ruby_cola.inspect
248
+
249
+ The output from this experiment looks like this:
250
+
251
+ demo_abm : ruby sim.rb
252
+ Consumer: preferred sweetness: 0.258095065252885
253
+ Cola: sweetness: 0.362263199218971, sales: 626
254
+ Cola: sweetness: 0.377573124603715, sales: 374
255
+
256
+ You can see that our single consumer wanted sweetness rated around 0.25, and ended up purchasing more soda that ended up looking like 0.36. With better goal-seeking behavior, the agents could actually optimize to the consumer's preferences. With some verification of the seed nodes against market data, the simulations could look more and more like the real world.
257
+
258
+ I've written up an article on our company blog to give a better background to Agent Based Models, which can be [found here](http://fleetventures.com/2010/11/07/agent-based-modeling/).
259
+
260
+ Future Development
261
+ ------------------
262
+
116
263
  This code is certainly not production ready. There are many things I'll want to add just to have basic Monte Carlo methods up to snuff:
117
264
 
118
265
  * More distributions to choose from
@@ -122,26 +269,27 @@ This code is certainly not production ready. There are many things I'll want to
122
269
  * Better visualization with plotutils support and possibly other graphics support
123
270
  * Project organization: decision descriptions, owners, sharing
124
271
  * Measurement values: use Shannon's entropy and some value calculations to point out which measurements have the highest potential ROI
272
+ * EventMachine to drive agent clusters, as well as possibly other parts of the system
125
273
 
126
274
  On a bigger level, I still haven't implemented other major ideas:
127
275
 
128
- * Agent-based modeling
129
276
  * System dynamics
130
277
  * Belief updating in Causal Graphs
131
278
  * Fathom as a Web service
132
279
 
133
- Documentation TODO:
280
+ Dependencies
281
+ ------------
134
282
 
135
- * Document using this library from the command line
136
- * Document these classes as RabbitMQ consumers
283
+ This project relies on the GNU Scientific Library and the ruby/gsl bindings for the GSL. It has only minimal extensions to external libraries:
137
284
 
138
- Dependencies
139
- ============
285
+ * Array responds to rand (so [1,2,3].rand returns a random value from that array)
286
+ * OpenStruct exposes it's underlying table, keys, and values
287
+ * FasterCSV has a :strip header converter now
140
288
 
141
- This project relies on the GNU Scientific Library and the ruby/gsl bindings for the GSL.
289
+ In the future, more dependencies will be introduced for parts of the library: EventMachine is one that I'm sure will be added. The goal of this project is to allow a reasonable number of dependencies to make the project performant and useful, but without making it a headache to setup or use with other projects.
142
290
 
143
291
  Note on Patches/Pull Requests
144
- =============================
292
+ -----------------------------
145
293
 
146
294
  * Fork the project.
147
295
  * Make your feature addition or bug fix.
@@ -153,7 +301,7 @@ Note on Patches/Pull Requests
153
301
  * Send me a pull request. Bonus points for topic branches.
154
302
 
155
303
  Copyright
156
- =========
304
+ ---------
157
305
 
158
306
  Copyright (c) 2010 David Richards
159
307
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.2.1
1
+ 0.2.2
@@ -7,6 +7,12 @@ class Fathom::MonteCarloSet
7
7
  self.samples[key.to_sym]
8
8
  end
9
9
  end
10
+
11
+ def define_summary_method(field)
12
+ define_method("#{field}_summary".to_sym) do
13
+ self.summary(field.to_sym)
14
+ end
15
+ end
10
16
  end
11
17
 
12
18
  attr_reader :value_description, :samples_taken, :samples
@@ -30,8 +36,33 @@ class Fathom::MonteCarloSet
30
36
  @keys_asserted = nil
31
37
  end
32
38
 
39
+ def fields
40
+ @samples.keys
41
+ end
42
+
43
+ def summary(field=nil)
44
+ return summarize_field(field) if field
45
+ fields.inject({}) do |h, field|
46
+ h[field] = summarize_field(field)
47
+ h
48
+ end
49
+ end
50
+
33
51
  protected
34
52
 
53
+ def summarize_field(field)
54
+ raise "No fields are defined. Have you processed this model yet?" if fields.empty?
55
+ raise ArgumentError, "#{field} is not a field in this set." unless fields.include?(field)
56
+ vector = self.send(field)
57
+ {
58
+ :coefficient_of_variation => (vector.sd / vector.mean),
59
+ :max => vector.max,
60
+ :mean => vector.mean,
61
+ :min => vector.min,
62
+ :sd => vector.sd
63
+ }
64
+ end
65
+
35
66
  def assert_sample_vectors
36
67
  vectors = @samples.inject({}) do |h, o|
37
68
  key, array = o.first, o.last
@@ -58,6 +89,7 @@ class Fathom::MonteCarloSet
58
89
  return true if @keys_asserted
59
90
  result.keys.each do |key|
60
91
  assert_key(key)
92
+ self.class.define_summary_method(key)
61
93
  end
62
94
  @keys_asserted = true
63
95
  end
@@ -15,6 +15,9 @@ describe MonteCarloSet do
15
15
  gross_margins = revenue - commissions_paid
16
16
  {:revenue => revenue, :commissions_paid => commissions_paid, :gross_margins => gross_margins}
17
17
  end
18
+
19
+ @fields = [:commissions_paid, :gross_margins, :revenue]
20
+ @summary_fields = [:coefficient_of_variation, :max, :mean, :min, :sd]
18
21
  end
19
22
 
20
23
  before do
@@ -55,4 +58,40 @@ describe MonteCarloSet do
55
58
  @mcs.reset!
56
59
  lambda{@mcs.process(1)}.should_not raise_error
57
60
  end
61
+
62
+ it "should expose the fields from the samples" do
63
+ @mcs.process(1)
64
+ sort_array_of_symbols(@mcs.fields).should eql(@fields)
65
+ end
66
+
67
+ it "should offer a summary of the fields" do
68
+ @mcs.process(1)
69
+ summary = @mcs.summary
70
+ summary.should be_a(Hash)
71
+ sort_array_of_symbols(summary.keys).should eql(@fields)
72
+ summary.each do |key, value|
73
+ value.should be_a(Hash)
74
+ sort_array_of_symbols(value.keys).should eql(@summary_fields)
75
+ end
76
+ end
77
+
78
+ it "should be able to summarize a single field" do
79
+ @mcs.process(2)
80
+ summary = @mcs.summary(:revenue)
81
+ summary.should be_a(Hash)
82
+ sort_array_of_symbols(summary.keys).should eql(@summary_fields)
83
+ summary[:coefficient_of_variation].should eql(@mcs.revenue.sd / @mcs.revenue.mean)
84
+ summary[:max].should eql(@mcs.revenue.max)
85
+ summary[:min].should eql(@mcs.revenue.min)
86
+ summary[:sd].should eql(@mcs.revenue.sd)
87
+ end
88
+
89
+ it "should define summary methods on the object" do
90
+ @mcs.process(2)
91
+ @mcs.revenue_summary.should eql(@mcs.summary(:revenue))
92
+ end
58
93
  end
94
+
95
+ def sort_array_of_symbols(array)
96
+ array.map {|e| e.to_s}.sort.map {|e| e.to_sym}
97
+ end
metadata CHANGED
@@ -1,13 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fathom
3
3
  version: !ruby/object:Gem::Version
4
- hash: 21
4
+ hash: 19
5
5
  prerelease: false
6
6
  segments:
7
7
  - 0
8
8
  - 2
9
- - 1
10
- version: 0.2.1
9
+ - 2
10
+ version: 0.2.2
11
11
  platform: ruby
12
12
  authors:
13
13
  - David
@@ -15,7 +15,7 @@ autorequire:
15
15
  bindir: bin
16
16
  cert_chain: []
17
17
 
18
- date: 2010-11-07 00:00:00 -06:00
18
+ date: 2010-11-09 00:00:00 -07:00
19
19
  default_executable:
20
20
  dependencies:
21
21
  - !ruby/object:Gem::Dependency