bonito 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.travis.yml +1 -1
- data/README.md +191 -91
- data/bonito.gemspec +3 -3
- data/lib/bonito/progress.rb +2 -0
- data/lib/bonito/timeline.rb +1 -1
- data/lib/bonito/version.rb +1 -1
- metadata +8 -8
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 599d257d6b113cc345ae71a20bb2a00be5bfba1639f647b597a0c98d6cf37df5
|
4
|
+
data.tar.gz: ee359e2a42753140a5b23a864764fe798110f3761a492335d64d2d24b4b8fad5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 9b560ad80a6bc02d69e14a171be711ba6d5599c774b9e85f920d8b9ecc760128f9732227e22ba2c17d01eb69c04e5667a92934a37847777a3605e2675c74e512
|
7
|
+
data.tar.gz: 43ca1a80a0d874de18060eff0ef45dd06c009ded65f6305fa61560ac6715288927f4c8e459ef1717ee561ac3503368fa55a97f7d883da1c8af59872fc15cb50c
|
data/.travis.yml
CHANGED
data/README.md
CHANGED
@@ -1,118 +1,90 @@
|
|
1
1
|
# Bonito
|
2
2
|
|
3
|
-
|
3
|
+
_Data, in a can_
|
4
4
|
|
5
|
-
|
6
|
-
|
7
|
-
**Bonito** is a ruby DSL for generating canned data. It can simulate, by
|
8
|
-
_freezing time_, sequences of events happening in series and parallel in
|
9
|
-
order to approximate any kind of live data.
|
5
|
+
 [](https://codeclimate.com/github/tmfnll/bonito/maintainability) [](https://codeclimate.com/github/tmfnll/bonito/test_coverage)
|
10
6
|
|
11
|
-
|
12
|
-
perform this _freezing_.
|
7
|
+
## TL;DR
|
13
8
|
|
14
|
-
|
9
|
+
_`Bonito` is a ruby DSL for generating canned data where timing is important._
|
15
10
|
|
16
|
-
|
11
|
+
`Bonito` uses [Timecop](https://github.com/travisjeffery/timecop) to simulate
|
12
|
+
the flow of time as it generates data.
|
17
13
|
|
18
|
-
|
14
|
+
## Introduction
|
19
15
|
|
20
|
-
|
21
|
-
|
22
|
-
|
16
|
+
At the core of `Bonito` is the concept of a _timeline_. A timeline is a sort
|
17
|
+
of schema that defines in what time period each of a sequence of actions can
|
18
|
+
occur.
|
23
19
|
|
24
|
-
|
25
|
-
|
26
|
-
aforementioned `Article`s.
|
20
|
+
An action in `Bonito` is called a `Moment` and is considered to have a duration
|
21
|
+
of `0` itself.
|
27
22
|
|
28
|
-
|
29
|
-
creation time of each `Comment` should be _after_ that of its associated
|
30
|
-
`Article`. In fact, we can consider the data to consist of a collection of
|
31
|
-
_timelines_ where each timeline includes the creation of an `Article` by an
|
32
|
-
`Author` with this being followed afterwards by a series of `Comment`s on the
|
33
|
-
`Article` being created by `User`s.
|
34
|
-
|
35
|
-
`Bonito` offers a `Window` object to model such timelines.
|
23
|
+
#### `Bonito` can generate data in series:
|
36
24
|
|
37
|
-
|
38
|
-
|
25
|
+
Suppose we wish to simulate an online content creator's data, where _author_s
|
26
|
+
write _articles_ and _users_ _comment_ on these articles.
|
39
27
|
|
40
|
-
|
41
|
-
created as follows:
|
28
|
+
We could use `Bonito` to define a `serial timeline`:
|
42
29
|
|
43
30
|
```ruby
|
44
|
-
|
45
|
-
|
46
|
-
|
31
|
+
# First we create data structures to store out data and keep track of them in
|
32
|
+
# a `Scope` object:
|
33
|
+
scope = Bonito::Scope.new.tap do |scope|
|
34
|
+
scope.authors = []
|
35
|
+
scope.articles = []
|
36
|
+
scope.users = []
|
37
|
+
scope.comments = []
|
38
|
+
scope.users_and_authors = []
|
39
|
+
end
|
40
|
+
|
41
|
+
# Next we define out serial timeline:
|
42
|
+
serial = Bonito.over 1.week do
|
43
|
+
please do |scope| # The `please` method denotes the definition of an action
|
44
|
+
author = scope.authors.sample
|
47
45
|
title = Faker::Company.bs
|
48
|
-
|
49
|
-
articles << article
|
46
|
+
scope.article = Article.new(title, author)
|
47
|
+
scope.articles << article
|
50
48
|
end
|
51
49
|
|
52
50
|
repeat times: rand(10), over: 5.days do
|
53
|
-
please do
|
54
|
-
user = users.sample
|
51
|
+
please do |scope|
|
52
|
+
user = scope.users.sample
|
55
53
|
content = Faker::Lorem.sentence
|
56
|
-
comments << Comment.new(content, article, user)
|
54
|
+
scope.comments << Comment.new(content, scope.article, user)
|
57
55
|
end
|
58
56
|
end
|
59
57
|
end
|
60
58
|
```
|
61
59
|
|
62
|
-
|
63
|
-
|
64
|
-
|
65
|
-
`Article
|
66
|
-
|
67
|
-
After this, we wish to define `Moment`s in which many `Comment`s are created for
|
68
|
-
the `Article`.
|
60
|
+
This _timeline_ specifies the creation of an instance of `Article` _followed by_
|
61
|
+
the creation of _up to_ 10 `Comment` belonging to that `Article`.
|
62
|
+
The `created_at` time on the `Article` will be before that of each of the
|
63
|
+
`Comment`s. The total elapsed time between the creation of the `Article` and
|
64
|
+
the creation of the final `Comment` will not be more than `1.week`.
|
69
65
|
|
70
|
-
To do this we use the `Window#repeat` method. This method accepts a block along
|
71
|
-
with a `times` parameter and an `over` parameter and inserts a new, child `Window`
|
72
|
-
into the current, parent window. The contents of the child window will be
|
73
|
-
that defined by the block repeated `times` times.
|
74
66
|
|
75
|
-
|
76
|
-
where each such `Moment` creates a `Comment` belonging to the previously
|
77
|
-
created `Article`.
|
67
|
+
#### `Bonito` can generate data in parallel:
|
78
68
|
|
79
|
-
|
80
|
-
|
81
|
-
|
69
|
+
Consider the timeline `serial` we defined above. We might realistically want
|
70
|
+
to generate data that represents many authors working _simultaneously_ on
|
71
|
+
articles with users then commenting on these once they have been published.
|
82
72
|
|
83
|
-
|
84
|
-
serial_window = Bonito.over 10.weeks do
|
85
|
-
repeat times: 5 do
|
86
|
-
use example_window
|
87
|
-
end
|
88
|
-
end
|
89
|
-
```
|
90
|
-
|
91
|
-
However this approach has a serious drawback: all events will occur _in series_.
|
92
|
-
The second `Article` will not be created until all the `Comment`s on the first
|
93
|
-
`Article` have been created and similarly the third `Article` will be preceded by
|
94
|
-
all `Comment`s on the second.
|
95
|
-
|
96
|
-
Ideally what we want is for `Articles` and `Comment`s to be _interleaved_.
|
97
|
-
|
98
|
-
We can achieve this using the `Window#simultaneously`
|
99
|
-
method to create a `Container` object, used to define parallel timelines. We
|
100
|
-
then fill that container with the same, `Window` five times, using the
|
101
|
-
`Container#use` method.
|
73
|
+
This can be done as follows:
|
102
74
|
|
103
75
|
```ruby
|
104
|
-
|
76
|
+
parallel = Bonito.over 2.weeks do
|
105
77
|
simultaneously do
|
106
78
|
repeat times: 5 do
|
107
|
-
use
|
79
|
+
use serial
|
108
80
|
end
|
109
81
|
end
|
110
82
|
end
|
111
83
|
```
|
112
84
|
|
113
|
-
The above will create 5 `Article`s, each having up to 9 `Comment`s where the
|
114
|
-
moment at which `Article` is created is independent of any other `Article` or
|
115
|
-
`Comment`. The times at which the `Comment`s are created, meanwhile, are
|
85
|
+
The above will create 5 `Article`s, each having up to 9 `Comment`s where the
|
86
|
+
moment at which `Article` is created is independent of any other `Article` or
|
87
|
+
`Comment`. The times at which the `Comment`s are created, meanwhile, are
|
116
88
|
dependent _only_ on the `Article` to which they belong.
|
117
89
|
|
118
90
|
#### Execution
|
@@ -120,40 +92,168 @@ dependent _only_ on the `Article` to which they belong.
|
|
120
92
|
Now we have defined the _shape_ of the data we wish to create, it remains
|
121
93
|
to actually create it.
|
122
94
|
|
123
|
-
This is achieved via a `Runner` object that takes a
|
124
|
-
evaluate
|
95
|
+
This is achieved via a `Runner` object that takes a timeline and uses it to
|
96
|
+
evaluate the individual actions. It distributes these actions randomly yet
|
97
|
+
within the confines of the schedule defined by the timeline.
|
125
98
|
|
126
99
|
```ruby
|
127
|
-
Bonito.run parallel_window, starting: 8.weeks_ago
|
100
|
+
Bonito.run parallel_window, scope: scope, starting: 8.weeks_ago
|
128
101
|
```
|
129
102
|
|
130
103
|
This will take the `Window` object `parallel_window` and distribute the `Moment`s
|
131
|
-
it contains according to its configuration, mapping each `Moment` to a point
|
104
|
+
it contains according to its configuration, mapping each `Moment` to a point
|
132
105
|
in time relative to the start time given by the `starting` parameter.
|
133
106
|
|
134
|
-
|
107
|
+
## Scaling
|
108
|
+
|
109
|
+
A typical use case may require different data set sizes for
|
135
110
|
different applications: For example, a large dataset to live in a staging
|
136
111
|
environment in order to sanity check releases and a small, easy to load dataset
|
137
112
|
that can be used locally while developing.
|
138
113
|
|
139
|
-
Suppose we have certain events that we wish to occur only once,
|
140
|
-
(using the above example, this may be the creation of some `
|
141
|
-
representing the
|
114
|
+
Suppose we have certain events that we wish to occur only once,
|
115
|
+
(using the above example, this may be the creation of some `Publication` object
|
116
|
+
representing the newspaper for which articles are being written) as well
|
142
117
|
as events that we wish to be able to scale, such as the creation of `Article`
|
143
118
|
objects and their associated `Comment`s.
|
144
119
|
|
145
|
-
Using `Bonito`, we could define two
|
146
|
-
|
120
|
+
Using `Bonito`, we could define two timelines: a `singleton_timeline`
|
121
|
+
that is run once per dataset and generates our
|
122
|
+
`Organisation` model, as well as a `scalable_timeline`
|
147
123
|
that results in different sizes of data according to some size parameter.
|
148
124
|
|
149
|
-
These
|
125
|
+
These timelines can then be combined as follows, where the size parameter `n`
|
150
126
|
can be provided dynamically as, say, an argument provided to a `Rake` task.
|
151
127
|
|
152
128
|
```ruby
|
153
|
-
|
129
|
+
scaled = singleton_timeline + (scalable_timeline ** n)
|
154
130
|
```
|
155
131
|
|
156
|
-
The `
|
132
|
+
The `scaled_timeline`, when run, will run the `scalable_timeline` `n` times
|
157
133
|
in parallel.
|
158
134
|
|
135
|
+
## Scoping
|
136
|
+
|
137
|
+
Data can be shared amongst `Moments` via a `Scope` object.
|
138
|
+
Attributes on a `Scope` object are available within the **current** serial
|
139
|
+
timeline and in all **child** serial timelines.
|
140
|
+
|
141
|
+
Consider the following example:
|
142
|
+
|
143
|
+
```ruby
|
144
|
+
Bonito.over 1.week do
|
145
|
+
please do |scope|
|
146
|
+
scope.foo = 'bar'
|
147
|
+
end
|
148
|
+
|
149
|
+
over 2.days do
|
150
|
+
please do |scope|
|
151
|
+
puts scope.foo # prints 'bar'
|
152
|
+
end
|
153
|
+
|
154
|
+
please do |scope|
|
155
|
+
scope.foo = 'baz'
|
156
|
+
end
|
157
|
+
|
158
|
+
please do |scope|
|
159
|
+
puts scope.foo # now prints 'baz'
|
160
|
+
end
|
161
|
+
end
|
162
|
+
|
163
|
+
please do |scope|
|
164
|
+
puts scope.foo # still prints 'bar'
|
165
|
+
end
|
166
|
+
end
|
167
|
+
```
|
168
|
+
|
169
|
+
## An Example
|
170
|
+
|
171
|
+
Consider the following:
|
172
|
+
|
173
|
+
```ruby
|
174
|
+
# Initialise the data store, in practice a database would probably be used for
|
175
|
+
# this. Defined this way, these variables are available globally.
|
176
|
+
|
177
|
+
scope = Bonito::Scope.new.tap do |scope|
|
178
|
+
scope.publications = []
|
179
|
+
scope.authors = []
|
180
|
+
scope.articles = []
|
181
|
+
scope.users = []
|
182
|
+
scope.comments = []
|
183
|
+
scope.users_and_authors = []
|
184
|
+
end
|
185
|
+
|
186
|
+
# We only ever want to create publication, regardless of how we scale, so we
|
187
|
+
# create a setup timeline to handle this
|
188
|
+
singleton_timeline = Bonito.over 1.day do
|
189
|
+
please do |scope|
|
190
|
+
scope.publications << Publication.new
|
191
|
+
end
|
192
|
+
end
|
193
|
+
|
194
|
+
|
195
|
+
scalable_timeline = Bonito.over 1.week do
|
196
|
+
# Make the publication available to the current timeline.
|
197
|
+
please do |scope|
|
198
|
+
scope.publication = scope.publications.first
|
199
|
+
end
|
200
|
+
# Simultaneously create authors and users, interweaving the two.
|
201
|
+
simultaneously do
|
202
|
+
over 1.day do
|
203
|
+
# Create 5 authors over the course of a day
|
204
|
+
repeat times: 5, over: 1.day do
|
205
|
+
please do |scope|
|
206
|
+
name = Faker::Name.name
|
207
|
+
author = Author.new(name)
|
208
|
+
scope.authors << author
|
209
|
+
scope.users_and_authors << author
|
210
|
+
end
|
211
|
+
end
|
212
|
+
end
|
213
|
+
|
214
|
+
# Create 10 users, also over one day, waiting at least 2 hours before
|
215
|
+
# creating the first.
|
216
|
+
also over: 1.day, after: 2.hours do
|
217
|
+
repeat times: 10, over: 1.day do
|
218
|
+
please do |scope|
|
219
|
+
name = Faker::Name.name
|
220
|
+
email = Faker::Internet.safe_email(name)
|
221
|
+
user = User.new(name, email)
|
222
|
+
scope.users << user
|
223
|
+
scope.users_and_authors << user
|
224
|
+
end
|
225
|
+
end
|
226
|
+
end
|
227
|
+
end
|
228
|
+
|
229
|
+
# Repeat the following sequence of events 5 times over 5 days
|
230
|
+
repeat times: 5, over: 5.days do
|
231
|
+
# Choose one of the existing authors and create an article belonging to that
|
232
|
+
# author.
|
233
|
+
please do |scope|
|
234
|
+
author = scope.authors.sample
|
235
|
+
title = Faker::Company.bs
|
236
|
+
scope.article = Article.new(title, author, scope.publication)
|
237
|
+
scope.articles << scope.article
|
238
|
+
end
|
239
|
+
|
240
|
+
# Choose one of the existing users and have them leave a comment on the
|
241
|
+
# article that was just created.
|
242
|
+
repeat times: rand(10), over: 5.hours do
|
243
|
+
please do |scope|
|
244
|
+
user = scope.users.sample
|
245
|
+
content = Faker::Lorem.sentence
|
246
|
+
scope.comments << Comment.new(content, scope.article, user)
|
247
|
+
end
|
248
|
+
end
|
249
|
+
end
|
250
|
+
end
|
251
|
+
|
252
|
+
# Finally, we run the timeline to generate our data
|
253
|
+
scale = 5
|
254
|
+
scaled_timeline = singleton_timeline + (scalable_timeline ** scale)
|
255
|
+
|
256
|
+
Bonito.run scaled_timeline, scope: scope, starting: 8.weeks_ago
|
257
|
+
```
|
258
|
+
|
159
259
|
|
data/bonito.gemspec
CHANGED
@@ -12,7 +12,7 @@ Gem::Specification.new do |spec|
|
|
12
12
|
|
13
13
|
spec.summary = 'A simple tool to create demo data'
|
14
14
|
spec.description = 'Create realistic demo data by simulating events occurring over some time period'
|
15
|
-
spec.homepage = 'https://github.com/
|
15
|
+
spec.homepage = 'https://github.com/tmfnll/bonito'
|
16
16
|
spec.license = 'MIT'
|
17
17
|
|
18
18
|
# Specify which files should be added to the gem when it is released.
|
@@ -34,8 +34,8 @@ Gem::Specification.new do |spec|
|
|
34
34
|
spec.add_development_dependency 'activesupport'
|
35
35
|
spec.add_development_dependency 'bundler', '~> 2.0'
|
36
36
|
spec.add_development_dependency 'factory_bot'
|
37
|
-
spec.add_development_dependency 'faker', '~>
|
38
|
-
spec.add_development_dependency 'rake', '~>
|
37
|
+
spec.add_development_dependency 'faker', '~> 2.10.1'
|
38
|
+
spec.add_development_dependency 'rake', '~> 13'
|
39
39
|
spec.add_development_dependency 'rdoc'
|
40
40
|
spec.add_development_dependency 'reek'
|
41
41
|
spec.add_development_dependency 'rspec', '~> 3.0'
|
data/lib/bonito/progress.rb
CHANGED
data/lib/bonito/timeline.rb
CHANGED
data/lib/bonito/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bonito
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Tom Finill
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2020-03-15 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: algorithms
|
@@ -100,28 +100,28 @@ dependencies:
|
|
100
100
|
requirements:
|
101
101
|
- - "~>"
|
102
102
|
- !ruby/object:Gem::Version
|
103
|
-
version:
|
103
|
+
version: 2.10.1
|
104
104
|
type: :development
|
105
105
|
prerelease: false
|
106
106
|
version_requirements: !ruby/object:Gem::Requirement
|
107
107
|
requirements:
|
108
108
|
- - "~>"
|
109
109
|
- !ruby/object:Gem::Version
|
110
|
-
version:
|
110
|
+
version: 2.10.1
|
111
111
|
- !ruby/object:Gem::Dependency
|
112
112
|
name: rake
|
113
113
|
requirement: !ruby/object:Gem::Requirement
|
114
114
|
requirements:
|
115
115
|
- - "~>"
|
116
116
|
- !ruby/object:Gem::Version
|
117
|
-
version: '
|
117
|
+
version: '13'
|
118
118
|
type: :development
|
119
119
|
prerelease: false
|
120
120
|
version_requirements: !ruby/object:Gem::Requirement
|
121
121
|
requirements:
|
122
122
|
- - "~>"
|
123
123
|
- !ruby/object:Gem::Version
|
124
|
-
version: '
|
124
|
+
version: '13'
|
125
125
|
- !ruby/object:Gem::Dependency
|
126
126
|
name: rdoc
|
127
127
|
requirement: !ruby/object:Gem::Requirement
|
@@ -225,7 +225,7 @@ files:
|
|
225
225
|
- lib/bonito/serial_timeline.rb
|
226
226
|
- lib/bonito/timeline.rb
|
227
227
|
- lib/bonito/version.rb
|
228
|
-
homepage: https://github.com/
|
228
|
+
homepage: https://github.com/tmfnll/bonito
|
229
229
|
licenses:
|
230
230
|
- MIT
|
231
231
|
metadata: {}
|
@@ -244,7 +244,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
244
244
|
- !ruby/object:Gem::Version
|
245
245
|
version: '0'
|
246
246
|
requirements: []
|
247
|
-
rubygems_version: 3.0.
|
247
|
+
rubygems_version: 3.0.6
|
248
248
|
signing_key:
|
249
249
|
specification_version: 4
|
250
250
|
summary: A simple tool to create demo data
|