pick_me_too 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: ac4906dd27b93c363b8e9af9e82ffe636c3149519d79654a7b8a2f0a2bdfdf25
4
+ data.tar.gz: a902f070e98dc4fab62695fe7e556e499ad33c3d1dd2b0dc21764923b37f310f
5
+ SHA512:
6
+ metadata.gz: 73f4816247afc780c1dae9c5399fbc1280db8795d191f575fd4829d39ea8b5f39f61974d9953a0883f25160dcd256feabefe0e75ddc323728523c9fa3a2c997c
7
+ data.tar.gz: ef409f69716637d562c45b52d2ea71843e2bfb3fe568d2dfd82447084a27cd93cae9fdf99f1a4decbe37036db1639d2332fe6debfb13ed5f268c5dffae632456
data/.gitignore ADDED
@@ -0,0 +1,4 @@
1
+ .byebug_history
2
+ *.swp
3
+ .ruby-version
4
+
data/CHANGES.md ADDED
@@ -0,0 +1,4 @@
1
+ # Change Log
2
+
3
+ ## 1.0.0 *2022-8-14*
4
+ * initial release
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2022 David Fairchild Houghton
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,101 @@
1
+ # pick_me_too
2
+
3
+ Pick things randomly from a list of things with specified frequencies.
4
+
5
+ This is what is known as an [urn model](https://en.wikipedia.org/wiki/Urn_problem).
6
+
7
+ # Synopsis
8
+
9
+ ```ruby
10
+ require 'pick_me_too'
11
+
12
+ # optionally make a seeded random number sequence
13
+ rng = Random.new 1
14
+
15
+ # make a picker that uses this
16
+ picker = PickMeToo.new([["prevention", 1], ["cure", 16]], -> { rng.rand })
17
+
18
+ counter = Hash.new 0
19
+ 32.times { counter[picker.pick] += 1 }
20
+ counter
21
+ # => {"cure"=>29, "prevention"=>3}
22
+
23
+ # you can also use a hash to map items to frequencies
24
+ # frequencies don't need to be whole numbers
25
+ # items don't need to be strings
26
+
27
+ rng = Random.new 1
28
+ picker = PickMeToo.new({foo: 1, bar: 2, baz: 0.5}, -> { rng.rand })
29
+ counter = Hash.new 0
30
+ 32.times { counter[picker.pick] += 1 }
31
+ counter
32
+ # => {:foo=>13, :bar=>12, :baz=>7}
33
+
34
+ # you don't need to provide your own random number sequence
35
+ picker = PickMeToo({a: 1, b: 2, c: 3})
36
+ # ...
37
+ ```
38
+
39
+ # What is this for?
40
+
41
+ Suppose you are simulating some phenomenon, wandering monsters in a dungeon, say, weather on particular day, or
42
+ the vowel in a random syllable in a random word in a random language. These things are representable as a list
43
+ of frequencies:
44
+ - goblin, 10; orc: 5; centipede: 15; ...
45
+ - sunny: 10; rainy: 5; cloudy: 15; ...
46
+ - a: 10; u: 5; e: 15; ...
47
+
48
+ What you need is something that will randomly pick these things for you with the frequencies you specify.
49
+
50
+ One way to do this would be to make an array, filling it with the items according to the frequencies specified
51
+ and then pick randomly from the array:
52
+
53
+ ```ruby
54
+ monsters = %i[goblin] * 10 + %i[orc] * 5 + %i[centipede] * 15
55
+ monster = monsters[(monsters.length * rand).floor]
56
+ ```
57
+
58
+ This is inefficient or impossible if the frequencies are huge or excessively precise. Say you want pi orcs in your list and e goblins.
59
+
60
+ `PickMeToo` requires just one item of each type. It converts the frequencies to probabilities and walks a tree of numeric comparisons to
61
+ choose an item given a random number. The tree of comparisons is optimized so, for example, if one item represents 50% or more of the
62
+ frequencies it will be selected with a single comparison.
63
+
64
+
65
+ # API
66
+
67
+ ## `PickMeToo`
68
+
69
+ This is the "[urn](https://en.wikipedia.org/wiki/Urn_problem)" containing the items selected.
70
+
71
+ ## `PickMeToo.new(frequences, [rnd])`
72
+
73
+ "Fill" the urn.
74
+
75
+ The required `frequencies` parameter must be something that is effectivly a list of pairs:
76
+ things to pick paired with their frequency. The "frequency" is just any positive number.
77
+
78
+ The optional `rnd` parameter is a `Proc` that when called returns a number, ideally in the interval
79
+ [0, 1]. This parameter allows you to provided a seeded random number generator, so the choices
80
+ occur in a predictable sequence, which is useful for testing.
81
+
82
+ This constructor method will raise a `PickMeToo::Error` if
83
+ - there are no pairs in the frequency list
84
+ - any of the frequencies is non-positive
85
+ - any of the items in the list isn't something followed by a number
86
+
87
+ ## `PickMeToo#pick()`
88
+
89
+ Draw an item from the urn.
90
+
91
+ # Installation
92
+
93
+ `pick_me_too` is available as a gem, so one installs it as one does gems.
94
+
95
+ # License
96
+
97
+ MIT. See the LICENSE file alongside this README.
98
+
99
+ # Acknowledgements
100
+
101
+ My son Jude helped me contemplate how to balance the probability tree.
data/Rakefile ADDED
@@ -0,0 +1,11 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'bundler/gem_tasks'
4
+
5
+ require 'rake/testtask'
6
+
7
+ Rake::TestTask.new do |t|
8
+ t.test_files = FileList['test/*_test.rb']
9
+ end
10
+
11
+ task default: :test
@@ -0,0 +1,162 @@
1
+ # frozen_string_literal: true
2
+
3
+ ##
4
+ # An "urn" from which you can pick things with specified frequencies.
5
+ #
6
+ # require 'pick_me_too'
7
+ #
8
+ # wandering_monsters = PickMeToo.new({goblin: 10, bugbear: 2, orc: 5, spider: 3, troll: 1})
9
+ # 10.times.map { wandering_monsters.pick }
10
+ # # => [:goblin, :orc, :bugbear, :orc, :goblin, :bugbear, :goblin, :goblin, :orc, :goblin]
11
+ #
12
+ # irrational = PickMeToo.new({e: Math::E, pi: Math::PI})
13
+ # to.times.map { irrational.pick }
14
+ # # => [:e, :e, :e, :pi, :e, :e, :e, :pi, :pi, :e]
15
+ #
16
+ # Items once picked are "placed back in the urn", so if you pick a cat this doesn't reduce the
17
+ # probability the next thing you pick is also a cat, and the urn will never be picked empty. (And of course
18
+ # this is all a metaphor.)
19
+ class PickMeToo
20
+ VERSION = '1.0.0'
21
+
22
+ class Error < StandardError; end
23
+
24
+ ##
25
+ # "Fill" the urn.
26
+ #
27
+ # The required frequencies parameter must be something that is effectivly a list of pairs:
28
+ # things to pick paired with their frequency. The "frequency" is just any positive number.
29
+ #
30
+ # The optional rnd parameter is a Proc that when called returns a number, ideally in the interval
31
+ # [0, 1]. This parameter allows you to provided a seeded random number generator, so the choices
32
+ # occur in a predictable sequence, which is useful for testing.
33
+ #
34
+ # This constructor method will raise a `PickMeToo::Error` if
35
+ # - there are no pairs in the frequency list
36
+ # - any of the frequencies is non-positive
37
+ # - any of the items in the list isn't something followed by a number
38
+ def initialize(frequencies, rnd = -> { rand })
39
+ @rnd = rnd
40
+ frequencies = prepare(Array(frequencies))
41
+ @objects = frequencies.map(&:first)
42
+ if @objects.length == 1
43
+ @picker = ->(_p) { 0 }
44
+ else
45
+ frequencies = frequencies.map(&:last)
46
+ balanced_binary_tree = bifurcate(frequencies.dup)
47
+ probability_tree = probabilities(frequencies, balanced_binary_tree)
48
+ # compile everything into a nested ternary expression
49
+ @picker = eval "->(p) { #{ternerize(probability_tree)} }"
50
+ end
51
+ end
52
+
53
+ ##
54
+ # Pick an item from the urn.
55
+ def pick
56
+ @objects[@picker.call(@rnd.call)]
57
+ end
58
+
59
+ private
60
+
61
+ # sanity check and normalization of frequencies
62
+ def prepare(frequencies)
63
+ raise Error, 'no frequencies given' unless frequencies.any?
64
+ unless frequencies.all? { |f| f.is_a?(Array) && f.length == 2 && f[1].is_a?(Numeric) }
65
+ raise Error, 'all frequencies must be two-member arrays the second member of which is Numeric'
66
+ end
67
+
68
+ good, bad = frequencies.partition { |*, n| n.positive? }
69
+ raise Error, "the following have non-positive frequencies: #{bad.inspect}" if bad.any?
70
+
71
+ total = good.map(&:last).sum.to_f
72
+ good.map { |o, n| [o, n / total] }
73
+ end
74
+
75
+ # reduce the probability tree to nested ternary expressions
76
+ def ternerize(ptree)
77
+ p, left, right = ptree.values_at :p, :left, :right
78
+ left = left.is_a?(Numeric) ? left : ternerize(left)
79
+ right = right.is_a?(Numeric) ? right : ternerize(right)
80
+ "(p > #{p} ? #{right} : #{left})"
81
+ end
82
+
83
+ def probabilities(frequencies, tree)
84
+ tree = sum_probabilities(tree, 0)
85
+ replace_frequencies_with_indices(tree, frequencies.each_with_index.to_a)
86
+ tree
87
+ end
88
+
89
+ def replace_frequencies_with_indices(tree, frequencies)
90
+ left, right = tree.values_at :left, :right
91
+ if left.is_a?(Numeric)
92
+ i = frequencies.index { |v,| v == left }
93
+ *, i = frequencies.slice!(i)
94
+ tree[:left] = i
95
+ else
96
+ replace_frequencies_with_indices(left, frequencies)
97
+ end
98
+ if right.is_a?(Numeric)
99
+ i = frequencies.index { |v,| v == right }
100
+ *, i = frequencies.slice!(i)
101
+ tree[:right] = i
102
+ else
103
+ replace_frequencies_with_indices(right, frequencies)
104
+ end
105
+ end
106
+
107
+ # convert the frequency numbers to probabilities
108
+ def sum_probabilities(tree, base)
109
+ left, right = tree
110
+ p = left.flatten.sum + base
111
+ left = left.length == 1 ? left.first : sum_probabilities(left, base)
112
+ right = right.length == 1 ? right.first : sum_probabilities(right, p)
113
+ { p: p, left: left, right: right }
114
+ end
115
+
116
+ # distribute the frequencies so their as balanced as possible
117
+ # the better to reduce expected length of the binary search
118
+ def bifurcate(nums)
119
+ return nums if nums.length < 2
120
+
121
+ max = total = 0
122
+ max_index = -1
123
+ # make one loop find all these things
124
+ nums.each_with_index do |n, i|
125
+ total += n
126
+ if n > max
127
+ max = n
128
+ max_index = i
129
+ end
130
+ end
131
+ half = total / 2.0
132
+ right = [nums.slice!(max_index)]
133
+ if max >= half
134
+ [bifurcate(nums), right]
135
+ else
136
+ gap = half - max
137
+ while rv = fit_gap(gap, nums)
138
+ removed, remaining_gap = rv
139
+ right << removed
140
+ break unless gap = remaining_gap
141
+ end
142
+ [bifurcate(nums), bifurcate(right)]
143
+ end
144
+ end
145
+
146
+ # look for the frequency best suited to balance the two branches
147
+ def fit_gap(gap, nums)
148
+ best_index = 0
149
+ best_fit = (gap - nums[0]).abs
150
+ nums.each_with_index.drop(1).each do |n, i|
151
+ fit = (gap - n).abs
152
+ if fit < best_fit
153
+ best_index = i
154
+ best_fit = fit
155
+ end
156
+ end
157
+ if nums[best_index] < gap * 2
158
+ n = nums.slice!(best_index)
159
+ [n, n < gap ? gap - n : nil]
160
+ end
161
+ end
162
+ end
@@ -0,0 +1,32 @@
1
+ # frozen_string_literal: true
2
+
3
+ lib = File.expand_path('lib', __dir__)
4
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
5
+ require 'pick_me_too'
6
+
7
+ Gem::Specification.new do |s|
8
+ s.name = 'pick_me_too'
9
+ s.version = PickMeToo::VERSION
10
+ s.summary = 'Randomly select items from a list with specified frequencies'
11
+ s.description = <<-DESC.strip.gsub(/\s+/, ' ')
12
+ PickMeToo is a Ruby urn model. It allows you to generate
13
+ an "urn" from which you can randomly sample items with
14
+ specified frequencies. This facilitates modeling things that occur
15
+ with known frequencies, like weather or wandering monsters.
16
+ DESC
17
+ s.authors = ['David F. Houghton']
18
+ s.email = 'dfhoughton@gmail.com'
19
+ s.homepage =
20
+ 'https://github.com/dfhoughton/pick_me_too'
21
+ s.license = 'MIT'
22
+ s.required_ruby_version = '>= 2.5.3'
23
+ s.files = `git ls-files -z`.split("\x0")
24
+ s.test_files = s.files.grep(%r{^(test|spec|features)/})
25
+ s.require_paths = ['lib']
26
+
27
+ s.add_development_dependency 'bundler', '~> 1.7'
28
+ s.add_development_dependency 'byebug', '~> 9.1.0'
29
+ s.add_development_dependency 'json', '~> 2'
30
+ s.add_development_dependency 'minitest', '~> 5'
31
+ s.add_development_dependency 'rake', '~> 10.0'
32
+ end
@@ -0,0 +1,73 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'minitest/autorun'
4
+
5
+ require 'pick_me_too'
6
+ require 'byebug'
7
+
8
+ # :stopdoc:
9
+
10
+ class BasicTest < Minitest::Test
11
+ def test_basic
12
+ rnd = Random.new 1
13
+ picker = PickMeToo.new([['cat', 2], ['dog', 1]], -> { rnd.rand })
14
+ counter = Hash.new(0)
15
+ 3000.times { counter[picker.pick] += 1 }
16
+ assert_equal 2, (counter['cat'] / 1000.0).round, 'right number of cats'
17
+ assert_equal 1, (counter['dog'] / 1000.0).round, 'right number of dogs'
18
+ end
19
+
20
+ def test_hash
21
+ rnd = Random.new 1
22
+ picker = PickMeToo.new({'cat' => 2, 'dog' => 1}, -> { rnd.rand })
23
+ counter = Hash.new(0)
24
+ 3000.times { counter[picker.pick] += 1 }
25
+ assert_equal 2, (counter['cat'] / 1000.0).round, 'right number of cats'
26
+ assert_equal 1, (counter['dog'] / 1000.0).round, 'right number of dogs'
27
+ end
28
+
29
+ def test_bigger
30
+ rnd = Random.new 1
31
+ frequencies = [['cat', 1], ['dog', 2], ['horse', 3], ['camel', 4], ['lizard', 5], ['fish', 6]]
32
+ picker = PickMeToo.new(frequencies, -> { rnd.rand })
33
+ counter = Hash.new(0)
34
+ (frequencies.map(&:last).sum * 1000).times { counter[picker.pick] += 1 }
35
+ frequencies.each do |key, n|
36
+ assert_equal n, (counter[key] / 1000.0).round, "right number of #{key}"
37
+ end
38
+ end
39
+
40
+ def test_degenerate
41
+ rnd = Random.new 1
42
+ picker = PickMeToo.new([['cat', 2]], -> { rnd.rand })
43
+ counter = Hash.new(0)
44
+ 3000.times { counter[picker.pick] += 1 }
45
+ assert_equal 3.0, counter['cat'] / 1000.0, 'right number of cats'
46
+ end
47
+
48
+ def test_no_frequencies_error
49
+ assert_raises(PickMeToo::Error, 'no frequencies given') do
50
+ PickMeToo.new([])
51
+ end
52
+ end
53
+
54
+ def test_any_negative_frequency
55
+ assert_raises(PickMeToo::Error, "the following have non-positive frequencies: #{[['bar', -1]].inspect}") do
56
+ PickMeToo.new([['foo', 1], ['bar', -1]])
57
+ end
58
+ end
59
+
60
+ def test_all_tuples
61
+ assert_raises(PickMeToo::Error,
62
+ 'all frequencies must be two-member arrays the second member of which is Numeric') do
63
+ PickMeToo.new([['foo', nil, 1]])
64
+ end
65
+ end
66
+
67
+ def test_frequency_required
68
+ assert_raises(PickMeToo::Error,
69
+ 'all frequencies must be two-member arrays the second member of which is Numeric') do
70
+ PickMeToo.new([['foo', nil]])
71
+ end
72
+ end
73
+ end
@@ -0,0 +1,26 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'minitest/autorun'
4
+
5
+ require 'pick_me_too'
6
+ require 'byebug'
7
+
8
+ # :stopdoc:
9
+
10
+ class BasicTest < Minitest::Test
11
+ def test_synopsis
12
+ rng = Random.new 1
13
+ picker = PickMeToo.new([['prevention', 1], ['cure', 16]], -> { rng.rand })
14
+ counter = Hash.new 0
15
+ 32.times { counter[picker.pick] += 1 }
16
+ assert_equal({ 'cure' => 29, 'prevention' => 3 }, counter)
17
+ end
18
+
19
+ def test_synopsis_hash
20
+ rng = Random.new 1
21
+ picker = PickMeToo.new({ foo: 1, bar: 2, baz: 0.5 }, -> { rng.rand })
22
+ counter = Hash.new 0
23
+ 32.times { counter[picker.pick] += 1 }
24
+ assert_equal({ foo: 13, bar: 12, baz: 7 }, counter)
25
+ end
26
+ end
metadata ADDED
@@ -0,0 +1,125 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: pick_me_too
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - David F. Houghton
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2022-08-14 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: bundler
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.7'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '1.7'
27
+ - !ruby/object:Gem::Dependency
28
+ name: byebug
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: 9.1.0
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: 9.1.0
41
+ - !ruby/object:Gem::Dependency
42
+ name: json
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '2'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '2'
55
+ - !ruby/object:Gem::Dependency
56
+ name: minitest
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: '5'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: '5'
69
+ - !ruby/object:Gem::Dependency
70
+ name: rake
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
74
+ - !ruby/object:Gem::Version
75
+ version: '10.0'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - "~>"
81
+ - !ruby/object:Gem::Version
82
+ version: '10.0'
83
+ description: PickMeToo is a Ruby urn model. It allows you to generate an "urn" from
84
+ which you can randomly sample items with specified frequencies. This facilitates
85
+ modeling things that occur with known frequencies, like weather or wandering monsters.
86
+ email: dfhoughton@gmail.com
87
+ executables: []
88
+ extensions: []
89
+ extra_rdoc_files: []
90
+ files:
91
+ - ".gitignore"
92
+ - CHANGES.md
93
+ - LICENSE
94
+ - README.md
95
+ - Rakefile
96
+ - lib/pick_me_too.rb
97
+ - pick_me_too.gemspec
98
+ - test/basic_test.rb
99
+ - test/documentation_test.rb
100
+ homepage: https://github.com/dfhoughton/pick_me_too
101
+ licenses:
102
+ - MIT
103
+ metadata: {}
104
+ post_install_message:
105
+ rdoc_options: []
106
+ require_paths:
107
+ - lib
108
+ required_ruby_version: !ruby/object:Gem::Requirement
109
+ requirements:
110
+ - - ">="
111
+ - !ruby/object:Gem::Version
112
+ version: 2.5.3
113
+ required_rubygems_version: !ruby/object:Gem::Requirement
114
+ requirements:
115
+ - - ">="
116
+ - !ruby/object:Gem::Version
117
+ version: '0'
118
+ requirements: []
119
+ rubygems_version: 3.3.7
120
+ signing_key:
121
+ specification_version: 4
122
+ summary: Randomly select items from a list with specified frequencies
123
+ test_files:
124
+ - test/basic_test.rb
125
+ - test/documentation_test.rb