segment_tree 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,17 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ Gemfile.lock
7
+ InstalledFiles
8
+ _yardoc
9
+ coverage
10
+ doc/
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --color
2
+ --format documentation
data/Gemfile ADDED
@@ -0,0 +1,6 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in segment_tree.gemspec
4
+ gemspec
5
+
6
+ gem "simplecov", "~> 0.6.4"
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2012 Alexei Mikhailov
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,94 @@
1
+ # SegmentTree
2
+
3
+ Ruby implementation of [segment tree](http://en.wikipedia.org/wiki/Segment_tree) data structure.
4
+ Segment tree is a tree data structure for storing intervals, or segments. It allows querying which of the stored segments contain a given point. It is, in principle, a static structure; that is, its content cannot be modified once the structure is built.
5
+
6
+ Segment tree storage has the complexity of <tt>O(n log n)</tt>.
7
+ Segment tree querying has the complexity of <tt>O(log n + k)</tt> where <tt>k</tt> is the number of reported intervals.
8
+
9
+ It's pretty fast on querying trees with ~ 10 millions segments, though building of such big tree will take long.
10
+
11
+ ## Installation
12
+
13
+ Add this line to your application's Gemfile:
14
+
15
+ gem 'segment_tree'
16
+
17
+ And then execute:
18
+
19
+ $ bundle
20
+
21
+ Or install it yourself as:
22
+
23
+ $ gem install segment_tree
24
+
25
+ ## Usage
26
+
27
+ Segment tree consists of segments (in Ruby it's <tt>Range</tt> objects) and corresponding values. The easiest way to build a segment tree is to create it from hash where segments are keys:
28
+ ```ruby
29
+ tree = SegmentTree.new(1..10 => "a", 11..20 => "b", 21..30 => "c") # => #<SegmentTree:0xa47eadc @root=#<SegmentTree::Container:0x523f3b6 @range=1..30>>
30
+ ```
31
+
32
+ After that you can query the tree of which segments contain a given point:
33
+ ```ruby
34
+ tree.find(5) # => [#<SegmentTree::Segment:0xa47ea8c @range=1..10, @value="a">]
35
+ ```
36
+
37
+ Or fetch only one segment:
38
+ ```ruby
39
+ segment = tree.find_first(5) # => #<SegmentTree::Segment:0xa47ea8c @range=1..10, @value="a">
40
+ segment.value # => "a"
41
+ ```
42
+
43
+ ## Real world example
44
+
45
+ Segment tree can be used in applications where IP-address geocoding is needed.
46
+
47
+ ```ruby
48
+ data = [
49
+ [IPAddr.new('87.224.241.0/24').to_range, {:city => "YEKT"}],
50
+ [IPAddr.new('195.58.18.0/24').to_range, {:city => "MSK"}]
51
+ # and so on
52
+ ]
53
+ ip_tree = SegmentTree.new(data)
54
+
55
+ client_ip = IPAddr.new("87.224.241.66")
56
+ ip_tree.find_first(client_ip).value # => {:city=>"YEKT"}
57
+ ```
58
+
59
+ ## Contributing
60
+
61
+ 1. Fork it
62
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
63
+ 3. Commit your changes (`git commit -am 'Added some feature'`)
64
+ 4. Push to the branch (`git push origin my-new-feature`)
65
+ 5. Create new Pull Request
66
+
67
+ ## TODO
68
+ 1. Fix README typos and grammatical errors (english speaking contributors are welcomed)
69
+ 2. Implement C binding for MRI.
70
+ 3. Test on different versions of Ruby.
71
+
72
+ ## LICENSE
73
+ Copyright (c) 2012 Alexei Mikhailov
74
+
75
+ MIT License
76
+
77
+ Permission is hereby granted, free of charge, to any person obtaining
78
+ a copy of this software and associated documentation files (the
79
+ "Software"), to deal in the Software without restriction, including
80
+ without limitation the rights to use, copy, modify, merge, publish,
81
+ distribute, sublicense, and/or sell copies of the Software, and to
82
+ permit persons to whom the Software is furnished to do so, subject to
83
+ the following conditions:
84
+
85
+ The above copyright notice and this permission notice shall be
86
+ included in all copies or substantial portions of the Software.
87
+
88
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
89
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
90
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
91
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
92
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
93
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
94
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env rake
2
+ require "bundler/gem_tasks"
@@ -0,0 +1,47 @@
1
+ #!/usr/bin/env ruby
2
+ require "bundler/setup"
3
+ require "benchmark"
4
+ require "segment_tree"
5
+
6
+ # generate a tree with +n+ number of intervals
7
+ def tree(n)
8
+ SegmentTree.new list(n)
9
+ end
10
+ def list(n)
11
+ (0..n).map { |num| [(num * 10)..(num + 1) * 10 - 1, num] }
12
+ end
13
+
14
+ puts "Pregenerating data..."
15
+ tests = [100, 1000, 10_000, 100_000, 1_000_000]
16
+
17
+ lists = Hash[tests.map { |n| [n, list(n)] }]
18
+ trees = Hash[tests.map { |n| [n, tree(n)] }]
19
+
20
+ puts "Done"
21
+ puts
22
+
23
+ puts "Building a tree of N intervals"
24
+ Benchmark.bmbm do |x|
25
+ tests.each do |n|
26
+ x.report(n.to_s) { tree(n) }
27
+ end
28
+ end
29
+
30
+ puts "Finding matching interval in tree of N intervals"
31
+ Benchmark.bmbm do |x|
32
+ tests.each do |n|
33
+ t = trees[n]
34
+
35
+ x.report(n.to_s) { t.find_first(rand(n)) }
36
+ end
37
+ end
38
+
39
+ puts
40
+ puts "Finding matching interval in list of N intervals"
41
+ Benchmark.bmbm do |x|
42
+ tests.each do |n|
43
+ data = lists[n]
44
+
45
+ x.report(n.to_s) { data.find { |range, _| range.cover?(rand(n)) } }
46
+ end
47
+ end
@@ -0,0 +1,104 @@
1
+ require "forwardable"
2
+ require "segment_tree/version"
3
+
4
+ class SegmentTree
5
+ # An abstract tree node
6
+ class Node #:nodoc:all:
7
+ extend Forwardable
8
+ def_delegators :@range, :cover?, :begin, :end
9
+ end
10
+
11
+ # An elementary intervals or nodes container
12
+ class Container < Node #:nodoc:all:
13
+ extend Forwardable
14
+
15
+ attr_reader :left, :right
16
+
17
+ # Node constructor, accepts both +Node+ and +Segment+
18
+ def initialize(left, right)
19
+ @left, @right = left, right
20
+
21
+ @range = left.begin..(right || left).end
22
+ end
23
+
24
+ # Find all intervals containing point +x+ within node's children. Returns array
25
+ def find(x)
26
+ [@left, @right].compact.
27
+ select { |node| node.cover?(x) }.
28
+ map { |node| node.find(x) }.
29
+ flatten
30
+ end
31
+
32
+ # Find first interval containing point +x+ within node's children
33
+ def find_first(x)
34
+ subset = [@left, @right].compact.find { |node| node.cover?(x) }
35
+ subset && subset.find_first(x)
36
+ end
37
+
38
+ # Do not expose left and right, otherwise output shall be too long on large trees
39
+ def inspect
40
+ "#<#{self.class.name}:0x#{object_id.to_s(16)} @range=#{@range.inspect}>"
41
+ end
42
+ end
43
+
44
+ # An elementary interval
45
+ class Segment < Node #:nodoc:all:
46
+ attr_reader :value
47
+
48
+ def initialize(range, value)
49
+ raise ArgumentError, 'Range expected, %s given' % range.class.name unless range.is_a?(Range)
50
+
51
+ @range, @value = range, value
52
+ end
53
+
54
+ def find(x)
55
+ [find_first(x)].compact
56
+ end
57
+
58
+ def find_first(x)
59
+ cover?(x) ? self : nil
60
+ end
61
+ end
62
+
63
+ # Build a segment tree from +data+.
64
+ #
65
+ # Data can be one of the following:
66
+ # 1. Hash - a hash, where ranges are keys,
67
+ # i.e. <code>{(0..3) => some_value1, (4..6) => some_value2, ...}<code>
68
+ # 2. 2-dimensional array - an array of arrays where first element of
69
+ # each element is range, and second is value:
70
+ # <code>[[(0..3), some_value1], [(4..6), some_value2] ...]<code>
71
+ def initialize(data)
72
+ # build elementary segments
73
+ nodes = case data
74
+ when Hash, Array, Enumerable then
75
+ data.collect { |range, value| Segment.new(range, value) }
76
+ else raise ArgumentError, '2-dim Array or Hash expected'
77
+ end.sort! do |x, y|
78
+ # intervals are sorted from left to right, from shortest to longest
79
+ x.begin == y.begin ?
80
+ x.end <=> y.end :
81
+ x.begin <=> y.begin
82
+ end
83
+
84
+ # now build binary tree
85
+ while nodes.length > 1
86
+ nodes = nodes.each_slice(2).collect { |left, right| Container.new(left, right) }
87
+ end
88
+
89
+ # root node is first node or nil when tree is empty
90
+ @root = nodes.first
91
+ end
92
+
93
+ # Find all intervals containing point +x+
94
+ # @return [Array]
95
+ def find(x)
96
+ @root ? @root.find(x) : []
97
+ end
98
+
99
+ # Find first interval containing point +x+.
100
+ # @return [Segment|NilClass]
101
+ def find_first(x)
102
+ @root && @root.find_first(x)
103
+ end
104
+ end
@@ -0,0 +1,3 @@
1
+ class SegmentTree
2
+ VERSION = "0.0.1"
3
+ end
@@ -0,0 +1,19 @@
1
+ # -*- encoding: utf-8 -*-
2
+ require File.expand_path('../lib/segment_tree/version', __FILE__)
3
+
4
+ Gem::Specification.new do |gem|
5
+ gem.author = "Alexei Mikhailov"
6
+ gem.email = "amikhailov83@gmail.com"
7
+ gem.description = %q{Tree data structure for storing segments. It allows querying which of the stored segments contain a given point.}
8
+ gem.summary = %q{Tree data structure for storing segments. It allows querying which of the stored segments contain a given point.}
9
+ gem.homepage = "https://github.com/take-five/segment_tree"
10
+
11
+ gem.files = `git ls-files`.split($\)
12
+ gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
13
+ gem.name = "segment_tree"
14
+ gem.require_paths = %W(lib)
15
+ gem.version = SegmentTree::VERSION
16
+
17
+ gem.add_development_dependency "bundler", ">= 1.0"
18
+ gem.add_development_dependency "rspec", ">= 2.11"
19
+ end
@@ -0,0 +1,169 @@
1
+ require "spec_helper"
2
+ require "segment_tree"
3
+
4
+ describe SegmentTree do
5
+ # some fixtures
6
+ # [[0..9, "a"], [10..19, "b"], ..., [90..99, "j"]] - spanned intervals
7
+ let(:sample_spanned) { (0..9).zip("a".."j").map { |num, letter| [(num * 10)..(num + 1) * 10 - 1, letter] } }
8
+ # [[0..12, "a"], [10..22, "b"], ..., [90..102, "j"]] - partially overlapping intervals
9
+ let(:sample_overlapping) { (0..9).zip("a".."j").map { |num, letter| [(num * 10)..(num + 1) * 10 + 2, letter] } }
10
+ # [[0..5, "a"], [10..15, "b"], ..., [90..95, "j"]] - sparsed intervals
11
+ let(:sample_sparsed) { (0..9).zip("a".."j").map { |num, letter| [(num * 10)..(num + 1) * 10 - 5, letter] } }
12
+
13
+ describe ".new" do
14
+ context "given a hash with ranges as keys" do
15
+ let :data do
16
+ {0..3 => "a",
17
+ 4..6 => "b",
18
+ 7..9 => "c",
19
+ 10..12 => "d"}
20
+ end
21
+
22
+ subject(:tree) { SegmentTree.new(data) }
23
+
24
+ it { should be_a SegmentTree }
25
+
26
+ it "should have a root" do
27
+ root = tree.instance_variable_get :@root
28
+ root.should be_a SegmentTree::Container
29
+ end
30
+ end
31
+
32
+ context "given an array of arrays" do
33
+ let :data do
34
+ [[0..3, "a"],
35
+ [4..6, "b"],
36
+ [7..9, "c"],
37
+ [10..12, "d"]]
38
+ end
39
+
40
+ subject(:tree) { SegmentTree.new(data) }
41
+
42
+ it { should be_a SegmentTree }
43
+
44
+ it "should have a root" do
45
+ root = tree.instance_variable_get :@root
46
+ root.should be_a SegmentTree::Container
47
+ end
48
+ end
49
+
50
+ context "given nor hash neither array" do
51
+ it { expect{ SegmentTree.new(Object.new) }.to raise_error(ArgumentError) }
52
+ end
53
+
54
+ context "given 1-dimensional array" do
55
+ let :data do
56
+ [0..3, "a",
57
+ 4..6, "b",
58
+ 7..9, "c",
59
+ 10..12, "d"]
60
+ end
61
+
62
+ it { expect{ SegmentTree.new(data) }.to raise_error(ArgumentError) }
63
+ end
64
+ end
65
+
66
+ describe "#find" do
67
+ context "given spanned intervals" do
68
+ let(:tree) { SegmentTree.new(sample_spanned) }
69
+
70
+ context "and looking up for existent point" do
71
+ subject { tree.find 12 }
72
+
73
+ it { should be_a Array }
74
+ it { should have_exactly(1).item }
75
+ its(:first) { should be_a SegmentTree::Segment }
76
+ its('first.value') { should eq 'b' }
77
+ end
78
+
79
+ context "and looking up for non-existent point" do
80
+ subject { tree.find 101 }
81
+
82
+ it { should be_a Array }
83
+ it { should be_empty }
84
+ end
85
+ end
86
+
87
+ context "given partially overlapping intervals" do
88
+ let(:tree) { SegmentTree.new(sample_overlapping) }
89
+
90
+ context "and looking up for existent point" do
91
+ subject { tree.find 11 }
92
+
93
+ it { should be_a Array }
94
+ it { should have_exactly(2).item }
95
+ its(:first) { should be_a SegmentTree::Segment }
96
+ its('first.value') { should eq 'a' }
97
+ its(:last) { should be_a SegmentTree::Segment }
98
+ its('last.value') { should eq 'b' }
99
+ end
100
+ end
101
+
102
+ context "given sparsed intervals" do
103
+ let(:tree) { SegmentTree.new(sample_sparsed) }
104
+
105
+ context "and looking up for existent point" do
106
+ subject { tree.find 12 }
107
+
108
+ it { should be_a Array }
109
+ it { should have_exactly(1).item }
110
+ its(:first) { should be_a SegmentTree::Segment }
111
+ its('first.value') { should eq 'b' }
112
+ end
113
+
114
+ context "and looking up for non-existent point" do
115
+ subject { tree.find 8 }
116
+
117
+ it { should be_a Array }
118
+ it { should be_empty }
119
+ end
120
+ end
121
+ end
122
+
123
+ describe "#find_first" do
124
+ context "given spanned intervals" do
125
+ let(:tree) { SegmentTree.new(sample_spanned) }
126
+
127
+ context "and looking up for existent point" do
128
+ subject { tree.find_first 12 }
129
+
130
+ it { should be_a SegmentTree::Segment }
131
+ its(:value) { should eq 'b' }
132
+ end
133
+
134
+ context "and looking up for non-existent point" do
135
+ subject { tree.find_first 101 }
136
+
137
+ it { should be_nil }
138
+ end
139
+ end
140
+
141
+ context "given partially overlapping intervals" do
142
+ let(:tree) { SegmentTree.new(sample_overlapping) }
143
+
144
+ context "and looking up for existent point" do
145
+ subject { tree.find_first 11 }
146
+
147
+ it { should be_a SegmentTree::Segment }
148
+ its(:value) { should eq 'a' }
149
+ end
150
+ end
151
+
152
+ context "given sparsed intervals" do
153
+ let(:tree) { SegmentTree.new(sample_sparsed) }
154
+
155
+ context "and looking up for existent point" do
156
+ subject { tree.find_first 12 }
157
+
158
+ it { should be_a SegmentTree::Segment }
159
+ its(:value) { should eq 'b' }
160
+ end
161
+
162
+ context "and looking up for non-existent point" do
163
+ subject { tree.find_first 8 }
164
+
165
+ it { should be_nil }
166
+ end
167
+ end
168
+ end
169
+ end
@@ -0,0 +1,12 @@
1
+ require "bundler/setup"
2
+ require "simplecov"
3
+
4
+ RSpec.configure do |config|
5
+ # Run specs in random order to surface order dependencies. If you find an
6
+ # order dependency and want to debug it, you can fix the order by providing
7
+ # the seed, which is printed after each run.
8
+ # --seed 1234
9
+ config.order = 'random'
10
+ end
11
+
12
+ SimpleCov.start
metadata ADDED
@@ -0,0 +1,82 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: segment_tree
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.1
5
+ prerelease:
6
+ platform: ruby
7
+ authors:
8
+ - Alexei Mikhailov
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+ date: 2012-08-01 00:00:00.000000000Z
13
+ dependencies:
14
+ - !ruby/object:Gem::Dependency
15
+ name: bundler
16
+ requirement: &84177160 !ruby/object:Gem::Requirement
17
+ none: false
18
+ requirements:
19
+ - - ! '>='
20
+ - !ruby/object:Gem::Version
21
+ version: '1.0'
22
+ type: :development
23
+ prerelease: false
24
+ version_requirements: *84177160
25
+ - !ruby/object:Gem::Dependency
26
+ name: rspec
27
+ requirement: &84176910 !ruby/object:Gem::Requirement
28
+ none: false
29
+ requirements:
30
+ - - ! '>='
31
+ - !ruby/object:Gem::Version
32
+ version: '2.11'
33
+ type: :development
34
+ prerelease: false
35
+ version_requirements: *84176910
36
+ description: Tree data structure for storing segments. It allows querying which of
37
+ the stored segments contain a given point.
38
+ email: amikhailov83@gmail.com
39
+ executables: []
40
+ extensions: []
41
+ extra_rdoc_files: []
42
+ files:
43
+ - .gitignore
44
+ - .rspec
45
+ - Gemfile
46
+ - LICENSE
47
+ - README.md
48
+ - Rakefile
49
+ - benchmark/benchmark.rb
50
+ - lib/segment_tree.rb
51
+ - lib/segment_tree/version.rb
52
+ - segment_tree.gemspec
53
+ - spec/segment_tree_spec.rb
54
+ - spec/spec_helper.rb
55
+ homepage: https://github.com/take-five/segment_tree
56
+ licenses: []
57
+ post_install_message:
58
+ rdoc_options: []
59
+ require_paths:
60
+ - lib
61
+ required_ruby_version: !ruby/object:Gem::Requirement
62
+ none: false
63
+ requirements:
64
+ - - ! '>='
65
+ - !ruby/object:Gem::Version
66
+ version: '0'
67
+ required_rubygems_version: !ruby/object:Gem::Requirement
68
+ none: false
69
+ requirements:
70
+ - - ! '>='
71
+ - !ruby/object:Gem::Version
72
+ version: '0'
73
+ requirements: []
74
+ rubyforge_project:
75
+ rubygems_version: 1.8.17
76
+ signing_key:
77
+ specification_version: 3
78
+ summary: Tree data structure for storing segments. It allows querying which of the
79
+ stored segments contain a given point.
80
+ test_files:
81
+ - spec/segment_tree_spec.rb
82
+ - spec/spec_helper.rb