array_trie 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +9 -0
- data/.rspec +1 -0
- data/Gemfile +6 -0
- data/README.md +35 -0
- data/Rakefile +2 -0
- data/array_trie.gemspec +32 -0
- data/bin/console +14 -0
- data/bin/setup +8 -0
- data/bin/test +6 -0
- data/lib/array_trie.rb +198 -0
- data/lib/array_trie/prefix_trie.rb +130 -0
- data/lib/array_trie/version.rb +3 -0
- metadata +100 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: c5ee196913bc37a7c274bc7d36689a40d19838a6
|
4
|
+
data.tar.gz: b4a0dc46652a9964f29db20def6b0d648407c933
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 4694fea10a3c317b20e026deb9ff1dd57fe6640b038b6d79622dc8ee8cc9713f87a6b79555627a01cd8c3df104a357826428f01ff45a2b534fee63ec96cd2b8e
|
7
|
+
data.tar.gz: 0258d42deceb72fc9c0df2c97c4b3212b0573291359554040be3775aa48a170a61513d04a62b2f6c42d868c3d75dded782fbe2a35d02e47a51a87aa5c22c6da4
|
data/.gitignore
ADDED
data/.rspec
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
--require spec_helper
|
data/Gemfile
ADDED
data/README.md
ADDED
@@ -0,0 +1,35 @@
|
|
1
|
+
# ArrayTrie
|
2
|
+
|
3
|
+
Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/array_trie`. To experiment with that code, run `bin/console` for an interactive prompt.
|
4
|
+
|
5
|
+
TODO: Delete this and the text above, and describe your gem
|
6
|
+
|
7
|
+
## Installation
|
8
|
+
|
9
|
+
Add this line to your application's Gemfile:
|
10
|
+
|
11
|
+
```ruby
|
12
|
+
gem 'array_trie'
|
13
|
+
```
|
14
|
+
|
15
|
+
And then execute:
|
16
|
+
|
17
|
+
$ bundle
|
18
|
+
|
19
|
+
Or install it yourself as:
|
20
|
+
|
21
|
+
$ gem install array_trie
|
22
|
+
|
23
|
+
## Usage
|
24
|
+
|
25
|
+
TODO: Write usage instructions here
|
26
|
+
|
27
|
+
## Development
|
28
|
+
|
29
|
+
After checking out the repo, run `bin/setup` to install dependencies. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
30
|
+
|
31
|
+
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
32
|
+
|
33
|
+
## Contributing
|
34
|
+
|
35
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/justjake/array_trie.
|
data/Rakefile
ADDED
data/array_trie.gemspec
ADDED
@@ -0,0 +1,32 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path("../lib", __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require "array_trie/version"
|
5
|
+
|
6
|
+
Gem::Specification.new do |spec|
|
7
|
+
spec.name = "array_trie"
|
8
|
+
spec.version = ArrayTrie::VERSION
|
9
|
+
spec.authors = ["Jake Teton-Landis"]
|
10
|
+
spec.email = ["jake.tl@airbnb.com"]
|
11
|
+
|
12
|
+
spec.summary = <<-EOS
|
13
|
+
Trie-like, prefix-tree data structures that maps from ordered keys to values.
|
14
|
+
EOS
|
15
|
+
spec.description = <<-EOS
|
16
|
+
Trie-like, prefix-tree data structures. First, a prefix-tree based on Arrays, which differs from a traditional trie, which maps strings to values. Second, a more general prefix-tree data structure that works for any type of keys, provided those keys can be transformed to and from an array.
|
17
|
+
|
18
|
+
Both of these data structures are implemented in terms of hashes.
|
19
|
+
EOS
|
20
|
+
spec.homepage = "https://github.com/justjake/array-trie"
|
21
|
+
|
22
|
+
spec.files = `git ls-files -z`.split("\x0").reject do |f|
|
23
|
+
f.match(%r{^(test|spec|features)/})
|
24
|
+
end
|
25
|
+
spec.bindir = "exe"
|
26
|
+
spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
|
27
|
+
spec.require_paths = ["lib"]
|
28
|
+
|
29
|
+
spec.add_development_dependency "bundler", "~> 1.15"
|
30
|
+
spec.add_development_dependency "rake", "~> 10.0"
|
31
|
+
spec.add_development_dependency "rspec", "~> 3.7"
|
32
|
+
end
|
data/bin/console
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require "bundler/setup"
|
4
|
+
require "array_trie"
|
5
|
+
|
6
|
+
# You can add fixtures and/or initialization code here to make experimenting
|
7
|
+
# with your gem easier. You can also use a different console, if you like.
|
8
|
+
|
9
|
+
# (If you use this, don't forget to add pry to your Gemfile!)
|
10
|
+
# require "pry"
|
11
|
+
# Pry.start
|
12
|
+
|
13
|
+
require "irb"
|
14
|
+
IRB.start(__FILE__)
|
data/bin/setup
ADDED
data/bin/test
ADDED
data/lib/array_trie.rb
ADDED
@@ -0,0 +1,198 @@
|
|
1
|
+
# ArrayTrie is a trie-like, prefix-tree data structure that maps from arrays to
|
2
|
+
# values. This differs from a traditional trie, which maps strings to values.
|
3
|
+
#
|
4
|
+
# ArrayTrie is implemented in terms of Ruby hashes, so members of your array
|
5
|
+
# keys must behave by Ruby's hash contracts:
|
6
|
+
# https://ruby-doc.org/core-2.3.1/Hash.html#class-Hash-label-Hash+Keys
|
7
|
+
#
|
8
|
+
# If you wish to construct a prefix tree with non-Array keys, please see
|
9
|
+
# {ArrayTrie::PrefixTrie}, which can map arbitrary keys to values, so long as
|
10
|
+
# your keys can be converted to and from arrays.
|
11
|
+
class ArrayTrie
|
12
|
+
# Just a unique marker value.
|
13
|
+
# Using Class.new is better than using Object.new, because it makes sense
|
14
|
+
# when inspected.
|
15
|
+
#
|
16
|
+
# @api private
|
17
|
+
STOP = Class.new
|
18
|
+
|
19
|
+
def initialize(root = {})
|
20
|
+
@root = root
|
21
|
+
end
|
22
|
+
|
23
|
+
# Retrieve a value for the given key
|
24
|
+
#
|
25
|
+
# @param parts [Array]
|
26
|
+
# @return [Any] the previously stored value
|
27
|
+
# @return [nil] if the given key was not found
|
28
|
+
def [](parts)
|
29
|
+
last_node, remaining = traverse(@root, parts)
|
30
|
+
return nil unless remaining.empty?
|
31
|
+
last_node[STOP]
|
32
|
+
end
|
33
|
+
|
34
|
+
# Set a key to a value
|
35
|
+
#
|
36
|
+
# @param parts [Array] the key
|
37
|
+
# @param value [Any] the value
|
38
|
+
def []=(parts, value)
|
39
|
+
last_node, * = traverse(@root, parts, true)
|
40
|
+
last_node[STOP] = value
|
41
|
+
end
|
42
|
+
|
43
|
+
# insert a subtrie into this trie.
|
44
|
+
#
|
45
|
+
# @param parts [Array]
|
46
|
+
# @param trie [ArrayTrie]
|
47
|
+
# @return self
|
48
|
+
def insert_subtrie(parts, trie)
|
49
|
+
raise ArgumentError.new("trie must be a trie") unless trie.is_a? self.class
|
50
|
+
raise ArgumentError.new("cannot insert a subtrie at the root") if parts.empty?
|
51
|
+
parent, * = traverse(@root, parts[0...-1], true)
|
52
|
+
parent[parts.last] = trie.root
|
53
|
+
self
|
54
|
+
end
|
55
|
+
|
56
|
+
# Retrieve a view into this trie at the given key. Underlying data storage is
|
57
|
+
# shared with the subtrie.
|
58
|
+
#
|
59
|
+
# @param parts [Array]
|
60
|
+
# @return [ArrayTrie]
|
61
|
+
def subtrie(parts)
|
62
|
+
last_node, remaining = traverse(@root, parts)
|
63
|
+
return nil unless remaining.empty?
|
64
|
+
self.class.new(last_node)
|
65
|
+
end
|
66
|
+
|
67
|
+
# Returns true if this array is a key in this trie.
|
68
|
+
#
|
69
|
+
# @param parts [Array]
|
70
|
+
# @return [Boolean]
|
71
|
+
def include?(parts)
|
72
|
+
last_node, remaining = traverse(@root, parts)
|
73
|
+
return false unless remaining.empty?
|
74
|
+
last_node.key?(STOP)
|
75
|
+
end
|
76
|
+
|
77
|
+
# Returns true if this array is a prefix of an array in this trie
|
78
|
+
#
|
79
|
+
# @param parts [Array]
|
80
|
+
# @return [Boolean]
|
81
|
+
def include_prefix?(parts)
|
82
|
+
_, remaining = traverse(@root, parts)
|
83
|
+
remaining.empty?
|
84
|
+
end
|
85
|
+
|
86
|
+
# @return [Integer] Number of keys under the given prefix
|
87
|
+
def count_prefix(parts)
|
88
|
+
trie = subtrie(parts)
|
89
|
+
trie ? trie.count : 0
|
90
|
+
end
|
91
|
+
|
92
|
+
# @return [Enumerator] a depth-first enumerator
|
93
|
+
def depth_first
|
94
|
+
enum = depth_first_enumerator(@root)
|
95
|
+
return enum unless block_given?
|
96
|
+
enum.each { |path, value| yield(path, value) }
|
97
|
+
end
|
98
|
+
|
99
|
+
# @return [Enumerator] a breadth-first enumerator
|
100
|
+
def breadth_first
|
101
|
+
enum = breadth_first_enumerator(@root)
|
102
|
+
return enum unless block_given?
|
103
|
+
enum.each { |path, value| yield(path, value) }
|
104
|
+
end
|
105
|
+
|
106
|
+
# Count the number of key-value pairs in this trie.
|
107
|
+
#
|
108
|
+
# @return [Integer]
|
109
|
+
def count
|
110
|
+
breadth_first.count
|
111
|
+
end
|
112
|
+
|
113
|
+
protected
|
114
|
+
|
115
|
+
attr_reader :root
|
116
|
+
|
117
|
+
private
|
118
|
+
|
119
|
+
def depth_first_enumerator(node, current_path = [])
|
120
|
+
::Enumerator.new do |y|
|
121
|
+
depth_first_scan(node, current_path) { |path, value| y.yield(path, value) }
|
122
|
+
end
|
123
|
+
end
|
124
|
+
|
125
|
+
def breadth_first_enumerator(node, start_path = [])
|
126
|
+
::Enumerator.new do |y|
|
127
|
+
breadth_first_scan(node, start_path) { |path, value| y.yield(path, value) }
|
128
|
+
end
|
129
|
+
end
|
130
|
+
|
131
|
+
# recursive version was just too much easier
|
132
|
+
def depth_first_scan(current_node, current_path = [], &block)
|
133
|
+
if current_node.key?(STOP)
|
134
|
+
yield(current_path, current_node[STOP])
|
135
|
+
end
|
136
|
+
|
137
|
+
current_node.each do |key, value|
|
138
|
+
# already handled
|
139
|
+
next if key == STOP
|
140
|
+
|
141
|
+
# recurse
|
142
|
+
depth_first_scan(value, current_path + [key], &block)
|
143
|
+
end
|
144
|
+
end
|
145
|
+
|
146
|
+
def breadth_first_scan(node, start_path = [])
|
147
|
+
raise ::ArgumentError.new('block required') unless block_given?
|
148
|
+
|
149
|
+
queue = [ [node, start_path] ]
|
150
|
+
loop do
|
151
|
+
break if queue.empty?
|
152
|
+
current_node, current_path = queue.shift
|
153
|
+
|
154
|
+
if current_node.key?(STOP)
|
155
|
+
yield(current_path, current_node[STOP])
|
156
|
+
end
|
157
|
+
|
158
|
+
current_node.each do |key, value|
|
159
|
+
next if key == STOP
|
160
|
+
queue << [value, current_path + [key]]
|
161
|
+
end
|
162
|
+
end
|
163
|
+
end
|
164
|
+
|
165
|
+
# traverse from node `start`, to the node at path `parts`
|
166
|
+
#
|
167
|
+
# traversals are best-effort, and return the furthest node they can
|
168
|
+
# reach, and the remaining parts of the traversal path.
|
169
|
+
def traverse(start, parts, inserting = false)
|
170
|
+
assert_is_array!(parts)
|
171
|
+
|
172
|
+
if parts.empty?
|
173
|
+
return [start, []]
|
174
|
+
end
|
175
|
+
|
176
|
+
current_node = start
|
177
|
+
parts.each_with_index do |part, index|
|
178
|
+
if inserting
|
179
|
+
next_node = current_node[part] ||= {}
|
180
|
+
else
|
181
|
+
next_node = current_node[part]
|
182
|
+
return [current_node, parts[index..-1]] unless next_node
|
183
|
+
end
|
184
|
+
|
185
|
+
if index == parts.length - 1
|
186
|
+
return [next_node, []]
|
187
|
+
end
|
188
|
+
|
189
|
+
current_node = next_node
|
190
|
+
end
|
191
|
+
end
|
192
|
+
|
193
|
+
def assert_is_array!(parts)
|
194
|
+
unless parts.is_a?(::Array)
|
195
|
+
raise ::ArgumentError.new("key must be an array, instead is #{parts.inspect}")
|
196
|
+
end
|
197
|
+
end
|
198
|
+
end
|
@@ -0,0 +1,130 @@
|
|
1
|
+
class ArrayTrie
|
2
|
+
# A trie-like, prefix-tree data structure that maps arbitrary keys to values,
|
3
|
+
# provided that the keys may be transformed to and from arrays, as each
|
4
|
+
# instance uses an underlying {ArrayTrie} instance for storage.
|
5
|
+
#
|
6
|
+
# If you only care about trie membership, just map your values to `true`. If
|
7
|
+
# your trie keys are already arrays, you can use {ArrayTrie}
|
8
|
+
# directly for improved performance.
|
9
|
+
class PrefixTrie
|
10
|
+
# Create a new Trie for mapping paths to values. This could be useful for
|
11
|
+
# storing a large amount of paths-to-data mappings, and querying the trie
|
12
|
+
# based on path prefix.
|
13
|
+
#
|
14
|
+
# @return [PrefixTrie]
|
15
|
+
def self.of_paths
|
16
|
+
of_strings_split_by('/')
|
17
|
+
end
|
18
|
+
|
19
|
+
# Create a new trie for mapping strings to values. This could be useful
|
20
|
+
#
|
21
|
+
# @return [PrefixTrie]
|
22
|
+
def self.of_strings
|
23
|
+
of_strings_split_by('')
|
24
|
+
end
|
25
|
+
|
26
|
+
# Create a trie for mapping delimited strings to values.
|
27
|
+
# String keys will be split by the delimiter for storage.
|
28
|
+
#
|
29
|
+
# @return [PrefixTrie]
|
30
|
+
def self.of_strings_split_by(delim)
|
31
|
+
new(
|
32
|
+
proc { |str| str.split(delim) },
|
33
|
+
proc { |parts| parts.join(delim) }
|
34
|
+
)
|
35
|
+
end
|
36
|
+
|
37
|
+
# Create a new trie for mapping arrays to values.
|
38
|
+
#
|
39
|
+
# @return [PrefixTrie]
|
40
|
+
def self.of_arrays
|
41
|
+
new(
|
42
|
+
proc { |x| x },
|
43
|
+
proc { |x| x }
|
44
|
+
)
|
45
|
+
end
|
46
|
+
|
47
|
+
# Create a new Trie
|
48
|
+
#
|
49
|
+
# @param to_a [#call] A callable that given a key, returns that key as an
|
50
|
+
# array.
|
51
|
+
# @param from_a [#call] Inverse of to_a. A callable that given an array,
|
52
|
+
# returns the key form of that array.
|
53
|
+
# @param trie [ArrayTrie] (ArrayTrie.new) Underlying trie to use for storage.
|
54
|
+
def initialize(to_a, from_a, trie = ArrayTrie.new)
|
55
|
+
@to_a = to_a
|
56
|
+
@from_a = from_a
|
57
|
+
@trie = trie
|
58
|
+
end
|
59
|
+
|
60
|
+
def [](key)
|
61
|
+
@trie[to_a(key)]
|
62
|
+
end
|
63
|
+
|
64
|
+
def []=(key, value)
|
65
|
+
@trie[to_a(key)] = value
|
66
|
+
end
|
67
|
+
|
68
|
+
def insert_subtrie(key, subtrie)
|
69
|
+
@trie.insert_subtrie(to_a(key), subtrie.trie)
|
70
|
+
end
|
71
|
+
|
72
|
+
def subtrie(key)
|
73
|
+
lower_subtrie = @trie.subtrie(to_a(key))
|
74
|
+
return nil unless lower_subtrie
|
75
|
+
self.class.new(@to_a, @from_a, lower_subtrie)
|
76
|
+
end
|
77
|
+
|
78
|
+
def include?(key)
|
79
|
+
@trie.include?(to_a key)
|
80
|
+
end
|
81
|
+
|
82
|
+
def include_prefix?(key)
|
83
|
+
@trie.include_prefix?(to_a key)
|
84
|
+
end
|
85
|
+
|
86
|
+
def count_prefix(key)
|
87
|
+
@trie.count_prefix(to_a key)
|
88
|
+
end
|
89
|
+
|
90
|
+
def depth_first
|
91
|
+
enum = transform_enumerator(@trie.depth_first)
|
92
|
+
return enum unless block_given?
|
93
|
+
enum.each { |path, value| yield(path, value) }
|
94
|
+
end
|
95
|
+
|
96
|
+
def breadth_first
|
97
|
+
enum = transform_enumerator(@trie.breadth_first)
|
98
|
+
return enum unless block_given?
|
99
|
+
enum.each { |path, value| yield(path, value) }
|
100
|
+
end
|
101
|
+
|
102
|
+
def count
|
103
|
+
@trie.count
|
104
|
+
end
|
105
|
+
|
106
|
+
protected
|
107
|
+
|
108
|
+
attr_reader :trie
|
109
|
+
|
110
|
+
private
|
111
|
+
|
112
|
+
def transform_enumerator(enum)
|
113
|
+
::Enumerator.new do |y|
|
114
|
+
loop do
|
115
|
+
parts, value = enum.next
|
116
|
+
y.yield(from_a(parts), value)
|
117
|
+
end
|
118
|
+
end
|
119
|
+
end
|
120
|
+
|
121
|
+
def to_a(key)
|
122
|
+
@to_a.call(key)
|
123
|
+
end
|
124
|
+
|
125
|
+
def from_a(parts)
|
126
|
+
@from_a.call(parts)
|
127
|
+
end
|
128
|
+
end
|
129
|
+
end
|
130
|
+
|
metadata
ADDED
@@ -0,0 +1,100 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: array_trie
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Jake Teton-Landis
|
8
|
+
autorequire:
|
9
|
+
bindir: exe
|
10
|
+
cert_chain: []
|
11
|
+
date: 2017-11-16 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: bundler
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - ~>
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1.15'
|
20
|
+
type: :development
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - ~>
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1.15'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: rake
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - ~>
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '10.0'
|
34
|
+
type: :development
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - ~>
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '10.0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: rspec
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - ~>
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '3.7'
|
48
|
+
type: :development
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - ~>
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '3.7'
|
55
|
+
description: |
|
56
|
+
Trie-like, prefix-tree data structures. First, a prefix-tree based on Arrays, which differs from a traditional trie, which maps strings to values. Second, a more general prefix-tree data structure that works for any type of keys, provided those keys can be transformed to and from an array.
|
57
|
+
|
58
|
+
Both of these data structures are implemented in terms of hashes.
|
59
|
+
email:
|
60
|
+
- jake.tl@airbnb.com
|
61
|
+
executables: []
|
62
|
+
extensions: []
|
63
|
+
extra_rdoc_files: []
|
64
|
+
files:
|
65
|
+
- .gitignore
|
66
|
+
- .rspec
|
67
|
+
- Gemfile
|
68
|
+
- README.md
|
69
|
+
- Rakefile
|
70
|
+
- array_trie.gemspec
|
71
|
+
- bin/console
|
72
|
+
- bin/setup
|
73
|
+
- bin/test
|
74
|
+
- lib/array_trie.rb
|
75
|
+
- lib/array_trie/prefix_trie.rb
|
76
|
+
- lib/array_trie/version.rb
|
77
|
+
homepage: https://github.com/justjake/array-trie
|
78
|
+
licenses: []
|
79
|
+
metadata: {}
|
80
|
+
post_install_message:
|
81
|
+
rdoc_options: []
|
82
|
+
require_paths:
|
83
|
+
- lib
|
84
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
85
|
+
requirements:
|
86
|
+
- - '>='
|
87
|
+
- !ruby/object:Gem::Version
|
88
|
+
version: '0'
|
89
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
90
|
+
requirements:
|
91
|
+
- - '>='
|
92
|
+
- !ruby/object:Gem::Version
|
93
|
+
version: '0'
|
94
|
+
requirements: []
|
95
|
+
rubyforge_project:
|
96
|
+
rubygems_version: 2.6.10
|
97
|
+
signing_key:
|
98
|
+
specification_version: 4
|
99
|
+
summary: Trie-like, prefix-tree data structures that maps from ordered keys to values.
|
100
|
+
test_files: []
|