rambling-trie 0.5.2 → 0.6.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.travis.yml +1 -0
- data/Gemfile +1 -5
- data/LICENSE +3 -1
- data/README.markdown +19 -8
- data/lib/rambling/trie.rb +4 -3
- data/lib/rambling/trie/branches.rb +24 -29
- data/lib/rambling/trie/compressor.rb +9 -5
- data/lib/rambling/trie/enumerable.rb +1 -1
- data/lib/rambling/trie/inspector.rb +1 -1
- data/lib/rambling/trie/node.rb +26 -11
- data/lib/rambling/trie/root.rb +28 -14
- data/lib/rambling/trie/tasks/performance.rb +4 -29
- data/lib/rambling/trie/version.rb +1 -1
- data/spec/lib/rambling/trie/node_spec.rb +82 -47
- data/spec/lib/rambling/trie/root_spec.rb +121 -97
- data/spec/spec_helper.rb +2 -0
- metadata +17 -38
- data/lib/rambling/trie/children_hash_deferer.rb +0 -35
- data/spec/lib/rambling/trie/children_hash_deferer_spec.rb +0 -59
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 403d826dbe4417d5ed4215bb0fd463a4142d51b9
|
4
|
+
data.tar.gz: 363b65990628f645c0d48bdc50803d22c7efbe93
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 428848bed7b6bda02217fcec568001bddc6f6ad880af8ac7ca7714b803cb3175a17c70073102c55b270e648140f9e7ff79d40f5f531eda653e022b99e8cedd5e
|
7
|
+
data.tar.gz: 9b7950cf8dd31fc652f07295b968eb680fe535d4f8258d8357d5dd5646f644102ad27ec57012005c1634492e591863e4753607e88b8ff1a73ce8accef85199c7
|
data/.travis.yml
CHANGED
data/Gemfile
CHANGED
data/LICENSE
CHANGED
data/README.markdown
CHANGED
@@ -1,4 +1,4 @@
|
|
1
|
-
# Rambling Trie [![Build Status](https://secure.travis-ci.org/gonzedge/rambling-trie.png)](http://travis-ci.org/gonzedge/rambling-trie) [![Dependency Status](https://gemnasium.com/gonzedge/rambling-trie.png)](https://gemnasium.com/gonzedge/rambling-trie) [![Code Climate](https://codeclimate.com/
|
1
|
+
# Rambling Trie [![Build Status](https://secure.travis-ci.org/gonzedge/rambling-trie.png)](http://travis-ci.org/gonzedge/rambling-trie) [![Dependency Status](https://gemnasium.com/gonzedge/rambling-trie.png)](https://gemnasium.com/gonzedge/rambling-trie) [![Code Climate](https://codeclimate.com/github/gonzedge/rambling-trie.png)](https://codeclimate.com/github/gonzedge/rambling-trie)
|
2
2
|
|
3
3
|
The Rambling Trie is a custom implementation of the Trie data structure with Ruby, which includes compression abilities and is designed to be very fast to traverse.
|
4
4
|
|
@@ -71,8 +71,17 @@ If you want to use a custom file format, you will need to provide a custom file
|
|
71
71
|
|
72
72
|
- - -
|
73
73
|
|
74
|
+
#### Breaking changes
|
75
|
+
|
76
|
+
* Starting from version 0.6.0, the `children` method returns an array of nodes instead of a hash. If you still need access to the underlying hash, use `children_tree` instead.
|
77
|
+
|
78
|
+
- - -
|
79
|
+
|
80
|
+
- - -
|
81
|
+
|
74
82
|
#### Deprecation warnings
|
75
83
|
|
84
|
+
* Starting from version 0.6.0, the `branch?` method is deprecated. The `partial_word?` method should be used instead.
|
76
85
|
* Starting from version 0.5.0, the `has_branch_for?`, `is_word?` and `add_branch_from` methods are deprecated. The methods `branch?`, `word?` and `add` should be used respectively.
|
77
86
|
|
78
87
|
- - -
|
@@ -91,10 +100,11 @@ trie.word? 'word'
|
|
91
100
|
trie.include? 'word'
|
92
101
|
```
|
93
102
|
|
94
|
-
If you wish to find if part of a word exists in the trie instance, you should call `
|
103
|
+
If you wish to find if part of a word exists in the trie instance, you should call `partial_word?`:
|
95
104
|
|
96
105
|
``` ruby
|
97
|
-
trie.
|
106
|
+
trie.partial_word? 'partial_word'
|
107
|
+
trie.match? 'partial_word'
|
98
108
|
```
|
99
109
|
|
100
110
|
### Compression
|
@@ -109,8 +119,6 @@ trie.compress!
|
|
109
119
|
|
110
120
|
This will reduce the amount of Trie nodes by eliminating the redundant ones, which are the only-child non-terminal nodes.
|
111
121
|
|
112
|
-
Starting from version 0.3.2, the `has_branch_for?` (now `has_branch?`) and `is_word?` (now `word?`) methods work as expected on a compressed trie.
|
113
|
-
|
114
122
|
__Note that the `compress!` method acts over the trie instance it belongs to.__
|
115
123
|
__Also, adding words after compression is not supported.__
|
116
124
|
|
@@ -141,13 +149,14 @@ You can find further API documentation on the autogenerated [rambling-trie gem R
|
|
141
149
|
|
142
150
|
The Rambling Trie has been tested with the following Ruby versions:
|
143
151
|
|
144
|
-
*
|
152
|
+
* 2.0.0
|
145
153
|
* 1.9.3
|
154
|
+
* 1.9.2
|
146
155
|
|
147
156
|
And the following Rails versions:
|
148
157
|
|
149
|
-
* 3.1.x
|
150
158
|
* 3.2.x
|
159
|
+
* 3.1.x
|
151
160
|
|
152
161
|
It's possible that Rails 3.0.x is supported, but there is no guarantee.
|
153
162
|
Ruby 1.8.7 is not supported.
|
@@ -159,7 +168,9 @@ Also, be sure to add tests for any feature you may develop or bug you may fix.
|
|
159
168
|
|
160
169
|
## License and copyright
|
161
170
|
|
162
|
-
Copyright (c) 2012
|
171
|
+
Copyright (c) 2012-2013 Edgar Gonzalez
|
172
|
+
|
173
|
+
MIT License
|
163
174
|
|
164
175
|
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
165
176
|
|
data/lib/rambling/trie.rb
CHANGED
@@ -1,7 +1,8 @@
|
|
1
|
+
require 'forwardable'
|
1
2
|
%w{
|
2
|
-
branches
|
3
|
-
|
4
|
-
root version
|
3
|
+
branches compressor enumerable
|
4
|
+
inspector invalid_operation node
|
5
|
+
plain_text_reader root version
|
5
6
|
}.map { |file| File.join 'rambling', 'trie', file }.each &method(:require)
|
6
7
|
|
7
8
|
# General namespace for all Rambling gems.
|
@@ -16,13 +16,13 @@ module Rambling
|
|
16
16
|
|
17
17
|
first_letter = word.slice(0).to_sym
|
18
18
|
|
19
|
-
if
|
19
|
+
if children_tree.has_key? first_letter
|
20
20
|
word.slice! 0
|
21
|
-
child =
|
21
|
+
child = children_tree[first_letter]
|
22
22
|
child << word
|
23
23
|
child
|
24
24
|
else
|
25
|
-
|
25
|
+
children_tree[first_letter] = Node.new word, self
|
26
26
|
end
|
27
27
|
end
|
28
28
|
|
@@ -30,31 +30,12 @@ module Rambling
|
|
30
30
|
|
31
31
|
protected
|
32
32
|
|
33
|
-
def
|
34
|
-
chars.empty? || fulfills_uncompressed_condition?(:
|
33
|
+
def partial_word_when_uncompressed?(chars)
|
34
|
+
chars.empty? || fulfills_uncompressed_condition?(:partial_word_when_uncompressed?, chars)
|
35
35
|
end
|
36
36
|
|
37
|
-
def
|
38
|
-
|
39
|
-
|
40
|
-
first_letter = chars.slice! 0
|
41
|
-
current_key, current_key_string = current_key first_letter
|
42
|
-
|
43
|
-
unless current_key.nil?
|
44
|
-
return children[current_key].branch_when_compressed?(chars) if current_key_string.length == first_letter.length
|
45
|
-
|
46
|
-
while not chars.empty?
|
47
|
-
char = chars.slice! 0
|
48
|
-
|
49
|
-
break unless current_key_string[first_letter.length] == char
|
50
|
-
|
51
|
-
return true if chars.empty?
|
52
|
-
first_letter << char
|
53
|
-
return children[current_key].branch_when_compressed?(chars) if current_key_string.length == first_letter.length
|
54
|
-
end
|
55
|
-
end
|
56
|
-
|
57
|
-
false
|
37
|
+
def partial_word_when_compressed?(chars)
|
38
|
+
chars.empty? || compressed_trie_has_partial_word?(chars)
|
58
39
|
end
|
59
40
|
|
60
41
|
def word_when_uncompressed?(chars)
|
@@ -68,7 +49,7 @@ module Rambling
|
|
68
49
|
while not chars.empty?
|
69
50
|
first_letter << chars.slice!(0)
|
70
51
|
key = first_letter.to_sym
|
71
|
-
return
|
52
|
+
return children_tree[key].word_when_compressed?(chars) if children_tree.has_key? key
|
72
53
|
end
|
73
54
|
|
74
55
|
false
|
@@ -76,10 +57,24 @@ module Rambling
|
|
76
57
|
|
77
58
|
private
|
78
59
|
|
60
|
+
def compressed_trie_has_partial_word?(chars)
|
61
|
+
current_length = 0
|
62
|
+
current_key, current_key_string = current_key chars.slice!(0)
|
63
|
+
|
64
|
+
begin
|
65
|
+
current_length += 1
|
66
|
+
|
67
|
+
if current_key_string.length == current_length || chars.empty?
|
68
|
+
return children_tree[current_key].partial_word_when_compressed?(chars)
|
69
|
+
end
|
70
|
+
end while current_key_string[current_length] == chars.slice!(0)
|
71
|
+
false
|
72
|
+
end
|
73
|
+
|
79
74
|
def current_key(letter)
|
80
75
|
current_key_string = current_key = nil
|
81
76
|
|
82
|
-
|
77
|
+
children_tree.keys.each do |key|
|
83
78
|
key_string = key.to_s
|
84
79
|
if key_string.start_with? letter
|
85
80
|
current_key = key
|
@@ -95,7 +90,7 @@ module Rambling
|
|
95
90
|
first_letter = chars.slice! 0
|
96
91
|
unless first_letter.nil?
|
97
92
|
first_letter_sym = first_letter.to_sym
|
98
|
-
return
|
93
|
+
return children_tree[first_letter_sym].send(method, chars) if children_tree.has_key? first_letter_sym
|
99
94
|
end
|
100
95
|
|
101
96
|
false
|
@@ -11,23 +11,27 @@ module Rambling
|
|
11
11
|
# Compress the current node using redundant node elimination.
|
12
12
|
# @return [Root, Node] the compressed node.
|
13
13
|
def compress_tree!
|
14
|
-
if
|
15
|
-
merge_with! children.
|
14
|
+
if compressable?
|
15
|
+
merge_with! children.first
|
16
16
|
compress_tree!
|
17
17
|
end
|
18
18
|
|
19
|
-
children.
|
19
|
+
children.each &:compress_tree!
|
20
20
|
|
21
21
|
self
|
22
22
|
end
|
23
23
|
|
24
24
|
private
|
25
25
|
|
26
|
+
def compressable?
|
27
|
+
!(root? || terminal?) && children_tree.size == 1
|
28
|
+
end
|
29
|
+
|
26
30
|
def merge_with!(child)
|
27
31
|
delete_old_key_on_parent!
|
28
32
|
redefine_self! child
|
29
33
|
|
30
|
-
children.each { |
|
34
|
+
children.each { |node| node.parent = self }
|
31
35
|
end
|
32
36
|
|
33
37
|
def delete_old_key_on_parent!
|
@@ -38,7 +42,7 @@ module Rambling
|
|
38
42
|
|
39
43
|
def redefine_self!(merged_node)
|
40
44
|
self.letter = letter.to_s << merged_node.letter.to_s
|
41
|
-
self.
|
45
|
+
self.children_tree = merged_node.children_tree
|
42
46
|
self.terminal = merged_node.terminal?
|
43
47
|
end
|
44
48
|
end
|
@@ -10,7 +10,7 @@ module Rambling
|
|
10
10
|
def each(&block)
|
11
11
|
enumerator = Enumerator.new do |words|
|
12
12
|
words << as_word if terminal?
|
13
|
-
children.each { |
|
13
|
+
children.each { |child| child.each { |word| words << word } }
|
14
14
|
end
|
15
15
|
|
16
16
|
block.nil? ? enumerator : enumerator.each(&block)
|
@@ -4,7 +4,7 @@ module Rambling
|
|
4
4
|
module Inspector
|
5
5
|
# @return [String] a string representation of the current node.
|
6
6
|
def inspect
|
7
|
-
"#<#{self.class.name} letter: #{letter.inspect || 'nil'}, children: #{
|
7
|
+
"#<#{self.class.name} letter: #{letter.inspect || 'nil'}, children: #{children_tree.keys}>"
|
8
8
|
end
|
9
9
|
end
|
10
10
|
end
|
data/lib/rambling/trie/node.rb
CHANGED
@@ -2,7 +2,10 @@ module Rambling
|
|
2
2
|
module Trie
|
3
3
|
# A representation of a node in the Trie data structure.
|
4
4
|
class Node
|
5
|
-
|
5
|
+
extend Forwardable
|
6
|
+
|
7
|
+
delegate [:[], :[]=, :delete, :has_key?] => :children_tree
|
8
|
+
|
6
9
|
include Compressor
|
7
10
|
include Branches
|
8
11
|
include Enumerable
|
@@ -13,8 +16,8 @@ module Rambling
|
|
13
16
|
attr_reader :letter
|
14
17
|
|
15
18
|
# Children nodes.
|
16
|
-
# @return [Hash] the
|
17
|
-
attr_reader :
|
19
|
+
# @return [Hash] the children_tree hash, consisting of :letter => node.
|
20
|
+
attr_reader :children_tree
|
18
21
|
|
19
22
|
# Parent node.
|
20
23
|
# @return [Node, nil] the parent node or nil for the root element.
|
@@ -25,7 +28,7 @@ module Rambling
|
|
25
28
|
# @param [Node, nil] parent the parent of this node.
|
26
29
|
def initialize(word = nil, parent = nil)
|
27
30
|
self.parent = parent
|
28
|
-
self.
|
31
|
+
self.children_tree = {}
|
29
32
|
|
30
33
|
unless word.nil? || word.empty?
|
31
34
|
self.letter = word.slice! 0
|
@@ -34,12 +37,6 @@ module Rambling
|
|
34
37
|
end
|
35
38
|
end
|
36
39
|
|
37
|
-
# Flag for terminal nodes.
|
38
|
-
# @return [Boolean] `true` for terminal nodes, `false` otherwise.
|
39
|
-
def terminal?
|
40
|
-
!!terminal
|
41
|
-
end
|
42
|
-
|
43
40
|
# String representation of the current node, if it is a terminal node.
|
44
41
|
# @return [String] the string representation of the current node.
|
45
42
|
# @raise [InvalidOperation] if node is not terminal or is root.
|
@@ -48,6 +45,24 @@ module Rambling
|
|
48
45
|
to_s
|
49
46
|
end
|
50
47
|
|
48
|
+
# Children nodes of the current node.
|
49
|
+
# @return [Array] the array of children nodes contained in the current node.
|
50
|
+
def children
|
51
|
+
children_tree.values
|
52
|
+
end
|
53
|
+
|
54
|
+
# If the current node is the root node.
|
55
|
+
# @return [Boolean] `false`
|
56
|
+
def root?
|
57
|
+
false
|
58
|
+
end
|
59
|
+
|
60
|
+
# Flag for terminal nodes.
|
61
|
+
# @return [Boolean] `true` for terminal nodes, `false` otherwise.
|
62
|
+
def terminal?
|
63
|
+
!!terminal
|
64
|
+
end
|
65
|
+
|
51
66
|
# String representation of the current node.
|
52
67
|
# @return [String] the string representation of the current node.
|
53
68
|
def to_s
|
@@ -56,7 +71,7 @@ module Rambling
|
|
56
71
|
|
57
72
|
protected
|
58
73
|
|
59
|
-
attr_writer :
|
74
|
+
attr_writer :children_tree
|
60
75
|
attr_accessor :terminal
|
61
76
|
|
62
77
|
def letter=(letter)
|
data/lib/rambling/trie/root.rb
CHANGED
@@ -10,6 +10,24 @@ module Rambling
|
|
10
10
|
yield self if block_given?
|
11
11
|
end
|
12
12
|
|
13
|
+
# Adds a branch to the trie based on the word, without changing the passed word.
|
14
|
+
# @param [String] word the word to add the branch from.
|
15
|
+
# @return [Node] the just added branch's root node.
|
16
|
+
# @raise [InvalidOperation] if the trie is already compressed.
|
17
|
+
# @see Branches#add
|
18
|
+
# @note Avoids clearing the contents of the word variable.
|
19
|
+
def add(word)
|
20
|
+
super word.clone
|
21
|
+
end
|
22
|
+
|
23
|
+
alias_method :<<, :add
|
24
|
+
|
25
|
+
# @deprecated Use `#partial_word?` instead.
|
26
|
+
def branch?(word = '')
|
27
|
+
warn 'The `#branch?` method will be deprecated, please use `#partial_word?` instead.'
|
28
|
+
partial_word? word
|
29
|
+
end
|
30
|
+
|
13
31
|
# Compresses the existing tree using redundant node elimination. Flags the trie as compressed.
|
14
32
|
# @return [Root] self
|
15
33
|
def compress!
|
@@ -26,8 +44,16 @@ module Rambling
|
|
26
44
|
# Checks if a path for a word or partial word exists in the trie.
|
27
45
|
# @param [String] word the word or partial word to look for in the trie.
|
28
46
|
# @return [Boolean] `true` if the word or partial word is found, `false` otherwise.
|
29
|
-
def
|
30
|
-
is? :
|
47
|
+
def partial_word?(word = '')
|
48
|
+
is? :partial_word, word
|
49
|
+
end
|
50
|
+
|
51
|
+
alias_method :match?, :partial_word?
|
52
|
+
|
53
|
+
# If the current node is the root node.
|
54
|
+
# @return [Boolean] `true`
|
55
|
+
def root?
|
56
|
+
true
|
31
57
|
end
|
32
58
|
|
33
59
|
# Checks if a whole word exists in the trie.
|
@@ -39,18 +65,6 @@ module Rambling
|
|
39
65
|
|
40
66
|
alias_method :include?, :word?
|
41
67
|
|
42
|
-
# Adds a branch to the trie based on the word, without changing the passed word.
|
43
|
-
# @param [String] word the word to add the branch from.
|
44
|
-
# @return [Node] the just added branch's root node.
|
45
|
-
# @raise [InvalidOperation] if the trie is already compressed.
|
46
|
-
# @see Branches#add
|
47
|
-
# @note Avoids clearing the contents of the word variable.
|
48
|
-
def add(word)
|
49
|
-
super word.clone
|
50
|
-
end
|
51
|
-
|
52
|
-
alias_method :<<, :add
|
53
|
-
|
54
68
|
private
|
55
69
|
|
56
70
|
attr_accessor :compressed
|
@@ -2,8 +2,8 @@ require 'benchmark'
|
|
2
2
|
|
3
3
|
namespace :performance do
|
4
4
|
def report(name, trie, output)
|
5
|
-
words =
|
6
|
-
methods = [:word?, :
|
5
|
+
words = %w(hi help beautiful impressionism anthropological)
|
6
|
+
methods = [:word?, :partial_word?]
|
7
7
|
|
8
8
|
output.puts "==> #{name}"
|
9
9
|
methods.each do |method|
|
@@ -65,8 +65,8 @@ namespace :performance do
|
|
65
65
|
puts 'Generating profiling reports...'
|
66
66
|
|
67
67
|
rambling_trie = Rambling::Trie.create path('assets', 'dictionaries', 'words_with_friends.txt')
|
68
|
-
words =
|
69
|
-
methods = [:
|
68
|
+
words = %w(hi help beautiful impressionism anthropological)
|
69
|
+
methods = [:word?, :partial_word?]
|
70
70
|
tries = [lambda {rambling_trie.clone}, lambda {rambling_trie.clone.compress!}]
|
71
71
|
|
72
72
|
methods.each do |method|
|
@@ -87,31 +87,6 @@ namespace :performance do
|
|
87
87
|
puts 'Done'
|
88
88
|
end
|
89
89
|
|
90
|
-
desc 'Generate CPU profiling reports'
|
91
|
-
task :cpu_profile do
|
92
|
-
require 'perftools'
|
93
|
-
|
94
|
-
puts 'Generating cpu profiling reports...'
|
95
|
-
|
96
|
-
rambling_trie = Rambling::Trie.create path('assets', 'dictionaries', 'words_with_friends.txt')
|
97
|
-
words = ['hi', 'help', 'beautiful', 'impressionism', 'anthropological']
|
98
|
-
methods = [:branch?, :word?]
|
99
|
-
tries = [lambda {rambling_trie.clone}, lambda {rambling_trie.clone.compress!}]
|
100
|
-
|
101
|
-
methods.each do |method|
|
102
|
-
tries.each do |trie_generator|
|
103
|
-
trie = trie_generator.call
|
104
|
-
result = PerfTools::CpuProfiler.start path('reports', "cpu_profile-#{trie.compressed? ? 'compressed' : 'uncompressed'}-#{method.to_s.sub(/\?/, '')}-#{Time.now.to_i}") do
|
105
|
-
words.each do |word|
|
106
|
-
200_000.times { trie.send method, word }
|
107
|
-
end
|
108
|
-
end
|
109
|
-
end
|
110
|
-
end
|
111
|
-
|
112
|
-
puts 'Done'
|
113
|
-
end
|
114
|
-
|
115
90
|
desc 'Generate profiling and performance reports'
|
116
91
|
task all: [:profile, :report]
|
117
92
|
end
|