waistband 0.8.5 → 0.9.0

Sign up to get free protection for your applications and to get access to all the features.
data/Gemfile CHANGED
@@ -5,5 +5,5 @@ gemspec
5
5
 
6
6
  gem 'rspec'
7
7
  gem 'debugger'
8
- gem 'kaminari'
9
- gem 'timecop'
8
+ gem 'kaminari', require: false
9
+ gem 'timecop'
@@ -1,4 +1,4 @@
1
- Copyright (c) 2013 David Jairala
1
+ Copyright (c) 2014 David Jairala
2
2
 
3
3
  MIT License
4
4
 
data/README.md CHANGED
@@ -1,8 +1,8 @@
1
1
  # Waistband
2
2
 
3
- Ruby interface to Elastic Search
3
+ Configuration and sensible defaults for ElasticSearch on Ruby. Handles configuration, index creation, quality of life, etc, of Elastic Search in Ruby.
4
4
 
5
- ## Installation
5
+ # Installation
6
6
 
7
7
  Install ElasticSearch:
8
8
 
@@ -20,7 +20,7 @@ And then execute:
20
20
 
21
21
  Or install it yourself as:
22
22
 
23
- $ gem install waistband
23
+ $ gem install waistband
24
24
 
25
25
  ## Configuration
26
26
 
@@ -29,11 +29,14 @@ Configuration is generally pretty simple. First, create a folder where you'll s
29
29
  ```yml
30
30
  # #{APP_DIR}/config/waistband/waistband.yml
31
31
  development:
32
- timeout: 2
33
- servers:
34
- server1:
35
- host: http://localhost
36
- port: 9200
32
+ retries: 5
33
+ timeout: 2
34
+ reload_on_failure: true
35
+ servers:
36
+ server1:
37
+ protocol: http
38
+ host: localhost
39
+ port: 9200
37
40
  ```
38
41
 
39
42
  You can name the servers whatever you want, and one of them is selected at random using `Array.sample`, excluding blacklisted servers, when conduction operations on the server. Here's an example with two servers:
@@ -41,14 +44,18 @@ You can name the servers whatever you want, and one of them is selected at rando
41
44
  ```yml
42
45
  # #{APP_DIR}/config/waistband/waistband.yml
43
46
  development:
44
- timeout: 2
45
- servers:
46
- server1:
47
- host: http://173.247.192.214
48
- port: 9200
49
- server2:
50
- host: http://173.247.192.215
51
- port: 9200
47
+ retries: 5
48
+ timeout: 2
49
+ reload_on_failure: true
50
+ servers:
51
+ server1:
52
+ protocol: http
53
+ host: 173.247.192.214
54
+ port: 9200
55
+ server2:
56
+ protocol: http
57
+ host: 173.247.192.215
58
+ port: 9200
52
59
  ```
53
60
 
54
61
  You'll need a separate config file for each index you use, containing the index settings and mappings. For example, for my search index, I use something akin to this:
@@ -56,31 +63,35 @@ You'll need a separate config file for each index you use, containing the index
56
63
  ```yml
57
64
  # #{APP_DIR}/config/waistband/waistband_search.yml
58
65
  development:
59
- stringify: false
60
- settings:
66
+ stringify: false
67
+ settings:
61
68
  index:
62
- number_of_shards: 4
63
- mappings:
64
- event:
65
- _source:
66
- includes: ["*"]
69
+ number_of_shards: 4
70
+ mappings:
71
+ event:
72
+ _source:
73
+ includes: ["*"]
67
74
  ```
68
75
 
69
76
  ## List of config settings:
70
77
 
71
78
  * `settings`: settings for the Elastic Search index. Refer to the ["admin indices update settings"](http://www.elasticsearch.org/guide/reference/api/admin-indices-update-settings/) document for more info.
72
79
  * `mappings`: the index mappings. More often than not you'll want to include all of the document attribute, so you'll do something like in the example above. For more info, refer to the [mapping reference]("http://www.elasticsearch.org/guide/reference/mapping/").
80
+ * `retries`: number of times to retry before moving on to the next server node.
81
+ * `reload_on_failure`: should we reload the node list from the server on failure.
82
+ * `timeout`: seconds till a timeout exception is raise when trying to connect to the node.
73
83
  * `name`: optional - name of the index. You can (and probably should) have a different name for the index for your test environment. If not specified, it defaults to the name of the yml file minus the `waistband_` portion, so in the above example, the index name would become `search_#{env}`, where env is your environment variable as defined in `Waistband::Configuration#setup` (determined by `RAILS_ENV` or `RACK_ENV`).
74
84
  * `stringify`: optional - determines wether whatever is stored into the index is going to be converted to a string before storage. Usually false unless you need it to be true for specific cases, like if for some `key => value` pairs the value is of different types some times.
75
85
 
76
86
  ## Initializer
77
87
 
78
- After getting all the YML config files in place, you'll just need to hook up an initializer to these files:
88
+
89
+ Waistband will look for config files by default in `File.join(Rails.root, 'config')`. You can override the default location of the config folder. After getting all the YML config files in place, you'll just need to hook up an initializer to these files:
79
90
 
80
91
  ```ruby
81
92
  # #{APP_DIR}/config/initializers/waistband.rb
82
93
  Waistband.configure do |c|
83
- c.config_dir = "#{APP_DIR}/spec/config/waistband"
94
+ c.config_dir = "#{APP_DIR}/spec/config/waistband"
84
95
  end
85
96
  ```
86
97
 
@@ -91,20 +102,20 @@ end
91
102
 
92
103
  #### Creating and destroying the indexes
93
104
 
94
- For each index you have, you'll probably want to make sure it's created on initialization, so either in the same waistband initializer or in another initializer, depending on your preferences, you'll have to create them. For our search example:
105
+ For each index you have, you'll probably want to make sure it's created on initialization, or in a Rake task, so either in the same waistband initializer or in another initializer, depending on your preferences, you'll have to create them. For our search example:
95
106
 
96
107
  ```ruby
97
108
  # #{APP_DIR}/config/initializers/waistband.rb
98
109
  # ...
99
- Waistband::Index.new('search').create!
110
+ Waistband::Index.new('search').create
100
111
  ```
101
112
 
102
- This will create the index if it's not been created already or return nil if it already exists.
113
+ This will create the index if it's not been created already or return nil if it already exists. If you want to raise an exception if it already exists, use the `#create!` method.
103
114
 
104
115
  Destroying an index is equally easy:
105
116
 
106
117
  ```ruby
107
- Waistband::Index.new('search').destroy!
118
+ Waistband::Index.new('search').delete
108
119
  ```
109
120
 
110
121
  When writing tests, it would generally be advisable to destroy and create the indexes in a `before(:each)` or `before(:all)` depending in your circumstances. Also, remember for testing that replication and data availability is not inmediate on the indexes, so if you create an immediate expectation for data to be there, you should refresh the index before it:
@@ -113,7 +124,7 @@ When writing tests, it would generally be advisable to destroy and create the in
113
124
  Waistband::Index.new('search').refresh
114
125
  ```
115
126
 
116
- Note: most index methods such as `create`, `destroy`, `read`, etc, have an equivalent bang method (`destroy!`) that will actually throw an exception if something goes wrong. For example, `destroy` will return nil if the index doesn't exist, but will raise any other unrelated exceptions, whereas `destroy!` will raise even the Index Not Found exception.
127
+ Note: most index methods such as `create`, `delete`, etc, have an equivalent bang method (`delete!`) that will actually throw an exception if something goes wrong. For example, `delete` will return nil if the index doesn't exist, but will raise any other unrelated exceptions, whereas `delete!` will raise even the Index Not Found exception.
117
128
 
118
129
  #### Writing, reading and deleting from an index
119
130
 
@@ -121,55 +132,63 @@ Note: most index methods such as `create`, `destroy`, `read`, etc, have an equiv
121
132
  index = Waistband::Index.new('search')
122
133
 
123
134
  # writing
124
- index.store!('my_data', {'important' => true, 'valuable' => {'always' => true}})
125
- # => "{\"ok\":true,\"_index\":\"search\",\"_type\":\"search\",\"_id\":\"my_data\",\"_version\":1}"
135
+ index.save('my_data', {'important' => true, 'valuable' => {'always' => true}}) # => true
126
136
 
127
137
  # reading
128
- index.read('my_data')
129
- # => {"important"=>true, "valuable"=>{"always"=>true}}
138
+ index.find('my_data') # => {"important"=>true, "valuable"=>{"always"=>true}}
139
+
140
+ # reading with all the internal data
141
+ index.read('my_data') # => {'_id' => '123123', '_source' => {"important"=>true, "valuable"=>{"always"=>true}}, ...}
130
142
 
131
143
  # deleting
132
- index.delete!('my_data')
133
- # => "{\"ok\":true,\"found\":true,\"_index\":\"search\",\"_type\":\"search\",\"_id\":\"my_data\",\"_version\":2}"
144
+ index.destroy('my_data') # => "{\"ok\":true,\"found\":true,\"_index\":\"search\",\"_type\":\"search\",\"_id\":\"my_data\",\"_version\":2}"
134
145
 
135
146
  # reading non-existent data
136
- index.read('my_data')
137
- # => nil
147
+ index.find('my_data') # => nil
138
148
  ```
139
149
 
140
150
  ### Searching
141
151
 
142
- For searching, you construct a query from your index:
152
+ For searching, you construct a search from your index:
143
153
 
144
154
  ```ruby
145
155
  index = Waistband::Index.new('search')
146
- query = index.query(page_size: 5).prepare({
147
- query: {
148
- term: { hidden: false }
149
- },
150
- sort: { created_at: {order: 'desc' } }
156
+ results = index.search({
157
+ query: {
158
+ term: { hidden: false }
159
+ },
160
+ sort: { created_at: {order: 'desc' } },
161
+ page: 1,
162
+ page_size: 5
151
163
  })
152
164
 
153
- query.results # => returns an array of Waistband::QueryResult
165
+ results.hits # => returns a search results object
166
+ results.total_hits # => 28481
154
167
 
155
- query.total_hits
156
- # => 28481
168
+ For paginating the results, you can use the `#paginated_results` method, which requires the [Kaminari](https://github.com/amatsuda/kaminari), gem. If you use another gem, you can just override the method, etc.
157
169
 
158
- # get the second page of results:
159
- query.page = 2
160
- query.results
170
+ For more information and extra methods, take a peek into the class docs.
161
171
 
162
- # change the page size:
163
- query.page_size = 50
164
- query.page = 1
165
- query.results
166
- ```
172
+ Also, for convenience, the gem provides the `Result` class, which just provides some quality-of-life methods for working with search result hashes or their inner `_source` hashes, for example:
167
173
 
168
- For paginating the results, you can use the `#paginated_results` method, which requires the [Kaminari](https://github.com/amatsuda/kaminari), gem. If you use another gem, you can just override the method, etc.
174
+ ```ruby
175
+ search = index.search({
176
+ query: {
177
+ term: { hidden: false }
178
+ }
179
+ })
180
+ results = search.results
181
+ result = result.first
169
182
 
170
- For more information and extra methods, take a peek into the class docs.
183
+ result._id # => '123123'
184
+ result._type # => 'search_result'
185
+ result._index # => 'search'
186
+ result.task_id # => 991122 -- note that this is a method missing interface directly either to the search result hash, or to the _source sub-hash
187
+ ```
188
+
189
+ The `Result` class is directly exposed via two methods in the `SearchResults` class: `#results` and `#paginated_results`. You can use `#paginated_results` if you're using Kaminari for pagination and wish to use the awesomeness it provides.
171
190
 
172
- ### Sub-Indexes
191
+ ### Index Aliasing
173
192
 
174
193
  Sometimes it can be useful to sub-divide your index into smaller indexes based on dates or other partitioning schemes. To do this, the `Index` class exposes the `subs` option on instantiation:
175
194
 
@@ -187,13 +206,11 @@ Part of subbing is gonna be creating the correct aliases that group up your sub-
187
206
  ```ruby
188
207
  index = Waistband::Index.new('events', subs: %w(2013 01))
189
208
  index.create!
190
- index.alias!('my_super_events_alias')
191
- => true
192
- index.fetch_alias('my_super_events_alias')
193
- => {"events__2013_01"=>{"aliases"=>{"my_super_events_alias"=>{}}}}
209
+ index.alias('my_super_events_alias') # => true
210
+ index.alias_exists?('my_super_events_alias') # => true
194
211
  ```
195
212
 
196
- The `alias!` methods receives a param to define the alias name.
213
+ The `alias` methods receives a param to define the alias name.
197
214
 
198
215
  ## Contributing
199
216
 
@@ -202,3 +219,4 @@ The `alias!` methods receives a param to define the alias name.
202
219
  3. Commit your changes (`git commit -am 'Add some feature'`)
203
220
  4. Push to the branch (`git push origin my-new-feature`)
204
221
  5. Create new Pull Request
222
+
@@ -1,17 +1,14 @@
1
1
  require "waistband/version"
2
2
 
3
3
  module Waistband
4
-
5
- autoload :Configuration, "waistband/configuration"
6
- autoload :Connection, "waistband/connection"
4
+
5
+ autoload :Errors, "waistband/errors"
7
6
  autoload :StringifiedArray, "waistband/stringified_array"
8
7
  autoload :StringifiedHash, "waistband/stringified_hash"
9
- autoload :QueryResult, "waistband/query_result"
10
- autoload :QueryHelpers, "waistband/query_helpers"
11
- autoload :Query, "waistband/query"
8
+ autoload :Configuration, "waistband/configuration"
12
9
  autoload :Index, "waistband/index"
13
- autoload :QuickError, "waistband/quick_error"
14
- autoload :Model, "waistband/model"
10
+ autoload :SearchResults, "waistband/search_results"
11
+ autoload :Result, "waistband/result"
15
12
 
16
13
  class << self
17
14
 
@@ -23,6 +20,10 @@ module Waistband
23
20
  end
24
21
  alias_method :config, :configure
25
22
 
23
+ def client
24
+ ::Waistband.config.client
25
+ end
26
+
26
27
  end
27
28
 
28
29
  end
@@ -17,6 +17,8 @@ module Waistband
17
17
  end
18
18
 
19
19
  def setup
20
+ self.config_dir = default_config_dir unless config_dir
21
+
20
22
  raise "Please define a valid `config_dir` configuration variable!" unless config_dir
21
23
  raise "Couldn't find configuration directory #{config_dir}" unless File.exist?(config_dir)
22
24
 
@@ -29,23 +31,32 @@ module Waistband
29
31
  end
30
32
 
31
33
  def method_missing(method_name, *args, &block)
32
- return current_server[method_name] if current_server[method_name]
33
- return @yml_config[method_name] if @yml_config[method_name]
34
+ return @yml_config[method_name] if @yml_config[method_name]
34
35
  super
35
36
  end
36
37
 
37
- def servers
38
- @servers ||= @yml_config['servers'].map do |server_name, config|
39
- config.merge({
40
- '_id' => Digest::SHA1.hexdigest("#{config['host']}:#{config['port']}")
41
- })
38
+ def hosts
39
+ @hosts ||= @yml_config['servers'].map do |server_name, config|
40
+ config
42
41
  end
43
42
  end
44
43
 
44
+ def client
45
+ Elasticsearch::Client.new(
46
+ hosts: hosts,
47
+ randomize_hosts: true,
48
+ retry_on_failure: retries,
49
+ reload_on_failure: reload_on_failure
50
+ )
51
+ end
52
+
45
53
  private
46
54
 
47
- def current_server
48
- servers.sample
55
+ def default_config_dir
56
+ @default_config_dir ||= begin
57
+ return nil unless defined?(Rails)
58
+ File.join(Rails.root, 'config')
59
+ end
49
60
  end
50
61
 
51
62
  # /private
@@ -0,0 +1,9 @@
1
+ module Waistband
2
+ module Errors
3
+
4
+ class IndexExists < StandardError; end
5
+ class IndexNotFound < StandardError; end
6
+ class NoSearchHits < StandardError; end
7
+
8
+ end
9
+ end
@@ -1,79 +1,199 @@
1
- require 'active_support/core_ext/object/blank'
2
1
  require 'active_support/core_ext/hash/keys'
2
+ require 'active_support/core_ext/array/extract_options'
3
+ require 'elasticsearch'
3
4
 
4
5
  module Waistband
5
6
  class Index
6
7
 
7
- delegate :create!, :create, :destroy!, :destroy,
8
- :update_settings!, :update_settings,
9
- :delete!, :delete, :read!, :read,
10
- :alias, :alias!,
11
- :fetch_alias, :mapping, :exists?,
12
- :refresh!, :refresh,
13
- :search_url,
14
- to: :connection
15
-
16
- attr_reader :base_name
17
-
18
- def initialize(index, options = {})
8
+ def initialize(index_name, options = {})
19
9
  options = options.stringify_keys
20
10
 
21
- @index = index
22
- @base_name = index
23
- @stringify = config['stringify']
11
+ @index_name = index_name
12
+ @stringify = config['stringify']
24
13
 
14
+ # alias subs
25
15
  @subs = [options['subs']] if options['subs'].present?
26
16
  @subs = @subs.flatten if @subs.is_a?(Array)
27
17
  end
28
18
 
29
- def store!(key, data)
30
- # map everything to strings
31
- if @stringify
32
- original_data = data
33
- data = stringify_all data
34
- end
19
+ def exists?
20
+ client.indices.exists index: config_name
21
+ end
22
+
23
+ def refresh
24
+ client.indices.refresh index: config_name
25
+ end
26
+
27
+ def update_mapping(type)
28
+ client.indices.put_mapping(
29
+ index: config_name,
30
+ type: type,
31
+ body: config['mappings'][type]
32
+ )
33
+ end
34
+
35
+ def update_settings
36
+ client.indices.put_settings(
37
+ index: config_name,
38
+ body: settings
39
+ )
40
+ end
41
+
42
+ def create
43
+ create!
44
+ rescue ::Waistband::Errors::IndexExists => ex
45
+ true
46
+ end
47
+
48
+ def create!
49
+ raise ::Waistband::Errors::IndexExists.new("Index already exists") if exists?
50
+ client.indices.create index: config_name, body: config.except('name', 'stringify')
51
+ end
52
+
53
+ def delete
54
+ delete!
55
+ rescue ::Waistband::Errors::IndexNotFound => ex
56
+ true
57
+ end
58
+
59
+ def delete!
60
+ raise ::Waistband::Errors::IndexNotFound.new("Index not found") unless exists?
61
+ client.indices.delete index: config_name
62
+ end
63
+
64
+ def save(*args)
65
+ body_hash = args.extract_options!
66
+ id = args.first
67
+ type = body_hash.delete(:type) || default_type_name
68
+
69
+ # map everything to strings if need be
70
+ body_hash = stringify_all(body_hash) if @stringify
71
+
72
+ saved = client.index(
73
+ index: config_name,
74
+ type: type,
75
+ id: id,
76
+ body: body_hash
77
+ )
78
+
79
+ saved['_id'].present?
80
+ end
81
+
82
+ def find(id, options = {})
83
+ find!(id, options)
84
+ rescue Elasticsearch::Transport::Transport::Errors::NotFound
85
+ nil
86
+ end
35
87
 
36
- result = connection.put key, data
37
- data = original_data if @stringify
88
+ def find!(id, options = {})
89
+ doc = read!(id, options)
90
+ doc['_source']
91
+ end
38
92
 
39
- result
93
+ def read(id, options = {})
94
+ read!(id, options)
95
+ rescue Elasticsearch::Transport::Transport::Errors::NotFound
96
+ nil
40
97
  end
41
98
 
42
- def query(options = {})
43
- ::Waistband::Query.new self, options
99
+ def read!(id, options = {})
100
+ options = options.with_indifferent_access
101
+ type = options[:type] || default_type_name
102
+
103
+ client.get(
104
+ index: config_name,
105
+ type: type,
106
+ id: id
107
+ ).with_indifferent_access
44
108
  end
45
109
 
46
- def name
47
- @subs ? "#{@index}__#{@subs.join('_')}" : @index
110
+ def destroy(id, options = {})
111
+ destroy!(id, options)
112
+ rescue Elasticsearch::Transport::Transport::Errors::NotFound
113
+ nil
48
114
  end
49
115
 
50
- def config_name
51
- @subs ? "#{base_config_name}__#{@subs.join('_')}" : base_config_name
116
+ def destroy!(id, options = {})
117
+ options = options.with_indifferent_access
118
+ type = options[:type] || default_type_name
119
+
120
+ client.delete(
121
+ index: config_name,
122
+ id: id,
123
+ type: type
124
+ )
52
125
  end
53
126
 
54
- def config_json
55
- config.except('name', 'stringify').to_json
127
+ def search(body_hash)
128
+ page, page_size = get_page_info body_hash
129
+ body_hash = parse_search_body(body_hash)
130
+
131
+ search_hash = client.search(
132
+ index: config_name,
133
+ body: body_hash
134
+ )
135
+
136
+ ::Waistband::SearchResults.new(search_hash, page: page, page_size: page_size)
56
137
  end
57
138
 
58
- def base_config_name
59
- return config['name'] if config['name']
60
- "#{@base_name}_#{env}"
139
+ def alias(alias_name)
140
+ alias_name = full_alias_name alias_name
141
+ client.indices.put_alias(
142
+ index: config_name,
143
+ name: alias_name
144
+ )
61
145
  end
62
146
 
63
- def custom_name?
64
- !!config['name']
147
+ def alias_exists?(alias_name)
148
+ alias_name = full_alias_name alias_name
149
+ client.indices.alias_exists?(
150
+ index: config_name,
151
+ name: alias_name
152
+ )
65
153
  end
66
154
 
67
155
  def config
68
- Waistband.config.index @index
156
+ ::Waistband.config.index @index_name
69
157
  end
70
158
 
71
- def env
72
- Waistband.config.env
159
+ def client
160
+ @client ||= ::Waistband.config.client
73
161
  end
74
162
 
75
163
  private
76
164
 
165
+ def get_page_info(body_hash)
166
+ page = body_hash[:page]
167
+ page_size = body_hash[:page_size]
168
+ [page, page_size]
169
+ end
170
+
171
+ def parse_search_body(body_hash)
172
+ body_hash = body_hash.with_indifferent_access
173
+
174
+ page = body_hash.delete(:page)
175
+ page_size = body_hash.delete(:page_size)
176
+
177
+ if page
178
+ page = page.to_i
179
+ page_size ||= 20
180
+ body_hash[:from] = page_size * (page - 1) unless body_hash[:from]
181
+ body_hash[:size] = page_size unless body_hash[:size]
182
+ end
183
+
184
+ body_hash
185
+ end
186
+
187
+ def full_alias_name(alias_name)
188
+ name = alias_name
189
+ name << "_#{::Waistband.config.env}" unless custom_name?
190
+ name
191
+ end
192
+
193
+ def custom_name?
194
+ !!config['name']
195
+ end
196
+
77
197
  def stringify_all(data)
78
198
  data = if data.is_a? Array
79
199
  ::Waistband::StringifiedArray.new data
@@ -85,8 +205,22 @@ module Waistband
85
205
  data
86
206
  end
87
207
 
88
- def connection
89
- ::Waistband::Connection.new self
208
+ def default_type_name
209
+ @index_name.singularize
210
+ end
211
+
212
+ def settings
213
+ settings = config['settings']['index'].except('number_of_shards')
214
+ {index: settings}
215
+ end
216
+
217
+ def config_name
218
+ @subs ? "#{base_config_name}__#{@subs.join('_')}" : base_config_name
219
+ end
220
+
221
+ def base_config_name
222
+ return config['name'] if config['name']
223
+ "#{@index_name}_#{::Waistband.config.env}"
90
224
  end
91
225
 
92
226
  # /private