elasticsearch-transport-sinneduy 1.0.12

Sign up to get free protection for your applications and to get access to all the features.
Files changed (38) hide show
  1. checksums.yaml +7 -0
  2. data/.gitignore +17 -0
  3. data/Gemfile +16 -0
  4. data/LICENSE.txt +13 -0
  5. data/README.md +441 -0
  6. data/Rakefile +80 -0
  7. data/elasticsearch-transport.gemspec +74 -0
  8. data/lib/elasticsearch-transport.rb +1 -0
  9. data/lib/elasticsearch/transport.rb +30 -0
  10. data/lib/elasticsearch/transport/client.rb +195 -0
  11. data/lib/elasticsearch/transport/transport/base.rb +284 -0
  12. data/lib/elasticsearch/transport/transport/connections/collection.rb +93 -0
  13. data/lib/elasticsearch/transport/transport/connections/connection.rb +121 -0
  14. data/lib/elasticsearch/transport/transport/connections/selector.rb +63 -0
  15. data/lib/elasticsearch/transport/transport/errors.rb +73 -0
  16. data/lib/elasticsearch/transport/transport/http/curb.rb +87 -0
  17. data/lib/elasticsearch/transport/transport/http/faraday.rb +60 -0
  18. data/lib/elasticsearch/transport/transport/http/manticore.rb +124 -0
  19. data/lib/elasticsearch/transport/transport/response.rb +21 -0
  20. data/lib/elasticsearch/transport/transport/serializer/multi_json.rb +36 -0
  21. data/lib/elasticsearch/transport/transport/sniffer.rb +46 -0
  22. data/lib/elasticsearch/transport/version.rb +5 -0
  23. data/test/integration/client_test.rb +144 -0
  24. data/test/integration/transport_test.rb +73 -0
  25. data/test/profile/client_benchmark_test.rb +125 -0
  26. data/test/test_helper.rb +76 -0
  27. data/test/unit/client_test.rb +274 -0
  28. data/test/unit/connection_collection_test.rb +88 -0
  29. data/test/unit/connection_selector_test.rb +64 -0
  30. data/test/unit/connection_test.rb +100 -0
  31. data/test/unit/response_test.rb +15 -0
  32. data/test/unit/serializer_test.rb +16 -0
  33. data/test/unit/sniffer_test.rb +145 -0
  34. data/test/unit/transport_base_test.rb +478 -0
  35. data/test/unit/transport_curb_test.rb +97 -0
  36. data/test/unit/transport_faraday_test.rb +140 -0
  37. data/test/unit/transport_manticore_test.rb +118 -0
  38. metadata +408 -0
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: ccea5bbc0bac232c14a2b9edfcd999b67097711f
4
+ data.tar.gz: cbedd9065932ec155d016a719c1eb9a255ce16de
5
+ SHA512:
6
+ metadata.gz: 99109f3f1a801a276f77395c31771d2fc6ffc7791df57290baf6df5e5025660b7ec913e428f97f4c0f188b8a55a01fdd612c0f9dabbb4fca23c0e48a3f49260b
7
+ data.tar.gz: ca1cb8294dd8716659235df49c470218519882a79994e258efe9cc1d7f5777c62b3faf282076ce2303610fab441ebdb86ab6ffd13ffbcf7b21b5fa0d265c9d44
@@ -0,0 +1,17 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ Gemfile.lock
7
+ InstalledFiles
8
+ _yardoc
9
+ coverage
10
+ doc/
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
data/Gemfile ADDED
@@ -0,0 +1,16 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in elasticsearch-transport.gemspec
4
+ gemspec
5
+
6
+ if File.exists? File.expand_path("../../elasticsearch-api/elasticsearch-api.gemspec", __FILE__)
7
+ gem 'elasticsearch-api', :path => File.expand_path("../../elasticsearch-api", __FILE__), :require => false
8
+ end
9
+
10
+ if File.exists? File.expand_path("../../elasticsearch-extensions", __FILE__)
11
+ gem 'elasticsearch-extensions', :path => File.expand_path("../../elasticsearch-extensions", __FILE__), :require => false
12
+ end
13
+
14
+ if File.exists? File.expand_path("../../elasticsearch/elasticsearch.gemspec", __FILE__)
15
+ gem 'elasticsearch', :path => File.expand_path("../../elasticsearch", __FILE__), :require => false
16
+ end
@@ -0,0 +1,13 @@
1
+ Copyright (c) 2013 Elasticsearch
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
@@ -0,0 +1,441 @@
1
+ # Elasticsearch::Transport
2
+
3
+ **This library is part of the [`elasticsearch-ruby`](https://github.com/elasticsearch/elasticsearch-ruby/) package;
4
+ please refer to it, unless you want to use this library standalone.**
5
+
6
+ ----
7
+
8
+ The `elasticsearch-transport` library provides a low-level Ruby client for connecting
9
+ to an [Elasticsearch](http://elasticsearch.org) cluster.
10
+
11
+ It handles connecting to multiple nodes in the cluster, rotating across connections,
12
+ logging and tracing requests and responses, maintaining failed connections,
13
+ discovering nodes in the cluster, and provides an abstraction for
14
+ data serialization and transport.
15
+
16
+ It does not handle calling the Elasticsearch API;
17
+ see the [`elasticsearch-api`](https://github.com/elasticsearch/elasticsearch-ruby/tree/master/elasticsearch-api) library.
18
+
19
+ The library is compatible with Ruby 1.8.7 or higher and with Elasticsearch 0.90 and 1.0.
20
+
21
+ Features overview:
22
+
23
+ * Pluggable logging and tracing
24
+ * Plugabble connection selection strategies (round-robin, random, custom)
25
+ * Pluggable transport implementation, customizable and extendable
26
+ * Pluggable serializer implementation
27
+ * Request retries and dead connections handling
28
+ * Node reloading (based on cluster state) on errors or on demand
29
+
30
+ For optimal performance, use a HTTP library which supports persistent ("keep-alive") connections,
31
+ such as [Typhoeus](https://github.com/typhoeus/typhoeus).
32
+ Just require the library (`require 'typhoeus'; require 'typhoeus/adapters/faraday'`) in your code,
33
+ and it will be automatically used; currently these libraries will be automatically detected and used:
34
+ [Patron](https://github.com/toland/patron),
35
+ [HTTPClient](https://rubygems.org/gems/httpclient) and
36
+ [Net::HTTP::Persistent](https://rubygems.org/gems/net-http-persistent).
37
+
38
+ For detailed information, see example configurations [below](#transport-implementations).
39
+
40
+ ## Installation
41
+
42
+ Install the package from [Rubygems](https://rubygems.org):
43
+
44
+ gem install elasticsearch-transport
45
+
46
+ To use an unreleased version, either add it to your `Gemfile` for [Bundler](http://gembundler.com):
47
+
48
+ gem 'elasticsearch-transport', git: 'git://github.com/elasticsearch/elasticsearch-ruby.git'
49
+
50
+ or install it from a source code checkout:
51
+
52
+ git clone https://github.com/elasticsearch/elasticsearch-ruby.git
53
+ cd elasticsearch-ruby/elasticsearch-transport
54
+ bundle install
55
+ rake install
56
+
57
+ ## Example Usage
58
+
59
+ In the simplest form, connect to Elasticsearch running on <http://localhost:9200>
60
+ without any configuration:
61
+
62
+ require 'elasticsearch/transport'
63
+
64
+ client = Elasticsearch::Client.new
65
+ response = client.perform_request 'GET', '_cluster/health'
66
+ # => #<Elasticsearch::Transport::Transport::Response:0x007fc5d506ce38 @status=200, @body={ ... } >
67
+
68
+ Full documentation is available at <http://rubydoc.info/gems/elasticsearch-transport>.
69
+
70
+ ## Configuration
71
+
72
+ The client supports many configurations options for setting up and managing connections,
73
+ configuring logging, customizing the transport library, etc.
74
+
75
+ ### Setting Hosts
76
+
77
+ To connect to a specific Elasticsearch host:
78
+
79
+ Elasticsearch::Client.new host: 'search.myserver.com'
80
+
81
+ To connect to a host with specific port:
82
+
83
+ Elasticsearch::Client.new host: 'myhost:8080'
84
+
85
+ To connect to multiple hosts:
86
+
87
+ Elasticsearch::Client.new hosts: ['myhost1', 'myhost2']
88
+
89
+ Instead of Strings, you can pass host information as an array of Hashes:
90
+
91
+ Elasticsearch::Client.new hosts: [ { host: 'myhost1', port: 8080 }, { host: 'myhost2', port: 8080 } ]
92
+
93
+ Common URL parts -- scheme, HTTP authentication credentials, URL prefixes, etc -- are handled automatically:
94
+
95
+ Elasticsearch::Client.new url: 'https://username:password@api.server.org:4430/search'
96
+
97
+ You can pass multiple URLs separated by a comma:
98
+
99
+ Elasticsearch::Client.new urls: 'http://localhost:9200,http://localhost:9201'
100
+
101
+ Another way to configure the URL(s) is to export the `ELASTICSEARCH_URL` variable.
102
+
103
+ The client will automatically round-robin across the hosts
104
+ (unless you select or implement a different [connection selector](#connection-selector)).
105
+
106
+ ### Authentication
107
+
108
+ You can pass the authentication credentials, scheme and port in the host configuration hash:
109
+
110
+ Elasticsearch::Client.new hosts: [
111
+ { host: 'my-protected-host',
112
+ port: '443',
113
+ user: 'USERNAME',
114
+ password: 'PASSWORD',
115
+ scheme: 'https'
116
+ } ]
117
+
118
+ ... or simply use the common URL format:
119
+
120
+ Elasticsearch::Client.new url: 'https://username:password@example.com:9200'
121
+
122
+ To pass a custom certificate for SSL peer verification to Faraday-based clients,
123
+ use the `transport_options` option:
124
+
125
+ Elasticsearch::Client.new url: 'https://username:password@example.com:9200',
126
+ transport_options: { ssl: { ca_file: '/path/to/cacert.pem' } }
127
+
128
+ ### Logging
129
+
130
+ To log requests and responses to standard output with the default logger (an instance of Ruby's {::Logger} class),
131
+ set the `log` argument:
132
+
133
+ Elasticsearch::Client.new log: true
134
+
135
+ To trace requests and responses in the _Curl_ format, set the `trace` argument:
136
+
137
+ Elasticsearch::Client.new trace: true
138
+
139
+ You can customize the default logger or tracer:
140
+
141
+ client.transport.logger.formatter = proc { |s, d, p, m| "#{s}: #{m}\n" }
142
+ client.transport.logger.level = Logger::INFO
143
+
144
+ Or, you can use a custom {::Logger} instance:
145
+
146
+ Elasticsearch::Client.new logger: Logger.new(STDERR)
147
+
148
+ You can pass the client any conforming logger implementation:
149
+
150
+ require 'logging' # https://github.com/TwP/logging/
151
+
152
+ log = Logging.logger['elasticsearch']
153
+ log.add_appenders Logging.appenders.stdout
154
+ log.level = :info
155
+
156
+ client = Elasticsearch::Client.new logger: log
157
+
158
+ ### Setting Timeouts
159
+
160
+ For many operations in Elasticsearch, the default timeouts of HTTP libraries are too low.
161
+ To increase the timeout, you can use the `request_timeout` parameter:
162
+
163
+ Elasticsearch::Client.new request_timeout: 5*60
164
+
165
+ You can also use the `transport_options` argument documented below.
166
+
167
+ ### Randomizing Hosts
168
+
169
+ If you pass multiple hosts to the client, it rotates across them in a round-robin fashion, by default.
170
+ When the same client would be running in multiple processes (eg. in a Ruby web server such as Thin),
171
+ it might keep connecting to the same nodes "at once". To prevent this, you can randomize the hosts
172
+ collection on initialization and reloading:
173
+
174
+ Elasticsearch::Client.new hosts: ['localhost:9200', 'localhost:9201'], randomize_hosts: true
175
+
176
+ ### Retrying on Failures
177
+
178
+ When the client is initialized with multiple hosts, it makes sense to retry a failed request
179
+ on a different host:
180
+
181
+ Elasticsearch::Client.new hosts: ['localhost:9200', 'localhost:9201'], retry_on_failure: true
182
+
183
+ You can specify how many times should the client retry the request before it raises an exception
184
+ (the default is 3 times):
185
+
186
+ Elasticsearch::Client.new hosts: ['localhost:9200', 'localhost:9201'], retry_on_failure: 5
187
+
188
+ ### Reloading Hosts
189
+
190
+ Elasticsearch by default dynamically discovers new nodes in the cluster. You can leverage this
191
+ in the client, and periodically check for new nodes to spread the load.
192
+
193
+ To retrieve and use the information from the
194
+ [_Nodes Info API_](http://www.elasticsearch.org/guide/reference/api/admin-cluster-nodes-info/)
195
+ on every 10,000th request:
196
+
197
+ Elasticsearch::Client.new hosts: ['localhost:9200', 'localhost:9201'], reload_connections: true
198
+
199
+ You can pass a specific number of requests after which the reloading should be performed:
200
+
201
+ Elasticsearch::Client.new hosts: ['localhost:9200', 'localhost:9201'], reload_connections: 1_000
202
+
203
+ To reload connections on failures, use:
204
+
205
+ Elasticsearch::Client.new hosts: ['localhost:9200', 'localhost:9201'], reload_on_failure: true
206
+
207
+ The reloading will timeout if not finished under 1 second by default. To change the setting:
208
+
209
+ Elasticsearch::Client.new hosts: ['localhost:9200', 'localhost:9201'], sniffer_timeout: 3
210
+
211
+ ### Connection Selector
212
+
213
+ By default, the client will rotate the connections in a round-robin fashion, using the
214
+ {Elasticsearch::Transport::Transport::Connections::Selector::RoundRobin} strategy.
215
+
216
+ You can implement your own strategy to customize the behaviour. For example,
217
+ let's have a "rack aware" strategy, which will prefer the nodes with a specific
218
+ [attribute](https://github.com/elasticsearch/elasticsearch/blob/1.0/config/elasticsearch.yml#L81-L85).
219
+ Only when these would be unavailable, the strategy will use the other nodes:
220
+
221
+ class RackIdSelector
222
+ include Elasticsearch::Transport::Transport::Connections::Selector::Base
223
+
224
+ def select(options={})
225
+ connections.select do |c|
226
+ # Try selecting the nodes with a `rack_id:x1` attribute first
227
+ c.host[:attributes] && c.host[:attributes][:rack_id] == 'x1'
228
+ end.sample || connections.to_a.sample
229
+ end
230
+ end
231
+
232
+ Elasticsearch::Client.new hosts: ['x1.search.org', 'x2.search.org'], selector_class: RackIdSelector
233
+
234
+ ### Transport Implementations
235
+
236
+ By default, the client will use the [_Faraday_](https://rubygems.org/gems/faraday) HTTP library
237
+ as a transport implementation.
238
+
239
+ It will auto-detect and use an _adapter_ for _Faraday_ based on gems loaded in your code,
240
+ preferring HTTP clients with support for persistent connections.
241
+
242
+ To use the [_Patron_](https://github.com/toland/patron) HTTP, for example, just require it:
243
+
244
+ require 'patron'
245
+
246
+ Then, create a new client, and the _Patron_ gem will be used as the "driver":
247
+
248
+ client = Elasticsearch::Client.new
249
+
250
+ client.transport.connections.first.connection.builder.handlers
251
+ # => [Faraday::Adapter::Patron]
252
+
253
+ 10.times do
254
+ client.nodes.stats(metric: 'http')['nodes'].values.each do |n|
255
+ puts "#{n['name']} : #{n['http']['total_opened']}"
256
+ end
257
+ end
258
+
259
+ # => Stiletoo : 24
260
+ # => Stiletoo : 24
261
+ # => Stiletoo : 24
262
+ # => ...
263
+
264
+ To use a specific adapter for _Faraday_, pass it as the `adapter` argument:
265
+
266
+ client = Elasticsearch::Client.new adapter: :net_http_persistent
267
+
268
+ client.transport.connections.first.connection.builder.handlers
269
+ # => [Faraday::Adapter::NetHttpPersistent]
270
+
271
+ To configure the _Faraday_ instance, pass a configuration block to the transport constructor:
272
+
273
+ require 'typhoeus'
274
+ require 'typhoeus/adapters/faraday'
275
+
276
+ transport_configuration = lambda do |f|
277
+ f.response :logger
278
+ f.adapter :typhoeus
279
+ end
280
+
281
+ transport = Elasticsearch::Transport::Transport::HTTP::Faraday.new \
282
+ hosts: [ { host: 'localhost', port: '9200' } ],
283
+ &transport_configuration
284
+
285
+ # Pass the transport to the client
286
+ #
287
+ client = Elasticsearch::Client.new transport: transport
288
+
289
+ To pass options to the
290
+ [`Faraday::Connection`](https://github.com/lostisland/faraday/blob/master/lib/faraday/connection.rb)
291
+ constructor, use the `transport_options` key:
292
+
293
+ client = Elasticsearch::Client.new transport_options: {
294
+ request: { open_timeout: 1 },
295
+ headers: { user_agent: 'MyApp' },
296
+ params: { :format => 'yaml' },
297
+ ssl: { verify: false }
298
+ }
299
+
300
+ You can also use a bundled [_Curb_](https://rubygems.org/gems/curb) based transport implementation:
301
+
302
+ require 'curb'
303
+ require 'elasticsearch/transport/transport/http/curb'
304
+
305
+ client = Elasticsearch::Client.new transport_class: Elasticsearch::Transport::Transport::HTTP::Curb
306
+
307
+ client.transport.connections.first.connection
308
+ # => #<Curl::Easy http://localhost:9200/>
309
+
310
+ It's possible to customize the _Curb_ instance by passing a block to the constructor as well
311
+ (in this case, as an inline block):
312
+
313
+ transport = Elasticsearch::Transport::Transport::HTTP::Curb.new \
314
+ hosts: [ { host: 'localhost', port: '9200' } ],
315
+ & lambda { |c| c.verbose = true }
316
+
317
+ client = Elasticsearch::Client.new transport: transport
318
+
319
+ Instead of passing the transport to the constructor, you can inject it at run time:
320
+
321
+ # Set up the transport
322
+ #
323
+ faraday_configuration = lambda do |f|
324
+ f.instance_variable_set :@ssl, { verify: false }
325
+ f.adapter :excon
326
+ end
327
+
328
+ faraday_client = Elasticsearch::Transport::Transport::HTTP::Faraday.new \
329
+ hosts: [ { host: 'my-protected-host',
330
+ port: '443',
331
+ user: 'USERNAME',
332
+ password: 'PASSWORD',
333
+ scheme: 'https'
334
+ }],
335
+ &faraday_configuration
336
+
337
+ # Create a default client
338
+ #
339
+ client = Elasticsearch::Client.new
340
+
341
+ # Inject the transport to the client
342
+ #
343
+ client.transport = faraday_client
344
+
345
+ You can write your own transport implementation easily, by including the
346
+ {Elasticsearch::Transport::Transport::Base} module, implementing the required contract,
347
+ and passing it to the client as the `transport_class` parameter -- or injecting it directly.
348
+
349
+ ### Serializer Implementations
350
+
351
+ By default, the [MultiJSON](http://rubygems.org/gems/multi_json) library is used as the
352
+ serializer implementation, and it will pick up the "right" adapter based on gems available.
353
+
354
+ The serialization component is pluggable, though, so you can write your own by including the
355
+ {Elasticsearch::Transport::Transport::Serializer::Base} module, implementing the required contract,
356
+ and passing it to the client as the `serializer_class` or `serializer` parameter.
357
+
358
+ ### Exception Handling
359
+
360
+ The library defines a [number of exception classes](https://github.com/elasticsearch/elasticsearch-ruby/blob/master/elasticsearch-transport/lib/elasticsearch/transport/transport/errors.rb)
361
+ for various client and server errors, as well as unsuccessful HTTP responses,
362
+ making it possible to `rescue` specific exceptions with desired granularity.
363
+
364
+ The highest-level exception is {Elasticsearch::Transport::Transport::Error}
365
+ and will be raised for any generic client *or* server errors.
366
+
367
+ {Elasticsearch::Transport::Transport::ServerError} will be raised for server errors only.
368
+
369
+ As an example for response-specific errors, a `404` response status will raise
370
+ an {Elasticsearch::Transport::Transport::Errors::NotFound} exception.
371
+
372
+ Finally, {Elasticsearch::Transport::Transport::SnifferTimeoutError} will be raised
373
+ when connection reloading ("sniffing") times out.
374
+
375
+ ## Development and Community
376
+
377
+ For local development, clone the repository and run `bundle install`. See `rake -T` for a list of
378
+ available Rake tasks for running tests, generating documentation, starting a testing cluster, etc.
379
+
380
+ Bug fixes and features must be covered by unit tests. Integration tests are written in Ruby 1.9 syntax.
381
+
382
+ Github's pull requests and issues are used to communicate, send bug reports and code contributions.
383
+
384
+ ## The Architecture
385
+
386
+ * {Elasticsearch::Transport::Client} is composed of {Elasticsearch::Transport::Transport}
387
+
388
+ * {Elasticsearch::Transport::Transport} is composed of {Elasticsearch::Transport::Transport::Connections},
389
+ and an instance of logger, tracer, serializer and sniffer.
390
+
391
+ * Logger and tracer can be any object conforming to Ruby logging interface,
392
+ ie. an instance of [`Logger`](http://www.ruby-doc.org/stdlib-1.9.3/libdoc/logger/rdoc/Logger.html),
393
+ [_log4r_](https://rubygems.org/gems/log4r), [_logging_](https://github.com/TwP/logging/), etc.
394
+
395
+ * The {Elasticsearch::Transport::Transport::Serializer::Base} implementations handle converting data for Elasticsearch
396
+ (eg. to JSON). You can implement your own serializer.
397
+
398
+ * {Elasticsearch::Transport::Transport::Sniffer} allows to discover nodes in the cluster and use them as connections.
399
+
400
+ * {Elasticsearch::Transport::Transport::Connections::Collection} is composed of
401
+ {Elasticsearch::Transport::Transport::Connections::Connection} instances and a selector instance.
402
+
403
+ * {Elasticsearch::Transport::Transport::Connections::Connection} contains the connection attributes such as hostname and port,
404
+ as well as the concrete persistent "session" connected to a specific node.
405
+
406
+ * The {Elasticsearch::Transport::Transport::Connections::Selector::Base} implementations allow to choose connections
407
+ from the pool, eg. in a round-robin or random fashion. You can implement your own selector strategy.
408
+
409
+ ## Development
410
+
411
+ To work on the code, clone and bootstrap the main repository first --
412
+ please see instructions in the main [README](../README.md#development).
413
+
414
+ To run tests, launch a testing cluster -- again, see instructions
415
+ in the main [README](../README.md#development) -- and use the Rake tasks:
416
+
417
+ ```
418
+ time rake test:unit
419
+ time rake test:integration
420
+ ```
421
+
422
+ Unit tests have to use Ruby 1.8 compatible syntax, integration tests
423
+ can use Ruby 2.x syntax and features.
424
+
425
+ ## License
426
+
427
+ This software is licensed under the Apache 2 license, quoted below.
428
+
429
+ Copyright (c) 2013 Elasticsearch <http://www.elasticsearch.org>
430
+
431
+ Licensed under the Apache License, Version 2.0 (the "License");
432
+ you may not use this file except in compliance with the License.
433
+ You may obtain a copy of the License at
434
+
435
+ http://www.apache.org/licenses/LICENSE-2.0
436
+
437
+ Unless required by applicable law or agreed to in writing, software
438
+ distributed under the License is distributed on an "AS IS" BASIS,
439
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
440
+ See the License for the specific language governing permissions and
441
+ limitations under the License.