elasticsearch 7.1.0 → 8.10.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 89c5a9a719fabbe7d5bd4c1e1c98ae686e7da2552689213fafc49d4ade5a8c62
4
- data.tar.gz: 06ac4a280af9bd879e62bf8d244d046b2b5238631427487198fa9fd0fa4e9ca7
3
+ metadata.gz: 01703b098a99215d5b9b9591ed0e978843b55314b9dfa2d8fcf4abeebbd7dd1e
4
+ data.tar.gz: 907bf243f4dbc446d962b4055e80a08688b2a28c0fe3a087c8c4e8b648600380
5
5
  SHA512:
6
- metadata.gz: 793d00f82d71ec2b909d48cccd51993a53a5c937584fd5c1ee8c083754fcd712fe19a002bbd09113de5577a082947885528319b972b304dec6dcbc4a876b6324
7
- data.tar.gz: 19d977143f99755f022b04afd534951c303767b5e7d7436b12e17412aabc9e5150d33eaa41af77f3fa9de117d305f7d1dfce3dd43e9e9f801efcb4d0c20fca3a
6
+ metadata.gz: 0c1f8db8b41885f95e07b12e28f7d402c2695cb2497d0e1de348918c0f0ca7740731dfad4770bd3319e9df2c216b23e00e73ba4ca51e5be2686ee86f4b3b4889
7
+ data.tar.gz: 4e770597919914ff503d30c3a81bfa38d0361d626eb04e75a7168728baf402ea479bc7a84febde17abf1e7ca85f0a6efdafde429f2d28755966e2ddb06383334
data/.gitignore CHANGED
@@ -1,19 +1,6 @@
1
- # Licensed to Elasticsearch B.V. under one or more contributor
2
- # license agreements. See the NOTICE file distributed with
3
- # this work for additional information regarding copyright
4
- # ownership. Elasticsearch B.V. licenses this file to you under
5
- # the Apache License, Version 2.0 (the "License"); you may
6
- # not use this file except in compliance with the License.
7
- # You may obtain a copy of the License at
8
- #
9
- # http://www.apache.org/licenses/LICENSE-2.0
10
- #
11
- # Unless required by applicable law or agreed to in writing,
12
- # software distributed under the License is distributed on an
13
- # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14
- # KIND, either express or implied. See the License for the
15
- # specific language governing permissions and limitations
16
- # under the License.
1
+ # Licensed to Elasticsearch B.V under one or more agreements.
2
+ # Elasticsearch B.V licenses this file to you under the Apache 2.0 License.
3
+ # See the LICENSE file in the project root for more information
17
4
 
18
5
  *.gem
19
6
  *.rbc
data/Gemfile CHANGED
@@ -1,3 +1,5 @@
1
+ # frozen_string_literal: true
2
+
1
3
  # Licensed to Elasticsearch B.V. under one or more contributor
2
4
  # license agreements. See the NOTICE file distributed with
3
5
  # this work for additional information regarding copyright
@@ -6,7 +8,7 @@
6
8
  # not use this file except in compliance with the License.
7
9
  # You may obtain a copy of the License at
8
10
  #
9
- # http://www.apache.org/licenses/LICENSE-2.0
11
+ # http://www.apache.org/licenses/LICENSE-2.0
10
12
  #
11
13
  # Unless required by applicable law or agreed to in writing,
12
14
  # software distributed under the License is distributed on an
@@ -20,14 +22,6 @@ source 'https://rubygems.org'
20
22
  # Specify your gem's dependencies in elasticsearch.gemspec
21
23
  gemspec
22
24
 
23
- if File.exist? File.expand_path("../../elasticsearch-api/elasticsearch-api.gemspec", __FILE__)
24
- gem 'elasticsearch-api', :path => File.expand_path("../../elasticsearch-api", __FILE__), :require => false
25
- end
26
-
27
- if File.exist? File.expand_path("../../elasticsearch-transport/elasticsearch-transport.gemspec", __FILE__)
28
- gem 'elasticsearch-transport', :path => File.expand_path("../../elasticsearch-transport", __FILE__), :require => false
29
- end
30
-
31
- if File.exist? File.expand_path("../../elasticsearch-extensions", __FILE__)
32
- gem 'elasticsearch-extensions', :path => File.expand_path("../../elasticsearch-extensions", __FILE__), :require => true
25
+ if File.exist? File.expand_path('../elasticsearch-api/elasticsearch-api.gemspec', __dir__)
26
+ gem 'elasticsearch-api', path: File.expand_path('../elasticsearch-api', __dir__), require: false
33
27
  end
data/README.md CHANGED
@@ -2,6 +2,29 @@
2
2
 
3
3
  The `elasticsearch` library provides a Ruby client and API for [Elasticsearch](http://elasticsearch.com).
4
4
 
5
+ ## Usage
6
+
7
+ This gem is a wrapper for two separate libraries:
8
+
9
+ * [`elastic-transport`](https://github.com/elastic/elastic-transport-ruby/), which provides a low-level Ruby client for connecting to [Elastic](http://elasticsearch.com) services.
10
+ * [`elasticsearch-api`](https://github.com/elasticsearch/elasticsearch-ruby/tree/main/elasticsearch-api), which provides a Ruby API for the Elasticsearch RESTful API.
11
+
12
+ Install the `elasticsearch` package and use the API directly:
13
+
14
+ ```ruby
15
+ require 'elasticsearch'
16
+
17
+ client = Elasticsearch::Client.new(log: true)
18
+
19
+ client.cluster.health
20
+
21
+ client.transport.reload_connections!
22
+
23
+ client.search(q: 'test')
24
+
25
+ # etc.
26
+ ```
27
+
5
28
  Features overview:
6
29
 
7
30
  * Pluggable logging and tracing
@@ -14,24 +37,13 @@ Features overview:
14
37
  * Extensive documentation and examples
15
38
  * Emphasis on modularity and extendability of both the client and API libraries
16
39
 
17
- (For integration with Ruby models and Rails applications,
18
- see the <https://github.com/elasticsearch/elasticsearch-rails> project.)
40
+ (For integration with Ruby models and Rails applications, see the <https://github.com/elasticsearch/elasticsearch-rails> project.)
19
41
 
20
42
  ## Compatibility
21
43
 
22
- The Elasticsearch client for Ruby is compatible with Ruby 1.9 and higher.
44
+ We follow Ruby’s own maintenance policy and officially support all currently maintained versions per [Ruby Maintenance Branches](https://www.ruby-lang.org/en/downloads/branches/).
23
45
 
24
- The client's API is compatible with Elasticsearch's API versions from 0.90 till current,
25
- just use a release matching major version of Elasticsearch.
26
-
27
- | Ruby | | Elasticsearch |
28
- |:-------------:|:-:| :-----------: |
29
- | 0.90 | → | 0.90 |
30
- | 1.x | → | 1.x |
31
- | 2.x | → | 2.x |
32
- | 5.x | → | 5.x |
33
- | 6.x | → | 6.x |
34
- | master | → | master |
46
+ Language clients are forward compatible; meaning that clients support communicating with greater minor versions of Elasticsearch. Elastic language clients are also backwards compatible with lesser supported minor Elasticsearch versions.
35
47
 
36
48
  ## Installation
37
49
 
@@ -50,58 +62,68 @@ or install it from a source code checkout:
50
62
  bundle install
51
63
  rake install
52
64
 
53
- ## Usage
65
+ ## Configuration
54
66
 
55
- This library is a wrapper for two separate libraries:
67
+ * [Identifying running tasks with X-Opaque-Id](#identifying-running-tasks-with-x-opaque-id)
68
+ * [Api Key Authentication](#api-key-authentication)
56
69
 
57
- * [`elasticsearch-transport`](https://github.com/elasticsearch/elasticsearch-ruby/tree/master/elasticsearch-transport),
58
- which provides a low-level Ruby client for connecting to an [Elasticsearch](http://elasticsearch.com) cluster
59
- * [`elasticsearch-api`](https://github.com/elasticsearch/elasticsearch-ruby/tree/master/elasticsearch-api),
60
- which provides a Ruby API for the Elasticsearch RESTful API
70
+ ### Identifying running tasks with X-Opaque-Id
61
71
 
62
- Install the `elasticsearch` package and use the API directly:
72
+ The X-Opaque-Id header allows to track certain calls, or associate certain tasks with the client that started them ([more on the Elasticsearch docs](https://www.elastic.co/guide/en/elasticsearch/reference/master/tasks.html#_identifying_running_tasks)). To use this feature, you need to set an id for `opaque_id` on the client on each request. Example:
63
73
 
64
74
  ```ruby
65
- require 'elasticsearch'
75
+ client = Elasticsearch::Client.new
76
+ client.search(index: 'myindex', q: 'title:test', opaque_id: '123456')
77
+ ```
78
+ The search request will include the following HTTP Header:
79
+ ```
80
+ X-Opaque-Id: 123456
81
+ ```
66
82
 
67
- client = Elasticsearch::Client.new log: true
83
+ You can also set a prefix for X-Opaque-Id when initializing the client. This will be prepended to the id you set before each request if you're using X-Opaque-Id. Example:
84
+ ```ruby
85
+ client = Elastic::Transport::Client.new(opaque_id_prefix: 'eu-west1_')
86
+ client.search(index: 'myindex', q: 'title:test', opaque_id: '123456')
87
+ ```
88
+ The request will include the following HTTP Header:
89
+ ```
90
+ X-Opaque-Id: eu-west1_123456
91
+ ```
68
92
 
69
- client.cluster.health
93
+ ### Api Key Authentication
70
94
 
71
- client.transport.reload_connections!
95
+ You can use [**API Key authentication**](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-create-api-key.html):
72
96
 
73
- client.search q: 'test'
97
+ ``` ruby
98
+ Elasticsearch::Client.new(
99
+ host: host,
100
+ transport_options: transport_options,
101
+ api_key: credentials
102
+ )
103
+ ```
74
104
 
75
- # etc.
105
+ Where credentials is either the base64 encoding of `id` and `api_key` joined by a colon or a hash with the `id` and `api_key`:
106
+
107
+ ``` ruby
108
+ Elasticsearch::Client.new(
109
+ host: host,
110
+ transport_options: transport_options,
111
+ api_key: {id: 'my_id', api_key: 'my_api_key'}
112
+ )
76
113
  ```
77
114
 
115
+ ## API and Transport
116
+
78
117
  Please refer to the specific library documentation for details:
79
118
 
80
119
  * **Transport**:
81
- [[README]](https://github.com/elasticsearch/elasticsearch-ruby/blob/master/elasticsearch-transport/README.md)
82
- [[Documentation]](http://rubydoc.info/gems/elasticsearch-transport/file/README.markdown)
120
+ [[README]](https://github.com/elastic/elastic-transport-ruby#elastic-transport)
121
+ [[Documentation]](https://rubydoc.info/github/elastic/elastic-transport-ruby/)
83
122
 
84
123
  * **API**:
85
- [[README]](https://github.com/elasticsearch/elasticsearch-ruby/blob/master/elasticsearch-api/README.md)
86
- [[Documentation]](http://rubydoc.info/gems/elasticsearch-api/file/README.markdown)
124
+ [[README]](https://github.com/elastic/elasticsearch-ruby/tree/main/elasticsearch-api#elasticsearchapi)
125
+ [[Documentation]](https://rubydoc.info/gems/elasticsearch-api)
87
126
 
88
127
  ## License
89
128
 
90
- This software is licensed under the Apache 2 license, quoted below.
91
-
92
- Licensed to Elasticsearch B.V. under one or more contributor
93
- license agreements. See the NOTICE file distributed with
94
- this work for additional information regarding copyright
95
- ownership. Elasticsearch B.V. licenses this file to you under
96
- the Apache License, Version 2.0 (the "License"); you may
97
- not use this file except in compliance with the License.
98
- You may obtain a copy of the License at
99
-
100
- http://www.apache.org/licenses/LICENSE-2.0
101
-
102
- Unless required by applicable law or agreed to in writing,
103
- software distributed under the License is distributed on an
104
- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
105
- KIND, either express or implied. See the License for the
106
- specific language governing permissions and limitations
107
- under the License.
129
+ This software is licensed under the [Apache 2 license](./LICENSE).
data/Rakefile CHANGED
@@ -6,7 +6,7 @@
6
6
  # not use this file except in compliance with the License.
7
7
  # You may obtain a copy of the License at
8
8
  #
9
- # http://www.apache.org/licenses/LICENSE-2.0
9
+ # http://www.apache.org/licenses/LICENSE-2.0
10
10
  #
11
11
  # Unless required by applicable law or agreed to in writing,
12
12
  # software distributed under the License is distributed on an
@@ -15,41 +15,35 @@
15
15
  # specific language governing permissions and limitations
16
16
  # under the License.
17
17
 
18
- require "bundler/gem_tasks"
18
+ require 'bundler/gem_tasks'
19
19
 
20
- desc "Run unit tests"
21
- task :test => 'test:unit'
20
+ task(:default) { system 'rake --tasks' }
21
+
22
+ desc 'Run unit tests'
23
+ task test: 'test:spec'
22
24
 
23
25
  # ----- Test tasks ------------------------------------------------------------
26
+ require 'rspec/core/rake_task'
24
27
 
25
- require 'rake/testtask'
26
28
  namespace :test do
27
-
28
- desc "Wait for Elasticsearch to be in a green state"
29
+ desc 'Wait for Elasticsearch to be in a green state'
29
30
  task :wait_for_green do
30
31
  sh '../scripts/wait-cluster.sh'
31
32
  end
32
33
 
33
- Rake::TestTask.new(:unit) do |test|
34
- test.libs << 'lib' << 'test'
35
- test.test_files = FileList["test/unit/**/*_test.rb"]
36
- test.verbose = false
37
- test.warning = false
34
+ RSpec::Core::RakeTask.new(:integration) do |t|
35
+ t.pattern = 'spec/integration/**{,/*/**}/*_spec.rb'
38
36
  end
39
37
 
40
- Rake::TestTask.new(:integration) do |test|
41
- test.deps = [ :wait_for_green ]
42
- test.libs << 'lib' << 'test'
43
- test.test_files = FileList["test/integration/**/*_test.rb"]
44
- test.verbose = false
45
- test.warning = false
38
+ RSpec::Core::RakeTask.new(:unit) do |t|
39
+ t.pattern = 'spec/unit/**{,/*/**}/*_spec.rb'
46
40
  end
47
41
 
48
- Rake::TestTask.new(:all) do |test|
49
- test.libs << 'lib' << 'test'
50
- test.test_files = FileList["test/unit/**/*_test.rb", "test/integration/**/*_test.rb"]
42
+ desc 'Run unit and integration tests'
43
+ task :all do
44
+ Rake::Task['test:unit'].invoke
45
+ Rake::Task['test:integration'].invoke
51
46
  end
52
-
53
47
  end
54
48
 
55
49
  # ----- Documentation tasks ---------------------------------------------------
@@ -1,30 +1,14 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
3
  $LOAD_PATH.unshift(File.expand_path('../../elasticsearch/lib', __dir__))
4
- $LOAD_PATH.unshift(File.expand_path('../../elasticsearch-transport/lib', __dir__))
5
- $LOAD_PATH.unshift(File.expand_path('../../elasticsearch-dsl/lib', __dir__))
6
4
  $LOAD_PATH.unshift(File.expand_path('../../elasticsearch-api/lib', __dir__))
7
- $LOAD_PATH.unshift(File.expand_path('../../elasticsearch-xpack/lib', __dir__))
8
- $LOAD_PATH.unshift(File.expand_path('../../elasticsearch-extensions/lib', __dir__))
5
+ $LOAD_PATH.unshift(File.expand_path('../../elasticsearch/lib/elasticsearch/helpers', __dir__))
9
6
 
10
7
  require 'elasticsearch'
11
- require 'elasticsearch-transport'
12
8
  require 'elasticsearch-api'
13
-
14
- gems_not_loaded = ['elasticsearch-dsl', 'elasticsearch/xpack', 'elasticsearch-extensions'].reject do |gem|
15
- begin
16
- (require gem) || true
17
- rescue LoadError
18
- false
19
- end
20
- end
21
-
22
- unless gems_not_loaded.empty?
23
- warn "The following gems were not loaded: [#{gems_not_loaded.join(', ')}]. Please install and require them explicitly."
24
- end
9
+ require 'elasticsearch/helpers/bulk_helper'
25
10
 
26
11
  include Elasticsearch
27
- include Elasticsearch::DSL if defined?(Elasticsearch::DSL)
28
12
 
29
13
  begin
30
14
  require 'pry'
@@ -6,7 +6,7 @@
6
6
  # not use this file except in compliance with the License.
7
7
  # You may obtain a copy of the License at
8
8
  #
9
- # http://www.apache.org/licenses/LICENSE-2.0
9
+ # http://www.apache.org/licenses/LICENSE-2.0
10
10
  #
11
11
  # Unless required by applicable law or agreed to in writing,
12
12
  # software distributed under the License is distributed on an
@@ -20,66 +20,44 @@ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
20
20
  require 'elasticsearch/version'
21
21
 
22
22
  Gem::Specification.new do |s|
23
- s.name = "elasticsearch"
23
+ s.name = 'elasticsearch'
24
24
  s.version = Elasticsearch::VERSION
25
- s.authors = ["Karel Minarik"]
26
- s.email = ["karel.minarik@elasticsearch.org"]
27
- s.summary = "Ruby integrations for Elasticsearch"
28
- s.homepage = "http://github.com/elasticsearch/elasticsearch-ruby"
29
- s.license = "Apache-2.0"
25
+ s.authors = ['Karel Minarik', 'Emily Stolfo', 'Fernando Briano']
26
+ s.email = ['clients-team@elastic.co']
27
+ s.summary = 'Ruby integrations for Elasticsearch'
28
+ s.homepage = 'https://www.elastic.co/guide/en/elasticsearch/client/ruby-api/current/index.html'
29
+ s.license = 'Apache-2.0'
30
+ s.metadata = {
31
+ 'homepage_uri' => 'https://www.elastic.co/guide/en/elasticsearch/client/ruby-api/current/index.html',
32
+ 'changelog_uri' => 'https://github.com/elastic/elasticsearch-ruby/blob/main/CHANGELOG.md',
33
+ 'source_code_uri' => 'https://github.com/elastic/elasticsearch-ruby/tree/main',
34
+ 'bug_tracker_uri' => 'https://github.com/elastic/elasticsearch-ruby/issues'
35
+ }
36
+ s.files = `git ls-files`.split($/)
37
+ s.executables = s.files.grep(%r{^bin/}) { |f| File.basename(f) }
38
+ s.executables << 'elastic_ruby_console'
39
+ s.test_files = s.files.grep(%r{^(test|spec|features)/})
40
+ s.require_paths = ['lib']
41
+ s.bindir = 'bin'
30
42
 
31
- s.files = `git ls-files`.split($/)
32
- s.executables = s.files.grep(%r{^bin/}) { |f| File.basename(f) }
33
- s.executables << 'elastic_ruby_console'
34
- s.test_files = s.files.grep(%r{^(test|spec|features)/})
35
- s.require_paths = ["lib"]
36
- s.bindir = "bin"
43
+ s.extra_rdoc_files = ['README.md', 'LICENSE.txt']
44
+ s.rdoc_options = ['--charset=UTF-8']
37
45
 
38
- s.extra_rdoc_files = [ "README.md", "LICENSE.txt" ]
39
- s.rdoc_options = [ "--charset=UTF-8" ]
46
+ s.required_ruby_version = '>= 2.5'
40
47
 
41
- s.required_ruby_version = '>= 1.9'
48
+ s.add_dependency 'elastic-transport', '~> 8'
49
+ s.add_dependency 'elasticsearch-api', '8.10.0'
42
50
 
43
- s.add_dependency "elasticsearch-transport", '7.1.0'
44
- s.add_dependency "elasticsearch-api", '7.1.0'
45
-
46
- s.add_development_dependency "bundler"
47
-
48
- if defined?(RUBY_VERSION) && RUBY_VERSION > '1.9'
49
- s.add_development_dependency "rake", "~> 11.1"
50
- else
51
- s.add_development_dependency "rake", "< 11.0"
52
- end
53
-
54
- if defined?(RUBY_VERSION) && RUBY_VERSION > '1.9'
55
- s.add_development_dependency "elasticsearch-extensions"
56
- end
57
-
58
- s.add_development_dependency "ansi"
59
- s.add_development_dependency "shoulda-context"
60
- s.add_development_dependency "mocha"
61
- s.add_development_dependency "yard"
62
- s.add_development_dependency "pry"
63
-
64
- # Prevent unit test failures on Ruby 1.8
65
- if defined?(RUBY_VERSION) && RUBY_VERSION < '1.9'
66
- s.add_development_dependency "test-unit", '~> 2'
67
- s.add_development_dependency "json", '~> 1.8'
68
- end
69
-
70
- if defined?(RUBY_VERSION) && RUBY_VERSION > '1.9'
71
- s.add_development_dependency "minitest"
72
- s.add_development_dependency "minitest-reporters"
73
- s.add_development_dependency "ruby-prof" unless defined?(JRUBY_VERSION) || defined?(Rubinius)
74
- s.add_development_dependency "require-prof" unless defined?(JRUBY_VERSION) || defined?(Rubinius)
75
- s.add_development_dependency "simplecov"
76
- s.add_development_dependency "simplecov-rcov"
77
- s.add_development_dependency "cane"
78
- end
79
-
80
- if defined?(RUBY_VERSION) && RUBY_VERSION > '2.2'
81
- s.add_development_dependency "test-unit", '~> 2'
82
- end
51
+ s.add_development_dependency 'bundler'
52
+ s.add_development_dependency 'byebug' unless defined?(JRUBY_VERSION) || defined?(Rubinius)
53
+ s.add_development_dependency 'pry'
54
+ s.add_development_dependency 'rake'
55
+ s.add_development_dependency 'require-prof' unless defined?(JRUBY_VERSION) || defined?(Rubinius)
56
+ s.add_development_dependency 'rspec'
57
+ s.add_development_dependency 'ruby-prof' unless defined?(JRUBY_VERSION) || defined?(Rubinius)
58
+ s.add_development_dependency 'simplecov'
59
+ s.add_development_dependency 'webmock'
60
+ s.add_development_dependency 'yard'
83
61
 
84
62
  s.description = <<-DESC.gsub(/^ /, '')
85
63
  Ruby integrations for Elasticsearch (client, API, etc.)
@@ -0,0 +1,127 @@
1
+ # Licensed to Elasticsearch B.V. under one or more contributor
2
+ # license agreements. See the NOTICE file distributed with
3
+ # this work for additional information regarding copyright
4
+ # ownership. Elasticsearch B.V. licenses this file to you under
5
+ # the Apache License, Version 2.0 (the "License"); you may
6
+ # not use this file except in compliance with the License.
7
+ # You may obtain a copy of the License at
8
+ #
9
+ # http://www.apache.org/licenses/LICENSE-2.0
10
+ #
11
+ # Unless required by applicable law or agreed to in writing,
12
+ # software distributed under the License is distributed on an
13
+ # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14
+ # KIND, either express or implied. See the License for the
15
+ # specific language governing permissions and limitations
16
+ # under the License.
17
+
18
+ module Elasticsearch
19
+ module Helpers
20
+ # Elasticsearch Client Helper for the Bulk API
21
+ #
22
+ # @see https://www.elastic.co/guide/en/elasticsearch/reference/master/docs-bulk.html
23
+ #
24
+ class BulkHelper
25
+ attr_accessor :index
26
+
27
+ # Create a BulkHelper
28
+ #
29
+ # @param [Elasticsearch::Client] client Instance of Elasticsearch client to use.
30
+ # @param [String] index Index on which to perform the Bulk actions.
31
+ # @param [Hash] params Parameters to re-use in every bulk call
32
+ #
33
+ def initialize(client, index, params = {})
34
+ @client = client
35
+ @index = index
36
+ @params = params
37
+ end
38
+
39
+ # Index documents using the Bulk API.
40
+ #
41
+ # @param [Array<Hash>] docs The documents to be indexed.
42
+ # @param [Hash] params Parameters to use in the bulk ingestion. See the official Elastic documentation for Bulk API for parameters to send to the Bulk API.
43
+ # @option params [Integer] slice number of documents to send to the Bulk API for eatch batch of ingestion.
44
+ # @param block [Block] Optional block to run after ingesting a batch of documents.
45
+ # @yieldparam response [Elasticsearch::Transport::Response] The response object from calling the Bulk API.
46
+ # @yieldparam ingest_docs [Array<Hash>] The collection of documents sent in the bulk request.
47
+ #
48
+ def ingest(docs, params = {}, body = {}, &block)
49
+ ingest_docs = docs.map { |doc| { index: { _index: @index, data: doc} } }
50
+ if (slice = params.delete(:slice))
51
+ ingest_docs.each_slice(slice) { |items| ingest(items, params, &block) }
52
+ else
53
+ bulk_request(ingest_docs, params, &block)
54
+ end
55
+ end
56
+
57
+ # Delete documents using the Bulk API
58
+ #
59
+ # @param [Array] ids Array of id's for documents to delete.
60
+ # @param [Hash] params Parameters to send to bulk delete.
61
+ #
62
+ def delete(ids, params = {}, body = {})
63
+ delete_docs = ids.map { |id| { delete: { _index: @index, _id: id} } }
64
+ @client.bulk({ body: delete_docs }.merge(params.merge(@params)))
65
+ end
66
+
67
+ # Update documents using the Bulk API
68
+ #
69
+ # @param [Array<Hash>] docs (Required) The documents to be updated.
70
+ # @option params [Integer] slice number of documents to send to the Bulk API for eatch batch of updates.
71
+ # @param block [Block] Optional block to run after ingesting a batch of documents.
72
+ #
73
+ # @yieldparam response [Elasticsearch::Transport::Response] The response object from calling the Bulk API.
74
+ # @yieldparam ingest_docs [Array<Hash>] The collection of documents sent in the bulk request.
75
+ #
76
+ def update(docs, params = {}, body = {}, &block)
77
+ ingest_docs = docs.map do |doc|
78
+ { update: { _index: @index, _id: doc.delete('id'), data: { doc: doc } } }
79
+ end
80
+ if (slice = params.delete(:slice))
81
+ ingest_docs.each_slice(slice) { |items| update(items, params, &block) }
82
+ else
83
+ bulk_request(ingest_docs, params, &block)
84
+ end
85
+ end
86
+
87
+ # Ingest data directly from a JSON file
88
+ #
89
+ # @param [String] file (Required) The file path.
90
+ # @param [Hash] params Parameters to use in the bulk ingestion.
91
+ # @option params [Integer] slice number of documents to send to the Bulk API for eatch batch of updates.
92
+ # @option params [Array|String] keys If the data needs to be digged from the JSON file, the
93
+ # keys can be passed in with this parameter to find it.
94
+ #
95
+ # E.g.: If the data in the parsed JSON Hash is found in
96
+ # +json_parsed['data']['items']+, keys would be passed
97
+ # like this (as an Array):
98
+ #
99
+ # +bulk_helper.ingest_json(file, { keys: ['data', 'items'] })+
100
+ #
101
+ # or as a String:
102
+ #
103
+ # +bulk_helper.ingest_json(file, { keys: 'data, items' })+
104
+ #
105
+ # @yieldparam response [Elasticsearch::Transport::Response] The response object from calling the Bulk API.
106
+ # @yieldparam ingest_docs [Array<Hash>] The collection of documents sent in the bulk request.
107
+ #
108
+ def ingest_json(file, params = {}, &block)
109
+ data = JSON.parse(File.read(file))
110
+ if (keys = params.delete(:keys))
111
+ keys = keys.split(',') if keys.is_a?(String)
112
+ data = data.dig(*keys)
113
+ end
114
+
115
+ ingest(data, params, &block)
116
+ end
117
+
118
+ private
119
+
120
+ def bulk_request(ingest_docs, params, &block)
121
+ response = @client.bulk({ body: ingest_docs }.merge(params.merge(@params)))
122
+ yield response, ingest_docs if block_given?
123
+ response
124
+ end
125
+ end
126
+ end
127
+ end
@@ -0,0 +1,95 @@
1
+ # Licensed to Elasticsearch B.V. under one or more contributor
2
+ # license agreements. See the NOTICE file distributed with
3
+ # this work for additional information regarding copyright
4
+ # ownership. Elasticsearch B.V. licenses this file to you under
5
+ # the Apache License, Version 2.0 (the "License"); you may
6
+ # not use this file except in compliance with the License.
7
+ # You may obtain a copy of the License at
8
+ #
9
+ # http://www.apache.org/licenses/LICENSE-2.0
10
+ #
11
+ # Unless required by applicable law or agreed to in writing,
12
+ # software distributed under the License is distributed on an
13
+ # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14
+ # KIND, either express or implied. See the License for the
15
+ # specific language governing permissions and limitations
16
+ # under the License.
17
+
18
+ module Elasticsearch
19
+ module Helpers
20
+ # Elasticsearch Client Helper for the Scroll API
21
+ #
22
+ # @see https://www.elastic.co/guide/en/elasticsearch/reference/current/scroll-api.html
23
+ #
24
+ class ScrollHelper
25
+ include Enumerable
26
+
27
+ # Create a ScrollHelper
28
+ #
29
+ # @param [Elasticsearch::Client] client (Required) Instance of Elasticsearch client to use.
30
+ # @param [String] index (Required) Index on which to perform the Bulk actions.
31
+ # @param [Hash] body Body parameters to re-use in every scroll request
32
+ # @param [Time] scroll Specify how long a consistent view of the index should be maintained for scrolled search
33
+ #
34
+ def initialize(client, index, body, scroll = '1m')
35
+ @index = index
36
+ @client = client
37
+ @scroll = scroll
38
+ @body = body
39
+ end
40
+
41
+ # Implementation of +each+ for Enumerable module inclusion
42
+ #
43
+ # @yieldparam document [Hash] yields a document found in the search hits.
44
+ #
45
+ def each(&block)
46
+ @docs = []
47
+ @scroll_id = nil
48
+ refresh_docs
49
+ for doc in @docs do
50
+ refresh_docs
51
+ yield doc
52
+ end
53
+ clear
54
+ end
55
+
56
+ # Results from a scroll.
57
+ # Can be called repeatedly (e.g. in a loop) to get the scroll pages.
58
+ #
59
+ def results
60
+ if @scroll_id
61
+ scroll_request
62
+ else
63
+ initial_search
64
+ end
65
+ rescue StandardError => e
66
+ raise e
67
+ end
68
+
69
+ # Clear Scroll and resets inner documents collection
70
+ #
71
+ def clear
72
+ @client.clear_scroll(body: { scroll_id: @scroll_id }) if @scroll_id
73
+ @docs = []
74
+ end
75
+
76
+ private
77
+
78
+ def refresh_docs
79
+ @docs ||= []
80
+ @docs << results
81
+ @docs.flatten!
82
+ end
83
+
84
+ def initial_search
85
+ response = @client.search(index: @index, scroll: @scroll, body: @body)
86
+ @scroll_id = response['_scroll_id']
87
+ response['hits']['hits']
88
+ end
89
+
90
+ def scroll_request
91
+ @client.scroll(body: {scroll: @scroll, scroll_id: @scroll_id})['hits']['hits']
92
+ end
93
+ end
94
+ end
95
+ end