logstash-input-elastic_jdbc 0.1.4 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: eb430d13b3d23a0c91d2a550aa65efc1beef6a4ff7dfdd726192ebb39c8427e1
4
- data.tar.gz: b443c0f66c569b9b3c2ce3dd9ab5d18cff46c7ec7601bbf0211a18fd8addb541
3
+ metadata.gz: 956e0978c2ea4e70c4b8410a90fc610113f3c46c12afdec85014ce6ffcd2d1ae
4
+ data.tar.gz: c191601527d553ba277c4383122ba053c5fb02a836f77416b65b117260a36cab
5
5
  SHA512:
6
- metadata.gz: d0ffd506739f0cf07b6536ca137aaa9b46f54ac74a8a9e86fd048447760371cb0701ab0eaea8729f258ea82ef7165b9446d86fa4dc28d18b3907f7b5865457b8
7
- data.tar.gz: 9c7f4e71931d00c2fd5d8ca46893290dd4cf2a57d9ddf6b82fc63ac3015814db949d29dbca2971136a1742a1913ed51afd88c49c2397332e3701270bc4d7e9bd
6
+ metadata.gz: d89379f8a8a87ffde484cd105b1fb9fe3760498375ee3b8bf90e230da185feaeea199a4b942b74e61049035ad8ad076898aecd9f9e2a95ab9a61a90107a2d42e
7
+ data.tar.gz: ba304b692417cd56fdc4aba032c4a58be96033b44bd2b522aadf531e6156b3b64c5d95f388bb3912b7079283fdf571690b14fcb6fed7571d812d9d5c35b2cc8a
data/README.md CHANGED
@@ -1,33 +1,66 @@
1
1
  # Logstash Plugin
2
- [GitHub](https://github.com/ernesrocker/RubyGems).
2
+ [GitHub](https://github.com/ernesrocker/logstash-input-elastic_jdbc).
3
3
  This is a plugin for [Logstash](https://github.com/elastic/logstash).
4
4
 
5
5
  It is fully free and fully open source.
6
6
 
7
+ ## How install
8
+ ```sh
9
+ sudo /usr/share/logstash/bin/logstash-plugin install logstash-input-elastic_jdbc.gem
10
+ ```
11
+
7
12
  ## Documentation
8
- This plugin inherit of elasticsearch input plugin, and added a tracking_column
13
+ This plugin inherit of elasticsearch(**ES**) input plugin, and added a tracking_column
9
14
  using in jdbc input plugin for make a query to obtain the updates values
10
15
  Sample :
11
- input {
12
- # Read all documents from Elasticsearch matching the given query
13
- elastic_jdbc {
16
+ ```logstash
17
+ input{
18
+ elastic_jdbc{
14
19
  hosts => "localhost"
20
+ index => "documents"
15
21
  tracking_column => "last_update"
16
- last_run_metadata_path => "/path_file"
17
- }
18
- }
19
-
20
- Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elastic.co/guide/en/logstash/current/).
21
-
22
- - For formatting code or config example, you can use the asciidoc `[source,ruby]` directive
23
- - For more asciidoc formatting tips, see the excellent reference here https://github.com/elastic/docs#asciidoc-guide
24
-
25
- ## Need Help?
26
-
27
- Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum.
28
-
29
- ## Developing
30
-
22
+ query => '{"query":{"range":{"created":{"gte":"2021-08-13T00:17:58+00:00"}}}}'
23
+ last_run_metadata_path => "/opt/logstash/last_run/elastic_jdbc_documents"
24
+ }
25
+ }
26
+ filter {
27
+ }
28
+ output{
29
+ stdout{}
30
+ }
31
+ ```
32
+ In the sample before, we read from ES cluster, **documents** index, where documents hits have last_update field as
33
+ a **date** type field (recommend use [Ingest pipelines](https://www.elastic.co/guide/en/elasticsearch/reference/7.x/ingest.html)),
34
+ then we look for all documents that have a field value **last_update** greater than the value stored in `/opt/logstash/last_run/elastic jdbc_documents" `.
35
+
36
+ ####Required parameters:
37
+ * `hosts`: ES cluster url
38
+ * `index`: ES index
39
+ * `tracking_column`: Date field to tracking in ES index
40
+ * `last_run_metadata_path` : File path where stored the last value from last hist readed from ES index. By the default have the date `1960-01-01`
41
+
42
+ ####Optional parameters:
43
+ * All [logstash-input-elasticsearch](https://rubygems.org/gems/logstash-input-elasticsearch) parameters can use in this plugins.
44
+ * `query`: By the default we use a bool query where we get a hits with `tracking column` greater that last value stored in `last_run_metadata_path`.
45
+ you can insert a query, but keep in mind that your query always be appended with the default query ( *if you don't need search by tracking column,
46
+ please use [logstash-input-elasticsearch](https://rubygems.org/gems/logstash-input-elasticsearch) plugin*).
47
+ Sample, for this query parameter ``query => '{"query":{"range":{"created":{"gte":"2021-08-13T00:17:58+00:00"}}}}'``,
48
+ the final query using this plugin would be:
49
+
50
+ ```{
51
+ "query":{
52
+ "bool":{
53
+ "must":[
54
+ {"range": {"last_update":{"gt": "date_time_value_stored"}}},
55
+ {"range":{"abonado_date":{"gte": "2021-08-13T00:17:58+00:00"}}}
56
+ ]
57
+ }
58
+ },
59
+ sort: [{"last_update"=>{:order=>"asc"}}]
60
+ }
61
+ ```
62
+ **Note:** If you insert a ranking attribute within the query, we always overwrite it with the ranking values shown above.
63
+
31
64
  ### 1. Plugin Developement and Testing
32
65
 
33
66
  #### Code
@@ -1,8 +1,7 @@
1
1
  # encoding: utf-8
2
2
  require "logstash/inputs/base"
3
3
  require "logstash/inputs/elasticsearch"
4
- require_relative "value_tracking"
5
- require "json"
4
+ require_relative "../../logstash/inputs/value_tracking"
6
5
  require "logstash/json"
7
6
  require "time"
8
7
 
@@ -23,6 +22,9 @@ require "time"
23
22
  class LogStash::Inputs::ElasticJdbc < LogStash::Inputs::Elasticsearch
24
23
  config_name "elastic_jdbc"
25
24
 
25
+ # Overwrite query default of elasticsearch plugin. We build a default query in this plugins.
26
+ config :query, :validate => :string, :default => '{}'
27
+
26
28
  #region tracking configuration
27
29
  # Path to file with last run time
28
30
  config :last_run_metadata_path, :validate => :string, :default => "#{ENV['HOME']}/.logstash_jdbc_last_run"
@@ -43,16 +45,12 @@ class LogStash::Inputs::ElasticJdbc < LogStash::Inputs::Elasticsearch
43
45
 
44
46
  public
45
47
  def register
46
- super
47
- begin
48
- if @tracking_column.nil?
49
- raise(LogStash::ConfigurationError, "Must set :tracking_column if :use_column_value is true.")
50
- end
51
- set_value_tracker(ValueTracking.build_last_value_tracker(self))
52
- build_query()
53
- rescue
54
- puts "[ERROR:ELASTIC_JDBC]: #{e}"
48
+ if @tracking_column.nil?
49
+ raise(LogStash::ConfigurationError, "Must set :tracking_column if :use_column_value is true.")
55
50
  end
51
+ @value_tracker = ValueTracking.build_last_value_tracker(self)
52
+ super
53
+ build_query
56
54
  end # def register
57
55
 
58
56
  def set_value_tracker(instance)
@@ -60,25 +58,79 @@ class LogStash::Inputs::ElasticJdbc < LogStash::Inputs::Elasticsearch
60
58
  end
61
59
 
62
60
  def build_query
61
+ input_query = @base_query
62
+ # Remove sort tag from base query. We only sort by tracking column
63
+ input_query.delete("sort")
63
64
  time_now = Time.now.utc
64
65
  last_value = @value_tracker ? Time.parse(@value_tracker.value.to_s).iso8601 : Time.parse(time_now).iso8601
65
66
  column = @tracking_column.to_s
66
- query = {query: { range: {column => {gt: last_value.to_s}}}, sort: [{column => {order: "desc"}}]}
67
- @query = query.to_json
68
- @base_query = LogStash::Json.load(@query)
67
+ query_default = {query: { bool: { must: [ {range: {column => {gt: last_value.to_s}}} ]}}}
68
+ if !input_query.nil? and !input_query.empty?
69
+ query_conditions = input_query["query"]
70
+ if query_conditions
71
+ must_statement = query_default[:query][:bool][:must]
72
+ final_must_cond = must_statement.append(query_conditions)
73
+ query_default[:query][:bool][:must] = final_must_cond
74
+ end
75
+ end
76
+ sort_condition = [{column => {order: "asc"}}]
77
+ query_default[:sort] = sort_condition
78
+ @base_query = LogStash::Json.load(query_default.to_json)
69
79
  end
70
80
 
71
81
  def run(output_queue)
72
- begin
73
- super
74
- @value_tracker.set_value(Time.now.to_s)
75
- @value_tracker.write
76
- rescue Exception =>e
77
- puts "[ERROR:ELASTIC_JDBC]: #{e.message}"
78
- puts "Full exception: #{e}"
79
- end
82
+ super
80
83
  end # def run
81
84
 
85
+ def do_run_slice(output_queue, slice_id=nil)
86
+ slice_query = @base_query
87
+ slice_query = slice_query.merge('slice' => { 'id' => slice_id, 'max' => @slices}) unless slice_id.nil?
88
+
89
+ slice_options = @options.merge(:body => LogStash::Json.dump(slice_query) )
90
+ logger.info("Slice starting", slice_id: slice_id, slices: @slices) unless slice_id.nil?
91
+ r = search_request(slice_options)
92
+
93
+ r['hits']['hits'].each { |hit| push_hit(hit, output_queue) }
94
+ logger.debug("Slice progress", slice_id: slice_id, slices: @slices) unless slice_id.nil?
95
+
96
+ has_hits = r['hits']['hits'].any?
97
+
98
+ while has_hits && r['_scroll_id'] && !stop?
99
+ r = process_next_scroll(output_queue, r['_scroll_id'])
100
+ logger.debug("Slice progress", slice_id: slice_id, slices: @slices) unless slice_id.nil?
101
+ has_hits = r['has_hits']
102
+ end
103
+ logger.info("Slice complete", slice_id: slice_id, slices: @slices) unless slice_id.nil?
104
+ end
105
+
106
+ def push_hit(hit, output_queue)
107
+ event = LogStash::Event.new(hit['_source'])
108
+
109
+ if @docinfo
110
+ # do not assume event[@docinfo_target] to be in-place updatable. first get it, update it, then at the end set it in the event.
111
+ docinfo_target = event.get(@docinfo_target) || {}
112
+
113
+ unless docinfo_target.is_a?(Hash)
114
+ @logger.error("Elasticsearch Input: Incompatible Event, incompatible type for the docinfo_target=#{@docinfo_target} field in the `_source` document, expected a hash got:", :docinfo_target_type => docinfo_target.class, :event => event)
115
+
116
+ # TODO: (colin) I am not sure raising is a good strategy here?
117
+ raise Exception.new("Elasticsearch input: incompatible event")
118
+ end
119
+
120
+ @docinfo_fields.each do |field|
121
+ docinfo_target[field] = hit[field]
122
+ end
123
+
124
+ event.set(@docinfo_target, docinfo_target)
125
+ end
126
+
127
+ decorate(event)
128
+ output_queue << event
129
+ # Write in the file the last_update value register in the event.
130
+ @value_tracker.set_value(event.get(@tracking_column))
131
+ @value_tracker.write
132
+ end
133
+
82
134
  def stop
83
135
  super
84
136
  end
@@ -1,10 +1,12 @@
1
+ # coding: utf-8
1
2
  Gem::Specification.new do |s|
2
3
  s.name = 'logstash-input-elastic_jdbc'
3
- s.version = '0.1.4'
4
+ s.version = '1.0.0'
4
5
  s.licenses = ['Apache-2.0']
5
- s.summary = 'Logstash elastic_jdbc'
6
- s.description = 'This plugin inherit of elasticsearch input plugin, but added tracking_column like jdbc input plugin.'
7
- s.homepage = 'https://github.com/ernesrocker/RubyGems'
6
+ s.summary = 'Reads querys from Elasticsearch cluster and write last run file.'
7
+ s.description = 'This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname.
8
+ This gem is not a stand-alone program. Also, this plugin inherit of elasticsearch input plugin, but added tracking_column like jdbc input plugin.'
9
+ s.homepage = 'https://github.com/ernesrocker/logstash-input-elastic_jdbc'
8
10
  s.authors = ['Ernesto Soler Calaña']
9
11
  s.email = 'ernes920825@gmail.com'
10
12
  s.require_paths = ['lib']
@@ -22,4 +24,5 @@ Gem::Specification.new do |s|
22
24
  s.add_runtime_dependency 'logstash-codec-plain'
23
25
  s.add_runtime_dependency 'stud', '>= 0.0.22'
24
26
  s.add_development_dependency 'logstash-devutils', '~> 0.0', '>= 0.0.16'
27
+ s.add_development_dependency 'logstash-input-elasticsearch', '>= 4.3.1'
25
28
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: logstash-input-elastic_jdbc
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.4
4
+ version: 1.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ernesto Soler Calaña
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-11-06 00:00:00.000000000 Z
11
+ date: 2021-09-13 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: logstash-core-plugin-api
@@ -72,8 +72,23 @@ dependencies:
72
72
  - - ">="
73
73
  - !ruby/object:Gem::Version
74
74
  version: 0.0.16
75
- description: This plugin inherit of elasticsearch input plugin, but added tracking_column
76
- like jdbc input plugin.
75
+ - !ruby/object:Gem::Dependency
76
+ name: logstash-input-elasticsearch
77
+ requirement: !ruby/object:Gem::Requirement
78
+ requirements:
79
+ - - ">="
80
+ - !ruby/object:Gem::Version
81
+ version: 4.3.1
82
+ type: :development
83
+ prerelease: false
84
+ version_requirements: !ruby/object:Gem::Requirement
85
+ requirements:
86
+ - - ">="
87
+ - !ruby/object:Gem::Version
88
+ version: 4.3.1
89
+ description: |-
90
+ This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname.
91
+ This gem is not a stand-alone program. Also, this plugin inherit of elasticsearch input plugin, but added tracking_column like jdbc input plugin.
77
92
  email: ernes920825@gmail.com
78
93
  executables: []
79
94
  extensions: []
@@ -89,7 +104,7 @@ files:
89
104
  - lib/logstash/inputs/value_tracking.rb
90
105
  - logstash-input-elastic-jdbc.gemspec
91
106
  - spec/inputs/elastic-jdbc_spec.rb
92
- homepage: https://github.com/ernesrocker/RubyGems
107
+ homepage: https://github.com/ernesrocker/logstash-input-elastic_jdbc
93
108
  licenses:
94
109
  - Apache-2.0
95
110
  metadata:
@@ -114,6 +129,6 @@ rubyforge_project:
114
129
  rubygems_version: 2.7.6
115
130
  signing_key:
116
131
  specification_version: 4
117
- summary: Logstash elastic_jdbc
132
+ summary: Reads querys from Elasticsearch cluster and write last run file.
118
133
  test_files:
119
134
  - spec/inputs/elastic-jdbc_spec.rb