fluent-plugin-output-solr 0.4.12 → 0.4.13

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 69c8450973f1699e1a6d085914b2aff6b33336c9
4
- data.tar.gz: 1c45581028a1e2554c9c34859af1353b2078214c
3
+ metadata.gz: e7291cfc870c7f8be862adb23b42f1cf9072f697
4
+ data.tar.gz: 64a33feafa219c2798fd1a3897d9122f24bd3b08
5
5
  SHA512:
6
- metadata.gz: 3b34e96a6bf48d9c402970c54feb4ebeab5dae6a3b1671e8508cb38343d418a46e2451446cb2fd6fbaa71942129def9bae6cc80c0b13ef3c072a315ed8990e46
7
- data.tar.gz: 7f6687e3baba7bd6c7ae7cd97367bf82757a3383cb58c322720fa69c6bf5fd056805dbfb8d3459e2ed31b7d4a284069c495b7c7983e7c00a6000b91a7ba50a78
6
+ metadata.gz: f99cf5bfcb8dde3440bf65183594508923a20e09670e9ab78f122b64f53fa8efad0a82befa3cf95a26bf2e9438cbf8aa1bb3c681e4888e6b42fbbc4bc8844e54
7
+ data.tar.gz: b847b79fa7cfcd2311ad515c4732971dd47065ef745c237431fe0bb1b11324480cc08b4c0648c2528e06d41114c3c03efe7653d7f80d6111fa2fff3ceb7a61f7
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Fluent::Plugin::SolrOutput
2
2
 
3
- This is a [Fluentd](http://fluentd.org/) output plugin for send data to [Apache Solr](http://lucene.apache.org/solr/). It support [SolrCloud](https://cwiki.apache.org/confluence/display/solr/SolrCloud) not only Standalone Solr.
3
+ This is a [Fluentd](http://fluentd.org/) output plugin for send data to [Apache Solr](http://lucene.apache.org/solr/).
4
4
 
5
5
  ## Requirements
6
6
 
@@ -32,12 +32,12 @@ $ rake install
32
32
 
33
33
  ## Config parameters
34
34
 
35
- ### url
35
+ ### base_url
36
36
 
37
- The Solr server url (for example http://localhost:8983/solr/collection1).
37
+ The Solr base url (for example http://localhost:8983/solr).
38
38
 
39
39
  ```
40
- url http://localhost:8983/solr/collection1
40
+ base_url http://localhost:8983/solr
41
41
  ```
42
42
 
43
43
  ### zk_host
@@ -50,20 +50,12 @@ zk_host localhost:2181/solr
50
50
 
51
51
  ### collection
52
52
 
53
- The SolrCloud collection name (default collection1).
53
+ The Solr collection/core name (default collection1).
54
54
 
55
55
  ```
56
56
  collection collection1
57
57
  ```
58
58
 
59
- ### defined_fields
60
-
61
- The defined fields in the Solr schema.xml. If omitted, it will get fields via Solr Schema API.
62
-
63
- ```
64
- defined_fields ["id", "title"]
65
- ```
66
-
67
59
  ### ignore_undefined_fields
68
60
 
69
61
  Ignore undefined fields in the Solr schema.xml.
@@ -72,20 +64,12 @@ Ignore undefined fields in the Solr schema.xml.
72
64
  ignore_undefined_fields false
73
65
  ```
74
66
 
75
- ### unique_key_field
76
-
77
- A field name of unique key in the Solr schema.xml. If omitted, it will get unique key via Solr Schema API.
78
-
79
- ```
80
- unique_key_field id
81
- ```
82
-
83
- ### string_field_value_max_length
67
+ ### tag_field
84
68
 
85
- A string field value max length. If set -1, it means unlimited (default -1). However, there is a limit of Solr.
69
+ A field name of fluentd tag in the Solr schema.xml (default tag).
86
70
 
87
71
  ```
88
- string_field_value_max_length -1
72
+ tag_field tag
89
73
  ```
90
74
 
91
75
  ### time_field
@@ -135,8 +119,11 @@ commit_with_flush true
135
119
  <match something.logs>
136
120
  @type solr
137
121
 
138
- # The Solr server url (for example http://localhost:8983/solr/collection1).
139
- url http://localhost:8983/solr/collection1
122
+ # The Solr base url (for example http://localhost:8983/solr).
123
+ base_url http://localhost:8983/solr
124
+
125
+ # The Solr collection/core name (default collection1).
126
+ collection collection1
140
127
  </match>
141
128
  ```
142
129
 
@@ -148,83 +135,11 @@ commit_with_flush true
148
135
  # The ZooKeeper connection string that SolrCloud refers to (for example localhost:2181/solr).
149
136
  zk_host localhost:2181/solr
150
137
 
151
- # The SolrCloud collection name (default collection1).
138
+ # The Solr collection/core name (default collection1).
152
139
  collection collection1
153
140
  </match>
154
141
  ```
155
142
 
156
- ## Solr setup examples
157
-
158
- ### How to setup Standalone Solr using data-driven schemaless mode.
159
-
160
- 1.Download and install Solr
161
-
162
- ```sh
163
- $ mkdir $HOME/solr
164
- $ cd $HOME/solr
165
- $ wget https://archive.apache.org/dist/lucene/solr/5.4.0/solr-5.4.0.tgz
166
- $ tar zxvf solr-5.4.0.tgz
167
- $ cd solr-5.4.0
168
- ```
169
-
170
- 2.Start standalone Solr
171
-
172
- ```sh
173
- $ ./bin/solr start -p 8983 -s server/solr
174
- ```
175
-
176
- 3.Create core
177
-
178
- ```sh
179
- $ ./bin/solr create -c collection1 -d server/solr/configsets/data_driven_schema_configs -n collection1_configs
180
- ```
181
-
182
- ### How to setup SolrCloud using data-driven schemaless mode (shards=1 and replicationfactor=2).
183
-
184
- 1.Download and install ZooKeeper
185
-
186
- ```sh
187
- $ mkdir $HOME/zookeeper
188
- $ cd $HOME/zookeeper
189
- $ wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
190
- $ tar zxvf zookeeper-3.4.6.tar.gz
191
- $ cd zookeeper-3.4.6
192
- $ cp -p ./conf/zoo_sample.cfg ./conf/zoo.cfg
193
- ```
194
-
195
- 2.Start standalone ZooKeeper
196
-
197
- ```sh
198
- $ ./bin/zkServer.sh start
199
- ```
200
-
201
- 3.Download an install Solr
202
-
203
- ```sh
204
- $ mkdir $HOME/solr
205
- $ cd $HOME/solr
206
- $ wget https://archive.apache.org/dist/lucene/solr/5.4.0/solr-5.4.0.tgz
207
- $ tar zxvf solr-5.4.0.tgz
208
- $ cd solr-5.4.0
209
- $ ./server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd clear /solr
210
- $ ./server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd makepath /solr
211
- $ cp -pr server/solr server/solr1
212
- $ cp -pr server/solr server/solr2
213
- ```
214
-
215
- 4.Start SolrCloud
216
-
217
- ```sh
218
- $ ./bin/solr start -h localhost -p 8983 -z localhost:2181/solr -s server/solr1
219
- $ ./bin/solr start -h localhost -p 8985 -z localhost:2181/solr -s server/solr2
220
- ```
221
-
222
- 5.Create collection
223
-
224
- ```sh
225
- $ ./bin/solr create -c collection1 -d server/solr1/configsets/data_driven_schema_configs -n collection1_configs -shards 1 -replicationFactor 2
226
- ```
227
-
228
143
  ## Development
229
144
 
230
145
  After checking out the repo, run `bundle install` to install dependencies. Then, run `rake test` to run the tests.
@@ -3,21 +3,21 @@ lib = File.expand_path('../lib', __FILE__)
3
3
  $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
4
 
5
5
  Gem::Specification.new do |spec|
6
- spec.name = "fluent-plugin-output-solr"
7
- spec.version = "0.4.12"
8
- spec.authors = ["Minoru Osuka"]
9
- spec.email = ["minoru.osuka@gmail.com"]
6
+ spec.name = 'fluent-plugin-output-solr'
7
+ spec.version = '0.4.13'
8
+ spec.authors = ['Minoru Osuka']
9
+ spec.email = ['minoru.osuka@gmail.com']
10
10
 
11
- spec.summary = "Fluent output plugin for sending data to Apache Solr."
12
- spec.description = "Fluent output plugin for sending data to Apache Solr. It support SolrCloud not only Standalone Solr."
13
- spec.homepage = "https://github.com/mosuka/fluent-plugin-output-solr"
11
+ spec.summary = 'Fluent output plugin for sending data to Apache Solr.'
12
+ spec.description = 'Fluent output plugin for sending data to Apache Solr.'
13
+ spec.homepage = 'https://github.com/mosuka/fluent-plugin-output-solr'
14
14
 
15
- spec.license = "Apache-2.0"
15
+ spec.license = 'Apache-2.0'
16
16
 
17
17
  spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
18
- spec.bindir = "exe"
18
+ spec.bindir = 'exe'
19
19
  spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
20
- spec.require_paths = ["lib"]
20
+ spec.require_paths = ['lib']
21
21
 
22
22
  spec.add_runtime_dependency 'fluentd', '~> 0.12.0'
23
23
  spec.add_runtime_dependency 'rsolr-cloud', '~> 1.1.0'
data/fluent.conf CHANGED
@@ -15,11 +15,17 @@
15
15
 
16
16
  <match messages>
17
17
  @type solr
18
- # url http://localhost:8983/solr/collection1
19
- zk_host localhost:2181/solr
18
+ base_url http://localhost:8983/solr
19
+ # zk_host localhost:2181/solr
20
20
  collection collection1
21
21
  ignore_undefined_fields false
22
+ tag_field tag
23
+ time_field time
24
+ time_format %FT%TZ
25
+ millisecond true
22
26
  flush_size 100
27
+ commit_with_flush true
28
+
23
29
  buffer_type memory
24
30
  buffer_queue_limit 64m
25
31
  buffer_chunk_limit 8m
@@ -8,12 +8,15 @@ module Fluent
8
8
  Fluent::Plugin.register_output('solr', self)
9
9
 
10
10
  DEFAULT_COLLECTION = 'collection1'
11
- DEFAULT_IGNORE_UNDEFINED_FIELDS = false
12
- DEFAULT_STRING_FIELD_VALUE_MAX_LENGTH = -1
11
+
13
12
  DEFAULT_TAG_FIELD = 'tag'
13
+
14
14
  DEFAULT_TIME_FIELD = 'time'
15
15
  DEFAULT_TIME_FORMAT = '%FT%TZ'
16
16
  DEFAULT_MILLISECOND = false
17
+
18
+ DEFAULT_IGNORE_UNDEFINED_FIELDS = false
19
+
17
20
  DEFAULT_FLUSH_SIZE = 100
18
21
  DEFAULT_COMMIT_WITH_FLUSH = true
19
22
 
@@ -26,25 +29,21 @@ module Fluent
26
29
  include Fluent::SetTimeKeyMixin
27
30
  config_set_default :include_time_key, false
28
31
 
29
- config_param :url, :string, :default => nil,
30
- :desc => 'The Solr server url (for example http://localhost:8983/solr/collection1).'
32
+ config_param :base_url, :string, :default => nil,
33
+ :desc => 'The Solr base url (for example http://localhost:8983/solr).'
31
34
 
32
35
  config_param :zk_host, :string, :default => nil,
33
36
  :desc => 'The ZooKeeper connection string that SolrCloud refers to (for example localhost:2181/solr).'
37
+
34
38
  config_param :collection, :string, :default => DEFAULT_COLLECTION,
35
- :desc => 'The SolrCloud collection name (default collection1).'
39
+ :desc => 'The Solr collection/core name (default collection1).'
36
40
 
37
- config_param :defined_fields, :array, :default => nil,
38
- :desc => 'The defined fields in the Solr schema.xml. If omitted, it will get fields via Solr Schema API.'
39
41
  config_param :ignore_undefined_fields, :bool, :default => DEFAULT_IGNORE_UNDEFINED_FIELDS,
40
42
  :desc => 'Ignore undefined fields in the Solr schema.xml.'
41
- config_param :string_field_value_max_length, :integer, :default => DEFAULT_STRING_FIELD_VALUE_MAX_LENGTH,
42
- :desc => 'Field value max length.'
43
43
 
44
- config_param :unique_key_field, :string, :default => nil,
45
- :desc => 'A field name of unique key in the Solr schema.xml. If omitted, it will get unique key via Solr Schema API.'
46
44
  config_param :tag_field, :string, :default => DEFAULT_TAG_FIELD,
47
45
  :desc => 'A field name of fluentd tag in the Solr schema.xml (default tag).'
46
+
48
47
  config_param :time_field, :string, :default => DEFAULT_TIME_FIELD,
49
48
  :desc => 'A field name of event timestamp in the Solr schema.xml (default time).'
50
49
  config_param :time_format, :string, :default => DEFAULT_TIME_FORMAT,
@@ -70,7 +69,7 @@ module Fluent
70
69
  super
71
70
 
72
71
  @mode = nil
73
- if ! @url.nil? then
72
+ if ! @base_url.nil? then
74
73
  @mode = MODE_STANDALONE
75
74
  elsif ! @zk_host.nil?
76
75
  @mode = MODE_SOLRCLOUD
@@ -80,7 +79,7 @@ module Fluent
80
79
  @zk = nil
81
80
 
82
81
  if @mode == MODE_STANDALONE then
83
- @solr = RSolr.connect :url => @url
82
+ @solr = RSolr.connect :url => @base_url.end_with?('/') ? @base_url + @collection : @base_url + '/' + @collection
84
83
  elsif @mode == MODE_SOLRCLOUD then
85
84
  @zk = ZK.new(@zk_host)
86
85
  cloud_connection = RSolr::Cloud::Connection.new(@zk)
@@ -103,145 +102,114 @@ module Fluent
103
102
  def write(chunk)
104
103
  documents = []
105
104
 
106
- @fields = @defined_fields.nil? ? get_fields : @defined_fields
107
- @unique_key = @unique_key_field.nil? ? get_unique_key : @unique_key_field
105
+ # Get fields from Solr
106
+ fields = get_fields
107
+
108
+ # Get unique key field from Solr
109
+ unique_key = get_unique_key
108
110
 
109
111
  chunk.msgpack_each do |tag, time, record|
110
- unless record.has_key?(@unique_key) then
111
- record.merge!({@unique_key => SecureRandom.uuid})
112
+ # Set unique key and value
113
+ unless record.has_key?(unique_key) then
114
+ record.merge!({unique_key => SecureRandom.uuid})
112
115
  end
113
116
 
117
+ # Set Fluentd tag to Solr tag field
114
118
  unless record.has_key?(@tag_field) then
115
119
  record.merge!({@tag_field => tag})
116
120
  end
117
121
 
122
+ # Set time
123
+ tmp_time = Time.at(time).utc
118
124
  if record.has_key?(@time_field) then
125
+ # Parsing the time field in the record by the specified format.
119
126
  begin
120
127
  tmp_time = Time.strptime(record[@time_field], @time_format).utc
121
- if @millisecond then
122
- record.merge!({@time_field => '%s.%03dZ' % [tmp_time.strftime('%FT%T'), tmp_time.usec / 1000.0]})
123
- else
124
- record.merge!({@time_field => tmp_time.strftime('%FT%TZ')})
125
- end
126
- rescue
127
- tmp_time = Time.at(time).utc
128
- if @millisecond then
129
- record.merge!({@time_field => '%s.%03dZ' % [tmp_time.strftime('%FT%T'), tmp_time.usec / 1000.0]})
130
- else
131
- record.merge!({@time_field => tmp_time.strftime('%FT%TZ')})
132
- end
128
+ rescue Exception => e
129
+ log.warn "An error occurred in parsing the time field: #{e.message}"
133
130
  end
131
+ end
132
+ if @millisecond then
133
+ record.merge!({@time_field => '%s.%03dZ' % [tmp_time.strftime('%FT%T'), tmp_time.usec / 1000.0]})
134
134
  else
135
- tmp_time = Time.at(time).utc
136
- if @millisecond then
137
- record.merge!({@time_field => '%s.%03dZ' % [tmp_time.strftime('%FT%T'), tmp_time.usec / 1000.0]})
138
- else
139
- record.merge!({@time_field => tmp_time.strftime('%FT%TZ')})
140
- end
135
+ record.merge!({@time_field => tmp_time.strftime('%FT%TZ')})
141
136
  end
142
137
 
138
+ # Ignore undefined fields
143
139
  if @ignore_undefined_fields then
144
140
  record.each_key do |key|
145
- unless @fields.include?(key) then
141
+ unless fields.include?(key) then
146
142
  record.delete(key)
147
143
  end
148
144
  end
149
145
  end
150
146
 
151
- if @string_field_value_max_length >= 0 then
152
- record.each_key do |key|
153
- if record[key].instance_of?(Array) then
154
- values = []
155
- record[key].each do |value|
156
- if value.instance_of?(String) then
157
- if value.length > @string_field_value_max_length then
158
- log.warn "#{key} is too long (#{value.length}, max is #{@string_field_value_max_length})."
159
- values.push(value.slice(0, @string_field_value_max_length))
160
- else
161
- values.push(value)
162
- end
163
- end
164
- end
165
- record[key] = values
166
- elsif record[key].instance_of?(String) then
167
- if record[key].length > @string_field_value_max_length then
168
- log.warn "#{key} is too long (#{record[key].length}, max is #{@string_field_value_max_length})."
169
- record[key] = record[key].slice(0, @string_field_value_max_length)
170
- end
171
- end
172
- end
173
- end
174
-
175
- #
176
- # delete reserved fields
177
- # https://cwiki.apache.org/confluence/display/solr/Defining+Fields
178
- #
179
- record.each_key do |key|
180
- if key[0] == '_' and key[-1] == '_' then
181
- record.delete(key)
182
- end
183
- end
184
-
147
+ # Add record to documents
185
148
  documents << record
186
149
 
150
+ # Update when flash size is reached
187
151
  if documents.count >= @flush_size
188
152
  update documents
189
153
  documents.clear
190
154
  end
191
155
  end
192
156
 
157
+ # Update remaining documents
193
158
  update documents unless documents.empty?
194
159
  end
195
160
 
196
161
  def update(documents)
197
- if @mode == MODE_STANDALONE then
198
- @solr.add documents, :params => {:commit => @commit_with_flush}
199
- log.debug "Added #{documents.count} document(s) to Solr"
200
- elsif @mode == MODE_SOLRCLOUD then
201
- @solr.add documents, collection: @collection, :params => {:commit => @commit_with_flush}
202
- log.debug "Added #{documents.count} document(s) to Solr"
162
+ begin
163
+ if @mode == MODE_STANDALONE then
164
+ @solr.add documents, :params => {:commit => @commit_with_flush}
165
+ elsif @mode == MODE_SOLRCLOUD then
166
+ @solr.add documents, collection: @collection, :params => {:commit => @commit_with_flush}
167
+ end
168
+ log.debug "Sent #{documents.count} document(s) to Solr"
169
+ rescue Exception
170
+ log.warn "An error occurred while sending #{documents.count} document(s) to Solr"
203
171
  end
204
- rescue Exception => e
205
- log.warn "An error occurred while indexing"
206
172
  end
207
173
 
208
174
  def get_unique_key
209
- response = nil
210
-
211
- if @mode == MODE_STANDALONE then
212
- response = @solr.get 'schema/uniquekey'
213
- elsif @mode == MODE_SOLRCLOUD then
214
- response = @solr.get 'schema/uniquekey', collection: @collection
175
+ unique_key = 'id'
176
+
177
+ begin
178
+ response = nil
179
+ if @mode == MODE_STANDALONE then
180
+ response = @solr.get 'schema/uniquekey'
181
+ elsif @mode == MODE_SOLRCLOUD then
182
+ response = @solr.get 'schema/uniquekey', collection: @collection
183
+ end
184
+ unique_key = response['uniqueKey']
185
+ log.debug "Unique key: #{unique_key}"
186
+ rescue Exception
187
+ log.warn 'An error occurred while getting unique key'
215
188
  end
216
189
 
217
- unique_key = response['uniqueKey']
218
- log.debug "Unique key: #{unique_key}"
219
-
220
190
  return unique_key
221
-
222
- rescue Exception => e
223
- log.warn "An error occurred while getting unique key"
224
191
  end
225
192
 
226
193
  def get_fields
227
- response = nil
194
+ fields = []
228
195
 
229
- if @mode == MODE_STANDALONE then
230
- response = @solr.get 'schema/fields'
231
- elsif @mode == MODE_SOLRCLOUD then
232
- response = @solr.get 'schema/fields', collection: @collection
233
- end
196
+ begin
197
+ response = nil
234
198
 
235
- fields = []
236
- response['fields'].each do |field|
237
- fields.push(field['name'])
199
+ if @mode == MODE_STANDALONE then
200
+ response = @solr.get 'schema/fields'
201
+ elsif @mode == MODE_SOLRCLOUD then
202
+ response = @solr.get 'schema/fields', collection: @collection
203
+ end
204
+ response['fields'].each do |field|
205
+ fields.push(field['name'])
206
+ end
207
+ log.debug "Fields: #{fields}"
208
+ rescue Exception
209
+ log.warn 'An error occurred while getting fields'
238
210
  end
239
- log.debug "Fields: #{fields}"
240
211
 
241
212
  return fields
242
-
243
- rescue Exception => e
244
- log.warn "An error occurred while getting fields"
245
213
  end
246
214
  end
247
215
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-output-solr
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.12
4
+ version: 0.4.13
5
5
  platform: ruby
6
6
  authors:
7
7
  - Minoru Osuka
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2017-06-21 00:00:00.000000000 Z
11
+ date: 2017-10-25 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: fluentd
@@ -150,8 +150,7 @@ dependencies:
150
150
  - - "~>"
151
151
  - !ruby/object:Gem::Version
152
152
  version: 1.1.8
153
- description: Fluent output plugin for sending data to Apache Solr. It support SolrCloud
154
- not only Standalone Solr.
153
+ description: Fluent output plugin for sending data to Apache Solr.
155
154
  email:
156
155
  - minoru.osuka@gmail.com
157
156
  executables: []