fluent-plugin-output-solr 0.4.12 → 0.4.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 69c8450973f1699e1a6d085914b2aff6b33336c9
4
- data.tar.gz: 1c45581028a1e2554c9c34859af1353b2078214c
3
+ metadata.gz: e7291cfc870c7f8be862adb23b42f1cf9072f697
4
+ data.tar.gz: 64a33feafa219c2798fd1a3897d9122f24bd3b08
5
5
  SHA512:
6
- metadata.gz: 3b34e96a6bf48d9c402970c54feb4ebeab5dae6a3b1671e8508cb38343d418a46e2451446cb2fd6fbaa71942129def9bae6cc80c0b13ef3c072a315ed8990e46
7
- data.tar.gz: 7f6687e3baba7bd6c7ae7cd97367bf82757a3383cb58c322720fa69c6bf5fd056805dbfb8d3459e2ed31b7d4a284069c495b7c7983e7c00a6000b91a7ba50a78
6
+ metadata.gz: f99cf5bfcb8dde3440bf65183594508923a20e09670e9ab78f122b64f53fa8efad0a82befa3cf95a26bf2e9438cbf8aa1bb3c681e4888e6b42fbbc4bc8844e54
7
+ data.tar.gz: b847b79fa7cfcd2311ad515c4732971dd47065ef745c237431fe0bb1b11324480cc08b4c0648c2528e06d41114c3c03efe7653d7f80d6111fa2fff3ceb7a61f7
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Fluent::Plugin::SolrOutput
2
2
 
3
- This is a [Fluentd](http://fluentd.org/) output plugin for send data to [Apache Solr](http://lucene.apache.org/solr/). It support [SolrCloud](https://cwiki.apache.org/confluence/display/solr/SolrCloud) not only Standalone Solr.
3
+ This is a [Fluentd](http://fluentd.org/) output plugin for send data to [Apache Solr](http://lucene.apache.org/solr/).
4
4
 
5
5
  ## Requirements
6
6
 
@@ -32,12 +32,12 @@ $ rake install
32
32
 
33
33
  ## Config parameters
34
34
 
35
- ### url
35
+ ### base_url
36
36
 
37
- The Solr server url (for example http://localhost:8983/solr/collection1).
37
+ The Solr base url (for example http://localhost:8983/solr).
38
38
 
39
39
  ```
40
- url http://localhost:8983/solr/collection1
40
+ base_url http://localhost:8983/solr
41
41
  ```
42
42
 
43
43
  ### zk_host
@@ -50,20 +50,12 @@ zk_host localhost:2181/solr
50
50
 
51
51
  ### collection
52
52
 
53
- The SolrCloud collection name (default collection1).
53
+ The Solr collection/core name (default collection1).
54
54
 
55
55
  ```
56
56
  collection collection1
57
57
  ```
58
58
 
59
- ### defined_fields
60
-
61
- The defined fields in the Solr schema.xml. If omitted, it will get fields via Solr Schema API.
62
-
63
- ```
64
- defined_fields ["id", "title"]
65
- ```
66
-
67
59
  ### ignore_undefined_fields
68
60
 
69
61
  Ignore undefined fields in the Solr schema.xml.
@@ -72,20 +64,12 @@ Ignore undefined fields in the Solr schema.xml.
72
64
  ignore_undefined_fields false
73
65
  ```
74
66
 
75
- ### unique_key_field
76
-
77
- A field name of unique key in the Solr schema.xml. If omitted, it will get unique key via Solr Schema API.
78
-
79
- ```
80
- unique_key_field id
81
- ```
82
-
83
- ### string_field_value_max_length
67
+ ### tag_field
84
68
 
85
- A string field value max length. If set -1, it means unlimited (default -1). However, there is a limit of Solr.
69
+ A field name of fluentd tag in the Solr schema.xml (default tag).
86
70
 
87
71
  ```
88
- string_field_value_max_length -1
72
+ tag_field tag
89
73
  ```
90
74
 
91
75
  ### time_field
@@ -135,8 +119,11 @@ commit_with_flush true
135
119
  <match something.logs>
136
120
  @type solr
137
121
 
138
- # The Solr server url (for example http://localhost:8983/solr/collection1).
139
- url http://localhost:8983/solr/collection1
122
+ # The Solr base url (for example http://localhost:8983/solr).
123
+ base_url http://localhost:8983/solr
124
+
125
+ # The Solr collection/core name (default collection1).
126
+ collection collection1
140
127
  </match>
141
128
  ```
142
129
 
@@ -148,83 +135,11 @@ commit_with_flush true
148
135
  # The ZooKeeper connection string that SolrCloud refers to (for example localhost:2181/solr).
149
136
  zk_host localhost:2181/solr
150
137
 
151
- # The SolrCloud collection name (default collection1).
138
+ # The Solr collection/core name (default collection1).
152
139
  collection collection1
153
140
  </match>
154
141
  ```
155
142
 
156
- ## Solr setup examples
157
-
158
- ### How to setup Standalone Solr using data-driven schemaless mode.
159
-
160
- 1.Download and install Solr
161
-
162
- ```sh
163
- $ mkdir $HOME/solr
164
- $ cd $HOME/solr
165
- $ wget https://archive.apache.org/dist/lucene/solr/5.4.0/solr-5.4.0.tgz
166
- $ tar zxvf solr-5.4.0.tgz
167
- $ cd solr-5.4.0
168
- ```
169
-
170
- 2.Start standalone Solr
171
-
172
- ```sh
173
- $ ./bin/solr start -p 8983 -s server/solr
174
- ```
175
-
176
- 3.Create core
177
-
178
- ```sh
179
- $ ./bin/solr create -c collection1 -d server/solr/configsets/data_driven_schema_configs -n collection1_configs
180
- ```
181
-
182
- ### How to setup SolrCloud using data-driven schemaless mode (shards=1 and replicationfactor=2).
183
-
184
- 1.Download and install ZooKeeper
185
-
186
- ```sh
187
- $ mkdir $HOME/zookeeper
188
- $ cd $HOME/zookeeper
189
- $ wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
190
- $ tar zxvf zookeeper-3.4.6.tar.gz
191
- $ cd zookeeper-3.4.6
192
- $ cp -p ./conf/zoo_sample.cfg ./conf/zoo.cfg
193
- ```
194
-
195
- 2.Start standalone ZooKeeper
196
-
197
- ```sh
198
- $ ./bin/zkServer.sh start
199
- ```
200
-
201
- 3.Download an install Solr
202
-
203
- ```sh
204
- $ mkdir $HOME/solr
205
- $ cd $HOME/solr
206
- $ wget https://archive.apache.org/dist/lucene/solr/5.4.0/solr-5.4.0.tgz
207
- $ tar zxvf solr-5.4.0.tgz
208
- $ cd solr-5.4.0
209
- $ ./server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd clear /solr
210
- $ ./server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd makepath /solr
211
- $ cp -pr server/solr server/solr1
212
- $ cp -pr server/solr server/solr2
213
- ```
214
-
215
- 4.Start SolrCloud
216
-
217
- ```sh
218
- $ ./bin/solr start -h localhost -p 8983 -z localhost:2181/solr -s server/solr1
219
- $ ./bin/solr start -h localhost -p 8985 -z localhost:2181/solr -s server/solr2
220
- ```
221
-
222
- 5.Create collection
223
-
224
- ```sh
225
- $ ./bin/solr create -c collection1 -d server/solr1/configsets/data_driven_schema_configs -n collection1_configs -shards 1 -replicationFactor 2
226
- ```
227
-
228
143
  ## Development
229
144
 
230
145
  After checking out the repo, run `bundle install` to install dependencies. Then, run `rake test` to run the tests.
@@ -3,21 +3,21 @@ lib = File.expand_path('../lib', __FILE__)
3
3
  $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
4
 
5
5
  Gem::Specification.new do |spec|
6
- spec.name = "fluent-plugin-output-solr"
7
- spec.version = "0.4.12"
8
- spec.authors = ["Minoru Osuka"]
9
- spec.email = ["minoru.osuka@gmail.com"]
6
+ spec.name = 'fluent-plugin-output-solr'
7
+ spec.version = '0.4.13'
8
+ spec.authors = ['Minoru Osuka']
9
+ spec.email = ['minoru.osuka@gmail.com']
10
10
 
11
- spec.summary = "Fluent output plugin for sending data to Apache Solr."
12
- spec.description = "Fluent output plugin for sending data to Apache Solr. It support SolrCloud not only Standalone Solr."
13
- spec.homepage = "https://github.com/mosuka/fluent-plugin-output-solr"
11
+ spec.summary = 'Fluent output plugin for sending data to Apache Solr.'
12
+ spec.description = 'Fluent output plugin for sending data to Apache Solr.'
13
+ spec.homepage = 'https://github.com/mosuka/fluent-plugin-output-solr'
14
14
 
15
- spec.license = "Apache-2.0"
15
+ spec.license = 'Apache-2.0'
16
16
 
17
17
  spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
18
- spec.bindir = "exe"
18
+ spec.bindir = 'exe'
19
19
  spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
20
- spec.require_paths = ["lib"]
20
+ spec.require_paths = ['lib']
21
21
 
22
22
  spec.add_runtime_dependency 'fluentd', '~> 0.12.0'
23
23
  spec.add_runtime_dependency 'rsolr-cloud', '~> 1.1.0'
data/fluent.conf CHANGED
@@ -15,11 +15,17 @@
15
15
 
16
16
  <match messages>
17
17
  @type solr
18
- # url http://localhost:8983/solr/collection1
19
- zk_host localhost:2181/solr
18
+ base_url http://localhost:8983/solr
19
+ # zk_host localhost:2181/solr
20
20
  collection collection1
21
21
  ignore_undefined_fields false
22
+ tag_field tag
23
+ time_field time
24
+ time_format %FT%TZ
25
+ millisecond true
22
26
  flush_size 100
27
+ commit_with_flush true
28
+
23
29
  buffer_type memory
24
30
  buffer_queue_limit 64m
25
31
  buffer_chunk_limit 8m
@@ -8,12 +8,15 @@ module Fluent
8
8
  Fluent::Plugin.register_output('solr', self)
9
9
 
10
10
  DEFAULT_COLLECTION = 'collection1'
11
- DEFAULT_IGNORE_UNDEFINED_FIELDS = false
12
- DEFAULT_STRING_FIELD_VALUE_MAX_LENGTH = -1
11
+
13
12
  DEFAULT_TAG_FIELD = 'tag'
13
+
14
14
  DEFAULT_TIME_FIELD = 'time'
15
15
  DEFAULT_TIME_FORMAT = '%FT%TZ'
16
16
  DEFAULT_MILLISECOND = false
17
+
18
+ DEFAULT_IGNORE_UNDEFINED_FIELDS = false
19
+
17
20
  DEFAULT_FLUSH_SIZE = 100
18
21
  DEFAULT_COMMIT_WITH_FLUSH = true
19
22
 
@@ -26,25 +29,21 @@ module Fluent
26
29
  include Fluent::SetTimeKeyMixin
27
30
  config_set_default :include_time_key, false
28
31
 
29
- config_param :url, :string, :default => nil,
30
- :desc => 'The Solr server url (for example http://localhost:8983/solr/collection1).'
32
+ config_param :base_url, :string, :default => nil,
33
+ :desc => 'The Solr base url (for example http://localhost:8983/solr).'
31
34
 
32
35
  config_param :zk_host, :string, :default => nil,
33
36
  :desc => 'The ZooKeeper connection string that SolrCloud refers to (for example localhost:2181/solr).'
37
+
34
38
  config_param :collection, :string, :default => DEFAULT_COLLECTION,
35
- :desc => 'The SolrCloud collection name (default collection1).'
39
+ :desc => 'The Solr collection/core name (default collection1).'
36
40
 
37
- config_param :defined_fields, :array, :default => nil,
38
- :desc => 'The defined fields in the Solr schema.xml. If omitted, it will get fields via Solr Schema API.'
39
41
  config_param :ignore_undefined_fields, :bool, :default => DEFAULT_IGNORE_UNDEFINED_FIELDS,
40
42
  :desc => 'Ignore undefined fields in the Solr schema.xml.'
41
- config_param :string_field_value_max_length, :integer, :default => DEFAULT_STRING_FIELD_VALUE_MAX_LENGTH,
42
- :desc => 'Field value max length.'
43
43
 
44
- config_param :unique_key_field, :string, :default => nil,
45
- :desc => 'A field name of unique key in the Solr schema.xml. If omitted, it will get unique key via Solr Schema API.'
46
44
  config_param :tag_field, :string, :default => DEFAULT_TAG_FIELD,
47
45
  :desc => 'A field name of fluentd tag in the Solr schema.xml (default tag).'
46
+
48
47
  config_param :time_field, :string, :default => DEFAULT_TIME_FIELD,
49
48
  :desc => 'A field name of event timestamp in the Solr schema.xml (default time).'
50
49
  config_param :time_format, :string, :default => DEFAULT_TIME_FORMAT,
@@ -70,7 +69,7 @@ module Fluent
70
69
  super
71
70
 
72
71
  @mode = nil
73
- if ! @url.nil? then
72
+ if ! @base_url.nil? then
74
73
  @mode = MODE_STANDALONE
75
74
  elsif ! @zk_host.nil?
76
75
  @mode = MODE_SOLRCLOUD
@@ -80,7 +79,7 @@ module Fluent
80
79
  @zk = nil
81
80
 
82
81
  if @mode == MODE_STANDALONE then
83
- @solr = RSolr.connect :url => @url
82
+ @solr = RSolr.connect :url => @base_url.end_with?('/') ? @base_url + @collection : @base_url + '/' + @collection
84
83
  elsif @mode == MODE_SOLRCLOUD then
85
84
  @zk = ZK.new(@zk_host)
86
85
  cloud_connection = RSolr::Cloud::Connection.new(@zk)
@@ -103,145 +102,114 @@ module Fluent
103
102
  def write(chunk)
104
103
  documents = []
105
104
 
106
- @fields = @defined_fields.nil? ? get_fields : @defined_fields
107
- @unique_key = @unique_key_field.nil? ? get_unique_key : @unique_key_field
105
+ # Get fields from Solr
106
+ fields = get_fields
107
+
108
+ # Get unique key field from Solr
109
+ unique_key = get_unique_key
108
110
 
109
111
  chunk.msgpack_each do |tag, time, record|
110
- unless record.has_key?(@unique_key) then
111
- record.merge!({@unique_key => SecureRandom.uuid})
112
+ # Set unique key and value
113
+ unless record.has_key?(unique_key) then
114
+ record.merge!({unique_key => SecureRandom.uuid})
112
115
  end
113
116
 
117
+ # Set Fluentd tag to Solr tag field
114
118
  unless record.has_key?(@tag_field) then
115
119
  record.merge!({@tag_field => tag})
116
120
  end
117
121
 
122
+ # Set time
123
+ tmp_time = Time.at(time).utc
118
124
  if record.has_key?(@time_field) then
125
+ # Parsing the time field in the record by the specified format.
119
126
  begin
120
127
  tmp_time = Time.strptime(record[@time_field], @time_format).utc
121
- if @millisecond then
122
- record.merge!({@time_field => '%s.%03dZ' % [tmp_time.strftime('%FT%T'), tmp_time.usec / 1000.0]})
123
- else
124
- record.merge!({@time_field => tmp_time.strftime('%FT%TZ')})
125
- end
126
- rescue
127
- tmp_time = Time.at(time).utc
128
- if @millisecond then
129
- record.merge!({@time_field => '%s.%03dZ' % [tmp_time.strftime('%FT%T'), tmp_time.usec / 1000.0]})
130
- else
131
- record.merge!({@time_field => tmp_time.strftime('%FT%TZ')})
132
- end
128
+ rescue Exception => e
129
+ log.warn "An error occurred in parsing the time field: #{e.message}"
133
130
  end
131
+ end
132
+ if @millisecond then
133
+ record.merge!({@time_field => '%s.%03dZ' % [tmp_time.strftime('%FT%T'), tmp_time.usec / 1000.0]})
134
134
  else
135
- tmp_time = Time.at(time).utc
136
- if @millisecond then
137
- record.merge!({@time_field => '%s.%03dZ' % [tmp_time.strftime('%FT%T'), tmp_time.usec / 1000.0]})
138
- else
139
- record.merge!({@time_field => tmp_time.strftime('%FT%TZ')})
140
- end
135
+ record.merge!({@time_field => tmp_time.strftime('%FT%TZ')})
141
136
  end
142
137
 
138
+ # Ignore undefined fields
143
139
  if @ignore_undefined_fields then
144
140
  record.each_key do |key|
145
- unless @fields.include?(key) then
141
+ unless fields.include?(key) then
146
142
  record.delete(key)
147
143
  end
148
144
  end
149
145
  end
150
146
 
151
- if @string_field_value_max_length >= 0 then
152
- record.each_key do |key|
153
- if record[key].instance_of?(Array) then
154
- values = []
155
- record[key].each do |value|
156
- if value.instance_of?(String) then
157
- if value.length > @string_field_value_max_length then
158
- log.warn "#{key} is too long (#{value.length}, max is #{@string_field_value_max_length})."
159
- values.push(value.slice(0, @string_field_value_max_length))
160
- else
161
- values.push(value)
162
- end
163
- end
164
- end
165
- record[key] = values
166
- elsif record[key].instance_of?(String) then
167
- if record[key].length > @string_field_value_max_length then
168
- log.warn "#{key} is too long (#{record[key].length}, max is #{@string_field_value_max_length})."
169
- record[key] = record[key].slice(0, @string_field_value_max_length)
170
- end
171
- end
172
- end
173
- end
174
-
175
- #
176
- # delete reserved fields
177
- # https://cwiki.apache.org/confluence/display/solr/Defining+Fields
178
- #
179
- record.each_key do |key|
180
- if key[0] == '_' and key[-1] == '_' then
181
- record.delete(key)
182
- end
183
- end
184
-
147
+ # Add record to documents
185
148
  documents << record
186
149
 
150
+ # Update when flash size is reached
187
151
  if documents.count >= @flush_size
188
152
  update documents
189
153
  documents.clear
190
154
  end
191
155
  end
192
156
 
157
+ # Update remaining documents
193
158
  update documents unless documents.empty?
194
159
  end
195
160
 
196
161
  def update(documents)
197
- if @mode == MODE_STANDALONE then
198
- @solr.add documents, :params => {:commit => @commit_with_flush}
199
- log.debug "Added #{documents.count} document(s) to Solr"
200
- elsif @mode == MODE_SOLRCLOUD then
201
- @solr.add documents, collection: @collection, :params => {:commit => @commit_with_flush}
202
- log.debug "Added #{documents.count} document(s) to Solr"
162
+ begin
163
+ if @mode == MODE_STANDALONE then
164
+ @solr.add documents, :params => {:commit => @commit_with_flush}
165
+ elsif @mode == MODE_SOLRCLOUD then
166
+ @solr.add documents, collection: @collection, :params => {:commit => @commit_with_flush}
167
+ end
168
+ log.debug "Sent #{documents.count} document(s) to Solr"
169
+ rescue Exception
170
+ log.warn "An error occurred while sending #{documents.count} document(s) to Solr"
203
171
  end
204
- rescue Exception => e
205
- log.warn "An error occurred while indexing"
206
172
  end
207
173
 
208
174
  def get_unique_key
209
- response = nil
210
-
211
- if @mode == MODE_STANDALONE then
212
- response = @solr.get 'schema/uniquekey'
213
- elsif @mode == MODE_SOLRCLOUD then
214
- response = @solr.get 'schema/uniquekey', collection: @collection
175
+ unique_key = 'id'
176
+
177
+ begin
178
+ response = nil
179
+ if @mode == MODE_STANDALONE then
180
+ response = @solr.get 'schema/uniquekey'
181
+ elsif @mode == MODE_SOLRCLOUD then
182
+ response = @solr.get 'schema/uniquekey', collection: @collection
183
+ end
184
+ unique_key = response['uniqueKey']
185
+ log.debug "Unique key: #{unique_key}"
186
+ rescue Exception
187
+ log.warn 'An error occurred while getting unique key'
215
188
  end
216
189
 
217
- unique_key = response['uniqueKey']
218
- log.debug "Unique key: #{unique_key}"
219
-
220
190
  return unique_key
221
-
222
- rescue Exception => e
223
- log.warn "An error occurred while getting unique key"
224
191
  end
225
192
 
226
193
  def get_fields
227
- response = nil
194
+ fields = []
228
195
 
229
- if @mode == MODE_STANDALONE then
230
- response = @solr.get 'schema/fields'
231
- elsif @mode == MODE_SOLRCLOUD then
232
- response = @solr.get 'schema/fields', collection: @collection
233
- end
196
+ begin
197
+ response = nil
234
198
 
235
- fields = []
236
- response['fields'].each do |field|
237
- fields.push(field['name'])
199
+ if @mode == MODE_STANDALONE then
200
+ response = @solr.get 'schema/fields'
201
+ elsif @mode == MODE_SOLRCLOUD then
202
+ response = @solr.get 'schema/fields', collection: @collection
203
+ end
204
+ response['fields'].each do |field|
205
+ fields.push(field['name'])
206
+ end
207
+ log.debug "Fields: #{fields}"
208
+ rescue Exception
209
+ log.warn 'An error occurred while getting fields'
238
210
  end
239
- log.debug "Fields: #{fields}"
240
211
 
241
212
  return fields
242
-
243
- rescue Exception => e
244
- log.warn "An error occurred while getting fields"
245
213
  end
246
214
  end
247
215
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-output-solr
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.12
4
+ version: 0.4.13
5
5
  platform: ruby
6
6
  authors:
7
7
  - Minoru Osuka
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2017-06-21 00:00:00.000000000 Z
11
+ date: 2017-10-25 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: fluentd
@@ -150,8 +150,7 @@ dependencies:
150
150
  - - "~>"
151
151
  - !ruby/object:Gem::Version
152
152
  version: 1.1.8
153
- description: Fluent output plugin for sending data to Apache Solr. It support SolrCloud
154
- not only Standalone Solr.
153
+ description: Fluent output plugin for sending data to Apache Solr.
155
154
  email:
156
155
  - minoru.osuka@gmail.com
157
156
  executables: []