logstash-input-azureblob-json-head-tail 0.9.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 36d2a3fcea0fdf5f3af0b3175acab68a94901b7b
4
+ data.tar.gz: 85b71255879d3b0b15c6277b4aa9530406982255
5
+ SHA512:
6
+ metadata.gz: c7e8986e035347e2a482877ae18683fbd1001cc7c0b7bdbf67077090a718d81be5da4a05fa1ba6f388874bc28bb9183927b22faaab8840db6bc69e29c282c607
7
+ data.tar.gz: 6b99f505724ed327e11eb38c5b971d970691ff9903668482936ee01ab3a89920174218e2b0432e09d56fb9dac0e4f2e79ce466488252ee47fab1586d437bafe7
@@ -0,0 +1,7 @@
1
+ ## 2016.08.17
2
+ * Added a new configuration parameter for custom endpoint.
3
+
4
+ ## 2016.05.05
5
+ * Made the plugin to respect Logstash shutdown signal.
6
+ * Updated the *logstash-core* runtime dependency requirement to '~> 2.0'.
7
+ * Updated the *logstash-devutils* development dependency requirement to '>= 0.0.16'
data/Gemfile ADDED
@@ -0,0 +1,2 @@
1
+ source 'https://rubygems.org'
2
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,17 @@
1
+
2
+ Copyright (c) Microsoft. All rights reserved.
3
+ Microsoft would like to thank its contributors, a list
4
+ of whom are at http://aka.ms/entlib-contributors
5
+
6
+ Licensed under the Apache License, Version 2.0 (the "License"); you
7
+ may not use this file except in compliance with the License. You may
8
+ obtain a copy of the License at
9
+
10
+ http://www.apache.org/licenses/LICENSE-2.0
11
+
12
+ Unless required by applicable law or agreed to in writing, software
13
+ distributed under the License is distributed on an "AS IS" BASIS,
14
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
15
+ implied. See the License for the specific language governing permissions
16
+ and limitations under the License.
17
+
@@ -0,0 +1,243 @@
1
+ # Logstash input plugin for Azure Storage Blobs
2
+
3
+ ## Summary
4
+ This plugin reads and parses data from Azure Storage Blobs.
5
+
6
+ ## Installation
7
+ You can install this plugin using the Logstash "plugin" or "logstash-plugin" (for newer versions of Logstash) command:
8
+ ```sh
9
+ logstash-plugin install logstash-input-azureblob
10
+ ```
11
+ For more information, see Logstash reference [Working with plugins](https://www.elastic.co/guide/en/logstash/current/working-with-plugins.html).
12
+
13
+ ## Configuration
14
+ ### Required Parameters
15
+ __*storage_account_name*__
16
+
17
+ The storage account name.
18
+
19
+ __*storage_access_key*__
20
+
21
+ The access key to the storage account.
22
+
23
+ __*container*__
24
+
25
+ The blob container name.
26
+
27
+ ### Optional Parameters
28
+ __*endpoint*__
29
+
30
+ Specifies the endpoint of Azure Service Management. The default value is `core.windows.net`.
31
+
32
+ __*registry_path*__
33
+
34
+ Specifies the file path for the registry file to record offsets and coordinate between multiple clients. The default value is `data/registry`.
35
+
36
+ Overwrite this value when there happen to be a file at the path of `data/registry` in the azure blob container.
37
+
38
+ __*interval*__
39
+
40
+ Set how many seconds to idle before checking for new logs. The default, `30`, means idle for `30` seconds.
41
+
42
+ __*registry_create_policy*__
43
+
44
+ Specifies the way to initially set offset for existing blob files.
45
+
46
+ This option only applies for registry creation.
47
+
48
+ Valid values include:
49
+
50
+ - resume
51
+ - start_over
52
+
53
+ The default, `resume`, means when the registry is initially created, it assumes all blob has been consumed and it will start to pick up any new content in the blobs.
54
+
55
+ When set to `start_over`, it assumes none of the blob is consumed and it will read all blob files from begining.
56
+
57
+ Offsets will be picked up from registry file whenever it exists.
58
+
59
+ __*file_head_bytes*__
60
+
61
+ Specifies the header of the file in bytes that does not repeat over records. Usually, these are json opening tags. The default value is `0`.
62
+
63
+ __*file_tail_bytes*__
64
+
65
+ Specifies the tail of the file that does not repeat over records. Usually, these are json closing tags. The defaul tvalue is `0`.
66
+
67
+ __*record_preprocess_reg_exp*__
68
+
69
+ Specifies the regular expression to process content before pushing the event. The matched will be removed. For example, `^\s*,` will removing the leading `,` from the content. The regular expression uses multiline mode.
70
+
71
+ __*blob_list_page_size*__
72
+
73
+ Specifies the page-size for returned blob items. Too big number will hit heap overflow; Too small number will leads to too many requests. The default of `100` is good for heap size of 1G.
74
+
75
+ ### Examples
76
+
77
+ * Bare-bone settings:
78
+
79
+ ```yaml
80
+ input
81
+ {
82
+ azureblob
83
+ {
84
+ storage_account_name => "mystorageaccount"
85
+ storage_access_key => "VGhpcyBpcyBhIGZha2Uga2V5Lg=="
86
+ container => "mycontainer"
87
+ }
88
+ }
89
+ ```
90
+
91
+ * Example for Wad-IIS
92
+
93
+ ```yaml
94
+ input {
95
+ azureblob
96
+ {
97
+ storage_account_name => 'mystorageaccount'
98
+ storage_access_key => 'VGhpcyBpcyBhIGZha2Uga2V5Lg=='
99
+ container => 'wad-iis-logfiles'
100
+ codec => line
101
+ }
102
+ }
103
+ filter {
104
+ ## Ignore the comments that IIS will add to the start of the W3C logs
105
+ #
106
+ if [message] =~ "^#" {
107
+ drop {}
108
+ }
109
+
110
+ grok {
111
+ # https://grokdebug.herokuapp.com/
112
+ match => ["message", "%{TIMESTAMP_ISO8601:log_timestamp} %{WORD:sitename} %{WORD:computername} %{IP:server_ip} %{WORD:method} %{URIPATH:uriStem} %{NOTSPACE:uriQuery} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientIP} %{NOTSPACE:protocolVersion} %{NOTSPACE:userAgent} %{NOTSPACE:cookie} %{NOTSPACE:referer} %{NOTSPACE:requestHost} %{NUMBER:response} %{NUMBER:subresponse} %{NUMBER:win32response} %{NUMBER:bytesSent} %{NUMBER:bytesReceived} %{NUMBER:timetaken}"]
113
+ }
114
+
115
+ ## Set the Event Timesteamp from the log
116
+ #
117
+ date {
118
+ match => [ "log_timestamp", "YYYY-MM-dd HH:mm:ss" ]
119
+ timezone => "Etc/UTC"
120
+ }
121
+
122
+ ## If the log record has a value for 'bytesSent', then add a new field
123
+ # to the event that converts it to kilobytes
124
+ #
125
+ if [bytesSent] {
126
+ ruby {
127
+ code => "event['kilobytesSent'] = event['bytesSent'].to_i / 1024.0"
128
+ }
129
+ }
130
+
131
+ ## Do the same conversion for the bytes received value
132
+ #
133
+ if [bytesReceived] {
134
+ ruby {
135
+ code => "event['kilobytesReceived'] = event['bytesReceived'].to_i / 1024.0"
136
+ }
137
+ }
138
+
139
+ ## Perform some mutations on the records to prep them for Elastic
140
+ #
141
+ mutate {
142
+ ## Convert some fields from strings to integers
143
+ #
144
+ convert => ["bytesSent", "integer"]
145
+ convert => ["bytesReceived", "integer"]
146
+ convert => ["timetaken", "integer"]
147
+
148
+ ## Create a new field for the reverse DNS lookup below
149
+ #
150
+ add_field => { "clientHostname" => "%{clientIP}" }
151
+
152
+ ## Finally remove the original log_timestamp field since the event will
153
+ # have the proper date on it
154
+ #
155
+ remove_field => [ "log_timestamp"]
156
+ }
157
+
158
+ ## Do a reverse lookup on the client IP to get their hostname.
159
+ #
160
+ dns {
161
+ ## Now that we've copied the clientIP into a new field we can
162
+ # simply replace it here using a reverse lookup
163
+ #
164
+ action => "replace"
165
+ reverse => ["clientHostname"]
166
+ }
167
+
168
+ ## Parse out the user agent
169
+ #
170
+ useragent {
171
+ source=> "useragent"
172
+ prefix=> "browser"
173
+ }
174
+ }
175
+ output {
176
+ file {
177
+ path => '/var/tmp/logstash-file-output'
178
+ codec => rubydebug
179
+ }
180
+ stdout {
181
+ codec => rubydebug
182
+ }
183
+ }
184
+ ```
185
+
186
+ * NSG Logs
187
+
188
+ ```yaml
189
+ input {
190
+ azureblob
191
+ {
192
+ storage_account_name => "mystorageaccount"
193
+ storage_access_key => "VGhpcyBpcyBhIGZha2Uga2V5Lg=="
194
+ container => "insights-logs-networksecuritygroupflowevent"
195
+ codec => "json"
196
+ file_head_bytes => 21
197
+ file_tail_bytes => 9
198
+ record_preprocess_reg_exp => "^\s*,"
199
+ }
200
+ }
201
+
202
+ filter {
203
+ split { field => "[records]" }
204
+ split { field => "[records][properties][flows]"}
205
+ split { field => "[records][properties][flows][flows]"}
206
+ split { field => "[records][properties][flows][flows][flowTuples]"}
207
+
208
+ mutate{
209
+ split => { "[records][resourceId]" => "/"}
210
+ add_field => {"Subscription" => "%{[records][resourceId][2]}"
211
+ "ResourceGroup" => "%{[records][resourceId][4]}"
212
+ "NetworkSecurityGroup" => "%{[records][resourceId][8]}"}
213
+ convert => {"Subscription" => "string"}
214
+ convert => {"ResourceGroup" => "string"}
215
+ convert => {"NetworkSecurityGroup" => "string"}
216
+ split => { "[records][properties][flows][flows][flowTuples]" => ","}
217
+ add_field => {
218
+ "unixtimestamp" => "%{[records][properties][flows][flows][flowTuples][0]}"
219
+ "srcIp" => "%{[records][properties][flows][flows][flowTuples][1]}"
220
+ "destIp" => "%{[records][properties][flows][flows][flowTuples][2]}"
221
+ "srcPort" => "%{[records][properties][flows][flows][flowTuples][3]}"
222
+ "destPort" => "%{[records][properties][flows][flows][flowTuples][4]}"
223
+ "protocol" => "%{[records][properties][flows][flows][flowTuples][5]}"
224
+ "trafficflow" => "%{[records][properties][flows][flows][flowTuples][6]}"
225
+ "traffic" => "%{[records][properties][flows][flows][flowTuples][7]}"
226
+ }
227
+ convert => {"unixtimestamp" => "integer"}
228
+ convert => {"srcPort" => "integer"}
229
+ convert => {"destPort" => "integer"}
230
+ }
231
+
232
+ date{
233
+ match => ["unixtimestamp" , "UNIX"]
234
+ }
235
+ }
236
+
237
+ output {
238
+ stdout { codec => rubydebug }
239
+ }
240
+ ```
241
+
242
+ ## More information
243
+ The source code of this plugin is hosted in GitHub repo [Microsoft Azure Diagnostics with ELK](https://github.com/Azure/azure-diagnostics-tools). We welcome you to provide feedback and/or contribute to the project.
@@ -0,0 +1,383 @@
1
+ # encoding: utf-8
2
+ require "logstash/inputs/base"
3
+ require "logstash/namespace"
4
+
5
+ # Azure Storage SDK for Ruby
6
+ require "azure/storage"
7
+ require 'json' # for registry content
8
+ require "securerandom" # for generating uuid.
9
+
10
+ # Registry item to coordinate between mulitple clients
11
+ class LogStash::Inputs::RegistryItem
12
+ attr_accessor :file_path, :etag, :offset, :reader, :gen
13
+ # Allow json serialization.
14
+ def as_json(options={})
15
+ {
16
+ file_path: @file_path,
17
+ etag: @etag,
18
+ reader: @reader,
19
+ offset: @offset,
20
+ gen: @gen
21
+ }
22
+ end # as_json
23
+
24
+ def to_json(*options)
25
+ as_json(*options).to_json(*options)
26
+ end # to_json
27
+
28
+ def initialize(file_path, etag, reader, offset = 0, gen = 0)
29
+ @file_path = file_path
30
+ @etag = etag
31
+ @reader = reader
32
+ @offset = offset
33
+ @gen = gen
34
+ end # initialize
35
+ end # class RegistryItem
36
+
37
+
38
+ # Logstash input plugin for Azure Blobs
39
+ #
40
+ # This logstash plugin gathers data from Microsoft Azure Blobs
41
+ class LogStash::Inputs::LogstashInputAzureblob < LogStash::Inputs::Base
42
+ config_name "azureblob"
43
+
44
+ # If undefined, Logstash will complain, even if codec is unused.
45
+ default :codec, "json_lines"
46
+
47
+ # Set the account name for the azure storage account.
48
+ config :storage_account_name, :validate => :string
49
+
50
+ # Set the key to access the storage account.
51
+ config :storage_access_key, :validate => :string
52
+
53
+ # Set the container of the blobs.
54
+ config :container, :validate => :string
55
+
56
+ # Set the endpoint for the blobs.
57
+ #
58
+ # The default, `core.windows.net` targets the public azure.
59
+ config :endpoint, :validate => :string, :default => 'core.windows.net'
60
+
61
+ # Set the value of using backup mode.
62
+ config :backupmode, :validate => :boolean, :default => false, :deprecated => true, :obsolete => 'This option is obsoleted and the settings will be ignored.'
63
+
64
+ # Set the value for the registry file.
65
+ #
66
+ # The default, `data/registry`, is used to coordinate readings for various instances of the clients.
67
+ config :registry_path, :validate => :string, :default => 'data/registry'
68
+
69
+ # Set how many seconds to keep idle before checking for new logs.
70
+ #
71
+ # The default, `30`, means trigger a reading for the log every 30 seconds after entering idle.
72
+ config :interval, :validate => :number, :default => 30
73
+
74
+ # Set the registry create mode
75
+ #
76
+ # The default, `resume`, means when the registry is initially created, it assumes all logs has been handled.
77
+ # When set to `start_over`, it will read all log files from begining.
78
+ config :registry_create_policy, :validate => :string, :default => 'resume'
79
+
80
+ # Sets the header of the file that does not repeat over records. Usually, these are json opening tags.
81
+ config :file_head_bytes, :validate => :number, :default => 0
82
+
83
+ # Sets the tail of the file that does not repeat over records. Usually, these are json closing tags.
84
+ config :file_tail_bytes, :validate => :number, :default => 0
85
+
86
+ # Sets the regular expression to process content before pushing the event.
87
+ config :record_preprocess_reg_exp, :validate => :string
88
+
89
+ # Sets the page-size for returned blob items. Too big number will hit heap overflow; Too small number will leads to too many requests.
90
+ #
91
+ # The default, `100` is good for default heap size of 1G.
92
+ config :blob_list_page_size, :validate => :number, :default => 100
93
+
94
+ # Constant of max integer
95
+ MAX = 2 ** ([42].pack('i').size * 16 -2 ) -1
96
+
97
+ public
98
+ def register
99
+ # this is the reader # for this specific instance.
100
+ @reader = SecureRandom.uuid
101
+ @registry_locker = "#{@registry_path}.lock"
102
+
103
+ # Setup a specific instance of an Azure::Storage::Client
104
+ client = Azure::Storage::Client.create(:storage_account_name => @storage_account_name, :storage_access_key => @storage_access_key, :storage_blob_host => "https://#{@storage_account_name}.blob.#{@endpoint}")
105
+ # Get an azure storage blob service object from a specific instance of an Azure::Storage::Client
106
+ @azure_blob = client.blob_client
107
+ # Add retry filter to the service object
108
+ @azure_blob.with_filter(Azure::Storage::Core::Filter::ExponentialRetryPolicyFilter.new)
109
+ end # def register
110
+
111
+ def run(queue)
112
+ # we can abort the loop if stop? becomes true
113
+ while !stop?
114
+ process(queue)
115
+ Stud.stoppable_sleep(@interval) { stop? }
116
+ end # loop
117
+ end # def run
118
+
119
+ def stop
120
+ cleanup_registry
121
+ end # def stop
122
+
123
+ # Start processing the next item.
124
+ def process(queue)
125
+ begin
126
+ blob, start_index, gen = register_for_read
127
+
128
+ if(!blob.nil?)
129
+ begin
130
+ blob_name = blob.name
131
+ # Work-around: After returned by get_blob, the etag will contains quotes.
132
+ new_etag = blob.properties[:etag]
133
+ # ~ Work-around
134
+
135
+ blob, header = @azure_blob.get_blob(@container, blob_name, {:end_range => @file_head_bytes}) if header.nil? unless @file_head_bytes.nil? or @file_head_bytes <= 0
136
+
137
+ if start_index == 0
138
+ # Skip the header since it is already read.
139
+ start_index = start_index + @file_head_bytes
140
+ else
141
+ # Adjust the offset when it is other than first time, then read till the end of the file, including the tail.
142
+ start_index = start_index - @file_tail_bytes
143
+ start_index = 0 if start_index < 0
144
+ end
145
+
146
+ blob, content = @azure_blob.get_blob(@container, blob_name, {:start_range => start_index} )
147
+
148
+ # content will be used to calculate the new offset. Create a new variable for processed content.
149
+ processed_content = content
150
+ if(!@record_preprocess_reg_exp.nil?)
151
+ reg_exp = Regexp.new(@record_preprocess_reg_exp, Regexp::MULTILINE)
152
+ processed_content = content.sub(reg_exp, '')
153
+ end
154
+
155
+ # Putting header and content and tail together before pushing into event queue
156
+ processed_content = "#{header}#{processed_content}" unless header.nil? || header.length == 0
157
+
158
+ @codec.decode(processed_content) do |event|
159
+ decorate(event)
160
+ queue << event
161
+ end # decode
162
+ ensure
163
+ # Making sure the reader is removed from the registry even when there's exception.
164
+ new_offset = start_index
165
+ new_offset = new_offset + content.length unless content.nil?
166
+ new_registry_item = LogStash::Inputs::RegistryItem.new(blob_name, new_etag, nil, new_offset, gen)
167
+ update_registry(new_registry_item)
168
+ end # begin
169
+ end # if
170
+ rescue StandardError => e
171
+ @logger.error("Oh My, An error occurred. \nError:#{e}:\nTrace:\n#{e.backtrace}", :exception => e)
172
+ end # begin
173
+ end # process
174
+
175
+ # Deserialize registry hash from json string.
176
+ def deserialize_registry_hash (json_string)
177
+ result = Hash.new
178
+ temp_hash = JSON.parse(json_string)
179
+ temp_hash.values.each { |kvp|
180
+ result[kvp['file_path']] = LogStash::Inputs::RegistryItem.new(kvp['file_path'], kvp['etag'], kvp['reader'], kvp['offset'], kvp['gen'])
181
+ }
182
+ return result
183
+ end #deserialize_registry_hash
184
+
185
+ # List all the blobs in the given container.
186
+ def list_all_blobs
187
+ blobs = Set.new []
188
+ continuation_token = NIL
189
+ @blob_list_page_size = 100 if @blob_list_page_size <= 0
190
+ loop do
191
+ # Need to limit the returned number of the returned entries to avoid out of memory exception.
192
+ entries = @azure_blob.list_blobs(@container, { :timeout => 10, :marker => continuation_token, :max_results => @blob_list_page_size })
193
+ entries.each do |entry|
194
+ blobs << entry
195
+ end # each
196
+ continuation_token = entries.continuation_token
197
+ break if continuation_token.empty?
198
+ end # loop
199
+ return blobs
200
+ end # def list_blobs
201
+
202
+ # Raise generation for blob in registry
203
+ def raise_gen(registry_hash, file_path)
204
+ begin
205
+ target_item = registry_hash[file_path]
206
+ begin
207
+ target_item.gen += 1
208
+ # Protect gen from overflow.
209
+ target_item.gen = target_item.gen / 2 if target_item.gen == MAX
210
+ rescue StandardError => e
211
+ @logger.error("Fail to get the next generation for target item #{target_item}.", :exception => e)
212
+ target_item.gen = 0
213
+ end
214
+
215
+ min_gen_item = registry_hash.values.min_by { |x| x.gen }
216
+ while min_gen_item.gen > 0
217
+ registry_hash.values.each { |value|
218
+ value.gen -= 1
219
+ }
220
+ min_gen_item = registry_hash.values.min_by { |x| x.gen }
221
+ end
222
+ end
223
+ end # raise_gen
224
+
225
+ # Acquire a lease on a blob item with retries.
226
+ #
227
+ # By default, it will retry 30 times with 1 second interval.
228
+ def acquire_lease(blob_name, retry_times = 30, interval_sec = 1)
229
+ lease = nil;
230
+ retried = 0;
231
+ while lease.nil? do
232
+ begin
233
+ lease = @azure_blob.acquire_blob_lease(@container, blob_name, {:timeout => 10})
234
+ rescue StandardError => e
235
+ if(e.type == 'LeaseAlreadyPresent')
236
+ if (retried > retry_times)
237
+ raise
238
+ end
239
+ retried += 1
240
+ sleep interval_sec
241
+ end
242
+ end
243
+ end #while
244
+ return lease
245
+ end # acquire_lease
246
+
247
+ # Return the next blob for reading as well as the start index.
248
+ def register_for_read
249
+ begin
250
+ all_blobs = list_all_blobs
251
+ registry = all_blobs.find { |item| item.name.downcase == @registry_path }
252
+ registry_locker = all_blobs.find { |item| item.name.downcase == @registry_locker }
253
+
254
+ candidate_blobs = all_blobs.select { |item| (item.name.downcase != @registry_path) && ( item.name.downcase != @registry_locker ) }
255
+
256
+ start_index = 0
257
+ gen = 0
258
+ lease = nil
259
+
260
+ # Put lease on locker file than the registy file to allow update of the registry as a workaround for Azure Storage Ruby SDK issue # 16.
261
+ # Workaround: https://github.com/Azure/azure-storage-ruby/issues/16
262
+ registry_locker = @azure_blob.create_block_blob(@container, @registry_locker, @reader) if registry_locker.nil?
263
+ lease = acquire_lease(@registry_locker)
264
+ # ~ Workaround
265
+
266
+ if(registry.nil?)
267
+ registry_hash = create_registry(candidate_blobs)
268
+ else
269
+ registry_hash = load_registry
270
+ end #if
271
+
272
+ picked_blobs = Set.new []
273
+ # Pick up the next candidate
274
+ picked_blob = nil
275
+ candidate_blobs.each { |candidate_blob|
276
+ registry_item = registry_hash[candidate_blob.name]
277
+
278
+ # Appending items that doesn't exist in the hash table
279
+ if registry_item.nil?
280
+ registry_item = LogStash::Inputs::RegistryItem.new(candidate_blob.name, candidate_blob.properties[:etag], nil, 0, 0)
281
+ registry_hash[candidate_blob.name] = registry_item
282
+ end # if
283
+
284
+ if ((registry_item.offset < candidate_blob.properties[:content_length]) && (registry_item.reader.nil? || registry_item.reader == @reader))
285
+ picked_blobs << candidate_blob
286
+ end
287
+ }
288
+
289
+ picked_blob = picked_blobs.min_by { |b| registry_hash[b.name].gen }
290
+ if !picked_blob.nil?
291
+ registry_item = registry_hash[picked_blob.name]
292
+ registry_item.reader = @reader
293
+ registry_hash[picked_blob.name] = registry_item
294
+ start_index = registry_item.offset
295
+ raise_gen(registry_hash, picked_blob.name)
296
+ gen = registry_item.gen
297
+ end #if
298
+
299
+ # Save the chnage for the registry
300
+ save_registry(registry_hash)
301
+
302
+ @azure_blob.release_blob_lease(@container, @registry_locker, lease)
303
+ lease = nil;
304
+
305
+ return picked_blob, start_index, gen
306
+ rescue StandardError => e
307
+ @logger.error("Oh My, An error occurred. #{e}:\n#{e.backtrace}", :exception => e)
308
+ return nil, nil, nil
309
+ ensure
310
+ @azure_blob.release_blob_lease(@container, @registry_locker, lease) unless lease.nil?
311
+ lease = nil
312
+ end # rescue
313
+ end #register_for_read
314
+
315
+ # Update the registry
316
+ def update_registry (registry_item)
317
+ begin
318
+ lease = nil
319
+ lease = acquire_lease(@registry_locker)
320
+ registry_hash = load_registry
321
+ registry_hash[registry_item.file_path] = registry_item
322
+ save_registry(registry_hash)
323
+ @azure_blob.release_blob_lease(@container, @registry_locker, lease)
324
+ lease = nil
325
+ rescue StandardError => e
326
+ @logger.error("Oh My, An error occurred. #{e}:\n#{e.backtrace}", :exception => e)
327
+ ensure
328
+ @azure_blob.release_blob_lease(@container, @registry_locker, lease) unless lease.nil?
329
+ lease = nil
330
+ end #rescue
331
+ end # def update_registry
332
+
333
+ # Clean up the registry.
334
+ def cleanup_registry
335
+ begin
336
+ lease = nil
337
+ lease = acquire_lease(@registry_locker)
338
+ registry_hash = load_registry
339
+ registry_hash.each { | key, registry_item|
340
+ registry_item.reader = nil if registry_item.reader == @reader
341
+ }
342
+ save_registry(registry_hash)
343
+ @azure_blob.release_blob_lease(@container, @registry_locker, lease)
344
+ lease = nil
345
+ rescue StandardError => e
346
+ @logger.error("Oh My, An error occurred. #{e}:\n#{e.backtrace}", :exception => e)
347
+ ensure
348
+ @azure_blob.release_blob_lease(@container, @registry_locker, lease) unless lease.nil?
349
+ lease = nil
350
+ end #rescue
351
+ end # def cleanup_registry
352
+
353
+ # Create a registry file to coordinate between multiple azure blob inputs.
354
+ def create_registry (blob_items)
355
+ registry_hash = Hash.new
356
+
357
+ blob_items.each do |blob_item|
358
+ initial_offset = 0
359
+ initial_offset = blob_item.properties[:content_length] if @registry_create_policy == 'resume'
360
+ registry_item = LogStash::Inputs::RegistryItem.new(blob_item.name, blob_item.properties[:etag], nil, initial_offset, 0)
361
+ registry_hash[blob_item.name] = registry_item
362
+ end # each
363
+ save_registry(registry_hash)
364
+ return registry_hash
365
+ end # create_registry
366
+
367
+ # Load the content of the registry into the registry hash and return it.
368
+ def load_registry
369
+ # Get content
370
+ registry_blob, registry_blob_body = @azure_blob.get_blob(@container, @registry_path)
371
+ registry_hash = deserialize_registry_hash(registry_blob_body)
372
+ return registry_hash
373
+ end # def load_registry
374
+
375
+ # Serialize the registry hash and save it.
376
+ def save_registry(registry_hash)
377
+ # Serialize hash to json
378
+ registry_hash_json = JSON.generate(registry_hash)
379
+
380
+ # Upload registry to blob
381
+ @azure_blob.create_block_blob(@container, @registry_path, registry_hash_json)
382
+ end # def save_registry
383
+ end # class LogStash::Inputs::LogstashInputAzureblob
@@ -0,0 +1,26 @@
1
+ Gem::Specification.new do |s|
2
+ s.name = 'logstash-input-azureblob-json-head-tail'
3
+ s.version = '0.9.9'
4
+ s.licenses = ['Apache License (2.0)']
5
+ s.summary = 'This plugin collects Microsoft Azure Diagnostics data from Azure Storage Blobs.'
6
+ s.description = 'This gem is a Logstash plugin. It reads and parses data from Azure Storage Blobs.'
7
+ s.homepage = 'https://github.com/Azure/azure-diagnostics-tools'
8
+ s.authors = ['Microsoft Corporation']
9
+ s.email = 'azdiag@microsoft.com'
10
+ s.require_paths = ['lib']
11
+
12
+ # Files
13
+ s.files = Dir['lib/**/*','spec/**/*','vendor/**/*','*.gemspec','*.md','Gemfile','LICENSE']
14
+ # Tests
15
+ s.test_files = s.files.grep(%r{^(test|spec|features)/})
16
+
17
+ # Special flag to let us know this is actually a logstash plugin
18
+ s.metadata = { "logstash_plugin" => "true", "logstash_group" => "input" }
19
+
20
+ # Gem dependencies
21
+ s.add_runtime_dependency "logstash-core-plugin-api", '>= 1.60', '<= 2.99'
22
+ s.add_runtime_dependency 'logstash-codec-json_lines'
23
+ s.add_runtime_dependency 'stud', '>= 0.0.22'
24
+ s.add_runtime_dependency 'azure-storage', '~> 0.12.3.preview'
25
+ s.add_development_dependency 'logstash-devutils'
26
+ end
@@ -0,0 +1 @@
1
+ require "logstash/devutils/rspec/spec_helper"
metadata ADDED
@@ -0,0 +1,131 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: logstash-input-azureblob-json-head-tail
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.9.9
5
+ platform: ruby
6
+ authors:
7
+ - Microsoft Corporation
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2017-08-23 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: logstash-core-plugin-api
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - '>='
18
+ - !ruby/object:Gem::Version
19
+ version: '1.60'
20
+ - - <=
21
+ - !ruby/object:Gem::Version
22
+ version: '2.99'
23
+ type: :runtime
24
+ prerelease: false
25
+ version_requirements: !ruby/object:Gem::Requirement
26
+ requirements:
27
+ - - '>='
28
+ - !ruby/object:Gem::Version
29
+ version: '1.60'
30
+ - - <=
31
+ - !ruby/object:Gem::Version
32
+ version: '2.99'
33
+ - !ruby/object:Gem::Dependency
34
+ name: logstash-codec-json_lines
35
+ requirement: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - '>='
38
+ - !ruby/object:Gem::Version
39
+ version: '0'
40
+ type: :runtime
41
+ prerelease: false
42
+ version_requirements: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - '>='
45
+ - !ruby/object:Gem::Version
46
+ version: '0'
47
+ - !ruby/object:Gem::Dependency
48
+ name: stud
49
+ requirement: !ruby/object:Gem::Requirement
50
+ requirements:
51
+ - - '>='
52
+ - !ruby/object:Gem::Version
53
+ version: 0.0.22
54
+ type: :runtime
55
+ prerelease: false
56
+ version_requirements: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - '>='
59
+ - !ruby/object:Gem::Version
60
+ version: 0.0.22
61
+ - !ruby/object:Gem::Dependency
62
+ name: azure-storage
63
+ requirement: !ruby/object:Gem::Requirement
64
+ requirements:
65
+ - - ~>
66
+ - !ruby/object:Gem::Version
67
+ version: 0.12.3.preview
68
+ type: :runtime
69
+ prerelease: false
70
+ version_requirements: !ruby/object:Gem::Requirement
71
+ requirements:
72
+ - - ~>
73
+ - !ruby/object:Gem::Version
74
+ version: 0.12.3.preview
75
+ - !ruby/object:Gem::Dependency
76
+ name: logstash-devutils
77
+ requirement: !ruby/object:Gem::Requirement
78
+ requirements:
79
+ - - '>='
80
+ - !ruby/object:Gem::Version
81
+ version: '0'
82
+ type: :development
83
+ prerelease: false
84
+ version_requirements: !ruby/object:Gem::Requirement
85
+ requirements:
86
+ - - '>='
87
+ - !ruby/object:Gem::Version
88
+ version: '0'
89
+ description: This gem is a Logstash plugin. It reads and parses data from Azure Storage
90
+ Blobs.
91
+ email: azdiag@microsoft.com
92
+ executables: []
93
+ extensions: []
94
+ extra_rdoc_files: []
95
+ files:
96
+ - lib/logstash/inputs/azureblob.rb
97
+ - spec/inputs/azureblob_spec.rb
98
+ - logstash-input-azureblob.gemspec
99
+ - CHANGELOG.md
100
+ - README.md
101
+ - Gemfile
102
+ - LICENSE
103
+ homepage: https://github.com/Azure/azure-diagnostics-tools
104
+ licenses:
105
+ - Apache License (2.0)
106
+ metadata:
107
+ logstash_plugin: 'true'
108
+ logstash_group: input
109
+ post_install_message:
110
+ rdoc_options: []
111
+ require_paths:
112
+ - lib
113
+ required_ruby_version: !ruby/object:Gem::Requirement
114
+ requirements:
115
+ - - '>='
116
+ - !ruby/object:Gem::Version
117
+ version: '0'
118
+ required_rubygems_version: !ruby/object:Gem::Requirement
119
+ requirements:
120
+ - - '>='
121
+ - !ruby/object:Gem::Version
122
+ version: '0'
123
+ requirements: []
124
+ rubyforge_project:
125
+ rubygems_version: 2.0.14.1
126
+ signing_key:
127
+ specification_version: 4
128
+ summary: This plugin collects Microsoft Azure Diagnostics data from Azure Storage
129
+ Blobs.
130
+ test_files:
131
+ - spec/inputs/azureblob_spec.rb