RubyGems - logstash-input-azure_blob_storage - Versions diffs - 0.11.4 → 0.12.0 - Mend

logstash-input-azure_blob_storage 0.11.4 → 0.12.0

Files changed (6) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +24 -1
data/README.md +52 -21
data/lib/logstash/inputs/azure_blob_storage.rb +133 -52
data/logstash-input-azure_blob_storage.gemspec +3 -3
metadata +12 -26

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 158d9ef3b7997fb3ec67f4e2278861ae367c3e4a73f362dc56f145482d802e34
-  data.tar.gz: 89f5b1bc848a97cbf31b1323aa64d021d86a05292d3d7d006994ad170666a37d
+  metadata.gz: bcf097b26eafe13b09cbaca77a097c10cc6e429b51125c6e82f27e8057e6ccab
+  data.tar.gz: cf229e45283fc69d29d751b75c4fce42b432103ef49d7ec018dea810477d4b32
 SHA512:
-  metadata.gz: 80f12e364ba3fd81375d2b88d24567d92ec83decac371552e3a814194f6dcae2f1c6991ac87f50e0012a8cb177f67da92790d40a71af953b211e5043a1691170
-  data.tar.gz: 0e54b9c0b9f63737ef8046d362c47f1c20f2d9f702db0311993def976f1a40c14534c7fae9a7a90e098ce4b3bdd18d00517f420e9cc6c4b7810f3709aee797e1
+  metadata.gz: ad5a05a919398a665b70ee177ba2a43f53e74462cfa4b1afb308caa2d065ece6d923142b12f16c5385e3edc840cbe453c7937b5242dd697239a240c8295e4418
+  data.tar.gz: 6be87f645933465f9edc34b675d7dd7dd861bbf1990d9fe9919731da14fd450523baa5bafefbdd9fcad4ec9aef70528d258090c2c545d4879c27069a514384ec

data/CHANGELOG.md CHANGED Viewed

@@ -1,6 +1,29 @@
+## 0.12.0
+  - version 2 of azure-storage
+  - saving current files registry, not keeping historical files
+## 0.11.7
+  - implemented skip_learning
+  - start ignoring failed files and not retry
+## 0.11.6
+  - fix in json head and tail learning the max_results
+  - broke out connection setup in order to call it again if connection exceptions come
+  - deal better with skipping of empty files.
+## 0.11.5
+  - added optional addfilename to add filename in message
+  - NSGFLOWLOG version 2 uses 0 as value instead of NULL in src and dst values
+  - added connection exception handling when full_read files
+  - rewritten json header footer learning to ignore learning from registry
+  - plumbing for emulator
 ## 0.11.4
   - fixed listing 3 times, rather than retrying to list max 3 times
-  - added log entries for better tracing in which phase the application is now and how long it takes
+  - added option to migrate/save to using local registry
+  - rewrote interval timing
+  - reduced saving of registry to maximum once per interval, protect duplicate simultanious writes
+  - added debug_timer for better tracing how long operations take
   - removing pipeline name from logfiles, logstash 7.6 and up have this in the log4j2 by default now
   - moved initialization from register to run. should make logs more readable

data/README.md CHANGED Viewed

@@ -1,30 +1,34 @@
-# Logstash Plugin
+# Logstash
-This is a plugin for [Logstash](https://github.com/elastic/logstash).
+This is a plugin for [Logstash](https://github.com/elastic/logstash). It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way. All logstash plugin documentation are placed under one [central location](http://www.elastic.co/guide/en/logstash/current/). Need generic logstash help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum.
-It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way.
+For problems or feature requests with this specific plugin, raise a github issue [GITHUB/janmg/logstash-input-azure_blob_storage/](https://github.com/janmg/logstash-input-azure_blob_storage). Pull requests will also be welcomed after discussion through an issue.
-## Documentation
-All logstash plugin documentation are placed under one [central location](http://www.elastic.co/guide/en/logstash/current/).
+## Purpose
+This plugin can read from Azure Storage Blobs, for instance JSON diagnostics logs for NSG flow logs or LINE based accesslogs from App Services.
+[Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/)
-## Need Help?
+The plugin depends on the [Ruby library azure-storage-blon](https://rubygems.org/gems/azure-storage-blob/versions/1.1.0) from Microsoft, that depends on Faraday for the HTTPS connection to Azure.
-Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum. For real problems or feature requests, raise a github issue [GITHUB/janmg/logstash-input-azure_blob_storage/](https://github.com/janmg/logstash-input-azure_blob_storage). Pull requests will ionly be merged after discussion through an issue.
+The plugin executes the following steps
+1. Lists all the files in the azure storage account. where the path of the files are matching pathprefix
+2. Filters on path_filters to only include files that match the directory and file glob (e.g. **/*.json)
+3. Save the listed files in a registry of known files and filesizes. (data/registry.dat on azure, or in a file on the logstash instance)
+4. List all the files again and compare the registry with the new filelist and put the delta in a worklist
+5. Process the worklist and put all events in the logstash queue.
+6. if there is time left, sleep to complete the interval. If processing takes more than an inteval, save the registry and continue processing.
+7. If logstash is stopped, a stop signal will try to finish the current file, save the registry and than quit
-## Purpose
-This plugin can read from Azure Storage Blobs, for instance diagnostics logs for NSG flow logs or accesslogs from App Services.
-[Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/)
-This
 ## Installation
 This plugin can be installed through logstash-plugin
 ```
-logstash-plugin install logstash-input-azure_blob_storage
+/usr/share/logstash/bin/logstash-plugin install logstash-input-azure_blob_storage
 ```
 ## Minimal Configuration
 The minimum configuration required as input is storageaccount, access_key and container.
+/etc/logstash/conf.d/test.conf
 ```
 input {
     azure_blob_storage {
@@ -36,23 +40,29 @@ input {
 ```
 ## Additional Configuration
-The registry_create_policy is used when the pipeline is started to either resume from the last known unprocessed file, or to start_fresh ignoring old files or start_over to process all the files from the beginning.
+The registry keeps track of files in the storage account, their size and how many bytes have been processed. Files can grow and the added part will be processed as a partial file. The registry is saved todisk every interval.
+The registry_create_policy determines at the start of the pipeline if processing should resume from the last known unprocessed file, or to start_fresh ignoring old files and start only processing new events that came after the start of the pipeline. Or start_over to process all the files ignoring the registry.
-interval defines the minimum time the registry should be saved to the registry file (by default 'data/registry.dat'), this is only needed in case the pipeline dies unexpectedly. During a normal shutdown the registry is also saved.
+interval defines the minimum time the registry should be saved to the registry file (by default to 'data/registry.dat'), this is only needed in case the pipeline dies unexpectedly. During a normal shutdown the registry is also saved.
-During the pipeline start the plugin uses one file to learn how the JSON header and tail look like, they can also be configured manually.
+When registry_local_path is set to a directory, the registry is saved on the logstash server in that directory. The filename is the pipe.id
+with registry_create_policy set to resume and the registry_local_path set to a directory where the registry isn't yet created, should load the registry from the storage account and save the registry on the local server. This allows for a migration to localstorage
+For pipelines that use the JSON codec or the JSON_LINE codec, the plugin uses one file to learn how the JSON header and tail look like, they can also be configured manually. Using skip_learning the learning can be disabled.
 ## Running the pipeline
 The pipeline can be started in several ways.
  - On the commandline
    ```
-   /usr/share/logstash/bin/logtash -f /etc/logstash/pipeline.d/test.yml
+   /usr/share/logstash/bin/logtash -f /etc/logstash/conf.d/test.conf
    ```
  - In the pipeline.yml
    ```
    /etc/logstash/pipeline.yml
    pipe.id = test
-   pipe.path = /etc/logstash/pipeline.d/test.yml
+   pipe.path = /etc/logstash/conf.d/test.conf
    ```
  - As managed pipeline from Kibana
@@ -91,6 +101,9 @@ The log level of the plugin can be put into DEBUG through
 curl -XPUT 'localhost:9600/_node/logging?pretty' -H 'Content-Type: application/json' -d'{"logger.logstash.inputs.azureblobstorage" : "DEBUG"}'
 ```
+Because logstash debug makes logstash very chatty, the option debug_until will for a number of processed events and stops debuging. One file can easily contain thousands of events. The debug_until is useful to monitor the start of the plugin and the processing of the first files.
+debug_timer will show detailed information on how much time listing of files took and how long the plugin will sleep to fill the interval and the listing and processing starts again.
 ## Other Configuration Examples
 For nsgflowlogs, a simple configuration looks like this
@@ -116,6 +129,10 @@ filter {
     }
 }
+output {
+  stdout { }
+}
 output {
     elasticsearch {
         hosts => "elasticsearch"
@@ -123,21 +140,35 @@ output {
     }
 }
 ```
+A more elaborate input configuration example
 ```
 input {
     azure_blob_storage {
+        codec => "json"
         storageaccount => "yourstorageaccountname"
         access_key => "Ba5e64c0d3=="
         container => "insights-logs-networksecuritygroupflowevent"
-        codec => "json"
         logtype => "nsgflowlog"
         prefix => "resourceId=/"
+        path_filters => ['**/*.json']
+        addfilename => true
         registry_create_policy => "resume"
+        registry_local_path => "/usr/share/logstash/plugin"
         interval => 300
+        debug_timer => true
+        debug_until => 100
+    }
+}
+output {
+    elasticsearch {
+        hosts => "elasticsearch"
+        index => "nsg-flow-logs-%{+xxxx.ww}"
     }
 }
 ```
+The configuration documentation is in the first 100 lines of the code
+[GITHUB/janmg/logstash-input-azure_blob_storage/blob/master/lib/logstash/inputs/azure_blob_storage.rb](https://github.com/janmg/logstash-input-azure_blob_storage/blob/master/lib/logstash/inputs/azure_blob_storage.rb)
 For WAD IIS and App Services the HTTP AccessLogs can be retrieved from a storage account as line based events and parsed through GROK. The date stamp can also be parsed with %{TIMESTAMP_ISO8601:log_timestamp}. For WAD IIS logfiles the container is wad-iis-logfiles. In the future grokking may happen already by the plugin.
 ```
@@ -176,7 +207,7 @@ filter {
     remove_field => ["subresponse"]
     remove_field => ["username"]
     remove_field => ["clientPort"]
-    remove_field => ["port"]
+    remove_field => ["port"]:0
     remove_field => ["timestamp"]
   }
 }

data/lib/logstash/inputs/azure_blob_storage.rb CHANGED Viewed

@@ -25,6 +25,9 @@ config :storageaccount, :validate => :string, :required => false
 # DNS Suffix other then blob.core.windows.net
 config :dns_suffix, :validate => :string, :required => false, :default => 'core.windows.net'
+# For development this can be used to emulate an accountstorage when not available from azure
+#config :use_development_storage, :validate => :boolean, :required => false
 # The (primary or secondary) Access Key for the the storage account. The key can be found in the portal.azure.com or through the azure api StorageAccounts/ListKeys. For example the PowerShell command Get-AzStorageAccountKey.
 config :access_key, :validate => :password, :required => false
@@ -58,6 +61,9 @@ config :registry_create_policy, :validate => ['resume','start_over','start_fresh
 # Z00000000000000000000000000000000 2     ]}
 config :interval, :validate => :number, :default => 60
+# add the filename into the events
+config :addfilename, :validate => :boolean, :default => false, :required => false
 # debug_until will for a maximum amount of processed messages shows 3 types of log printouts including processed filenames. This is a lightweight alternative to switching the loglevel from info to debug or even trace
 config :debug_until, :validate => :number, :default => 0, :required => false
@@ -67,6 +73,9 @@ config :debug_timer, :validate => :boolean, :default => false, :required => fals
 # WAD IIS Grok Pattern
 #config :grokpattern, :validate => :string, :required => false, :default => '%{TIMESTAMP_ISO8601:log_timestamp} %{NOTSPACE:instanceId} %{NOTSPACE:instanceId2} %{IPORHOST:ServerIP} %{WORD:httpMethod} %{URIPATH:requestUri} %{NOTSPACE:requestQuery} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientIP} %{NOTSPACE:httpVersion} %{NOTSPACE:userAgent} %{NOTSPACE:cookie} %{NOTSPACE:referer} %{NOTSPACE:host} %{NUMBER:httpStatus} %{NUMBER:subresponse} %{NUMBER:win32response} %{NUMBER:sentBytes:int} %{NUMBER:receivedBytes:int} %{NUMBER:timeTaken:int}'
+# skip learning if you use json and don't want to learn the head and tail, but use either the defaults or configure them.
+config :skip_learning, :validate => :boolean, :default => false, :required => false
 # The string that starts the JSON. Only needed when the codec is JSON. When partial file are read, the result will not be valid JSON unless the start and end are put back. the file_head and file_tail are learned at startup, by reading the first file in the blob_list and taking the first and last block, this would work for blobs that are appended like nsgflowlogs. The configuration can be set to override the learning. In case learning fails and the option is not set, the default is to use the 'records' as set by nsgflowlogs.
 config :file_head, :validate => :string, :required => false, :default => '{"records":['
 # The string that ends the JSON
@@ -109,30 +118,7 @@ def run(queue)
     @processed = 0
     @regsaved = @processed
-    # Try in this order to access the storageaccount
-    # 1. storageaccount / sas_token
-    # 2. connection_string
-    # 3. storageaccount / access_key
-    unless connection_string.nil?
-	conn = connection_string.value
-    end
-    unless sas_token.nil?
-        unless sas_token.value.start_with?('?')
-	    conn = "BlobEndpoint=https://#{storageaccount}.#{dns_suffix};SharedAccessSignature=#{sas_token.value}"
-        else
-	    conn = sas_token.value
-    	end
-    end
-    unless conn.nil?
-        @blob_client = Azure::Storage::Blob::BlobService.create_from_connection_string(conn)
-    else
-        @blob_client = Azure::Storage::Blob::BlobService.create(
-            storage_account_name: storageaccount,
-	    storage_dns_suffix: dns_suffix,
-            storage_access_key: access_key.value,
-        )
-    end
+    connect
     @registry = Hash.new
     if registry_create_policy == "resume"
@@ -167,7 +153,7 @@ def run(queue)
     if registry_create_policy == "start_fresh"
         @registry = list_blobs(true)
 	save_registry(@registry)
-	@logger.info("starting fresh, overwriting the registry to contain #{@registry.size} blobs/files")
+	@logger.info("starting fresh, writing a clean registry to contain #{@registry.size} blobs/files")
     end
     @is_json = false
@@ -180,12 +166,14 @@ def run(queue)
     @tail = ''
     # if codec=json sniff one files blocks A and Z to learn file_head and file_tail
     if @is_json
-        learn_encapsulation
         if file_head
-           @head = file_head
+            @head = file_head
         end
         if file_tail
-           @tail = file_tail
+            @tail = file_tail
+        end
+        if file_head and file_tail and !skip_learning
+            learn_encapsulation
         end
         @logger.info("head will be: #{@head} and tail is set to #{@tail}")
     end
@@ -223,33 +211,55 @@ def run(queue)
             newreg.store(name, { :offset => off, :length => file[:length] })
 	    if (@debug_until > @processed) then @logger.info("2: adding offsets: #{name} #{off} #{file[:length]}") end
 	end
+        # size nilClass when the list doesn't grow?!
         # Worklist is the subset of files where the already read offset is smaller than the file size
-	worklist.clear
+        @registry = newreg
+        worklist.clear
+        chunk = nil
 	worklist = newreg.select {|name,file| file[:offset] < file[:length]}
 	if (worklist.size > 4) then @logger.info("worklist contains #{worklist.size} blobs") end
         # Start of processing
 	# This would be ideal for threading since it's IO intensive, would be nice with a ruby native ThreadPool
-        worklist.each do |name, file|
+        if (worklist.size > 0) then
+          worklist.each do |name, file|
             start = Time.now.to_i
             if (@debug_until > @processed) then @logger.info("3: processing #{name} from #{file[:offset]} to #{file[:length]}") end
             size = 0
             if file[:offset] == 0
-                chunk = full_read(name)
-                size=chunk.size
+                # This is where Sera4000 issue starts
+                # For an append blob, reading full and crashing, retry, last_modified? ... lenght? ... committed? ...
+                # length and skip reg value
+                if (file[:length] > 0)
+                    begin
+                        chunk = full_read(name)
+                        size=chunk.size
+                    rescue Exception => e
+                        @logger.error("Failed to read #{name} because of: #{e.message} .. will continue, set file as read and pretend this never happened")
+                        @logger.error("#{size} size and #{file[:length]} file length")
+                        size = file[:length]
+                    end
+                else
+                    @logger.info("found a zero size file #{name}")
+                    chunk = nil
+                end
             else
                 chunk = partial_read_json(name, file[:offset], file[:length])
                 @logger.debug("partial file #{name} from #{file[:offset]} to #{file[:length]}")
             end
             if logtype == "nsgflowlog" && @is_json
+              # skip empty chunks
+              unless chunk.nil?
                 res = resource(name)
                 begin
 		    fingjson = JSON.parse(chunk)
-                    @processed += nsgflowlog(queue, fingjson)
+                    @processed += nsgflowlog(queue, fingjson, name)
                     @logger.debug("Processed #{res[:nsg]} [#{res[:date]}] #{@processed} events")
                 rescue JSON::ParserError
                     @logger.error("parse error on #{res[:nsg]} [#{res[:date]}] offset: #{file[:offset]} length: #{file[:length]}")
                 end
+              end
             # TODO: Convert this to line based grokking.
             # TODO: ECS Compliance?
             elsif logtype == "wadiis" && !@is_json
@@ -257,13 +267,17 @@ def run(queue)
             else
                 counter = 0
                 begin
-                  @codec.decode(chunk) do |event|
+                    @codec.decode(chunk) do |event|
                     counter += 1
+                    if @addfilename
+                      event.set('filename', name)
+                    end
                     decorate(event)
                     queue << event
                   end
                 rescue Exception => e
                     @logger.error("codec exception: #{e.message} .. will continue and pretend this never happened")
+                    @registry.store(name, { :offset => file[:length], :length => file[:length] })
                     @logger.debug("#{chunk}")
                 end
                 @processed += counter
@@ -279,6 +293,7 @@ def run(queue)
 	    if ((Time.now.to_i - @last) > @interval)
                 save_registry(@registry)
             end
+          end
         end
 	# The files that got processed after the last registry save need to be saved too, in case the worklist is empty for some intervals.
         now = Time.now.to_i
@@ -302,8 +317,54 @@ end
 private
+def connect
+    # Try in this order to access the storageaccount
+    # 1. storageaccount / sas_token
+    # 2. connection_string
+    # 3. storageaccount / access_key
+    unless connection_string.nil?
+        conn = connection_string.value
+    end
+    unless sas_token.nil?
+        unless sas_token.value.start_with?('?')
+            conn = "BlobEndpoint=https://#{storageaccount}.#{dns_suffix};SharedAccessSignature=#{sas_token.value}"
+        else
+            conn = sas_token.value
+        end
+    end
+    unless conn.nil?
+        @blob_client = Azure::Storage::Blob::BlobService.create_from_connection_string(conn)
+    else
+#        unless use_development_storage?
+        @blob_client = Azure::Storage::Blob::BlobService.create(
+            storage_account_name: storageaccount,
+            storage_dns_suffix: dns_suffix,
+            storage_access_key: access_key.value,
+        )
+#        else
+#            @logger.info("not yet implemented")
+#        end
+    end
+end
 def full_read(filename)
-    return @blob_client.get_blob(container, filename)[1]
+    tries ||= 2
+    begin
+        return @blob_client.get_blob(container, filename)[1]
+    rescue Exception => e
+        @logger.error("caught: #{e.message} for full_read")
+        if (tries -= 1) > 0
+           if e.message = "Connection reset by peer"
+               connect
+           end
+           retry
+        end
+    end
+    begin
+        chuck = @blob_client.get_blob(container, filename)[1]
+    end
+    return chuck
 end
 def partial_read_json(filename, offset, length)
@@ -326,8 +387,7 @@ def strip_comma(str)
 end
-def nsgflowlog(queue, json)
+def nsgflowlog(queue, json, name)
     count=0
     json["records"].each do |record|
       res = resource(record["resourceId"])
@@ -340,9 +400,16 @@ def nsgflowlog(queue, json)
                   tups = tup.split(',')
                   ev = rule.merge({:unixtimestamp => tups[0], :src_ip => tups[1], :dst_ip => tups[2], :src_port => tups[3], :dst_port => tups[4], :protocol => tups[5], :direction => tups[6], :decision => tups[7]})
                   if (record["properties"]["Version"]==2)
+                    tups[9] = 0 if tups[9].nil?
+                    tups[10] = 0 if tups[10].nil?
+                    tups[11] = 0 if tups[11].nil?
+                    tups[12] = 0 if tups[12].nil?
                       ev.merge!( {:flowstate => tups[8], :src_pack => tups[9], :src_bytes => tups[10], :dst_pack => tups[11], :dst_bytes => tups[12]} )
                   end
                   @logger.trace(ev.to_s)
+                  if @addfilename
+                      ev.merge!( {:filename => name } )
+                  end
                   event = LogStash::Event.new('message' => ev.to_json)
                   decorate(event)
                   queue << event
@@ -429,10 +496,10 @@ def save_registry(filelist)
                 @busy_writing_registry = true
                 unless (@registry_local_path)
                     @blob_client.create_block_blob(container, registry_path, Marshal.dump(filelist))
-                    @logger.info("processed #{@processed} events, saving #{filelist.size} blobs and offsets to registry #{registry_path}")
+                    @logger.info("processed #{@processed} events, saving #{filelist.size} blobs and offsets to remote registry #{registry_path}")
                 else
                     File.open(@registry_local_path+"/"+@pipe_id, 'w') { |file| file.write(Marshal.dump(filelist)) }
-                    @logger.info("processed #{@processed} events, saving #{filelist.size} blobs and offsets to registry #{registry_local_path+"/"+@pipe_id}")
+                    @logger.info("processed #{@processed} events, saving #{filelist.size} blobs and offsets to local registry #{registry_local_path+"/"+@pipe_id}")
                 end
                 @busy_writing_registry = false
                 @last = Time.now.to_i
@@ -446,21 +513,35 @@ def save_registry(filelist)
     end
 end
 def learn_encapsulation
+  @logger.info("learn_encapsulation, this can be skipped by setting skip_learning => true. Or set both head_file and tail_file")
     # From one file, read first block and last block to learn head and tail
-    # If the blobstorage can't be found, an error from farraday middleware will come with the text
-    # org.jruby.ext.set.RubySet cannot be cast to class org.jruby.RubyFixnum
-    blob = @blob_client.list_blobs(container, { maxresults: 1, prefix: @prefix }).first
-    return if blob.nil?
-    blocks = @blob_client.list_blob_blocks(container, blob.name)[:committed]
-    # TODO add check for empty blocks and log error that the header and footer can't be learned and must be set in the config
-    @logger.debug("using #{blob.name} to learn the json header and tail")
-    @head = @blob_client.get_blob(container, blob.name, start_range: 0, end_range: blocks.first.size-1)[1]
-    @logger.debug("learned header: #{@head}")
-    length = blob.properties[:content_length].to_i
-    offset = length - blocks.last.size
-    @tail = @blob_client.get_blob(container, blob.name, start_range: offset, end_range: length-1)[1]
-    @logger.debug("learned tail: #{@tail}")
+    begin
+        blobs = @blob_client.list_blobs(container, { max_results: 3, prefix: @prefix})
+        blobs.each do |blob|
+            unless blob.name == registry_path
+              begin
+                blocks = @blob_client.list_blob_blocks(container, blob.name)[:committed]
+                if blocks.first.name.start_with?('A00')
+                  @logger.debug("using #{blob.name}/#{blocks.first.name} to learn the json header")
+                  @head = @blob_client.get_blob(container, blob.name, start_range: 0, end_range: blocks.first.size-1)[1]
+                end
+                if blocks.last.name.start_with?('Z00')
+                  @logger.debug("using #{blob.name}/#{blocks.last.name} to learn the json footer")
+                  length = blob.properties[:content_length].to_i
+                  offset = length - blocks.last.size
+                  @tail = @blob_client.get_blob(container, blob.name, start_range: offset, end_range: length-1)[1]
+                  @logger.debug("learned tail: #{@tail}")
+                end
+              rescue Exception => e
+                @logger.info("learn json one of the attempts failed #{e.message}")
+              end
+            end
+        end
+    rescue Exception => e
+        @logger.info("learn json header and footer failed because #{e.message}")
+    end
 end
 def resource(str)

data/logstash-input-azure_blob_storage.gemspec CHANGED Viewed

@@ -1,6 +1,6 @@
 Gem::Specification.new do |s|
   s.name          = 'logstash-input-azure_blob_storage'
-  s.version       = '0.11.4'
+  s.version       = '0.12.0'
   s.licenses      = ['Apache-2.0']
   s.summary       = 'This logstash plugin reads and parses data from Azure Storage Blobs.'
   s.description   = <<-EOF
@@ -22,6 +22,6 @@ EOF
   # Gem dependencies
   s.add_runtime_dependency 'logstash-core-plugin-api', '~> 2.1'
   s.add_runtime_dependency 'stud', '~> 0.0.23'
-  s.add_runtime_dependency 'azure-storage-blob', '~> 1.1'
-  s.add_development_dependency 'logstash-devutils', '~> 1.0', '>= 1.0.0'
+  s.add_runtime_dependency 'azure-storage-blob', '~> 2', '>= 2.0.3'
+  #s.add_development_dependency 'logstash-devutils', '~> 2'
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: logstash-input-azure_blob_storage
 version: !ruby/object:Gem::Version
-  version: 0.11.4
+  version: 0.12.0
 platform: ruby
 authors:
 - Jan Geertsma
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2020-05-23 00:00:00.000000000 Z
+date: 2021-12-06 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   requirement: !ruby/object:Gem::Requirement
@@ -17,8 +17,8 @@ dependencies:
       - !ruby/object:Gem::Version
         version: '2.1'
   name: logstash-core-plugin-api
-  type: :runtime
   prerelease: false
+  type: :runtime
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
@@ -31,8 +31,8 @@ dependencies:
       - !ruby/object:Gem::Version
         version: 0.0.23
   name: stud
-  type: :runtime
   prerelease: false
+  type: :runtime
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
@@ -43,35 +43,21 @@ dependencies:
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '1.1'
+        version: '2'
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: 2.0.3
   name: azure-storage-blob
-  type: :runtime
   prerelease: false
+  type: :runtime
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '1.1'
-- !ruby/object:Gem::Dependency
-  requirement: !ruby/object:Gem::Requirement
-    requirements:
-    - - ">="
-      - !ruby/object:Gem::Version
-        version: 1.0.0
-    - - "~>"
-      - !ruby/object:Gem::Version
-        version: '1.0'
-  name: logstash-devutils
-  type: :development
-  prerelease: false
-  version_requirements: !ruby/object:Gem::Requirement
-    requirements:
+        version: '2'
     - - ">="
       - !ruby/object:Gem::Version
-        version: 1.0.0
-    - - "~>"
-      - !ruby/object:Gem::Version
-        version: '1.0'
+        version: 2.0.3
 description: " This gem is a Logstash plugin. It reads and parses data from Azure\
   \ Storage Blobs. The azure_blob_storage is a reimplementation to replace azureblob\
   \ from azure-diagnostics-tools/Logstash. It can deal with larger volumes and partial\
@@ -112,7 +98,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.0.6
+rubygems_version: 3.1.6
 signing_key:
 specification_version: 4
 summary: This logstash plugin reads and parses data from Azure Storage Blobs.