microsoft-sentinel-logstash-output-plugin 1.0.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: b1e2cf6ccbb30025280b4d8e8fd4d7d1aeaf08ab35d958f29330994c876fb715
4
+ data.tar.gz: 502651c1aea8fcde496e96d38abbd03ebcef59a7a8d440117b4d846ee9d08933
5
+ SHA512:
6
+ metadata.gz: 43f79af90fe870588d342b92f1248b3d66ddd804735f18166381126871b85bdfb071a6d9b08ca95f78e8e0a650b3ad9f5f03bffa60c29dd34f439c4c8b5f9152
7
+ data.tar.gz: ff11a0adf54fd4298d201aa0db71751f635b6c56c4dda9918299b32aa95b9d4ade947a717c10c79b395e9650988b0b29c3dd839d2bb5d2941e3c56b22d8364a3
data/CHANGELOG.md ADDED
@@ -0,0 +1,2 @@
1
+ ## 1.0.0
2
+ * Initial release for output plugin for logstash to Microsoft Sentinel. This is done with the Log Analytics DCR based API.
data/Gemfile ADDED
@@ -0,0 +1,2 @@
1
+ source 'https://rubygems.org'
2
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) Microsoft Corporation. All rights reserved.
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE
data/README.md ADDED
@@ -0,0 +1,230 @@
1
+ # Microsoft Sentinel output plugin for Logstash
2
+
3
+ Microsoft Sentinel provides a new output plugin for Logstash. Use this output plugin to send any log via Logstash to the Microsoft Sentinel/Log Analytics workspace. This is done with the Log Analytics DCR-based API.
4
+ You may send logs to custom or standard tables.
5
+
6
+ Plugin version: v1.0.0
7
+ Released on: 2022-11-14
8
+
9
+ This plugin is currently in development and is free to use. We welcome contributions from the open source community on this project, and we request and appreciate feedback from users.
10
+
11
+
12
+ ## Steps to implement the output plugin
13
+ 1) Install the plugin
14
+ 2) Create a sample file
15
+ 3) Create the required DCR-related resources
16
+ 4) Configure Logstash configuration file
17
+ 5) Basic logs transmission
18
+
19
+
20
+ ## 1. Install the plugin
21
+
22
+ Microsoft Sentinel provides Logstash output plugin to Log analytics workspace using DCR based logs API.
23
+ Install the microsoft-sentinel-logstash-output-plugin, use [Logstash Offline Plugin Management instruction](<https://www.elastic.co/guide/en/logstash/current/offline-plugins.html>).
24
+
25
+ Microsoft Sentinel's Logstash output plugin supports the following versions
26
+ - Logstash 7 Between 7.0 and 7.17.10
27
+ - Logstash 8 Between 8.0 and 8.8.1
28
+
29
+ Please note that when using Logstash 8, it is recommended to disable ECS in the pipeline. For more information refer to [Logstash documentation.](<https://www.elastic.co/guide/en/logstash/8.4/ecs-ls.html>)
30
+
31
+
32
+ ## 2. Create a sample file
33
+ To create a sample file, follow the following steps:
34
+ 1) Copy the output plugin configuration below to your Logstash configuration file:
35
+ ```
36
+ output {
37
+ microsoft-sentinel-logstash-output-plugin {
38
+ create_sample_file => true
39
+ sample_file_path => "<enter the path to the file in which the sample data will be written>" #for example: "c:\\temp" (for windows) or "/var/log" for Linux.
40
+ }
41
+ }
42
+ ```
43
+ Note: make sure that the path exists before creating the sample file.
44
+ 2) Start Logstash. The plugin will write up to 10 records to a sample file named "sampleFile<epoch seconds>.json" in the configured path
45
+ (for example: "c:\temp\sampleFile1648453501.json")
46
+
47
+
48
+ ### Configurations:
49
+ The following parameters are optional and should be used to create a sample file.
50
+ - **create_sample_file** - Boolean, False by default. When enabled, up to 10 events will be written to a sample json file.
51
+ - **sample_file_path** - Number, Empty by default. Required when create_sample_file is enabled. Should include a valid path in which to place the sample file generated.
52
+
53
+ ### Complete example
54
+ 1. set the pipeline.conf with the following configuration:
55
+ ```
56
+ input {
57
+ generator {
58
+ lines => [ "This is a test log message"]
59
+ count => 10
60
+ }
61
+ }
62
+
63
+ output {
64
+ microsoft-sentinel-logstash-output-plugin {
65
+ create_sample_file => true
66
+ sample_file_path => "<enter the path to the file in which the sample data will be written>" #for example: "c:\\temp" (for windows) or "/var/log" for Linux.
67
+ }
68
+ }
69
+ ```
70
+
71
+ 2. the following sample file will be generated:
72
+ ```
73
+ [
74
+ {
75
+ "host": "logstashMachine",
76
+ "sequence": 0,
77
+ "message": "This is a test log message",
78
+ "ls_timestamp": "2022-10-29T13:19:28.116Z",
79
+ "ls_version": "1"
80
+ },
81
+ ...
82
+ ]
83
+ ```
84
+
85
+ ## 3. Create the required DCR-related resources
86
+ To configure Microsoft Sentinel Logstash plugin you first need to create the DCR-related resources. To create these resources, follow one of the following tutorials:
87
+ 1) To ingest the data to a custom table use [Tutorial - Send custom logs to Azure Monitor Logs (preview) - Azure Monitor | Microsoft Docs](<https://docs.microsoft.com/azure/azure-monitor/logs/tutorial-custom-logs>) tutorial. Note that as part of creating the table and the DCR you will need to provide the sample file that you've created in the previous section.
88
+ 2) To ingest the data to a standard table like Syslog or CommonSecurityLog use [Tutorial - Send custom logs to Azure Monitor Logs using resource manager templates - Azure Monitor | Microsoft Docs](<https://docs.microsoft.com/azure/azure-monitor/logs/tutorial-custom-logs-api>).
89
+
90
+
91
+ ## 4. Configure Logstash configuration file
92
+
93
+ Use the tutorial from the previous section to retrieve the following attributes:
94
+ - **client_app_Id** - String, The 'Application (client) ID' value created in step #3 of the "Configure Application" section of the tutorial you used in the previous step.
95
+ - **client_app_secret** -String, The value of the client secret created in step #5 of the "Configure Application" section of the tutorial you used in the previous step.
96
+ - **tenant_id** - String, Your subscription's tenant id. You can find in the following path: Home -> Azure Active Directory -> Overview Under 'Basic Information'.
97
+ - **data_collection_endpoint** - String, - The value of the logsIngestion URI (see step #3 of the "Create data collection endpoint" section in Tutorial [Tutorial - Send custom logs to Azure Monitor Logs using resource manager templates - Azure Monitor | Microsoft Docs](<https://docs.microsoft.com/azure/azure-monitor/logs/tutorial-custom-logs-api#create-data-collection-endpoint>).
98
+ - **dcr_immutable_id** - String, The value of the DCR immutableId (see the "Collect information from DCR" section in [Tutorial - Send custom logs to Azure Monitor Logs (preview) - Azure Monitor | Microsoft Docs](<https://docs.microsoft.com/azure/azure-monitor/logs/tutorial-custom-logs#collect-information-from-dcr>).
99
+ - **dcr_stream_name** - String, The name of the data stream (Go to the json view of the DCR as explained in the "Collect information from DCR" section in [Tutorial - Send custom logs to Azure Monitor Logs (preview) - Azure Monitor | Microsoft Docs](<https://docs.microsoft.com/azure/azure-monitor/logs/tutorial-custom-logs#collect-information-from-dcr>) and copy the value of the "dataFlows -> streams" property (see circled in red in the below example).
100
+
101
+ After retrieving the required values replace the output section of the Logstash configuration file created in the previous steps with the example below. Then, replace the strings in the brackets below with the corresponding values. Make sure you change the "create_sample_file" attribute to false.
102
+
103
+ Here is an example for the output plugin configuration section:
104
+ ```
105
+ output {
106
+ microsoft-sentinel-logstash-output-plugin {
107
+ client_app_Id => "<enter your client_app_id value here>"
108
+ client_app_secret => "<enter your client_app_secret value here>"
109
+ tenant_id => "<enter your tenant id here>"
110
+ data_collection_endpoint => "<enter your DCE logsIngestion URI here>"
111
+ dcr_immutable_id => "<enter your DCR immutableId here>"
112
+ dcr_stream_name => "<enter your stream name here>"
113
+ create_sample_file=> false
114
+ sample_file_path => "c:\\temp"
115
+ }
116
+ }
117
+
118
+ ```
119
+ ### Optional configuration
120
+ - **key_names** – Array of strings, if you wish to send a subset of the columns to Log Analytics.
121
+ - **plugin_flush_interval** – Number, 5 by default. Defines the maximal time difference (in seconds) between sending two messages to Log Analytics.
122
+ - **retransmission_time** - Number, 10 by default. This will set the amount of time in seconds given for retransmitting messages once sending has failed.
123
+ - **compress_data** - Boolean, false by default. When this field is true, the event data is compressed before using the API. Recommended for high throughput pipelines
124
+ - **proxy** - String, Empty by default. Specify which proxy URL to use for all API calls.
125
+
126
+ Security notice: We recommend not to implicitly state client_app_Id, client_app_secret, tenant_id, data_collection_endpoint, and dcr_immutable_id in your Logstash configuration for security reasons.
127
+ It is best to store this sensitive information in a Logstash KeyStore as described here- ['Secrets Keystore'](<https://www.elastic.co/guide/en/logstash/current/keystore.html>)
128
+
129
+
130
+ ## 5. Basic logs transmission
131
+
132
+ Here is an example configuration that parses Syslog incoming data into a custom stream named "Custom-MyTableRawData".
133
+
134
+ ### Example Configuration
135
+
136
+ - Using filebeat input pipe
137
+
138
+ ```
139
+ input {
140
+ beats {
141
+ port => "5044"
142
+ }
143
+ }
144
+ filter {
145
+ }
146
+ output {
147
+ microsoft-sentinel-logstash-output-plugin {
148
+ client_app_Id => "619c1731-15ca-4403-9c61-xxxxxxxxxxxx"
149
+ client_app_secret => "xxxxxxxxxxxxxxxx"
150
+ tenant_id => "72f988bf-86f1-41af-91ab-xxxxxxxxxxxx"
151
+ data_collection_endpoint => "https://my-customlogsv2-test-jz2a.eastus2-1.ingest.monitor.azure.com"
152
+ dcr_immutable_id => "dcr-xxxxxxxxxxxxxxxxac23b8978251433a"
153
+ dcr_stream_name => "Custom-MyTableRawData"
154
+ proxy => "http://proxy.example.com"
155
+ }
156
+ }
157
+
158
+ ```
159
+ - Or using the tcp input pipe
160
+
161
+ ```
162
+ input {
163
+ tcp {
164
+ port => "514"
165
+ type => syslog #optional, will effect log type in table
166
+ }
167
+ }
168
+ filter {
169
+ }
170
+ output {
171
+ microsoft-sentinel-logstash-output-plugin {
172
+ client_app_Id => "619c1731-15ca-4403-9c61-xxxxxxxxxxxx"
173
+ client_app_secret => "xxxxxxxxxxxxxxxx"
174
+ tenant_id => "72f988bf-86f1-41af-91ab-xxxxxxxxxxxx"
175
+ data_collection_endpoint => "https://my-customlogsv2-test-jz2a.eastus2-1.ingest.monitor.azure.com"
176
+ dcr_immutable_id => "dcr-xxxxxxxxxxxxxxxxac23b8978251433a"
177
+ dcr_stream_name => "Custom-MyTableRawData"
178
+ }
179
+ }
180
+ ```
181
+
182
+ <u>Advanced Configuration</u>
183
+ ```
184
+ input {
185
+ syslog {
186
+ port => 514
187
+ }
188
+ }
189
+
190
+ output {
191
+ microsoft-sentinel-logstash-output-plugin {
192
+ client_app_Id => "${CLIENT_APP_ID}"
193
+ client_app_secret => "${CLIENT_APP_SECRET}"
194
+ tenant_id => "${TENANT_ID}"
195
+ data_collection_endpoint => "${DATA_COLLECTION_ENDPOINT}"
196
+ dcr_immutable_id => "${DCR_IMMUTABLE_ID}"
197
+ dcr_stream_name => "Custom-MyTableRawData"
198
+ key_names => ['PRI','TIME_TAG','HOSTNAME','MSG']
199
+ }
200
+ }
201
+
202
+ ```
203
+
204
+ Now you are able to run logstash with the example configuration and send mock data using the 'logger' command.
205
+
206
+ For example:
207
+ ```
208
+ logger -p local4.warn --rfc3164 --tcp -t CEF "0|Microsoft|Device|cef-test|example|data|1|here is some more data for the example" -P 514 -d -n 127.0.0.1
209
+ ```
210
+
211
+ Which will produce this content in the sample file:
212
+
213
+ ```
214
+ [
215
+ {
216
+ "logsource": "logstashMachine",
217
+ "facility": 20,
218
+ "severity_label": "Warning",
219
+ "severity": 4,
220
+ "timestamp": "Apr 7 08:26:04",
221
+ "program": "CEF:",
222
+ "host": "127.0.0.1",
223
+ "facility_label": "local4",
224
+ "priority": 164,
225
+ "message": "0|Microsoft|Device|cef-test|example|data|1|here is some more data for the example",
226
+ "ls_timestamp": "2022-04-07T08:26:04.000Z",
227
+ "ls_version": "1"
228
+ }
229
+ ]
230
+ ```
@@ -0,0 +1,103 @@
1
+ # encoding: utf-8
2
+ require "logstash/outputs/base"
3
+ require "logstash/namespace"
4
+ require "logstash/sentinel/logstashLoganalyticsConfiguration"
5
+ require "logstash/sentinel/sampleFileCreator"
6
+ require "logstash/sentinel/logsSender"
7
+
8
+
9
+ class LogStash::Outputs::MicrosoftSentinelOutput < LogStash::Outputs::Base
10
+
11
+ config_name "microsoft-sentinel-logstash-output-plugin"
12
+
13
+ # Stating that the output plugin will run in concurrent mode
14
+ concurrency :shared
15
+
16
+ # Your registered app ID
17
+ config :client_app_Id, :validate => :string
18
+
19
+ # The registered app's secret, required by Azure Loganalytics REST API
20
+ config :client_app_secret, :validate => :string
21
+
22
+ # Your Operations Management Suite Tenant ID
23
+ config :tenant_id, :validate => :string
24
+
25
+ # Your data collection rule endpoint
26
+ config :data_collection_endpoint, :validate => :string
27
+
28
+ # Your data collection rule ID
29
+ config :dcr_immutable_id, :validate => :string
30
+
31
+ # Your dcr data stream name
32
+ config :dcr_stream_name, :validate => :string
33
+
34
+ # Subset of keys to send to the Azure Loganalytics workspace
35
+ config :key_names, :validate => :array, :default => []
36
+
37
+ # Max number of seconds to wait between flushes. Default 5
38
+ config :plugin_flush_interval, :validate => :number, :default => 5
39
+
40
+ # Factor for adding to the amount of messages sent
41
+ config :decrease_factor, :validate => :number, :default => 100
42
+
43
+ # This will trigger message amount resizing in a REST request to LA
44
+ config :amount_resizing, :validate => :boolean, :default => true
45
+
46
+ # Setting the default amount of messages sent
47
+ # it this is set with amount_resizing=false --> each message will have max_items
48
+ config :max_items, :validate => :number, :default => 2000
49
+
50
+ # Setting proxy to be used for the Azure LogAnalytics REST client
51
+ config :proxy, :validate => :string, :default => ''
52
+
53
+ # This will set the amount of time given for retransmitting messages once sending is failed
54
+ config :retransmission_time, :validate => :number, :default => 10
55
+
56
+ # Compress the message body before sending to LA
57
+ config :compress_data, :validate => :boolean, :default => false
58
+
59
+ # Generate sample file from incoming events
60
+ config :create_sample_file, :validate => :boolean, :default => false
61
+
62
+ # Path where to place the sample file created
63
+ config :sample_file_path, :validate => :string
64
+
65
+ public
66
+ def register
67
+ @logstash_configuration= build_logstash_configuration()
68
+
69
+ # Validate configuration correctness
70
+ @logstash_configuration.validate_configuration()
71
+
72
+ @events_handler = @logstash_configuration.create_sample_file ?
73
+ LogStash::Outputs::MicrosoftSentinelOutputInternal::SampleFileCreator::new(@logstash_configuration) :
74
+ LogStash::Outputs::MicrosoftSentinelOutputInternal::LogsSender::new(@logstash_configuration)
75
+ end # def register
76
+
77
+ def multi_receive(events)
78
+ @events_handler.handle_events(events)
79
+ end # def multi_receive
80
+
81
+ def close
82
+ @events_handler.close
83
+ end
84
+
85
+ #private
86
+ private
87
+
88
+ # Building the logstash object configuration from the output configuration provided by the user
89
+ # Return LogstashLoganalyticsOutputConfiguration populated with the configuration values
90
+ def build_logstash_configuration()
91
+ logstash_configuration= LogStash::Outputs::MicrosoftSentinelOutputInternal::LogstashLoganalyticsOutputConfiguration::new(@client_app_Id, @client_app_secret, @tenant_id, @data_collection_endpoint, @dcr_immutable_id, @dcr_stream_name, @compress_data, @create_sample_file, @sample_file_path, @logger)
92
+ logstash_configuration.key_names = @key_names
93
+ logstash_configuration.plugin_flush_interval = @plugin_flush_interval
94
+ logstash_configuration.decrease_factor = @decrease_factor
95
+ logstash_configuration.amount_resizing = @amount_resizing
96
+ logstash_configuration.max_items = @max_items
97
+ logstash_configuration.proxy = @proxy
98
+ logstash_configuration.retransmission_time = @retransmission_time
99
+
100
+ return logstash_configuration
101
+ end # def build_logstash_configuration
102
+
103
+ end # class LogStash::Outputs::MicrosoftSentinelOutput
@@ -0,0 +1,293 @@
1
+ # This code is from a PR for the official repo of ruby-stud
2
+ # with a small change to calculating the event size in the var_size function
3
+ # https://github.com/jordansissel/ruby-stud/pull/19
4
+ #
5
+ # @author {Alex Dean}[http://github.com/alexdean]
6
+ #
7
+ # Implements a generic framework for accepting events which are later flushed
8
+ # in batches. Flushing occurs whenever +:max_items+ or +:max_interval+ (seconds)
9
+ # has been reached or if the event size outgrows +:flush_each+ (bytes)
10
+ #
11
+ # Including class must implement +flush+, which will be called with all
12
+ # accumulated items either when the output buffer fills (+:max_items+ or
13
+ # +:flush_each+) or when a fixed amount of time (+:max_interval+) passes.
14
+ #
15
+ # == batch_receive and flush
16
+ # General receive/flush can be implemented in one of two ways.
17
+ #
18
+ # === batch_receive(event) / flush(events)
19
+ # +flush+ will receive an array of events which were passed to +buffer_receive+.
20
+ #
21
+ # batch_receive('one')
22
+ # batch_receive('two')
23
+ #
24
+ # will cause a flush invocation like
25
+ #
26
+ # flush(['one', 'two'])
27
+ #
28
+ # === batch_receive(event, group) / flush(events, group)
29
+ # flush() will receive an array of events, plus a grouping key.
30
+ #
31
+ # batch_receive('one', :server => 'a')
32
+ # batch_receive('two', :server => 'b')
33
+ # batch_receive('three', :server => 'a')
34
+ # batch_receive('four', :server => 'b')
35
+ #
36
+ # will result in the following flush calls
37
+ #
38
+ # flush(['one', 'three'], {:server => 'a'})
39
+ # flush(['two', 'four'], {:server => 'b'})
40
+ #
41
+ # Grouping keys can be anything which are valid Hash keys. (They don't have to
42
+ # be hashes themselves.) Strings or Fixnums work fine. Use anything which you'd
43
+ # like to receive in your +flush+ method to help enable different handling for
44
+ # various groups of events.
45
+ #
46
+ # == on_flush_error
47
+ # Including class may implement +on_flush_error+, which will be called with an
48
+ # Exception instance whenever buffer_flush encounters an error.
49
+ #
50
+ # * +buffer_flush+ will automatically re-try failed flushes, so +on_flush_error+
51
+ # should not try to implement retry behavior.
52
+ # * Exceptions occurring within +on_flush_error+ are not handled by
53
+ # +buffer_flush+.
54
+ #
55
+ # == on_full_buffer_receive
56
+ # Including class may implement +on_full_buffer_receive+, which will be called
57
+ # whenever +buffer_receive+ is called while the buffer is full.
58
+ #
59
+ # +on_full_buffer_receive+ will receive a Hash like <code>{:pending => 30,
60
+ # :outgoing => 20}</code> which describes the internal state of the module at
61
+ # the moment.
62
+ #
63
+ # == final flush
64
+ # Including class should call <code>buffer_flush(:final => true)</code>
65
+ # during a teardown/shutdown routine (after the last call to buffer_receive)
66
+ # to ensure that all accumulated messages are flushed.
67
+ module LogStash; module Outputs; class MicrosoftSentinelOutputInternal
68
+ module CustomSizeBasedBuffer
69
+
70
+ public
71
+ # Initialize the buffer.
72
+ #
73
+ # Call directly from your constructor if you wish to set some non-default
74
+ # options. Otherwise buffer_initialize will be called automatically during the
75
+ # first buffer_receive call.
76
+ #
77
+ # Options:
78
+ # * :max_items, Max number of items to buffer before flushing. Default 50.
79
+ # * :flush_each, Flush each bytes of buffer. Default 0 (no flushing fired by
80
+ # a buffer size).
81
+ # * :max_interval, Max number of seconds to wait between flushes. Default 5.
82
+ # * :logger, A logger to write log messages to. No default. Optional.
83
+ #
84
+ # @param [Hash] options
85
+ def buffer_initialize(options={})
86
+ if ! self.class.method_defined?(:flush)
87
+ raise ArgumentError, "Any class including Stud::Buffer must define a flush() method."
88
+ end
89
+
90
+ @buffer_config = {
91
+ :max_items => options[:max_items] || 50,
92
+ :flush_each => options[:flush_each].to_i || 0,
93
+ :max_interval => options[:max_interval] || 5,
94
+ :logger => options[:logger] || nil,
95
+ :has_on_flush_error => self.class.method_defined?(:on_flush_error),
96
+ :has_on_full_buffer_receive => self.class.method_defined?(:on_full_buffer_receive)
97
+ }
98
+ @buffer_state = {
99
+ # items accepted from including class
100
+ :pending_items => {},
101
+ :pending_count => 0,
102
+ :pending_size => 0,
103
+
104
+ # guard access to pending_items & pending_count & pending_size
105
+ :pending_mutex => Mutex.new,
106
+
107
+ # items which are currently being flushed
108
+ :outgoing_items => {},
109
+ :outgoing_count => 0,
110
+ :outgoing_size => 0,
111
+
112
+ # ensure only 1 flush is operating at once
113
+ :flush_mutex => Mutex.new,
114
+
115
+ # data for timed flushes
116
+ :last_flush => Time.now.to_i,
117
+ :timer => Thread.new do
118
+ loop do
119
+ sleep(@buffer_config[:max_interval])
120
+ buffer_flush(:force => true)
121
+ end
122
+ end
123
+ }
124
+
125
+ # events we've accumulated
126
+ buffer_clear_pending
127
+ end
128
+
129
+ # Determine if +:max_items+ or +:flush_each+ has been reached.
130
+ #
131
+ # buffer_receive calls will block while <code>buffer_full? == true</code>.
132
+ #
133
+ # @return [bool] Is the buffer full?
134
+ def buffer_full?
135
+ (@buffer_state[:pending_count] + @buffer_state[:outgoing_count] >= @buffer_config[:max_items]) || \
136
+ (@buffer_config[:flush_each] != 0 && @buffer_state[:pending_size] + @buffer_state[:outgoing_size] >= @buffer_config[:flush_each])
137
+ end
138
+
139
+ # Save an event for later delivery
140
+ #
141
+ # Events are grouped by the (optional) group parameter you provide.
142
+ # Groups of events, plus the group name, are later passed to +flush+.
143
+ #
144
+ # This call will block if +:max_items+ or +:flush_each+ has been reached.
145
+ #
146
+ # @see Stud::Buffer The overview has more information on grouping and flushing.
147
+ #
148
+ # @param event An item to buffer for flushing later.
149
+ # @param group Optional grouping key. All events with the same key will be
150
+ # passed to +flush+ together, along with the grouping key itself.
151
+ def buffer_receive(event, group=nil)
152
+ buffer_initialize if ! @buffer_state
153
+
154
+ # block if we've accumulated too many events
155
+ while buffer_full? do
156
+ on_full_buffer_receive(
157
+ :pending => @buffer_state[:pending_count],
158
+ :outgoing => @buffer_state[:outgoing_count]
159
+ ) if @buffer_config[:has_on_full_buffer_receive]
160
+ sleep 0.1
161
+ end
162
+ @buffer_state[:pending_mutex].synchronize do
163
+ @buffer_state[:pending_items][group] << event
164
+ @buffer_state[:pending_count] += 1
165
+ @buffer_state[:pending_size] += var_size(event) if @buffer_config[:flush_each] != 0
166
+ end
167
+
168
+ buffer_flush
169
+ end
170
+
171
+ # Try to flush events.
172
+ #
173
+ # Returns immediately if flushing is not necessary/possible at the moment:
174
+ # * :max_items or :flush_each have not been accumulated
175
+ # * :max_interval seconds have not elapased since the last flush
176
+ # * another flush is in progress
177
+ #
178
+ # <code>buffer_flush(:force => true)</code> will cause a flush to occur even
179
+ # if +:max_items+ or +:flush_each+ or +:max_interval+ have not been reached. A forced flush
180
+ # will still return immediately (without flushing) if another flush is
181
+ # currently in progress.
182
+ #
183
+ # <code>buffer_flush(:final => true)</code> is identical to <code>buffer_flush(:force => true)</code>,
184
+ # except that if another flush is already in progress, <code>buffer_flush(:final => true)</code>
185
+ # will block/wait for the other flush to finish before proceeding.
186
+ #
187
+ # @param [Hash] options Optional. May be <code>{:force => true}</code> or <code>{:final => true}</code>.
188
+ # @return [Fixnum] The number of items successfully passed to +flush+.
189
+ def buffer_flush(options={})
190
+ force = options[:force] || options[:final]
191
+ final = options[:final]
192
+
193
+ # final flush will wait for lock, so we are sure to flush out all buffered events
194
+ if options[:final]
195
+ @buffer_state[:flush_mutex].lock
196
+ elsif ! @buffer_state[:flush_mutex].try_lock # failed to get lock, another flush already in progress
197
+ return 0
198
+ end
199
+
200
+ items_flushed = 0
201
+
202
+ begin
203
+ return 0 if @buffer_state[:pending_count] == 0
204
+
205
+ # compute time_since_last_flush only when some item is pending
206
+ time_since_last_flush = get_time_since_last_flush
207
+
208
+ return 0 if (!force) &&
209
+ (@buffer_state[:pending_count] < @buffer_config[:max_items]) &&
210
+ (@buffer_config[:flush_each] == 0 || @buffer_state[:pending_size] < @buffer_config[:flush_each]) &&
211
+ (time_since_last_flush < @buffer_config[:max_interval])
212
+
213
+ @buffer_state[:pending_mutex].synchronize do
214
+ @buffer_state[:outgoing_items] = @buffer_state[:pending_items]
215
+ @buffer_state[:outgoing_count] = @buffer_state[:pending_count]
216
+ @buffer_state[:outgoing_size] = @buffer_state[:pending_size]
217
+ buffer_clear_pending
218
+ end
219
+ @buffer_config[:logger].debug("Flushing output",
220
+ :outgoing_count => @buffer_state[:outgoing_count],
221
+ :time_since_last_flush => time_since_last_flush,
222
+ :outgoing_events => @buffer_state[:outgoing_items],
223
+ :batch_timeout => @buffer_config[:max_interval],
224
+ :force => force,
225
+ :final => final
226
+ ) if @buffer_config[:logger]
227
+
228
+ @buffer_state[:outgoing_items].each do |group, events|
229
+ begin
230
+
231
+ if group.nil?
232
+ flush(events,final)
233
+ else
234
+ flush(events, group, final)
235
+ end
236
+
237
+ @buffer_state[:outgoing_items].delete(group)
238
+ events_size = events.size
239
+ @buffer_state[:outgoing_count] -= events_size
240
+ if @buffer_config[:flush_each] != 0
241
+ events_volume = 0
242
+ events.each do |event|
243
+ events_volume += var_size(event)
244
+ end
245
+ @buffer_state[:outgoing_size] -= events_volume
246
+ end
247
+ items_flushed += events_size
248
+
249
+ rescue => e
250
+ @buffer_config[:logger].warn("Failed to flush outgoing items",
251
+ :outgoing_count => @buffer_state[:outgoing_count],
252
+ :exception => e,
253
+ :backtrace => e.backtrace
254
+ ) if @buffer_config[:logger]
255
+
256
+ if @buffer_config[:has_on_flush_error]
257
+ on_flush_error e
258
+ end
259
+
260
+ sleep 1
261
+ retry
262
+ end
263
+ @buffer_state[:last_flush] = Time.now.to_i
264
+ end
265
+
266
+ ensure
267
+ @buffer_state[:flush_mutex].unlock
268
+ end
269
+
270
+ return items_flushed
271
+ end
272
+
273
+ private
274
+ def buffer_clear_pending
275
+ @buffer_state[:pending_items] = Hash.new { |h, k| h[k] = [] }
276
+ @buffer_state[:pending_count] = 0
277
+ @buffer_state[:pending_size] = 0
278
+ end
279
+
280
+ private
281
+ def var_size(var)
282
+ # Calculate event size as a json.
283
+ # assuming event is a hash
284
+ return var.to_json.bytesize + 2
285
+ end
286
+
287
+ protected
288
+ def get_time_since_last_flush
289
+ Time.now.to_i - @buffer_state[:last_flush]
290
+ end
291
+
292
+ end
293
+ end ;end ;end