logstash-output-kusto 1.0.5-java → 1.0.6-java

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,94 +1,94 @@
1
- # Logstash Output Plugin for Azure Data Explorer (Kusto)
2
-
3
- ![build](https://github.com/Azure/logstash-output-kusto/workflows/build/badge.svg)
4
- ![build](https://github.com/Azure/logstash-output-kusto/workflows/build/badge.svg?branch=master)
5
- [![Gem](https://img.shields.io/gem/v/logstash-output-kusto.svg)](https://rubygems.org/gems/logstash-output-kusto)
6
- [![Gem](https://img.shields.io/gem/dt/logstash-output-kusto.svg)](https://rubygems.org/gems/logstash-output-kusto)
7
-
8
- This is a plugin for [Logstash](https://github.com/elastic/logstash).
9
-
10
- It is fully free and open source. The license is Apache 2.0.
11
-
12
- This Azure Data Explorer (ADX) Logstash plugin enables you to process events from Logstash into an **Azure Data Explorer** database for later analysis.
13
-
14
- ## Requirements
15
-
16
- - Logstash version 6+. [Installation instructions](https://www.elastic.co/guide/en/logstash/current/installing-logstash.html)
17
- - Azure Data Explorer cluster with a database. Read [Create a cluster and database](https://docs.microsoft.com/en-us/azure/data-explorer/create-cluster-database-portal) for more information.
18
- - AAD Application credentials with permission to ingest data into Azure Data Explorer. Read [Creating an AAD Application](https://docs.microsoft.com/en-us/azure/kusto/management/access-control/how-to-provision-aad-app) for more information.
19
-
20
- ## Installation
21
-
22
- To make the Azure Data Explorer plugin available in your Logstash environment, run the following command:
23
- ```sh
24
- bin/logstash-plugin install logstash-output-kusto
25
- ```
26
-
27
- ## Configuration
28
-
29
- Perform configuration before sending events from Logstash to Azure Data Explorer. The following example shows the minimum you need to provide. It should be enough for most use-cases:
30
-
31
- ```ruby
32
- output {
33
- kusto {
34
- path => "/tmp/kusto/%{+YYYY-MM-dd-HH-mm}.txt"
35
- ingest_url => "https://ingest-<cluster-name>.kusto.windows.net/"
36
- app_id => "<application id>"
37
- app_key => "<application key/secret>"
38
- app_tenant => "<tenant id>"
39
- database => "<database name>"
40
- table => "<target table>"
41
- json_mapping => "<mapping name>"
42
- proxy_host => "<proxy host>"
43
- proxy_port => <proxy port>
44
- proxy_protocol => <"http"|"https">
45
- }
46
- }
47
- ```
48
- More information about configuring Logstash can be found in the [logstash configuration guide](https://www.elastic.co/guide/en/logstash/current/configuration.html)
49
-
50
- ### Available Configuration Keys
51
-
52
- | Parameter Name | Description | Notes |
53
- | --- | --- | --- |
54
- | **path** | The plugin writes events to temporary files before sending them to ADX. This parameter includes a path where files should be written and a time expression for file rotation to trigger an upload to the ADX service. The example above shows how to rotate the files every minute and check the Logstash docs for more information on time expressions. | Required
55
- | **ingest_url** | The Kusto endpoint for ingestion-related communication. See it on the Azure Portal.| Required|
56
- | **app_id, app_key, app_tenant**| Credentials required to connect to the ADX service. Be sure to use an application with 'ingest' privileges. | Required|
57
- | **database**| Database name to place events | Required |
58
- | **table** | Target table name to place events | Required
59
- | **json_mapping** | Maps each attribute from incoming event JSON strings to the appropriate column in the table. Note that this must be in JSON format, as this is the interface between Logstash and Kusto | Required |
60
- | **recovery** | If set to true (default), plugin will attempt to resend pre-existing temp files found in the path upon startup | |
61
- | **delete_temp_files** | Determines if temp files will be deleted after a successful upload (true is default; set false for debug purposes only)| |
62
- | **flush_interval** | The time (in seconds) for flushing writes to temporary files. Default is 2 seconds, 0 will flush on every event. Increase this value to reduce IO calls but keep in mind that events in the buffer will be lost in case of abrupt failure.| |
63
- | **proxy_host** | The proxy hostname for redirecting traffic to Kusto.| |
64
- | **proxy_port** | The proxy port for the proxy. Defaults to 80.| |
65
- | **proxy_protocol** | The proxy server protocol , is one of http or https.| |
66
-
67
- > Note : LS_JAVA_OPTS can be used to set proxy parameters as well (using export or SET options)
68
-
69
- ```bash
70
- export LS_JAVA_OPTS="-Dhttp.proxyHost=1.2.34 -Dhttp.proxyPort=8989 -Dhttps.proxyHost=1.2.3.4 -Dhttps.proxyPort=8989"
71
- ```
72
-
73
-
74
- ## Development Requirements
75
-
76
- - Openjdk **8 64bit** (https://www.openlogic.com/openjdk-downloads)
77
- - JRuby 9.2 or higher, defined with openjdk 8 64bit
78
- - Logstash, defined with openjdk 8 64bit
79
-
80
- *It is reccomened to use the bundled jdk and jruby with logstash to avoid compatibility issues.*
81
-
82
- To fully build the gem, run:
83
-
84
- ```shell
85
- bundle install
86
- lock_jars
87
- gem build
88
- ```
89
-
90
- ## Contributing
91
-
92
- All contributions are welcome: ideas, patches, documentation, bug reports, and complaints.
93
- Programming is not a required skill. It is more important to the community that you are able to contribute.
94
- For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file.
1
+ # Logstash Output Plugin for Azure Data Explorer (Kusto)
2
+
3
+ ![build](https://github.com/Azure/logstash-output-kusto/workflows/build/badge.svg)
4
+ ![build](https://github.com/Azure/logstash-output-kusto/workflows/build/badge.svg?branch=master)
5
+ [![Gem](https://img.shields.io/gem/v/logstash-output-kusto.svg)](https://rubygems.org/gems/logstash-output-kusto)
6
+ [![Gem](https://img.shields.io/gem/dt/logstash-output-kusto.svg)](https://rubygems.org/gems/logstash-output-kusto)
7
+
8
+ This is a plugin for [Logstash](https://github.com/elastic/logstash).
9
+
10
+ It is fully free and open source. The license is Apache 2.0.
11
+
12
+ This Azure Data Explorer (ADX) Logstash plugin enables you to process events from Logstash into an **Azure Data Explorer** database for later analysis.
13
+
14
+ ## Requirements
15
+
16
+ - Logstash version 6+. [Installation instructions](https://www.elastic.co/guide/en/logstash/current/installing-logstash.html)
17
+ - Azure Data Explorer cluster with a database. Read [Create a cluster and database](https://docs.microsoft.com/en-us/azure/data-explorer/create-cluster-database-portal) for more information.
18
+ - AAD Application credentials with permission to ingest data into Azure Data Explorer. Read [Creating an AAD Application](https://docs.microsoft.com/en-us/azure/kusto/management/access-control/how-to-provision-aad-app) for more information.
19
+
20
+ ## Installation
21
+
22
+ To make the Azure Data Explorer plugin available in your Logstash environment, run the following command:
23
+ ```sh
24
+ bin/logstash-plugin install logstash-output-kusto
25
+ ```
26
+
27
+ ## Configuration
28
+
29
+ Perform configuration before sending events from Logstash to Azure Data Explorer. The following example shows the minimum you need to provide. It should be enough for most use-cases:
30
+
31
+ ```ruby
32
+ output {
33
+ kusto {
34
+ path => "/tmp/kusto/%{+YYYY-MM-dd-HH-mm}.txt"
35
+ ingest_url => "https://ingest-<cluster-name>.kusto.windows.net/"
36
+ app_id => "<application id>"
37
+ app_key => "<application key/secret>"
38
+ app_tenant => "<tenant id>"
39
+ database => "<database name>"
40
+ table => "<target table>"
41
+ json_mapping => "<mapping name>"
42
+ proxy_host => "<proxy host>"
43
+ proxy_port => <proxy port>
44
+ proxy_protocol => <"http"|"https">
45
+ }
46
+ }
47
+ ```
48
+ More information about configuring Logstash can be found in the [logstash configuration guide](https://www.elastic.co/guide/en/logstash/current/configuration.html)
49
+
50
+ ### Available Configuration Keys
51
+
52
+ | Parameter Name | Description | Notes |
53
+ | --- | --- | --- |
54
+ | **path** | The plugin writes events to temporary files before sending them to ADX. This parameter includes a path where files should be written and a time expression for file rotation to trigger an upload to the ADX service. The example above shows how to rotate the files every minute and check the Logstash docs for more information on time expressions. | Required
55
+ | **ingest_url** | The Kusto endpoint for ingestion-related communication. See it on the Azure Portal.| Required|
56
+ | **app_id, app_key, app_tenant**| Credentials required to connect to the ADX service. Be sure to use an application with 'ingest' privileges. | Required|
57
+ | **database**| Database name to place events | Required |
58
+ | **table** | Target table name to place events | Required
59
+ | **json_mapping** | Maps each attribute from incoming event JSON strings to the appropriate column in the table. Note that this must be in JSON format, as this is the interface between Logstash and Kusto | Required |
60
+ | **recovery** | If set to true (default), plugin will attempt to resend pre-existing temp files found in the path upon startup | |
61
+ | **delete_temp_files** | Determines if temp files will be deleted after a successful upload (true is default; set false for debug purposes only)| |
62
+ | **flush_interval** | The time (in seconds) for flushing writes to temporary files. Default is 2 seconds, 0 will flush on every event. Increase this value to reduce IO calls but keep in mind that events in the buffer will be lost in case of abrupt failure.| |
63
+ | **proxy_host** | The proxy hostname for redirecting traffic to Kusto.| |
64
+ | **proxy_port** | The proxy port for the proxy. Defaults to 80.| |
65
+ | **proxy_protocol** | The proxy server protocol , is one of http or https.| |
66
+
67
+ > Note : LS_JAVA_OPTS can be used to set proxy parameters as well (using export or SET options)
68
+
69
+ ```bash
70
+ export LS_JAVA_OPTS="-Dhttp.proxyHost=1.2.34 -Dhttp.proxyPort=8989 -Dhttps.proxyHost=1.2.3.4 -Dhttps.proxyPort=8989"
71
+ ```
72
+
73
+
74
+ ## Development Requirements
75
+
76
+ - Openjdk **8 64bit** (https://www.openlogic.com/openjdk-downloads)
77
+ - JRuby 9.2 or higher, defined with openjdk 8 64bit
78
+ - Logstash, defined with openjdk 8 64bit
79
+
80
+ *It is reccomened to use the bundled jdk and jruby with logstash to avoid compatibility issues.*
81
+
82
+ To fully build the gem, run:
83
+
84
+ ```shell
85
+ bundle install
86
+ lock_jars
87
+ gem build
88
+ ```
89
+
90
+ ## Contributing
91
+
92
+ All contributions are welcome: ideas, patches, documentation, bug reports, and complaints.
93
+ Programming is not a required skill. It is more important to the community that you are able to contribute.
94
+ For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file.
data/SECURITY.md CHANGED
@@ -1,41 +1,41 @@
1
- <!-- BEGIN MICROSOFT SECURITY.MD V0.0.7 BLOCK -->
2
-
3
- ## Security
4
-
5
- Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/).
6
-
7
- If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://aka.ms/opensource/security/definition), please report it to us as described below.
8
-
9
- ## Reporting Security Issues
10
-
11
- **Please do not report security vulnerabilities through public GitHub issues.**
12
-
13
- Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://aka.ms/opensource/security/create-report).
14
-
15
- If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://aka.ms/opensource/security/pgpkey).
16
-
17
- You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://aka.ms/opensource/security/msrc).
18
-
19
- Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:
20
-
21
- * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
22
- * Full paths of source file(s) related to the manifestation of the issue
23
- * The location of the affected source code (tag/branch/commit or direct URL)
24
- * Any special configuration required to reproduce the issue
25
- * Step-by-step instructions to reproduce the issue
26
- * Proof-of-concept or exploit code (if possible)
27
- * Impact of the issue, including how an attacker might exploit the issue
28
-
29
- This information will help us triage your report more quickly.
30
-
31
- If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://aka.ms/opensource/security/bounty) page for more details about our active programs.
32
-
33
- ## Preferred Languages
34
-
35
- We prefer all communications to be in English.
36
-
37
- ## Policy
38
-
39
- Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://aka.ms/opensource/security/cvd).
40
-
41
- <!-- END MICROSOFT SECURITY.MD BLOCK -->
1
+ <!-- BEGIN MICROSOFT SECURITY.MD V0.0.7 BLOCK -->
2
+
3
+ ## Security
4
+
5
+ Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/).
6
+
7
+ If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://aka.ms/opensource/security/definition), please report it to us as described below.
8
+
9
+ ## Reporting Security Issues
10
+
11
+ **Please do not report security vulnerabilities through public GitHub issues.**
12
+
13
+ Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://aka.ms/opensource/security/create-report).
14
+
15
+ If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://aka.ms/opensource/security/pgpkey).
16
+
17
+ You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://aka.ms/opensource/security/msrc).
18
+
19
+ Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:
20
+
21
+ * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
22
+ * Full paths of source file(s) related to the manifestation of the issue
23
+ * The location of the affected source code (tag/branch/commit or direct URL)
24
+ * Any special configuration required to reproduce the issue
25
+ * Step-by-step instructions to reproduce the issue
26
+ * Proof-of-concept or exploit code (if possible)
27
+ * Impact of the issue, including how an attacker might exploit the issue
28
+
29
+ This information will help us triage your report more quickly.
30
+
31
+ If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://aka.ms/opensource/security/bounty) page for more details about our active programs.
32
+
33
+ ## Preferred Languages
34
+
35
+ We prefer all communications to be in English.
36
+
37
+ ## Policy
38
+
39
+ Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://aka.ms/opensource/security/cvd).
40
+
41
+ <!-- END MICROSOFT SECURITY.MD BLOCK -->
@@ -1,138 +1,138 @@
1
- # encoding: utf-8
2
-
3
- require 'logstash/outputs/base'
4
- require 'logstash/namespace'
5
- require 'logstash/errors'
6
-
7
- class LogStash::Outputs::Kusto < LogStash::Outputs::Base
8
- ##
9
- # This handles the overall logic and communication with Kusto
10
- #
11
- class Ingestor
12
- require 'logstash-output-kusto_jars'
13
- RETRY_DELAY_SECONDS = 3
14
- DEFAULT_THREADPOOL = Concurrent::ThreadPoolExecutor.new(
15
- min_threads: 1,
16
- max_threads: 8,
17
- max_queue: 1,
18
- fallback_policy: :caller_runs
19
- )
20
- LOW_QUEUE_LENGTH = 3
21
- FIELD_REF = /%\{[^}]+\}/
22
-
23
- def initialize(ingest_url, app_id, app_key, app_tenant, database, table, json_mapping, delete_local, proxy_host , proxy_port , proxy_protocol,logger, threadpool = DEFAULT_THREADPOOL)
24
- @workers_pool = threadpool
25
- @logger = logger
26
- validate_config(database, table, json_mapping,proxy_protocol)
27
- @logger.info('Preparing Kusto resources.')
28
-
29
- kusto_java = Java::com.microsoft.azure.kusto
30
- apache_http = Java::org.apache.http
31
- kusto_connection_string = kusto_java.data.auth.ConnectionStringBuilder.createWithAadApplicationCredentials(ingest_url, app_id, app_key.value, app_tenant)
32
- #
33
- @logger.debug(Gem.loaded_specs.to_s)
34
- # Unfortunately there's no way to avoid using the gem/plugin name directly...
35
- name_for_tracing = "logstash-output-kusto:#{Gem.loaded_specs['logstash-output-kusto']&.version || "unknown"}"
36
- @logger.debug("Client name for tracing: #{name_for_tracing}")
37
- kusto_connection_string.setClientVersionForTracing(name_for_tracing)
38
-
39
- @kusto_client = begin
40
- if proxy_host.nil? || proxy_host.empty?
41
- kusto_java.ingest.IngestClientFactory.createClient(kusto_connection_string)
42
- else
43
- kusto_http_client_properties = kusto_java.data.HttpClientProperties.builder().proxy(apache_http.HttpHost.new(proxy_host,proxy_port,proxy_protocol)).build()
44
- kusto_java.ingest.IngestClientFactory.createClient(kusto_connection_string, kusto_http_client_properties)
45
- end
46
- end
47
-
48
- @ingestion_properties = kusto_java.ingest.IngestionProperties.new(database, table)
49
- @ingestion_properties.setIngestionMapping(json_mapping, kusto_java.ingest.IngestionMapping::IngestionMappingKind::JSON)
50
- @ingestion_properties.setDataFormat(kusto_java.ingest.IngestionProperties::DataFormat::JSON)
51
- @delete_local = delete_local
52
-
53
- @logger.debug('Kusto resources are ready.')
54
- end
55
-
56
- def validate_config(database, table, json_mapping,proxy_protocol)
57
- if database =~ FIELD_REF
58
- @logger.error('database config value should not be dynamic.', database)
59
- raise LogStash::ConfigurationError.new('database config value should not be dynamic.')
60
- end
61
-
62
- if table =~ FIELD_REF
63
- @logger.error('table config value should not be dynamic.', table)
64
- raise LogStash::ConfigurationError.new('table config value should not be dynamic.')
65
- end
66
-
67
- if json_mapping =~ FIELD_REF
68
- @logger.error('json_mapping config value should not be dynamic.', json_mapping)
69
- raise LogStash::ConfigurationError.new('json_mapping config value should not be dynamic.')
70
- end
71
-
72
- if not(["https", "http"].include? proxy_protocol)
73
- @logger.error('proxy_protocol has to be http or https.', proxy_protocol)
74
- raise LogStash::ConfigurationError.new('proxy_protocol has to be http or https.')
75
- end
76
-
77
- end
78
-
79
- def upload_async(path, delete_on_success)
80
- if @workers_pool.remaining_capacity <= LOW_QUEUE_LENGTH
81
- @logger.warn("Ingestor queue capacity is running low with #{@workers_pool.remaining_capacity} free slots.")
82
- end
83
-
84
- @workers_pool.post do
85
- LogStash::Util.set_thread_name("Kusto to ingest file: #{path}")
86
- upload(path, delete_on_success)
87
- end
88
- rescue Exception => e
89
- @logger.error('StandardError.', exception: e.class, message: e.message, path: path, backtrace: e.backtrace)
90
- raise e
91
- end
92
-
93
- def upload(path, delete_on_success)
94
- file_size = File.size(path)
95
- @logger.debug("Sending file to kusto: #{path}. size: #{file_size}")
96
-
97
- # TODO: dynamic routing
98
- # file_metadata = path.partition('.kusto.').last
99
- # file_metadata_parts = file_metadata.split('.')
100
-
101
- # if file_metadata_parts.length == 3
102
- # # this is the number we expect - database, table, json_mapping
103
- # database = file_metadata_parts[0]
104
- # table = file_metadata_parts[1]
105
- # json_mapping = file_metadata_parts[2]
106
-
107
- # local_ingestion_properties = Java::KustoIngestionProperties.new(database, table)
108
- # local_ingestion_properties.addJsonMappingName(json_mapping)
109
- # end
110
-
111
- if file_size > 0
112
- file_source_info = Java::com.microsoft.azure.kusto.ingest.source.FileSourceInfo.new(path, 0); # 0 - let the sdk figure out the size of the file
113
- @kusto_client.ingestFromFile(file_source_info, @ingestion_properties)
114
- else
115
- @logger.warn("File #{path} is an empty file and is not ingested.")
116
- end
117
- File.delete(path) if delete_on_success
118
- @logger.debug("File #{path} sent to kusto.")
119
- rescue Errno::ENOENT => e
120
- @logger.error("File doesn't exist! Unrecoverable error.", exception: e.class, message: e.message, path: path, backtrace: e.backtrace)
121
- rescue Java::JavaNioFile::NoSuchFileException => e
122
- @logger.error("File doesn't exist! Unrecoverable error.", exception: e.class, message: e.message, path: path, backtrace: e.backtrace)
123
- rescue => e
124
- # When the retry limit is reached or another error happen we will wait and retry.
125
- #
126
- # Thread might be stuck here, but I think its better than losing anything
127
- # its either a transient errors or something bad really happened.
128
- @logger.error('Uploading failed, retrying.', exception: e.class, message: e.message, path: path, backtrace: e.backtrace)
129
- sleep RETRY_DELAY_SECONDS
130
- retry
131
- end
132
-
133
- def stop
134
- @workers_pool.shutdown
135
- @workers_pool.wait_for_termination(nil) # block until its done
136
- end
137
- end
138
- end
1
+ # encoding: utf-8
2
+
3
+ require 'logstash/outputs/base'
4
+ require 'logstash/namespace'
5
+ require 'logstash/errors'
6
+
7
+ class LogStash::Outputs::Kusto < LogStash::Outputs::Base
8
+ ##
9
+ # This handles the overall logic and communication with Kusto
10
+ #
11
+ class Ingestor
12
+ require 'logstash-output-kusto_jars'
13
+ RETRY_DELAY_SECONDS = 3
14
+ DEFAULT_THREADPOOL = Concurrent::ThreadPoolExecutor.new(
15
+ min_threads: 1,
16
+ max_threads: 8,
17
+ max_queue: 1,
18
+ fallback_policy: :caller_runs
19
+ )
20
+ LOW_QUEUE_LENGTH = 3
21
+ FIELD_REF = /%\{[^}]+\}/
22
+
23
+ def initialize(ingest_url, app_id, app_key, app_tenant, database, table, json_mapping, delete_local, proxy_host , proxy_port , proxy_protocol,logger, threadpool = DEFAULT_THREADPOOL)
24
+ @workers_pool = threadpool
25
+ @logger = logger
26
+ validate_config(database, table, json_mapping,proxy_protocol)
27
+ @logger.info('Preparing Kusto resources.')
28
+
29
+ kusto_java = Java::com.microsoft.azure.kusto
30
+ apache_http = Java::org.apache.http
31
+ kusto_connection_string = kusto_java.data.auth.ConnectionStringBuilder.createWithAadApplicationCredentials(ingest_url, app_id, app_key.value, app_tenant)
32
+ #
33
+ @logger.debug(Gem.loaded_specs.to_s)
34
+ # Unfortunately there's no way to avoid using the gem/plugin name directly...
35
+ name_for_tracing = "logstash-output-kusto:#{Gem.loaded_specs['logstash-output-kusto']&.version || "unknown"}"
36
+ @logger.debug("Client name for tracing: #{name_for_tracing}")
37
+ kusto_connection_string.setClientVersionForTracing(name_for_tracing)
38
+
39
+ @kusto_client = begin
40
+ if proxy_host.nil? || proxy_host.empty?
41
+ kusto_java.ingest.IngestClientFactory.createClient(kusto_connection_string)
42
+ else
43
+ kusto_http_client_properties = kusto_java.data.HttpClientProperties.builder().proxy(apache_http.HttpHost.new(proxy_host,proxy_port,proxy_protocol)).build()
44
+ kusto_java.ingest.IngestClientFactory.createClient(kusto_connection_string, kusto_http_client_properties)
45
+ end
46
+ end
47
+
48
+ @ingestion_properties = kusto_java.ingest.IngestionProperties.new(database, table)
49
+ @ingestion_properties.setIngestionMapping(json_mapping, kusto_java.ingest.IngestionMapping::IngestionMappingKind::JSON)
50
+ @ingestion_properties.setDataFormat(kusto_java.ingest.IngestionProperties::DataFormat::JSON)
51
+ @delete_local = delete_local
52
+
53
+ @logger.debug('Kusto resources are ready.')
54
+ end
55
+
56
+ def validate_config(database, table, json_mapping,proxy_protocol)
57
+ if database =~ FIELD_REF
58
+ @logger.error('database config value should not be dynamic.', database)
59
+ raise LogStash::ConfigurationError.new('database config value should not be dynamic.')
60
+ end
61
+
62
+ if table =~ FIELD_REF
63
+ @logger.error('table config value should not be dynamic.', table)
64
+ raise LogStash::ConfigurationError.new('table config value should not be dynamic.')
65
+ end
66
+
67
+ if json_mapping =~ FIELD_REF
68
+ @logger.error('json_mapping config value should not be dynamic.', json_mapping)
69
+ raise LogStash::ConfigurationError.new('json_mapping config value should not be dynamic.')
70
+ end
71
+
72
+ if not(["https", "http"].include? proxy_protocol)
73
+ @logger.error('proxy_protocol has to be http or https.', proxy_protocol)
74
+ raise LogStash::ConfigurationError.new('proxy_protocol has to be http or https.')
75
+ end
76
+
77
+ end
78
+
79
+ def upload_async(path, delete_on_success)
80
+ if @workers_pool.remaining_capacity <= LOW_QUEUE_LENGTH
81
+ @logger.warn("Ingestor queue capacity is running low with #{@workers_pool.remaining_capacity} free slots.")
82
+ end
83
+
84
+ @workers_pool.post do
85
+ LogStash::Util.set_thread_name("Kusto to ingest file: #{path}")
86
+ upload(path, delete_on_success)
87
+ end
88
+ rescue Exception => e
89
+ @logger.error('StandardError.', exception: e.class, message: e.message, path: path, backtrace: e.backtrace)
90
+ raise e
91
+ end
92
+
93
+ def upload(path, delete_on_success)
94
+ file_size = File.size(path)
95
+ @logger.debug("Sending file to kusto: #{path}. size: #{file_size}")
96
+
97
+ # TODO: dynamic routing
98
+ # file_metadata = path.partition('.kusto.').last
99
+ # file_metadata_parts = file_metadata.split('.')
100
+
101
+ # if file_metadata_parts.length == 3
102
+ # # this is the number we expect - database, table, json_mapping
103
+ # database = file_metadata_parts[0]
104
+ # table = file_metadata_parts[1]
105
+ # json_mapping = file_metadata_parts[2]
106
+
107
+ # local_ingestion_properties = Java::KustoIngestionProperties.new(database, table)
108
+ # local_ingestion_properties.addJsonMappingName(json_mapping)
109
+ # end
110
+
111
+ if file_size > 0
112
+ file_source_info = Java::com.microsoft.azure.kusto.ingest.source.FileSourceInfo.new(path, 0); # 0 - let the sdk figure out the size of the file
113
+ @kusto_client.ingestFromFile(file_source_info, @ingestion_properties)
114
+ else
115
+ @logger.warn("File #{path} is an empty file and is not ingested.")
116
+ end
117
+ File.delete(path) if delete_on_success
118
+ @logger.debug("File #{path} sent to kusto.")
119
+ rescue Errno::ENOENT => e
120
+ @logger.error("File doesn't exist! Unrecoverable error.", exception: e.class, message: e.message, path: path, backtrace: e.backtrace)
121
+ rescue Java::JavaNioFile::NoSuchFileException => e
122
+ @logger.error("File doesn't exist! Unrecoverable error.", exception: e.class, message: e.message, path: path, backtrace: e.backtrace)
123
+ rescue => e
124
+ # When the retry limit is reached or another error happen we will wait and retry.
125
+ #
126
+ # Thread might be stuck here, but I think its better than losing anything
127
+ # its either a transient errors or something bad really happened.
128
+ @logger.error('Uploading failed, retrying.', exception: e.class, message: e.message, path: path, backtrace: e.backtrace)
129
+ sleep RETRY_DELAY_SECONDS
130
+ retry
131
+ end
132
+
133
+ def stop
134
+ @workers_pool.shutdown
135
+ @workers_pool.wait_for_termination(nil) # block until its done
136
+ end
137
+ end
138
+ end