RubyGems - logstash-output-google_bigquery - Versions diffs - 4.1.0-java → 4.1.1-java - Mend

logstash-output-google_bigquery 4.1.0-java → 4.1.1-java

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +3 -0
data/docs/index.asciidoc +31 -22
data/logstash-output-google_bigquery.gemspec +1 -1
metadata +2 -2

checksums.yaml CHANGED

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 31a414a63c70fc7b35c690e076db1d1595785c79187a94e361b50d8632161496
-  data.tar.gz: 70622ec5c509cd8ac319e4e021b04086a841952b51619dd99fdaa4e2ac21c0a6
+  metadata.gz: 1bea90d55eb25689d86a79c1adf326d382d8f2d7c6045a027935b5f19c4e005f
+  data.tar.gz: fd315b0e136a61429a63489a2d3ea21e294d474814d809911b382268037a8ce8
 SHA512:
-  metadata.gz: b0157e8e33e82c5532bf13cb758ef74d55b05c5ee32185ba53ae0f3dfefbfbc34bcd0df8eac2d1c024b01be93c034ad923e15741df31cb37a84746e944ffe69f
-  data.tar.gz: 31a7c30f4a4bd0d5b95cfec1f4e569961263527a81f8d92b97c0d9ec04dcff8b73c4f5fe4c118cc307543a12958d0986a33ea35e9e45f3c89e7363dcf8a501d5
+  metadata.gz: d178dea332da37a9f4762624a409415b90b55b74d3c2279cb521c79621d74e2fb70686b1256249d368da22a08c742a10714b8623eb632c423d9d8be4f90e8d2f
+  data.tar.gz: a83de7d1395056256d36889a4ecb8ee30b0d79cfc81ae2b193363a86cc188728218670ad5250a4cf24796eeff0af9d8f4a2aa4972e8c9d169c4a486f1c5df5ff

data/CHANGELOG.md CHANGED

@@ -1,3 +1,6 @@
+## 4.1.1
+ - Fixed inaccuracies in documentation [#46](https://github.com/logstash-plugins/logstash-output-google_bigquery/pull/46)
 ## 4.1.0
  - Added `skip_invalid_rows` configuration which will insert all valid rows of a BigQuery insert
    and skip any invalid ones.

data/docs/index.asciidoc CHANGED

@@ -23,25 +23,24 @@ include::{include_path}/plugin_header.asciidoc[]
 ===== Summary
-This plugin uploads events to Google BigQuery using the streaming API
-so data can become available nearly immediately.
+This Logstash plugin uploads events to Google BigQuery using the streaming API
+so data can become available to query nearly immediately.
 You can configure it to flush periodically, after N events or after
 a certain amount of data is ingested.
 ===== Environment Configuration
-You must enable BigQuery on your Google Cloud Storage (GCS) account and create a dataset to
+You must enable BigQuery on your Google Cloud account and create a dataset to
 hold the tables this plugin generates.
-You must also grant the service account this plugin uses access to
-the dataset.
+You must also grant the service account this plugin uses access to the dataset.
 You can use https://www.elastic.co/guide/en/logstash/current/event-dependent-configuration.html[Logstash conditionals]
 and multiple configuration blocks to upload events with different structures.
 ===== Usage
-This is an example of logstash config:
+This is an example of Logstash config:
 [source,ruby]
 --------------------------
@@ -65,15 +64,18 @@ https://cloud.google.com/docs/authentication/production[Application Default Cred
 ===== Considerations
-* There is a small fee to insert data into BigQuery using the streaming API
+* There is a small fee to insert data into BigQuery using the streaming API.
 * This plugin buffers events in-memory, so make sure the flush configurations are appropriate
   for your use-case and consider using
-  https://www.elastic.co/guide/en/logstash/current/persistent-queues.html[Logstash Persistent Queues]
+  https://www.elastic.co/guide/en/logstash/current/persistent-queues.html[Logstash Persistent Queues].
+* Events will be flushed when <<plugins-{type}s-{plugin}-batch_size>>, <<plugins-{type}s-{plugin}-batch_size_bytes>>, or <<plugins-{type}s-{plugin}-flush_interval_secs>> is met, whatever comes first.
+  If you notice a delay in your processing or low throughput, try adjusting those settings.
 ===== Additional Resources
 * https://cloud.google.com/docs/authentication/production[Application Default Credentials (ADC) Overview]
 * https://cloud.google.com/bigquery/[BigQuery Introduction]
+* https://cloud.google.com/bigquery/quota[BigQuery Quotas and Limits]
 * https://cloud.google.com/bigquery/docs/schemas[BigQuery Schema Formats and Types]
 [id="plugins-{type}s-{plugin}-options"]
@@ -120,7 +122,12 @@ added[4.0.0]
   * Value type is <<number,number>>
   * Default value is `128`
-The number of messages to upload at a single time. (< 1000, default: 128)
+The maximum number of messages to upload at a single time.
+This number must be < 10,000.
+Batching can increase performance and throughput to a point, but at the cost of per-request latency.
+Too few rows per request and the overhead of each request can make ingestion inefficient.
+Too many rows per request and the throughput may drop.
+BigQuery recommends using about 500 rows per request, but experimentation with representative data (schema and data sizes) will help you determine the ideal batch size.
 [id="plugins-{type}s-{plugin}-batch_size_bytes"]
 ===== `batch_size_bytes`
@@ -130,10 +137,11 @@ added[4.0.0]
   * Value type is <<number,number>>
   * Default value is `1_000_000`
-An approximate number of bytes to upload as part of a batch. Default: 1MB
+An approximate number of bytes to upload as part of a batch.
+This number should be < 10MB or inserts may fail.
 [id="plugins-{type}s-{plugin}-csv_schema"]
-===== `csv_schema`
+===== `csv_schema`
   * Value type is <<string,string>>
   * Default value is `nil`
@@ -142,7 +150,7 @@ Schema for log data. It must follow the format `name1:type1(,name2:type2)*`.
 For example, `path:STRING,status:INTEGER,score:FLOAT`.
 [id="plugins-{type}s-{plugin}-dataset"]
-===== `dataset`
+===== `dataset`
   * This is a required setting.
   * Value type is <<string,string>>
@@ -151,7 +159,7 @@ For example, `path:STRING,status:INTEGER,score:FLOAT`.
 The BigQuery dataset the tables for the events will be added to.
 [id="plugins-{type}s-{plugin}-date_pattern"]
-===== `date_pattern`
+===== `date_pattern`
   * Value type is <<string,string>>
   * Default value is `"%Y-%m-%dT%H:00"`
@@ -187,15 +195,16 @@ transparently upload to a GCS bucket.
 Files names follow the pattern `[table name]-[UNIX timestamp].log`
 [id="plugins-{type}s-{plugin}-flush_interval_secs"]
-===== `flush_interval_secs`
+===== `flush_interval_secs`
   * Value type is <<number,number>>
   * Default value is `5`
-Uploads all data this often even if other upload criteria aren't met. Default: 5s
+Uploads all data this often even if other upload criteria aren't met.
 [id="plugins-{type}s-{plugin}-ignore_unknown_values"]
-===== `ignore_unknown_values`
+===== `ignore_unknown_values`
   * Value type is <<boolean,boolean>>
   * Default value is `false`
@@ -222,12 +231,12 @@ added[4.0.0, Replaces <<plugins-{type}s-{plugin}-key_password>>, <<plugins-{type
   * Value type is <<string,string>>
   * Default value is `nil`
-If logstash is running within Google Compute Engine, the plugin can use
+If Logstash is running within Google Compute Engine, the plugin can use
 GCE's Application Default Credentials. Outside of GCE, you will need to
 specify a Service Account JSON key file.
 [id="plugins-{type}s-{plugin}-json_schema"]
-===== `json_schema`
+===== `json_schema`
   * Value type is <<hash,hash>>
   * Default value is `nil`
@@ -287,7 +296,7 @@ Please use one of the following mechanisms:
     `gcloud iam service-accounts keys create key.json --iam-account my-sa-123@my-project-123.iam.gserviceaccount.com`
 [id="plugins-{type}s-{plugin}-project_id"]
-===== `project_id`
+===== `project_id`
   * This is a required setting.
   * Value type is <<string,string>>
@@ -314,7 +323,7 @@ Insert all valid rows of a request, even if invalid rows exist.
 The default value is false, which causes the entire request to fail if any invalid rows exist.
 [id="plugins-{type}s-{plugin}-table_prefix"]
-===== `table_prefix`
+===== `table_prefix`
   * Value type is <<string,string>>
   * Default value is `"logstash"`
@@ -323,7 +332,7 @@ BigQuery table ID prefix to be used when creating new tables for log data.
 Table name will be `<table_prefix><table_separator><date>`
 [id="plugins-{type}s-{plugin}-table_separator"]
-===== `table_separator`
+===== `table_separator`
   * Value type is <<string,string>>
   * Default value is `"_"`
@@ -361,4 +370,4 @@ around one hour).
 [id="plugins-{type}s-{plugin}-common-options"]
 include::{include_path}/{type}.asciidoc[]
-:default_codec!:
+:default_codec!:

data/logstash-output-google_bigquery.gemspec CHANGED

@@ -1,6 +1,6 @@
 Gem::Specification.new do |s|
   s.name            = 'logstash-output-google_bigquery'
-  s.version         = '4.1.0'
+  s.version         = '4.1.1'
   s.licenses        = ['Apache License (2.0)']
   s.summary         = "Writes events to Google BigQuery"
   s.description     = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"

metadata CHANGED

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: logstash-output-google_bigquery
 version: !ruby/object:Gem::Version
-  version: 4.1.0
+  version: 4.1.1
 platform: java
 authors:
 - Elastic
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2018-09-07 00:00:00.000000000 Z
+date: 2018-10-25 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   requirement: !ruby/object:Gem::Requirement