fluent-plugin-gcs 0.4.4 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 42a11febaed3fc628877f4825ccf37be26bb72aea69fb403d5a74099353d6af6
4
- data.tar.gz: 1a4e644a95db9f2debf96593914f64bdcef2eb1eddf80c983ee5800616cedc68
3
+ metadata.gz: 832c3cf74b29f669e37a3f6e5e4e720e1e588ec01608169a9dad2d0861b936b4
4
+ data.tar.gz: 2ef5f2d43dfdedb8516dfbbfd91799d1782a9c9922aa792d2ff45713a17bfb01
5
5
  SHA512:
6
- metadata.gz: e209ea956498fd8773fe6cf38723cc3faa0336fdfa12a0eef58a81b8447c50027582748d71ed92b4039e8cb4ea77ffad722ad05473c98dcf9442bc870d80735b
7
- data.tar.gz: 717e1a3e64cb0c8ba2f1f4fe5eb2f947399558ef7ad85094c39f083840dcf46c88260124f2c9b3463e457e7e705d52348d9f1b7d09a1b1b370ac6a4e5942c16b
6
+ metadata.gz: be563af89ac3b24afe7114ec12969d993efb4ac44b6911ad2056977679d223eb696afe6e7bce29b301821d3d3ab34421a8bcd08c7d1d433badd395ccaedb32f6
7
+ data.tar.gz: 8786fbff16b1a96fc733aaad166aa2d16ca33b529ae03fafadb54526824e76025aeff01c9a4ecd56bd5dbfa1164d1ae0bb327a7b85d6095a3c398a4573c949fc
@@ -0,0 +1,20 @@
1
+ version: 2
2
+ updates:
3
+ - package-ecosystem: github-actions
4
+ directory: /
5
+ schedule:
6
+ interval: weekly
7
+ groups:
8
+ actions:
9
+ patterns:
10
+ - "*"
11
+
12
+ - package-ecosystem: bundler
13
+ directory: /
14
+ schedule:
15
+ interval: weekly
16
+ groups:
17
+ development:
18
+ dependency-type: development
19
+ production:
20
+ dependency-type: production
@@ -0,0 +1,58 @@
1
+ name: Test
2
+
3
+ on:
4
+ push:
5
+ branches: [main]
6
+ pull_request:
7
+ branches: [main]
8
+ workflow_dispatch:
9
+
10
+ permissions:
11
+ contents: read
12
+
13
+ concurrency:
14
+ group: ${{ github.workflow }}-${{ github.ref }}
15
+ cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
16
+
17
+ jobs:
18
+ test:
19
+ runs-on: ubuntu-latest
20
+ strategy:
21
+ fail-fast: false
22
+ matrix:
23
+ ruby-version: ['3.3', '3.4']
24
+ steps:
25
+ - uses: actions/checkout@v6
26
+ - name: Install compression tools
27
+ run: sudo apt-get update && sudo apt-get install -y lzop xz-utils zstd
28
+ - name: Set up Ruby
29
+ uses: ruby/setup-ruby@v1
30
+ with:
31
+ ruby-version: ${{ matrix.ruby-version }}
32
+ bundler-cache: true
33
+ - name: Run tests
34
+ run: bundle exec rake test
35
+
36
+ build:
37
+ runs-on: ubuntu-latest
38
+ steps:
39
+ - uses: actions/checkout@v6
40
+ - name: Set up Ruby
41
+ uses: ruby/setup-ruby@v1
42
+ with:
43
+ ruby-version: '3.4'
44
+ bundler-cache: true
45
+ - name: Build gem
46
+ run: gem build fluent-plugin-gcs.gemspec
47
+
48
+ audit:
49
+ runs-on: ubuntu-latest
50
+ steps:
51
+ - uses: actions/checkout@v6
52
+ - name: Set up Ruby
53
+ uses: ruby/setup-ruby@v1
54
+ with:
55
+ ruby-version: '3.4'
56
+ bundler-cache: true
57
+ - name: Bundler audit
58
+ run: bundle exec bundler-audit check --update
data/Gemfile CHANGED
@@ -5,9 +5,14 @@ source "https://rubygems.org"
5
5
  # Specify your gem's dependencies in fluent-plugin-gcs.gemspec
6
6
  gemspec
7
7
 
8
- gem "rake", "~> 13.0"
9
- gem "rr", "= 1.1.2"
10
- gem "test-unit", ">= 3.0.8"
11
- gem "test-unit-rr", ">= 1.0.3"
12
- gem "timecop"
13
- gem "solargraph"
8
+ group :test do
9
+ gem "rake", "~> 13.0"
10
+ gem "test-unit", ">= 3.0.8"
11
+ gem "mocha", "~> 3.1"
12
+ gem "timecop"
13
+ gem "bundler-audit", "~> 0.9"
14
+ end
15
+
16
+ group :development, optional: true do
17
+ gem "solargraph"
18
+ end
data/README.md CHANGED
@@ -1,258 +1,340 @@
1
1
  # fluent-plugin-gcs
2
- [![Gem Version](https://badge.fury.io/rb/fluent-plugin-gcs.svg)](https://badge.fury.io/rb/fluent-plugin-gcs) [![Test](https://github.com/daichirata/fluent-plugin-gcs/actions/workflows/test.yaml/badge.svg)](https://github.com/daichirata/fluent-plugin-gcs/actions/workflows/test.yaml) [![Code Climate](https://codeclimate.com/github/daichirata/fluent-plugin-gcs/badges/gpa.svg)](https://codeclimate.com/github/daichirata/fluent-plugin-gcs)
3
2
 
4
- Google Cloud Storage output plugin for [Fluentd](https://github.com/fluent/fluentd).
3
+ [![Test](https://github.com/daichirata/fluent-plugin-gcs/actions/workflows/test.yml/badge.svg)](https://github.com/daichirata/fluent-plugin-gcs/actions/workflows/test.yml)
4
+ [![Gem Version](https://badge.fury.io/rb/fluent-plugin-gcs.svg)](https://badge.fury.io/rb/fluent-plugin-gcs)
5
+
6
+ A [Fluentd](https://www.fluentd.org/) output plugin that buffers events and uploads them to [Google Cloud Storage](https://cloud.google.com/storage).
7
+
8
+ ## Features
9
+
10
+ - **Multiple formats** — store objects as gzip, plain text, or JSON.
11
+ - **Fast compression** — optionally shell out to the external `gzip` binary, with automatic fallback to the pure-Ruby compressor.
12
+ - **Flexible object keys** — build paths from time slices, tags, hostnames, random tokens, and UUIDs.
13
+ - **Server-side controls** — set ACLs, storage class, customer-supplied encryption keys, and custom object metadata.
14
+ - **Flexible auth** — explicit credentials or Application Default Credentials on GCE / GKE / Cloud Run.
15
+
16
+ ## Table of contents
17
+
18
+ - [Requirements](#requirements)
19
+ - [Installation](#installation)
20
+ - [Quick start](#quick-start)
21
+ - [Configuration](#configuration)
22
+ - [Authentication](#authentication)
23
+ - [Object placement](#object-placement)
24
+ - [Format and compression](#format-and-compression)
25
+ - [GCS object settings](#gcs-object-settings)
26
+ - [Object key format](#object-key-format)
27
+ - [Object metadata](#object-metadata)
28
+ - [Examples](#examples)
29
+ - [Development](#development)
30
+ - [Author](#author)
31
+ - [License](#license)
5
32
 
6
33
  ## Requirements
7
34
 
8
- | fluent-plugin-gcs | fluentd | ruby |
9
- |--------------------|------------|--------|
10
- | >= 0.4.0 | >= v0.14.0 | >= 2.4 |
11
- | < 0.4.0 | >= v0.12.0 | >= 1.9 |
35
+ | fluent-plugin-gcs | fluentd | ruby |
36
+ |-------------------|----------|--------|
37
+ | >= 0.5.0 | >= 1.0 | >= 3.3 |
12
38
 
13
39
  ## Installation
14
40
 
15
- ``` shell
16
- $ gem install fluent-plugin-gcs -v "~> 0.3" --no-document # for fluentd v0.12 or later
17
- $ gem install fluent-plugin-gcs -v "0.4.0" --no-document # for fluentd v0.14 or later
41
+ ```shell
42
+ gem install fluent-plugin-gcs
18
43
  ```
19
44
 
20
- ## Examples
21
-
22
- ### For v0.14 style
45
+ Using td-agent / fluent-package:
23
46
 
47
+ ```shell
48
+ fluent-gem install fluent-plugin-gcs
24
49
  ```
25
- <match pattern>
50
+
51
+ ## Quick start
52
+
53
+ The minimal configuration needs only a bucket. On GCE, GKE, or Cloud Run the credentials are picked up automatically from the environment.
54
+
55
+ ```aconf
56
+ <match your.tag>
26
57
  @type gcs
27
58
 
28
- project YOUR_PROJECT
29
- keyfile YOUR_KEYFILE_PATH
30
59
  bucket YOUR_GCS_BUCKET_NAME
31
- object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
32
- path logs/${tag}/%Y/%m/%d/
60
+ path logs/
33
61
 
34
- # if you want to use ${tag} or %Y/%m/%d/ like syntax in path / object_key_format,
35
- # need to specify tag for ${tag} and time for %Y/%m/%d in <buffer> argument.
36
- <buffer tag,time>
62
+ <buffer time>
37
63
  @type file
38
64
  path /var/log/fluent/gcs
39
- timekey 1h # 1 hour partition
65
+ timekey 1h
40
66
  timekey_wait 10m
41
- timekey_use_utc true # use utc
67
+ timekey_use_utc true
42
68
  </buffer>
43
-
44
- <format>
45
- @type json
46
- </format>
47
69
  </match>
48
70
  ```
49
71
 
50
- ### For v0.12 style
51
-
52
- ```
53
- <match pattern>
54
- @type gcs
55
-
56
- project YOUR_PROJECT
57
- keyfile YOUR_KEYFILE_PATH
58
- bucket YOUR_GCS_BUCKET_NAME
59
- object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
60
- path logs/
61
- buffer_path /var/log/fluent/gcs
62
-
63
- time_slice_format %Y%m%d-%H
64
- time_slice_wait 10m
65
- utc
66
- </match>
67
- ```
72
+ This writes gzip-compressed objects such as `logs/2024010112_0.gz`, one per hourly time slice.
68
73
 
69
74
  ## Configuration
70
75
 
71
76
  ### Authentication
72
77
 
73
- You can provide the project and credential information to connect to the Storage
74
- service, or if you are running on Google Compute Engine this configuration is taken care of for you.
78
+ Provide credentials explicitly, or rely on [Application Default Credentials](https://cloud.google.com/docs/authentication/application-default-credentials) when running on Google Cloud.
75
79
 
76
- **project**
80
+ | Option | Type | Default | Description |
81
+ |--------------------|---------|---------|-------------|
82
+ | `project` | string | `nil` | GCS project identifier |
83
+ | `keyfile` | string | `nil` | Path to a service account credentials JSON file |
84
+ | `credentials_json` | hash | `nil` | Service account credentials inline as JSON. Takes precedence over `keyfile` |
85
+ | `client_retries` | integer | `nil` | Number of retries on server error |
86
+ | `client_timeout` | integer | `nil` | Request timeout in seconds |
77
87
 
78
- Project identifier for GCS. Project are discovered in the following order:
79
- * Specify project in `project`
80
- * Discover project in environment variables `STORAGE_PROJECT`, `GOOGLE_CLOUD_PROJECT`, `GCLOUD_PROJECT`
81
- * Discover GCE credentials
88
+ `project` is resolved in the following order: the `project` option, then the `STORAGE_PROJECT` / `GOOGLE_CLOUD_PROJECT` / `GCLOUD_PROJECT` environment variables, then GCE metadata.
82
89
 
83
- **keyfile**
90
+ `keyfile` is resolved in the following order: the `keyfile` option, the `GOOGLE_CLOUD_KEYFILE` / `GCLOUD_KEYFILE` (path) or `GOOGLE_CLOUD_KEYFILE_JSON` / `GCLOUD_KEYFILE_JSON` (inline) environment variables, the Cloud SDK's well-known path, then GCE metadata.
84
91
 
85
- Path of GCS service account credentials JSON file. Credentials are discovered in the following order:
86
- * Specify credentials path in `keyfile`
87
- * Discover credentials path in environment variables `GOOGLE_CLOUD_KEYFILE`, `GCLOUD_KEYFILE`
88
- * Discover credentials JSON in environment variables `GOOGLE_CLOUD_KEYFILE_JSON`, `GCLOUD_KEYFILE_JSON`
89
- * Discover credentials file in the Cloud SDK's path
90
- * Discover GCE credentials
92
+ ### Object placement
91
93
 
92
- **client_retries**
94
+ | Option | Type | Default | Description |
95
+ |---------------------|---------|---------|-------------|
96
+ | `bucket` | string | — | **Required.** GCS bucket name |
97
+ | `path` | string | `""` | Path prefix for objects |
98
+ | `object_key_format` | string | `%{path}%{time_slice}_%{index}.%{file_extension}` | Template for object keys. See [Object key format](#object-key-format) |
99
+ | `hex_random_length` | integer | `4` | Length of the `%{hex_random}` placeholder (max 32) |
100
+ | `overwrite` | bool | `false` | Overwrite the existing object instead of incrementing `%{index}` |
101
+ | `blind_write` | bool | `false` | Skip the existence check before writing (see below) |
93
102
 
94
- Number of times to retry requests on server error.
103
+ **Avoiding key collisions.** When `object_key_format` contains `%{index}` (the default), the plugin checks GCS for an existing object and increments `%{index}` until it finds an unused key, so existing objects are never overwritten. This existence check requires the `storage.objects.get` permission.
95
104
 
96
- **client_timeout**
105
+ **`blind_write`** skips that existence check, so the `storage.objects.get` permission is no longer needed. The trade-off is that `%{index}` stops working (it always stays `0`), so you must keep keys unique another way, with `%{hex_random}` (unique per chunk) or `%{uuid_flush}` (unique per flush).
97
106
 
98
- Default timeout to use in requests.
107
+ > [!WARNING]
108
+ > If a key collides with an existing object (which can happen with `blind_write true`, or with `overwrite true`), uploading it overwrites the existing object, and GCS requires the `storage.objects.delete` permission to do so. Without that permission the flush fails repeatedly and the buffer chunk is eventually lost. With `blind_write true`, include `%{hex_random}` or `%{uuid_flush}` in `object_key_format` to avoid collisions.
99
109
 
100
- **bucket (*required)**
110
+ ### Format and compression
101
111
 
102
- GCS bucket name.
112
+ | Option | Type | Default | Description |
113
+ |---------------------|--------|--------------|-------------|
114
+ | `store_as` | enum | `gzip` | Object format. See the table below |
115
+ | `command_parameter` | string | (per format) | Override the default arguments for the compression command (`gzip_command` / `lzo` / `lzma2` / `zstd`) |
116
+ | `transcoding` | bool | `false` | Enable [decompressive transcoding](https://cloud.google.com/storage/docs/transcoding) (gzip only) |
103
117
 
104
- **store_as**
118
+ | `store_as` | Compression | Requires | Default args | Extension | content_type |
119
+ |----------------|-------------|----------|--------------|-----------|--------------|
120
+ | `gzip` | Ruby's built-in `Zlib::GzipWriter` | (none) | — | `gz` | `application/gzip` |
121
+ | `gzip_command` | External `gzip`. Faster for large chunks, falls back to `Zlib::GzipWriter` on failure | `gzip` command | (none) | `gz` | `application/gzip` |
122
+ | `lzo` | External `lzop` | `lzop` command | `-qf1` | `lzo` | `application/x-lzop` |
123
+ | `lzma2` | External `xz` | `xz` command | `-qf0` | `xz` | `application/x-xz` |
124
+ | `zstd` | External `zstd` | `zstd` command | (none) | `zst` | `application/x-zst` |
125
+ | `json` | None (upload as JSON) | (none) | — | `json` | `application/json` |
126
+ | `text` | None (upload as text) | (none) | — | `txt` | `text/plain` |
105
127
 
106
- Archive format on GCS. You can use serveral format:
128
+ The command-based formats (`gzip_command`, `lzo`, `lzma2`, `zstd`) stream the chunk through the command's stdin (no intermediate temp file). Each has a sensible default argument set; override it with `command_parameter`. Multiple arguments are separated by spaces; the value is parsed with `shellsplit`, so it is **not** evaluated by a shell:
107
129
 
108
- * gzip (default)
109
- * json
110
- * text
130
+ ```aconf
131
+ store_as gzip_command
132
+ command_parameter -1 # single argument
133
+ ```
111
134
 
112
- **path**
135
+ ```aconf
136
+ store_as zstd
137
+ command_parameter -19 --long # multiple arguments, split on spaces
138
+ ```
113
139
 
114
- path prefix of the files on GCS. Default is "" (no prefix).
140
+ Quote a value that itself contains a space, the same way you would in a shell (`command_parameter -o "with space"`).
115
141
 
116
- **object_key_format**
142
+ `gzip_command` falls back to `Zlib::GzipWriter` if the `gzip` command fails. `lzo` / `lzma2` / `zstd` have no fallback, so the command must be installed (checked at startup), and they are not compatible with `transcoding`, which is gzip-specific.
117
143
 
118
- The format of GCS object keys. You can use several built-in variables:
144
+ > [!NOTE]
145
+ > `gzip_command_parameter` is a deprecated alias of `command_parameter`, kept for backward compatibility with v0.4.x configs. New configs should use `command_parameter`.
119
146
 
120
- * %{path}
121
- * %{time_slice}
122
- * %{index}
123
- * %{file_extension}
124
- * %{uuid_flush}
125
- * %{hex_random}
126
- * %{hostname}
147
+ The per-line format is configured with a `<format>` section (default `out_file`):
127
148
 
128
- to decide keys dynamically.
149
+ ```aconf
150
+ <format>
151
+ @type json
152
+ </format>
153
+ ```
129
154
 
130
- * `%{path}` is exactly the value of `path` configured in the configuration file. E.g., "logs/" in the example configuration above.
131
- * `%{time_slice}` is the time-slice in text that are formatted with `time_slice_format`.
132
- * `%{index}` is the sequential number starts from 0, increments when multiple files are uploaded to GCS in the same time slice.
133
- * `%{file_extention}` is changed by the value of `store_as`.
134
- * gzip - gz
135
- * json - json
136
- * text - txt
137
- * `%{uuid_flush}` a uuid that is replaced everytime the buffer will be flushed
138
- * `%{hex_random}` a random hex string that is replaced for each buffer chunk, not assured to be unique. You can configure the length of string with a `hex_random_length` parameter (Default: 4).
139
- * `%{hostname}` is set to the standard host name of the system of the running server.
155
+ See the [Formatter documentation](https://docs.fluentd.org/formatter) for available types (`out_file`, `json`, `ltsv`, `single_value`, ...).
140
156
 
141
- The default format is `%{path}%{time_slice}_%{index}.%{file_extension}`.
157
+ ### GCS object settings
142
158
 
143
- **hex_random_length**
159
+ | Option | Type | Default | Description |
160
+ |----------------------|--------|---------|-------------|
161
+ | `auto_create_bucket` | bool | `true` | Create the bucket if it does not exist |
162
+ | `acl` | enum | `nil` | Predefined ACL for uploaded objects (see below) |
163
+ | `storage_class` | enum | `nil` | Storage class for uploaded objects (see below) |
164
+ | `encryption_key` | string | `nil` | Customer-supplied AES-256 key for server-side encryption |
144
165
 
145
- The length of `%{hex_random}` placeholder.
166
+ **`acl`** accepts one of `auth_read`, `owner_full`, `owner_read`, `private`, `project_private`, `public_read`. Defaults to the bucket's default object ACL. See the [access control documentation](https://cloud.google.com/storage/docs/access-control/lists).
146
167
 
147
- **transcoding**
168
+ **`storage_class`** accepts one of `dra`, `nearline`, `coldline`, `multi_regional`, `regional`, `standard`. See the [storage classes documentation](https://cloud.google.com/storage/docs/storage-classes).
148
169
 
149
- Enable the decompressive form of transcoding.
170
+ **`encryption_key`** enables [customer-supplied encryption](https://cloud.google.com/storage/docs/encryption#customer-supplied); the `encryption_key_sha256` is computed automatically.
150
171
 
151
- See also [Transcoding of gzip-compressed files](https://cloud.google.com/storage/docs/transcoding).
172
+ ### Object key format
152
173
 
153
- **format**
174
+ `object_key_format` supports the following placeholders:
154
175
 
155
- Change one line format in the GCS object. You can use serveral format:
176
+ | Placeholder | Description |
177
+ |---------------------|-------------|
178
+ | `%{path}` | The value of the `path` option |
179
+ | `%{time_slice}` | Time slice text derived from the `<buffer>` `timekey` |
180
+ | `%{index}` | Sequential number (from 0) within the same time slice |
181
+ | `%{file_extension}` | Inferred from `store_as` (`gz` / `lzo` / `xz` / `zst` / `json` / `txt`) |
182
+ | `%{uuid_flush}` | A UUID generated on every buffer flush |
183
+ | `%{hex_random}` | A random hex string per chunk, length set by `hex_random_length` |
184
+ | `%{hostname}` | The hostname of the running server |
156
185
 
157
- * out_file (default)
158
- * json
159
- * ltsv
160
- * single_value
186
+ The default is `%{path}%{time_slice}_%{index}.%{file_extension}`.
161
187
 
162
- See also [official Formatter article](http://docs.fluentd.org/articles/formatter-plugin-overview).
188
+ ### Object metadata
163
189
 
164
- **auto_create_bucket**
190
+ Attach arbitrary `x-goog-meta-*` headers to uploaded objects with one or more `<object_metadata>` sections:
165
191
 
166
- Create GCS bucket if it does not exists. Default is true.
192
+ ```aconf
193
+ <object_metadata>
194
+ key KEY_1
195
+ value VALUE_1
196
+ </object_metadata>
167
197
 
168
- **acl**
198
+ <object_metadata>
199
+ key KEY_2
200
+ value VALUE_2
201
+ </object_metadata>
202
+ ```
169
203
 
170
- Permission for the object in GCS. Acceptable values are:
204
+ ## Examples
171
205
 
172
- * `auth_read` - File owner gets OWNER access, and allAuthenticatedUsers get READER access.
173
- * `owner_full` - File owner gets OWNER access, and project team owners get OWNER access.
174
- * `owner_read` - File owner gets OWNER access, and project team owners get READER access.
175
- * `private` - File owner gets OWNER access.
176
- * `project_private` - File owner gets OWNER access, and project team members get access according to their roles.
177
- * `public_read` - File owner gets OWNER access, and allUsers get READER access.
206
+ ### Partition by tag and date
178
207
 
179
- Default is nil (bucket default object ACL). See also [official document](https://cloud.google.com/storage/docs/access-control/lists).
208
+ ```aconf
209
+ <match app.**>
210
+ @type gcs
180
211
 
181
- **storage_class**
212
+ project YOUR_PROJECT
213
+ bucket YOUR_GCS_BUCKET_NAME
214
+ object_key_format %{path}%{time_slice}/%{hostname}_%{index}.%{file_extension}
215
+ path logs/${tag}/
182
216
 
183
- Storage class of the file. Acceptable values are:
217
+ <buffer tag,time>
218
+ @type file
219
+ path /var/log/fluent/gcs
220
+ timekey 1d
221
+ timekey_wait 10m
222
+ timekey_use_utc true
223
+ </buffer>
184
224
 
185
- * `dra` - Durable Reduced Availability
186
- * `nearline` - Nearline Storage
187
- * `coldline` - Coldline Storage
188
- * `multi_regional` - Multi-Regional Storage
189
- * `regional` - Regional Storage
190
- * `standard` - Standard Storage
225
+ <format>
226
+ @type json
227
+ </format>
228
+ </match>
229
+ ```
191
230
 
192
- Default is nil. See also [official document](https://cloud.google.com/storage/docs/storage-classes).
231
+ For the tag `app.web` on host `web1`, this writes objects such as `logs/app.web/20240101/web1_0.gz`.
193
232
 
194
- **encryption_key**
233
+ ### Fine-grained 1-minute partitions
195
234
 
196
- You can also choose to provide your own AES-256 key for server-side encryption. See also [Customer-supplied encryption keys](https://cloud.google.com/storage/docs/encryption#customer-supplied).
235
+ When `timekey` is under an hour, `%{time_slice}` automatically resolves to minute granularity (`%Y%m%d%H%M`).
197
236
 
198
- `encryption_key_sha256` will be calculated using encryption_key.
237
+ ```aconf
238
+ <match app.**>
239
+ @type gcs
199
240
 
200
- **overwrite**
241
+ bucket YOUR_GCS_BUCKET_NAME
242
+ path logs/
201
243
 
202
- Overwrite already existing path. Default is false, which raises an error
203
- if a GCS object of the same path already exists, or increment the
204
- `%{index}` placeholder until finding an absent path.
244
+ <buffer time>
245
+ @type file
246
+ path /var/log/fluent/gcs
247
+ timekey 1m # 1 minute partition
248
+ timekey_wait 10s # short wait for late events
249
+ timekey_use_utc true
250
+ </buffer>
251
+ </match>
252
+ ```
205
253
 
206
- **buffer_path (*required)**
254
+ This writes objects such as `logs/202401011230_0.gz`, one (or more) per minute.
207
255
 
208
- path prefix of the files to buffer logs.
256
+ ### Fast compression with the external gzip
209
257
 
210
- **time_slice_format**
258
+ ```aconf
259
+ <match app.**>
260
+ @type gcs
211
261
 
212
- Format of the time used as the file name. Default is '%Y%m%d'. Use
213
- '%Y%m%d%H' to split files hourly.
262
+ bucket YOUR_GCS_BUCKET_NAME
263
+ path logs/
264
+ store_as gzip_command
265
+ command_parameter -1
214
266
 
215
- **time_slice_wait**
267
+ <buffer time>
268
+ @type file
269
+ path /var/log/fluent/gcs
270
+ timekey 1h
271
+ timekey_wait 10m
272
+ </buffer>
273
+ </match>
274
+ ```
216
275
 
217
- The time to wait old logs. Default is 10 minutes. Specify larger value if
218
- old logs may reache.
276
+ Using the default `object_key_format`, this writes objects such as `logs/2024010112_0.gz`, one per hourly slice.
219
277
 
220
- **localtime**
278
+ ### Cost-optimized cold storage
221
279
 
222
- Use Local time instead of UTC.
280
+ ```aconf
281
+ <match archive.**>
282
+ @type gcs
223
283
 
224
- **utc**
284
+ bucket YOUR_GCS_BUCKET_NAME
285
+ path archive/
286
+ storage_class coldline
287
+ acl project_private
288
+
289
+ <buffer time>
290
+ @type file
291
+ path /var/log/fluent/gcs-archive
292
+ timekey 1d
293
+ timekey_wait 1h
294
+ </buffer>
295
+ </match>
296
+ ```
225
297
 
226
- Use UTC instead of local time.
298
+ Using the default `object_key_format`, this writes objects such as `archive/20240101_0.gz`, one per day, stored in the Coldline class.
227
299
 
228
- And see [official Time Sliced Output article](http://docs.fluentd.org/articles/output-plugin-overview#time-sliced-output-parameters)
300
+ ### Write without the get permission (blind_write)
229
301
 
230
- **blind_write**
302
+ `blind_write true` skips the existence check, so the `storage.objects.get` permission is not required. Because `%{index}` does not work in this mode, include `%{hex_random}` or `%{uuid_flush}` to keep keys unique.
231
303
 
232
- Doesn't check if an object exists in GCS before writing. Default is false.
304
+ ```aconf
305
+ <match app.**>
306
+ @type gcs
233
307
 
234
- Allows to avoid granting of `storage.objects.get` permission.
308
+ bucket YOUR_GCS_BUCKET_NAME
309
+ path logs/
310
+ object_key_format %{path}%{time_slice}_%{hex_random}.%{file_extension}
311
+ blind_write true
235
312
 
236
- Warning! If the object exists and `storage.objects.delete` permission is not
237
- granted, it will result in an unrecoverable error. Usage of `%{hex_random}` is
238
- recommended.
313
+ <buffer time>
314
+ @type file
315
+ path /var/log/fluent/gcs
316
+ timekey 1h
317
+ timekey_wait 10m
318
+ timekey_use_utc true
319
+ </buffer>
320
+ </match>
321
+ ```
239
322
 
240
- ### ObjectMetadata
323
+ This writes objects such as `logs/2024010112_a1b2.gz`, with a per-chunk random suffix instead of an incrementing index.
241
324
 
242
- User provided web-safe keys and arbitrary string values that will returned with requests for the file as "x-goog-meta-" response headers.
325
+ ## Development
243
326
 
327
+ ```shell
328
+ bundle install
329
+ bundle exec rake test # run the test suite
330
+ bundle exec bundler-audit check --update # audit dependencies
331
+ gem build fluent-plugin-gcs.gemspec # build the gem
244
332
  ```
245
- <match *>
246
- @type gcs
247
333
 
248
- <object_metadata>
249
- key KEY_DATA_1
250
- value VALUE_DATA_1
251
- </object_metadata>
334
+ ## Author
252
335
 
253
- <object_metadata>
254
- key KEY_DATA_2
255
- value VALUE_DATA_2
256
- </object_metadata>
257
- </match>
258
- ```
336
+ Daichi HIRATA
337
+
338
+ ## License
339
+
340
+ Apache License 2.0. See [LICENSE.txt](LICENSE.txt).
@@ -1,7 +1,4 @@
1
- # coding: utf-8
2
- lib = File.expand_path('../lib', __FILE__)
3
- $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
- require 'fluent/plugin/gcs/version'
1
+ require_relative "lib/fluent/plugin/gcs/version"
5
2
 
6
3
  Gem::Specification.new do |spec|
7
4
  spec.name = "fluent-plugin-gcs"
@@ -9,17 +6,25 @@ Gem::Specification.new do |spec|
9
6
  spec.authors = ["Daichi HIRATA"]
10
7
  spec.email = ["hirata.daichi@gmail.com"]
11
8
  spec.summary = "Google Cloud Storage output plugin for Fluentd"
12
- spec.description = "Google Cloud Storage output plugin for Fluentd"
9
+ spec.description = "Fluentd output plugin that buffers events and uploads them to Google Cloud Storage as gzip, json, or text objects."
13
10
  spec.homepage = "https://github.com/daichirata/fluent-plugin-gcs"
14
11
  spec.license = "Apache-2.0"
15
12
 
16
- spec.files = `git ls-files -z`.split("\x0").reject do |f|
13
+ spec.required_ruby_version = ">= 3.3"
14
+
15
+ spec.metadata = {
16
+ "source_code_uri" => spec.homepage,
17
+ "bug_tracker_uri" => "#{spec.homepage}/issues",
18
+ "rubygems_mfa_required" => "true",
19
+ }
20
+
21
+ spec.files = `git ls-files -z`.split("\x0").reject do |f|
17
22
  f.match(%r{^(test|spec|features)/})
18
23
  end
19
24
  spec.bindir = "exe"
20
25
  spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
21
26
  spec.require_paths = ["lib"]
22
27
 
23
- spec.add_runtime_dependency "fluentd", [">= 0.14.22", "< 2"]
28
+ spec.add_runtime_dependency "fluentd", ">= 1.0", "< 3"
24
29
  spec.add_runtime_dependency "google-cloud-storage", "~> 1.1"
25
30
  end
@@ -1,12 +1,26 @@
1
1
  require "tempfile"
2
2
  require "zlib"
3
+ require "open3"
4
+ require "shellwords"
3
5
 
4
6
  module Fluent
5
7
  module GCS
6
- def self.discovered_object_creator(store_as, transcoding: nil)
8
+ def self.discovered_object_creator(store_as, transcoding: nil, command_parameter: nil, log: nil)
7
9
  case store_as
8
10
  when :gzip
9
11
  Fluent::GCS::GZipObjectCreator.new(transcoding)
12
+ when :gzip_command
13
+ Fluent::GCS::GZipCommandObjectCreator.new(
14
+ transcoding: transcoding,
15
+ command_parameter: command_parameter,
16
+ log: log
17
+ )
18
+ when :lzo
19
+ Fluent::GCS::LZOObjectCreator.new(command_parameter: command_parameter, log: log)
20
+ when :lzma2
21
+ Fluent::GCS::LZMA2ObjectCreator.new(command_parameter: command_parameter, log: log)
22
+ when :zstd
23
+ Fluent::GCS::ZstdObjectCreator.new(command_parameter: command_parameter, log: log)
10
24
  when :json
11
25
  Fluent::GCS::JSONObjectCreator.new
12
26
  when :text
@@ -41,6 +55,30 @@ module Fluent
41
55
  end
42
56
  end
43
57
 
58
+ class TextObjectCreator < ObjectCreator
59
+ def content_type
60
+ "text/plain"
61
+ end
62
+
63
+ def file_extension
64
+ "txt"
65
+ end
66
+
67
+ def write(chunk, io)
68
+ chunk.write_to(io)
69
+ end
70
+ end
71
+
72
+ class JSONObjectCreator < TextObjectCreator
73
+ def content_type
74
+ "application/json"
75
+ end
76
+
77
+ def file_extension
78
+ "json"
79
+ end
80
+ end
81
+
44
82
  class GZipObjectCreator < ObjectCreator
45
83
  def initialize(transcoding)
46
84
  @transcoding = transcoding
@@ -65,27 +103,153 @@ module Fluent
65
103
  end
66
104
  end
67
105
 
68
- class TextObjectCreator < ObjectCreator
106
+ class CommandObjectCreator < ObjectCreator
107
+ def initialize(command_parameter: nil, log: nil)
108
+ @command_parameter = command_parameter
109
+ @log = log
110
+ check_command
111
+ end
112
+
113
+ def write(chunk, io)
114
+ parameter = @command_parameter.nil? || @command_parameter.empty? ? default_parameter : @command_parameter
115
+ cmd = [command, *parameter.shellsplit, "-c"]
116
+ status = Open3.pipeline_w(cmd, out: io.path) do |stdin, wait_thrs|
117
+ chunk.write_to(stdin)
118
+ stdin.close
119
+ wait_thrs.last.value
120
+ end
121
+
122
+ handle_failure(chunk, io, status) unless status.success?
123
+ end
124
+
125
+ private
126
+
127
+ def command
128
+ raise NotImplementedError
129
+ end
130
+
131
+ def store_as
132
+ raise NotImplementedError
133
+ end
134
+
135
+ def default_parameter
136
+ ""
137
+ end
138
+
139
+ def handle_failure(chunk, io, status)
140
+ raise "failed to execute #{command} command. status = #{status}"
141
+ end
142
+
143
+ def check_command
144
+ Open3.capture3(command, "--version")
145
+ rescue Errno::ENOENT
146
+ raise Fluent::ConfigError, "'#{command}' utility must be in PATH for #{store_as} compression"
147
+ end
148
+ end
149
+
150
+ class GZipCommandObjectCreator < CommandObjectCreator
151
+ def initialize(transcoding: nil, command_parameter: nil, log: nil)
152
+ @transcoding = transcoding
153
+ super(command_parameter: command_parameter, log: log)
154
+ end
155
+
69
156
  def content_type
70
- "text/plain"
157
+ @transcoding ? "text/plain" : "application/gzip"
158
+ end
159
+
160
+ def content_encoding
161
+ @transcoding ? "gzip" : nil
71
162
  end
72
163
 
73
164
  def file_extension
74
- "txt"
165
+ "gz"
75
166
  end
76
167
 
77
- def write(chunk, io)
78
- chunk.write_to(io)
168
+ private
169
+
170
+ def command
171
+ "gzip"
172
+ end
173
+
174
+ def store_as
175
+ "gzip_command"
176
+ end
177
+
178
+ def handle_failure(chunk, io, status)
179
+ @log&.warn("failed to execute gzip command. Fallback to GzipWriter. status = #{status}")
180
+ io.truncate(0)
181
+ io.rewind
182
+ writer = Zlib::GzipWriter.new(io)
183
+ chunk.write_to(writer)
184
+ writer.finish
79
185
  end
80
186
  end
81
187
 
82
- class JSONObjectCreator < TextObjectCreator
188
+ class LZOObjectCreator < CommandObjectCreator
83
189
  def content_type
84
- "application/json"
190
+ "application/x-lzop"
85
191
  end
86
192
 
87
193
  def file_extension
88
- "json"
194
+ "lzo"
195
+ end
196
+
197
+ private
198
+
199
+ def command
200
+ "lzop"
201
+ end
202
+
203
+ def default_parameter
204
+ "-qf1"
205
+ end
206
+
207
+ def store_as
208
+ "lzo"
209
+ end
210
+ end
211
+
212
+ class LZMA2ObjectCreator < CommandObjectCreator
213
+ def content_type
214
+ "application/x-xz"
215
+ end
216
+
217
+ def file_extension
218
+ "xz"
219
+ end
220
+
221
+ private
222
+
223
+ def command
224
+ "xz"
225
+ end
226
+
227
+ def default_parameter
228
+ "-qf0"
229
+ end
230
+
231
+ def store_as
232
+ "lzma2"
233
+ end
234
+ end
235
+
236
+ class ZstdObjectCreator < CommandObjectCreator
237
+ def content_type
238
+ "application/x-zst"
239
+ end
240
+
241
+ def file_extension
242
+ "zst"
243
+ end
244
+
245
+ private
246
+
247
+ def command
248
+ "zstd"
249
+ end
250
+
251
+ def store_as
252
+ "zstd"
89
253
  end
90
254
  end
91
255
  end
@@ -1,5 +1,5 @@
1
1
  module Fluent
2
2
  module GCSPlugin
3
- VERSION = "0.4.4"
3
+ VERSION = "0.5.0"
4
4
  end
5
5
  end
@@ -10,7 +10,7 @@ module Fluent::Plugin
10
10
  class GCSOutput < Output
11
11
  Fluent::Plugin.register_output("gcs", self)
12
12
 
13
- helpers :compat_parameters, :formatter, :inject
13
+ helpers :formatter, :inject
14
14
 
15
15
  def initialize
16
16
  super
@@ -34,18 +34,20 @@ module Fluent::Plugin
34
34
  desc: "Format of GCS object keys"
35
35
  config_param :path, :string, default: "",
36
36
  desc: "Path prefix of the files on GCS"
37
- config_param :store_as, :enum, list: %i(gzip json text), default: :gzip,
37
+ config_param :store_as, :enum, list: %i(gzip gzip_command lzo lzma2 zstd json text), default: :gzip,
38
38
  desc: "Archive format on GCS"
39
39
  config_param :transcoding, :bool, default: false,
40
40
  desc: "Enable the decompressive form of transcoding"
41
+ config_param :gzip_command_parameter, :string, default: nil, deprecated: "Use command_parameter instead.",
42
+ desc: "Deprecated alias of command_parameter for the gzip_command compressor"
43
+ config_param :command_parameter, :string, default: nil,
44
+ desc: "Override the default arguments for the gzip_command / lzo / lzma2 / zstd compression command"
41
45
  config_param :auto_create_bucket, :bool, default: true,
42
46
  desc: "Create GCS bucket if it does not exists"
43
47
  config_param :hex_random_length, :integer, default: 4,
44
48
  desc: "Max length of `%{hex_random}` placeholder(4-16)"
45
49
  config_param :overwrite, :bool, default: false,
46
50
  desc: "Overwrite already existing path"
47
- config_param :format, :string, default: "out_file",
48
- desc: "Change one line format in the GCS object"
49
51
  config_param :acl, :enum, list: %i(auth_read owner_full owner_read private project_private public_read), default: nil,
50
52
  desc: "Permission for the object in GCS"
51
53
  config_param :storage_class, :enum, list: %i(dra nearline coldline multi_regional regional standard), default: nil,
@@ -59,10 +61,8 @@ module Fluent::Plugin
59
61
  config_param :value, :string, default: ""
60
62
  end
61
63
 
62
- DEFAULT_FORMAT_TYPE = "out_file"
63
-
64
64
  config_section :format do
65
- config_set_default :@type, DEFAULT_FORMAT_TYPE
65
+ config_set_default :@type, "out_file"
66
66
  end
67
67
 
68
68
  config_section :buffer do
@@ -73,7 +73,6 @@ module Fluent::Plugin
73
73
  MAX_HEX_RANDOM_LENGTH = 32
74
74
 
75
75
  def configure(conf)
76
- compat_parameters_convert(conf, :buffer, :formatter, :inject)
77
76
  super
78
77
 
79
78
  if @hex_random_length > MAX_HEX_RANDOM_LENGTH
@@ -91,11 +90,17 @@ module Fluent::Plugin
91
90
 
92
91
  @formatter = formatter_create
93
92
 
94
- @object_creator = Fluent::GCS.discovered_object_creator(@store_as, transcoding: @transcoding)
95
- # For backward compatibility
96
- # TODO: Remove time_slice_format when end of support compat_parameters
97
- @configured_time_slice_format = conf['time_slice_format']
98
- @time_slice_with_tz = Fluent::Timezone.formatter(@timekey_zone, @configured_time_slice_format || timekey_to_timeformat(@buffer_config['timekey']))
93
+ # gzip_command_parameter is a deprecated alias of command_parameter; the
94
+ # explicit command_parameter wins when both are set.
95
+ command_parameter = @command_parameter || @gzip_command_parameter
96
+
97
+ @object_creator = Fluent::GCS.discovered_object_creator(
98
+ @store_as,
99
+ transcoding: @transcoding,
100
+ command_parameter: command_parameter,
101
+ log: log
102
+ )
103
+ @time_slice_with_tz = Fluent::Timezone.formatter(@timekey_zone, timekey_to_timeformat(@buffer_config['timekey']))
99
104
 
100
105
  if @credentials_json
101
106
  @credentials = @credentials_json
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-gcs
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.4
4
+ version: 0.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Daichi HIRATA
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2024-02-03 00:00:00.000000000 Z
11
+ date: 2026-05-27 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: fluentd
@@ -16,20 +16,20 @@ dependencies:
16
16
  requirements:
17
17
  - - ">="
18
18
  - !ruby/object:Gem::Version
19
- version: 0.14.22
19
+ version: '1.0'
20
20
  - - "<"
21
21
  - !ruby/object:Gem::Version
22
- version: '2'
22
+ version: '3'
23
23
  type: :runtime
24
24
  prerelease: false
25
25
  version_requirements: !ruby/object:Gem::Requirement
26
26
  requirements:
27
27
  - - ">="
28
28
  - !ruby/object:Gem::Version
29
- version: 0.14.22
29
+ version: '1.0'
30
30
  - - "<"
31
31
  - !ruby/object:Gem::Version
32
- version: '2'
32
+ version: '3'
33
33
  - !ruby/object:Gem::Dependency
34
34
  name: google-cloud-storage
35
35
  requirement: !ruby/object:Gem::Requirement
@@ -44,16 +44,17 @@ dependencies:
44
44
  - - "~>"
45
45
  - !ruby/object:Gem::Version
46
46
  version: '1.1'
47
- description: Google Cloud Storage output plugin for Fluentd
47
+ description: Fluentd output plugin that buffers events and uploads them to Google
48
+ Cloud Storage as gzip, json, or text objects.
48
49
  email:
49
50
  - hirata.daichi@gmail.com
50
51
  executables: []
51
52
  extensions: []
52
53
  extra_rdoc_files: []
53
54
  files:
54
- - ".github/workflows/test.yaml"
55
+ - ".github/dependabot.yml"
56
+ - ".github/workflows/test.yml"
55
57
  - ".gitignore"
56
- - CHANGELOG.md
57
58
  - Gemfile
58
59
  - LICENSE.txt
59
60
  - README.md
@@ -67,7 +68,10 @@ files:
67
68
  homepage: https://github.com/daichirata/fluent-plugin-gcs
68
69
  licenses:
69
70
  - Apache-2.0
70
- metadata: {}
71
+ metadata:
72
+ source_code_uri: https://github.com/daichirata/fluent-plugin-gcs
73
+ bug_tracker_uri: https://github.com/daichirata/fluent-plugin-gcs/issues
74
+ rubygems_mfa_required: 'true'
71
75
  post_install_message:
72
76
  rdoc_options: []
73
77
  require_paths:
@@ -76,14 +80,14 @@ required_ruby_version: !ruby/object:Gem::Requirement
76
80
  requirements:
77
81
  - - ">="
78
82
  - !ruby/object:Gem::Version
79
- version: '0'
83
+ version: '3.3'
80
84
  required_rubygems_version: !ruby/object:Gem::Requirement
81
85
  requirements:
82
86
  - - ">="
83
87
  - !ruby/object:Gem::Version
84
88
  version: '0'
85
89
  requirements: []
86
- rubygems_version: 3.3.7
90
+ rubygems_version: 3.5.22
87
91
  signing_key:
88
92
  specification_version: 4
89
93
  summary: Google Cloud Storage output plugin for Fluentd
@@ -1,24 +0,0 @@
1
- name: Test
2
-
3
- on:
4
- push:
5
- branches: [ master ]
6
- pull_request:
7
- branches: [ master ]
8
-
9
- jobs:
10
- test:
11
- runs-on: ubuntu-latest
12
- strategy:
13
- fail-fast: false
14
- matrix:
15
- ruby-version: ['3.1', '3.2']
16
- steps:
17
- - uses: actions/checkout@v4
18
- - name: Set up Ruby
19
- uses: ruby/setup-ruby@v1
20
- with:
21
- ruby-version: ${{ matrix.ruby-version }}
22
- bundler-cache: true
23
- - name: Run tests
24
- run: bundle exec rake test
data/CHANGELOG.md DELETED
@@ -1,46 +0,0 @@
1
- ## [Unreleased]
2
-
3
- New features / Enhancements
4
-
5
- ## [0.4.2] - 2022/08/16
6
-
7
- Bug fixes
8
-
9
- - [Fix automatic conversion from a hash to keyword arguments](https://github.com/daichirata/fluent-plugin-gcs/pull/22)
10
-
11
- ## [0.4.1] - 2020/04/17
12
-
13
- New features
14
- - [Support blind write to GSC](https://github.com/daichirata/fluent-plugin-gcs/pull/14)
15
-
16
- ## [0.4.0] - 2019/04/01
17
-
18
- New features / Enhancements
19
-
20
- - [Support v0.14 (by @cosmo0920)](https://github.com/daichirata/fluent-plugin-gcs/pull/6)
21
-
22
- ## [0.3.0] - 2017/02/28
23
-
24
- New features / Enhancements
25
-
26
- - [Add support for setting a File's storage_class on file creation](https://github.com/daichirata/fluent-plugin-gcs/pull/4)
27
- - see also https://cloud.google.com/storage/docs/storage-classes
28
-
29
- ## [0.2.0] - 2017/01/16
30
-
31
- Bug fixes
32
-
33
- - [Remove encryption_key_sha256 parameter.](https://github.com/daichirata/fluent-plugin-gcs/pull/2)
34
- - see also. https://github.com/GoogleCloudPlatform/google-cloud-ruby/blob/master/google-cloud-storage/CHANGELOG.md#0230--2016-12-8
35
-
36
- ## [0.1.1] - 2016/11/28
37
-
38
- New features / Enhancements
39
-
40
- - Add support for `%{hostname}` of object_key_format
41
-
42
- [Unreleased]: https://github.com/daichirata/fluent-plugin-gcs/compare/v0.4.0...HEAD
43
- [0.4.0]: https://github.com/daichirata/fluent-plugin-gcs/compare/v0.3.0...v0.4.0
44
- [0.3.0]: https://github.com/daichirata/fluent-plugin-gcs/compare/v0.2.0...v0.3.0
45
- [0.2.0]: https://github.com/daichirata/fluent-plugin-gcs/compare/v0.1.0...v0.2.0
46
- [0.1.1]: https://github.com/daichirata/fluent-plugin-gcs/compare/v0.1.0...v0.1.1