logstash-output-vespa_feed 0.8.0 → 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +114 -32
- data/VERSION +1 -1
- data/docs/index.asciidoc +147 -0
- data/lib/logstash-output-vespa_feed_jars.rb +1 -1
- data/vendor/jar-dependencies/org/logstashplugins/logstash-output-vespa_feed/{0.8.0/logstash-output-vespa_feed-0.8.0.jar → 1.0.0/logstash-output-vespa_feed-1.0.0.jar} +0 -0
- metadata +3 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: b3821846f444b171f16fdd64199d52bd4a6e9e16a0fcaeb70acc4e0e28eb72c1
|
4
|
+
data.tar.gz: 0e479e7f5e999ebd1d6099c85bbab3ea62259326ae31296e4e6f255d52007b52
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 7bacdfd3c56daea32a1501ac69da32da04a98dd396819d7b257c3e046faad491b7ce4ae225f699e2afa8858c56b0f08b46e1b1c9a8c48d69c2b6b8b4d5c672d1
|
7
|
+
data.tar.gz: 9d9f5cbc881d448298c2f60e6667c0e4aa9b073744d57912ac7ddba4a91954de4ad2758a658ebc0bb87404e830096ec8e715346c41b0fa1c9524926b186e83bb
|
data/README.md
CHANGED
@@ -2,48 +2,99 @@
|
|
2
2
|
|
3
3
|
Plugin for [Logstash](https://github.com/elastic/logstash) to write to [Vespa](https://vespa.ai). Apache 2.0 license.
|
4
4
|
|
5
|
-
##
|
6
|
-
|
7
|
-
|
5
|
+
## Table of Contents
|
6
|
+
- [Quick start](#quick-start)
|
7
|
+
- [Usage](#usage)
|
8
|
+
- [Mode 1: Generating an application package](#mode-1-generating-an-application-package)
|
9
|
+
- [Mode 2: Sending data to Vespa](#mode-2-sending-data-to-vespa)
|
10
|
+
- [Development](#development)
|
11
|
+
- [Integration tests](#integration-tests)
|
12
|
+
- [Publishing the gem](#publishing-the-gem)
|
13
|
+
|
14
|
+
## Quick start
|
15
|
+
|
16
|
+
[Download and unpack/install Logstash](https://www.elastic.co/downloads/logstash), then install the plugin:
|
8
17
|
```
|
9
18
|
bin/logstash-plugin install logstash-output-vespa_feed
|
10
19
|
```
|
11
20
|
|
12
|
-
|
13
|
-
If you're developing the plugin, you'll want to do something like:
|
21
|
+
Write a Logstash config file (a couple of examples are below), then start Logstash, pointing it to the config file:
|
14
22
|
```
|
15
|
-
|
16
|
-
./gradlew gem
|
17
|
-
# run tests
|
18
|
-
./gradlew test
|
19
|
-
# install it as a Logstash plugin
|
20
|
-
/opt/logstash/bin/logstash-plugin install /path/to/logstash-output-vespa/logstash-output-vespa_feed-0.7.0.gem
|
21
|
-
# profit
|
22
|
-
/opt/logstash/bin/logstash
|
23
|
+
bin/logstash -f logstash.conf
|
23
24
|
```
|
24
|
-
Some more good info about Logstash Java plugins can be found [here](https://www.elastic.co/guide/en/logstash/current/java-output-plugin.html).
|
25
25
|
|
26
|
-
|
27
|
-
are
|
26
|
+
## Usage
|
27
|
+
There are two modes of operation:
|
28
|
+
1. **Generating an application package**: this is useful if you're just getting started with Vespa and you don't have e.g. a schema yet.
|
29
|
+
2. **Sending data to Vespa**: once you have an application package deployed, you can send data to Vespa.
|
28
30
|
|
29
|
-
###
|
30
|
-
To run integration tests, you'll need to have a Vespa instance running with an app deployed that supports an "id" field. And Logstash installed.
|
31
|
+
### Mode 1: generating an application package
|
31
32
|
|
32
|
-
|
33
|
+
If you're just getting started with Vespa, you can use the `detect_schema` option to generate an application package that works with the data you're sending. That application package can be deployed to [Vespa Cloud](https://cloud.vespa.ai) or a local Vespa instance.
|
34
|
+
|
35
|
+
To process the data, the `input` and `filter` sections are the same as when you send data to Vespa (see the [config example below](#mode-2-sending-data-to-vespa)). The output section is different. Here's an example for Vespa Cloud:
|
33
36
|
|
34
37
|
```
|
35
|
-
|
36
|
-
|
38
|
+
output {
|
39
|
+
vespa_feed {
|
40
|
+
# enable detect schema mode
|
41
|
+
detect_schema => true
|
42
|
+
# to get copy-paste-able Vespa CLI commands
|
43
|
+
deploy_package => true
|
44
|
+
|
45
|
+
# Vespa Cloud application details
|
46
|
+
vespa_cloud_tenant => "my_tenant"
|
47
|
+
vespa_cloud_application => "my_application"
|
48
|
+
|
49
|
+
### optional settings
|
50
|
+
# where to save the generated application package
|
51
|
+
application_package_dir => "/OS_TMPDIR/vespa_app"
|
52
|
+
# How long to wait (in empty batches) before showing the CLI commands for deploying the application
|
53
|
+
# This is useful if Logstash doesn't exit quickly and waits (e.g. when tailing a file)
|
54
|
+
# Otherwise, the plugin will show the CLI commands before Logstash exits
|
55
|
+
idle_batches => 10
|
56
|
+
# whether to generate mTLS certificates (defaults to true for Vespa Cloud, because you'll need them)
|
57
|
+
generate_mtls_certificates => true
|
58
|
+
# common name for the mTLS certificates
|
59
|
+
certificate_common_name => "cloud.vespa.logstash"
|
60
|
+
# validity days for the mTLS certificates
|
61
|
+
certificate_validity_days => 30
|
62
|
+
# where should the client certificate and key be saved
|
63
|
+
client_cert => "/OS_TMPDIR/vespa_app/security/clients.pem"
|
64
|
+
client_key => "/OS_TMPDIR/vespa_app/data-plane-private-key.pem"
|
65
|
+
}
|
66
|
+
}
|
37
67
|
```
|
38
68
|
|
39
|
-
|
69
|
+
For self-hosted Vespa, options are slightly different:
|
70
|
+
```
|
71
|
+
output {
|
72
|
+
vespa_feed {
|
73
|
+
# enable detect schema mode
|
74
|
+
detect_schema => true
|
75
|
+
# whether to actually deploy the application package
|
76
|
+
deploy_package => true
|
77
|
+
|
78
|
+
# config server endpoint (derived from vespa_url, which defaults to http://localhost:8080)
|
79
|
+
# used for deploying the application package, if deploy_package=true
|
80
|
+
config_server => "http://localhost:19071"
|
81
|
+
|
82
|
+
### same optional settings as for Vespa Cloud
|
83
|
+
### exception: generate_mtls_certificates defaults to false
|
84
|
+
}
|
85
|
+
}
|
86
|
+
```
|
40
87
|
|
41
|
-
|
42
|
-
|
88
|
+
In the end, you should have an application package in `application_package_dir`, which defaults to your OS's temp directory + `vespa_app`. We encourage you to check it out and change it as needed. If Logstash didn't already deploy it, you can do so with the [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html):
|
89
|
+
```
|
90
|
+
cd /path/to/application_package
|
91
|
+
### show deployment logs for up to 15 minutes
|
92
|
+
vespa deploy --wait 900
|
93
|
+
```
|
43
94
|
|
44
|
-
|
95
|
+
### Mode 2: sending data to Vespa
|
45
96
|
|
46
|
-
Some more Logstash config examples can be found [in this blog post](https://blog.vespa.ai/logstash-vespa-tutorials/), but here's one with all the output
|
97
|
+
Some more Logstash config examples can be found [in this blog post](https://blog.vespa.ai/logstash-vespa-tutorials/), but here's one with all the relevant output options:
|
47
98
|
|
48
99
|
```
|
49
100
|
# read stuff
|
@@ -94,16 +145,20 @@ output {
|
|
94
145
|
vespa_url => "http://localhost:8080"
|
95
146
|
|
96
147
|
# for HTTPS URLS (e.g. Vespa Cloud), you may want to provide a certificate and key for mTLS authentication
|
97
|
-
|
148
|
+
# the defaults are relative to application_package_dir (see the detect_schema section above)
|
149
|
+
client_cert => "/OS_TMPDIR/vespa_app/security/clients.pem"
|
98
150
|
# make sure the key isn't password-protected
|
99
151
|
# if it is, you can create a new key without a password like this:
|
100
152
|
# openssl rsa -in myapp_key_with_pass.pem -out myapp_key.pem
|
101
|
-
client_key => "/
|
153
|
+
client_key => "/OS_TMPDIR/vespa_app/data-plane-private-key.pem"
|
154
|
+
|
155
|
+
# for Vespa Cloud, you can use an auth token instead of mTLS certificates
|
156
|
+
auth_token => "vespa_cloud_TOKEN_GOES_HERE"
|
102
157
|
|
103
158
|
# namespace could be static or in the %{field} format, picking from a field in the document
|
104
|
-
namespace => "
|
159
|
+
namespace => "defaults_to_the_document_type_value"
|
105
160
|
# similarly, doc type could be static or in the %{field} format
|
106
|
-
document_type => "
|
161
|
+
document_type => "doctype"
|
107
162
|
|
108
163
|
# operation can be "put", "update", "remove" or dynamic (in the %{field} format)
|
109
164
|
operation => "put"
|
@@ -155,7 +210,34 @@ output {
|
|
155
210
|
}
|
156
211
|
```
|
157
212
|
|
158
|
-
|
213
|
+
## Development
|
214
|
+
If you're developing the plugin, you'll want to do something like:
|
159
215
|
```
|
160
|
-
|
216
|
+
# build the gem
|
217
|
+
./gradlew gem
|
218
|
+
# run tests
|
219
|
+
./gradlew test
|
220
|
+
# install it as a Logstash plugin
|
221
|
+
/opt/logstash/bin/logstash-plugin install /path/to/logstash-output-vespa/logstash-output-vespa_feed-1.0.0.gem
|
222
|
+
# profit
|
223
|
+
/opt/logstash/bin/logstash
|
224
|
+
```
|
225
|
+
Some more good info about Logstash Java plugins can be found [here](https://www.elastic.co/guide/en/logstash/current/java-output-plugin.html).
|
226
|
+
|
227
|
+
It looks like the JVM options from [here](https://github.com/logstash-plugins/.ci/blob/main/dockerjdk17.env)
|
228
|
+
are useful to make JRuby's `bundle install` work.
|
229
|
+
|
230
|
+
### Integration tests
|
231
|
+
To run integration tests, you'll need to have a Vespa instance running with an app deployed that supports an "id" field. And Logstash installed.
|
232
|
+
|
233
|
+
Check out the `integration-test` directory for more information.
|
234
|
+
|
161
235
|
```
|
236
|
+
cd integration-test
|
237
|
+
./run_tests.sh
|
238
|
+
```
|
239
|
+
|
240
|
+
### Publishing the gem
|
241
|
+
|
242
|
+
Note to self: for some reason, `bundle exec rake publish_gem` fails, but `gem push logstash-output-vespa_feed-$VERSION.gem`
|
243
|
+
does the trick.
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.
|
1
|
+
1.0.0
|
data/docs/index.asciidoc
CHANGED
@@ -51,6 +51,26 @@ Writes documents to Vespa.
|
|
51
51
|
| <<plugins-{type}s-{plugin}-max_streams>> |<<number,number>>|No
|
52
52
|
| <<plugins-{type}s-{plugin}-operation_timeout>> |<<number,number>>|No
|
53
53
|
| <<plugins-{type}s-{plugin}-grace_period>> |<<number,number>>|No
|
54
|
+
| <<plugins-{type}s-{plugin}-doom_period>> |<<number,number>>|No
|
55
|
+
| <<plugins-{type}s-{plugin}-auth_token>> |<<string,string>>|No
|
56
|
+
| <<plugins-{type}s-{plugin}-enable_dlq>> |<<boolean,boolean>>|No
|
57
|
+
| <<plugins-{type}s-{plugin}-dlq_path>> |<<string,string>>|No
|
58
|
+
| <<plugins-{type}s-{plugin}-max_queue_size>> |<<number,number>>|No
|
59
|
+
| <<plugins-{type}s-{plugin}-max_segment_size>> |<<number,number>>|No
|
60
|
+
| <<plugins-{type}s-{plugin}-flush_interval>> |<<number,number>>|No
|
61
|
+
| <<plugins-{type}s-{plugin}-detect_schema>> |<<boolean,boolean>>|No
|
62
|
+
| <<plugins-{type}s-{plugin}-application_package_dir>> |<<string,string>>|No
|
63
|
+
| <<plugins-{type}s-{plugin}-deploy_package>> |<<boolean,boolean>>|No
|
64
|
+
| <<plugins-{type}s-{plugin}-idle_batches>> |<<number,number>>|No
|
65
|
+
| <<plugins-{type}s-{plugin}-config_server>> |<<string,string>>|No
|
66
|
+
| <<plugins-{type}s-{plugin}-generate_mtls_certificates>> |<<boolean,boolean>>|No
|
67
|
+
| <<plugins-{type}s-{plugin}-certificate_common_name>> |<<string,string>>|No
|
68
|
+
| <<plugins-{type}s-{plugin}-certificate_validity_days>> |<<number,number>>|No
|
69
|
+
| <<plugins-{type}s-{plugin}-type_mappings_file>> |<<string,string>>|No
|
70
|
+
| <<plugins-{type}s-{plugin}-type_conflict_resolution_file>> |<<string,string>>|No
|
71
|
+
| <<plugins-{type}s-{plugin}-vespa_cloud_tenant>> |<<string,string>>|No
|
72
|
+
| <<plugins-{type}s-{plugin}-vespa_cloud_application>> |<<string,string>>|No
|
73
|
+
| <<plugins-{type}s-{plugin}-vespa_cloud_instance>> |<<string,string>>|No
|
54
74
|
|=======================================================================
|
55
75
|
|
56
76
|
[id="plugins-{type}s-{plugin}-vespa_url"]
|
@@ -198,6 +218,15 @@ After this time (seconds), the circuit breaker will be half-open:
|
|
198
218
|
it will ping the endpoint to see if it's back,
|
199
219
|
then resume sending requests when it's back.
|
200
220
|
|
221
|
+
[id="plugins-{type}s-{plugin}-doom_period"]
|
222
|
+
===== `doom_period`
|
223
|
+
|
224
|
+
* Value type is <<number,number>>
|
225
|
+
* Default value is `60`
|
226
|
+
|
227
|
+
After this time (seconds), if the connection is still broken, the connection will close
|
228
|
+
and stop sending requests until the Vespa service comes back up.
|
229
|
+
|
201
230
|
[id="plugins-{type}s-{plugin}-enable_dlq"]
|
202
231
|
===== `enable_dlq`
|
203
232
|
|
@@ -239,6 +268,124 @@ Maximum size of each Dead Letter Queue segment file in bytes.
|
|
239
268
|
|
240
269
|
How often to commit the Dead Letter Queue to disk, in milliseconds.
|
241
270
|
|
271
|
+
[id="plugins-{type}s-{plugin}-detect_schema"]
|
272
|
+
===== `detect_schema`
|
273
|
+
|
274
|
+
* Value type is <<boolean,boolean>>
|
275
|
+
* Default value is `false`
|
276
|
+
|
277
|
+
Enable detect schema mode. This will not send documents to Vespa, but will generate an application package
|
278
|
+
based on the incoming data. This is useful for quickly getting started with Vespa when you don't have
|
279
|
+
a schema yet.
|
280
|
+
|
281
|
+
[id="plugins-{type}s-{plugin}-application_package_dir"]
|
282
|
+
===== `application_package_dir`
|
283
|
+
|
284
|
+
* Value type is <<string,string>>
|
285
|
+
* Default value is `/tmp/vespa_app` (system temporary directory)
|
286
|
+
|
287
|
+
Directory where the generated application package will be stored. Only relevant if `detect_schema` is `true`.
|
288
|
+
|
289
|
+
[id="plugins-{type}s-{plugin}-deploy_package"]
|
290
|
+
===== `deploy_package`
|
291
|
+
|
292
|
+
* Value type is <<boolean,boolean>>
|
293
|
+
* Default value is `true`
|
294
|
+
|
295
|
+
Whether to deploy the generated application package. Only relevant if `detect_schema` is `true`.
|
296
|
+
|
297
|
+
[id="plugins-{type}s-{plugin}-idle_batches"]
|
298
|
+
===== `idle_batches`
|
299
|
+
|
300
|
+
* Value type is <<number,number>>
|
301
|
+
* Default value is `10`
|
302
|
+
|
303
|
+
Number of empty batches to wait before deploying the application package. Only relevant if `detect_schema` is `true`
|
304
|
+
and `deploy_package` is `true`.
|
305
|
+
|
306
|
+
[id="plugins-{type}s-{plugin}-config_server"]
|
307
|
+
===== `config_server`
|
308
|
+
|
309
|
+
* Value type is <<string,string>>
|
310
|
+
* Default value is `nil` (uses the host from `vespa_url`, if defined, with port 19071)
|
311
|
+
|
312
|
+
URL to the Vespa config server for deployment. Only relevant if `detect_schema` is `true` and `deploy_package` is `true`.
|
313
|
+
|
314
|
+
[id="plugins-{type}s-{plugin}-generate_mtls_certificates"]
|
315
|
+
===== `generate_mtls_certificates`
|
316
|
+
|
317
|
+
* Value type is <<boolean,boolean>>
|
318
|
+
* Default value is `false`
|
319
|
+
|
320
|
+
Whether to generate self-signed mTLS certificates for the application package. Only relevant if `detect_schema` is `true`.
|
321
|
+
Defaults to `true` if using Vespa Cloud, otherwise `false`.
|
322
|
+
|
323
|
+
[id="plugins-{type}s-{plugin}-certificate_common_name"]
|
324
|
+
===== `certificate_common_name`
|
325
|
+
|
326
|
+
* Value type is <<string,string>>
|
327
|
+
* Default value is `cloud.vespa.logstash`
|
328
|
+
|
329
|
+
Common name for the generated mTLS certificates. Only relevant if `detect_schema` is `true` and `generate_mtls_certificates` is `true`.
|
330
|
+
|
331
|
+
[id="plugins-{type}s-{plugin}-certificate_validity_days"]
|
332
|
+
===== `certificate_validity_days`
|
333
|
+
|
334
|
+
* Value type is <<number,number>>
|
335
|
+
* Default value is `30`
|
336
|
+
|
337
|
+
Validity period in days for the generated mTLS certificates. Only relevant if `detect_schema` is `true` and `generate_mtls_certificates` is `true`.
|
338
|
+
|
339
|
+
[id="plugins-{type}s-{plugin}-type_mappings_file"]
|
340
|
+
===== `type_mappings_file`
|
341
|
+
|
342
|
+
* Value type is <<string,string>>
|
343
|
+
* Default value is `nil`
|
344
|
+
|
345
|
+
Path to a YAML file containing custom type mappings for schema detection. Only relevant if `detect_schema` is `true`.
|
346
|
+
|
347
|
+
[id="plugins-{type}s-{plugin}-type_conflict_resolution_file"]
|
348
|
+
===== `type_conflict_resolution_file`
|
349
|
+
|
350
|
+
* Value type is <<string,string>>
|
351
|
+
* Default value is `nil`
|
352
|
+
|
353
|
+
Path to a YAML file containing custom type conflict resolution rules. Only relevant if `detect_schema` is `true`.
|
354
|
+
|
355
|
+
[id="plugins-{type}s-{plugin}-vespa_cloud_tenant"]
|
356
|
+
===== `vespa_cloud_tenant`
|
357
|
+
|
358
|
+
* Value type is <<string,string>>
|
359
|
+
* Default value is `nil`
|
360
|
+
|
361
|
+
Tenant name for Vespa Cloud deployment. Only relevant if `detect_schema` is `true` and deploying to Vespa Cloud.
|
362
|
+
|
363
|
+
[id="plugins-{type}s-{plugin}-vespa_cloud_application"]
|
364
|
+
===== `vespa_cloud_application`
|
365
|
+
|
366
|
+
* Value type is <<string,string>>
|
367
|
+
* Default value is `nil`
|
368
|
+
|
369
|
+
Application name for Vespa Cloud deployment. Only relevant if `detect_schema` is `true` and deploying to Vespa Cloud.
|
370
|
+
|
371
|
+
[id="plugins-{type}s-{plugin}-vespa_cloud_instance"]
|
372
|
+
===== `vespa_cloud_instance`
|
373
|
+
|
374
|
+
* Value type is <<string,string>>
|
375
|
+
* Default value is `default`
|
376
|
+
|
377
|
+
Instance name for Vespa Cloud deployment. Only relevant if `detect_schema` is `true` and deploying to Vespa Cloud.
|
378
|
+
|
379
|
+
[id="plugins-{type}s-{plugin}-auth_token"]
|
380
|
+
===== `auth_token`
|
381
|
+
|
382
|
+
* Value type is <<string,string>>
|
383
|
+
* Default value is `nil`
|
384
|
+
|
385
|
+
Authentication token for Vespa Cloud. If provided, it will be sent as a Bearer token in the Authorization header.
|
386
|
+
|
387
|
+
Note: This is mutually exclusive with client certificate authentication (`client_cert` and `client_key`). If both are provided, the client certificate will be used.
|
388
|
+
|
242
389
|
// The full list of Value Types is here:
|
243
390
|
// https://www.elastic.co/guide/en/logstash/current/configuration-file-structure.html
|
244
391
|
|
Binary file
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: logstash-output-vespa_feed
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 1.0.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Radu Gheorghe
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2025-
|
11
|
+
date: 2025-04-11 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
requirement: !ruby/object:Gem::Requirement
|
@@ -76,7 +76,7 @@ files:
|
|
76
76
|
- lib/logstash-output-vespa_feed_jars.rb
|
77
77
|
- lib/logstash/outputs/vespa_feed.rb
|
78
78
|
- logstash-output-vespa_feed.gemspec
|
79
|
-
- vendor/jar-dependencies/org/logstashplugins/logstash-output-vespa_feed/0.
|
79
|
+
- vendor/jar-dependencies/org/logstashplugins/logstash-output-vespa_feed/1.0.0/logstash-output-vespa_feed-1.0.0.jar
|
80
80
|
homepage: https://vespa.ai
|
81
81
|
licenses:
|
82
82
|
- Apache-2.0
|