fluent-plugin-parser-avro 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: accbc87b882c2e1e0999a9992da794fbde6be745f240e043e77e92d9964406ac
4
- data.tar.gz: 817d393998e57f8ba2352f405e565b0c59de34e3c92b7cbc38653453a621fd4a
3
+ metadata.gz: 9633a316c1de1e4cb83b487d99d5a3814f1cde2b96d20633cfe728c584108761
4
+ data.tar.gz: 6438fe5e248c98540602183a5828a7dbc596fe5000c4993fd428ee9223ed3077
5
5
  SHA512:
6
- metadata.gz: 2dda0e434a73d6e3d1380c3e96e5cfe1595c3041bfe4afa9b99b125bcc8b19c7177139860ea108c7523f418d5fd589bc5edcd673d3025f628c0b47f2ca664c31
7
- data.tar.gz: 4277e90990b38c9d72e5982f817d02c39f57592af64f2b03c54af9ec708b57908f3a13bd41de3ad2a5e82794b4612b555888ace4e10ec197d728e84b89d4718e
6
+ metadata.gz: 8b8894c17aaa33916ef3601634d36139f2817db875eba2c5e552a0309f2c5f3ce6578526c5aad54244cd51ce36cf30bc04f64ce576fcbd4ccb89fc7726d4102d
7
+ data.tar.gz: a9dbc9e144b1e0f7f2b585a51c0519e6c2953d7258b5bd13fd8d6246754f9956bffcee43777378692ea4ef2d8997d8f6648653d8928e2345e70c32e48ab849a6
data/LICENSE CHANGED
@@ -187,7 +187,7 @@
187
187
  same "printed page" as the copyright notice for easier
188
188
  identification within third-party archives.
189
189
 
190
- Copyright 2020- Hiroshi Hatake
190
+ Copyright [yyyy] [name of copyright owner]
191
191
 
192
192
  Licensed under the Apache License, Version 2.0 (the "License");
193
193
  you may not use this file except in compliance with the License.
data/README.md CHANGED
@@ -32,12 +32,22 @@ $ bundle
32
32
  * **schema_file** (string) (optional): avro schema file path.
33
33
  * **schema_json** (string) (optional): avro schema definition hash.
34
34
  * **schema_url** (string) (optional): avro schema remote URL.
35
- * **schema_registery_with_subject_url** (string) (optional): avro schema registry URL.
36
35
  * **schema_url_key** (string) (optional): avro schema registry or something's response schema key.
37
36
  * **writers_schema_file** (string) (optional): avro schema file path for writers definition.
38
37
  * **writers_schema_json** (string) (optional): avro schema definition hash for writers definition.
39
38
  * **readers_schema_file** (string) (optional): avro schema file path for readers definition.
40
39
  * **readers_schema_json** (string) (optional): avro schema definition hash for readers definition.
40
+ * **use_confluent_schema** (bool) (optional): Assume to use confluent schema. Confluent avro schema uses the first 5-bytes for magic byte (1 byte) and schema_id (4 bytes). This parameter specifies to skip reading the first 5-bytes or not.
41
+ * Default value: `true`.
42
+
43
+ ### \<confluent_registry\> section (optional) (single)
44
+
45
+ * **url** (string) (required): confluent schema registry URL.
46
+ * **subject** (string) (required): Specify schema subject.
47
+ * **schema_key** (string) (optional): Specify schema key on confluent registry REST API response.
48
+ * Default value: `schema`.
49
+ * **schema_version** (string) (optional): Specify schema version for the specified subject.
50
+ * Default value: `latest`.
41
51
 
42
52
  ### Configuration Example
43
53
 
@@ -48,7 +58,14 @@ $ bundle
48
58
  # schema_json { "namespace": "org.fluentd.parser.avro", "type": "record", "name": "User", "fields" : [{"name": "username", "type": "string"}, {"name": "age", "type": "int"}, {"name": "verified", "type": ["boolean", "null"], "default": false}]}
49
59
  # schema_url http(s)://[server fqdn]:[port]/subjects/[a great user's subject]/[the latest schema version]
50
60
  # schema_key schema
51
- # schema_registery_with_subject_url http(s)://[server fqdn]:[port]/subjects/[a great user's subject]/
61
+ # When using with confluent registry without <confluent_registry>, this parameter must be true.
62
+ # use_confluent_schema true
63
+ #<confluent_registry>
64
+ # url http://localhost:8081/
65
+ # subject your-awesome-subject
66
+ # # schema_key schema
67
+ # # schema_version 1
68
+ #</confluent_registry>
52
69
  </parse>
53
70
  ```
54
71
 
@@ -58,22 +75,44 @@ Confluent AVRO schema registry should respond with REST API.
58
75
 
59
76
  This plugin uses the following API:
60
77
 
61
- * [`GET /subjects/(string: subject)/versions`](https://docs.confluent.io/current/schema-registry/develop/api.html#get--subjects-(string-%20subject)-versions)
62
78
  * [`GET /subjects/(string: subject)/versions/(versionId: version)`](https://docs.confluent.io/current/schema-registry/develop/api.html#get--subjects-(string-%20subject)-versions)
63
79
 
64
- Users can specify a URL for retrieving the latest schemna information:
80
+ Users can specify a URL for retrieving the latest schemna information with `<confluent_registry>`:
65
81
 
66
- e.g.) `http(s)://[server fqdn]:[port]/subjects/[a great user's subject]/`
82
+ e.g.)
83
+ ```
84
+ <confluent_registry>
85
+ url http://[confluent registry server ip]:[port]/
86
+ subject your-awesome-subject
87
+ # schema_key schema
88
+ # schema_version 1
89
+ </confluent_registry>
90
+ ```
67
91
 
68
92
  For example, when specifying the following configuration:
69
93
 
70
94
  ```
71
95
  <parse>
72
96
  @type avro
73
- schema_registery_with_subject_url http://localhost:8081/subjects/persons-avro-value/
97
+ <confluent_registry>
98
+ url http://localhost:8081/
99
+ subject persons-avro-value
100
+ # schema_key schema
101
+ # schema_version 1
102
+ </confluent_registry>
74
103
  ```
75
104
 
76
- Then the parser plugin calls `GET http://localhost:8081/subjects/persons-avro-value/versions/` to retrive the registered schema versions and then calls `GET GET http://localhost:8081/subjects/persons-avro-value/versions/<the latest schema version>`.
105
+ Then the parser plugin calls `GET http://localhost:8081/subjects/persons-avro-value/versions/latest` to retrive the registered schema versions. And when parsing failure occurred, this plugin will call `GET http://localhost:8081/schemas/ids/<schema id which is obtained from the second record on avro schema>`.
106
+
107
+ If you use this plugin to parse confluent schema, please specify `use_confluent_schema` as `true`.
108
+
109
+ This is because, confluent avro schema uses the following structure:
110
+
111
+ MAGIC_BYTE | schema_id | record
112
+ ----------:|:---------:|:---------------
113
+ 1byte | 4bytes | record contents
114
+
115
+ When specifying `<confluent_registry>` section on configuration, this plugin will skip to read the first 5-bytes automatically and parse `schema_id` from there.
77
116
 
78
117
  ## Copyright
79
118
 
@@ -3,7 +3,7 @@ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
3
3
 
4
4
  Gem::Specification.new do |spec|
5
5
  spec.name = "fluent-plugin-parser-avro"
6
- spec.version = "0.1.0"
6
+ spec.version = "0.2.0"
7
7
  spec.authors = ["Hiroshi Hatake"]
8
8
  spec.email = ["cosmo0920.wp@gmail.com"]
9
9
 
@@ -0,0 +1,49 @@
1
+ #
2
+ # Copyright 2020- Hiroshi Hatake
3
+ #
4
+ # Licensed under the Apache License, Version 2.0 (the "License");
5
+ # you may not use this file except in compliance with the License.
6
+ # You may obtain a copy of the License at
7
+ #
8
+ # http://www.apache.org/licenses/LICENSE-2.0
9
+ #
10
+ # Unless required by applicable law or agreed to in writing, software
11
+ # distributed under the License is distributed on an "AS IS" BASIS,
12
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
+ # See the License for the specific language governing permissions and
14
+ # limitations under the License.
15
+
16
+ require "net/http"
17
+ require "uri"
18
+
19
+ module Fluent
20
+ module Plugin
21
+ class ConfluentAvroSchemaRegistry
22
+ def initialize(registry_url)
23
+ @registry_url = registry_url
24
+ end
25
+
26
+ def subject_version(subject, schema_key, version = "latest")
27
+ registry_uri = URI.parse(@registry_url)
28
+ registry_uri_with_versions = URI.join(registry_uri, "/subjects/#{subject}/versions/#{version}")
29
+ response = Net::HTTP.get_response(registry_uri_with_versions)
30
+ if schema_key.nil?
31
+ response.body
32
+ else
33
+ Yajl.load(response.body)[schema_key]
34
+ end
35
+ end
36
+
37
+ def schema_with_id(schema_id, schema_key)
38
+ registry_uri = URI.parse(@registry_url)
39
+ registry_uri_with_ids = URI.join(registry_uri, "/schemas/ids/#{schema_id}")
40
+ response = Net::HTTP.get_response(registry_uri_with_ids)
41
+ if schema_key.nil?
42
+ response.body
43
+ else
44
+ Yajl.load(response.body)[schema_key]
45
+ end
46
+ end
47
+ end
48
+ end
49
+ end
@@ -18,21 +18,30 @@ require "net/http"
18
18
  require "stringio"
19
19
  require "uri"
20
20
  require "fluent/plugin/parser"
21
+ require_relative "./confluent_avro_schema_registry"
21
22
 
22
23
  module Fluent
23
24
  module Plugin
24
25
  class AvroParser < Fluent::Plugin::Parser
25
26
  Fluent::Plugin.register_parser("avro", self)
26
27
 
28
+ MAGIC_BYTE = [0].pack("C").freeze
29
+
27
30
  config_param :schema_file, :string, :default => nil
28
31
  config_param :schema_json, :string, :default => nil
29
32
  config_param :schema_url, :string, :default => nil
30
- config_param :schema_registery_with_subject_url, :string, :default => nil
31
33
  config_param :schema_url_key, :string, :default => nil
32
34
  config_param :writers_schema_file, :string, :default => nil
33
35
  config_param :writers_schema_json, :string, :default => nil
34
36
  config_param :readers_schema_file, :string, :default => nil
35
37
  config_param :readers_schema_json, :string, :default => nil
38
+ config_param :use_confluent_schema, :bool, :default => true
39
+ config_section :confluent_registry, param_name: :avro_registry, required: false, multi: false do
40
+ config_param :url, :string
41
+ config_param :subject, :string
42
+ config_param :schema_key, :string, :default => "schema"
43
+ config_param :schema_version, :string, :default => "latest"
44
+ end
36
45
 
37
46
  def configure(conf)
38
47
  super
@@ -60,18 +69,20 @@ module Fluent
60
69
  @writers_schema = Avro::Schema.parse(@writers_raw_schema)
61
70
  @readers_schema = Avro::Schema.parse(@readers_raw_schema)
62
71
  @reader = Avro::IO::DatumReader.new(@writers_schema, @readers_schema)
72
+ elsif @avro_registry
73
+ @confluent_registry = Fluent::Plugin::ConfluentAvroSchemaRegistry.new(@avro_registry.url)
74
+ @raw_schema = @confluent_registry.subject_version(@avro_registry.subject,
75
+ @avro_registry.schema_key,
76
+ @avro_registry.schema_version)
77
+ @schema = Avro::Schema.parse(@raw_schema)
78
+ @reader = Avro::IO::DatumReader.new(@schema)
63
79
  else
64
- unless [@schema_json, @schema_file, @schema_url, @schema_registery_with_subject_url].compact.size == 1
80
+ unless [@schema_json, @schema_file, @schema_url].compact.size == 1
65
81
  raise Fluent::ConfigError, "schema_json, schema_file, or schema_url is required, but they cannot specify at the same time!"
66
82
  end
67
- if @schema_registery_with_subject_url && !@schema_registery_with_subject_url.end_with?("/")
68
- raise Fluent::ConfigError, "schema_registery_with_subject_url must contain the trailing slash('/')."
69
- end
70
83
 
71
84
  @raw_schema = if @schema_file
72
85
  File.read(@schema_file)
73
- elsif @schema_registery_with_subject_url
74
- fetch_latest_schema(@schema_registery_with_subject_url, @schema_url_key)
75
86
  elsif @schema_url
76
87
  fetch_schema(@schema_url, @schema_url_key)
77
88
  elsif @schema_json
@@ -91,25 +102,55 @@ module Fluent
91
102
  buffer = StringIO.new(data)
92
103
  decoder = Avro::IO::BinaryDecoder.new(buffer)
93
104
  begin
105
+ if @use_confluent_schema || @avro_registry
106
+ # When using confluent avro schema, record is formatted as follows:
107
+ #
108
+ # MAGIC_BYTE | schema_id | record
109
+ # ----------:|:---------:|:---------------
110
+ # 1byte | 4bytes | record contents
111
+ magic_byte = decoder.read(1)
112
+
113
+ if magic_byte != MAGIC_BYTE
114
+ raise "The first byte should be magic byte but got {magic_byte.inspect}"
115
+ end
116
+ schema_id = decoder.read(4).unpack("N").first
117
+ end
94
118
  decoded_data = @reader.read(decoder)
95
119
  time, record = convert_values(parse_time(decoded_data), decoded_data)
96
120
  yield time, record
97
- rescue => e
98
- raise e if @schema_url.nil? or @schema_registery_with_subject_url.nil?
121
+ rescue EOFError, RuntimeError => e
122
+ raise e unless [@schema_url, @avro_registry].compact.size == 1
99
123
  begin
100
124
  new_raw_schema = if @schema_url
101
125
  fetch_schema(@schema_url, @schema_url_key)
102
- elsif @schema_registery_with_subject_url
103
- fetch_latest_schema(@schema_registery_with_subject_url, @schema_url_key)
126
+ elsif @avro_registry
127
+ @confluent_registry.schema_with_id(schema_id,
128
+ @avro_registry.schema_key)
104
129
  end
105
130
  new_schema = Avro::Schema.parse(new_raw_schema)
106
- is_changed = (new_raw_schena_== @raw_schema)
131
+ is_changed = (new_raw_schema != @raw_schema)
107
132
  @raw_schema = new_raw_schema
108
- @schame = new_schema
109
- rescue
133
+ @schema = new_schema
134
+ rescue EOFError, RuntimeError
110
135
  # Do nothing.
111
136
  end
112
137
  if is_changed
138
+ buffer = StringIO.new(data)
139
+ decoder = Avro::IO::BinaryDecoder.new(buffer)
140
+ if @use_confluent_schema || @avro_registry
141
+ # When using confluent avro schema, record is formatted as follows:
142
+ #
143
+ # MAGIC_BYTE | schema_id | record
144
+ # ----------:|:---------:|:---------------
145
+ # 1byte | 4bytes | record contents
146
+ magic_byte = decoder.read(1)
147
+
148
+ if magic_byte != MAGIC_BYTE
149
+ raise "The first byte should be magic byte but got {magic_byte.inspect}"
150
+ end
151
+ schema_id = decoder.read(4).unpack("N").first
152
+ end
153
+ @reader = Avro::IO::DatumReader.new(@schema)
113
154
  decoded_data = @reader.read(decoder)
114
155
  time, record = convert_values(parse_time(decoded_data), decoded_data)
115
156
  yield time, record
@@ -119,24 +160,6 @@ module Fluent
119
160
  end
120
161
  end
121
162
 
122
- def fetch_schema_versions(base_uri_with_versions)
123
- versions_response = Net::HTTP.get_response(base_uri_with_versions)
124
- Yajl.load(versions_response.body)
125
- end
126
-
127
- def fetch_latest_schema(base_url, schema_key)
128
- base_uri = URI.parse(base_url)
129
- base_uri_with_versions = URI.join(base_uri, "versions/")
130
- versions = fetch_schema_versions(base_uri_with_versions)
131
- uri = URI.join(base_uri_with_versions, versions.last.to_s)
132
- response = Net::HTTP.get_response(uri)
133
- if schema_key.nil?
134
- response.body
135
- else
136
- Yajl.load(response.body)[schema_key]
137
- end
138
- end
139
-
140
163
  def fetch_schema(url, schema_key)
141
164
  uri = URI.parse(url)
142
165
  response = Net::HTTP.get_response(uri)
@@ -0,0 +1 @@
1
+ {"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"}]}"}
@@ -0,0 +1 @@
1
+ {"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"},{\"name\":\"verified\",\"type\":\"boolean\",\"default\":false}]}"}
@@ -0,0 +1 @@
1
+ {"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"},{\"name\":\"verified\",\"type\":\"boolean\"}]}"}
@@ -0,0 +1 @@
1
+ {"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"},{\"name\":\"verified\",\"type\":[\"boolean\",\"null\"],\"default\":false}]}"}
@@ -76,74 +76,94 @@ class AvroParserTest < Test::Unit::TestCase
76
76
  }
77
77
  EOC
78
78
 
79
- def test_parse
79
+ data("use_confluent_schema" => true,
80
+ "plain" => false)
81
+ def test_parse(data)
82
+ config = data
80
83
  conf = {
81
- 'schema_json' => SCHEMA
84
+ 'schema_json' => SCHEMA,
85
+ 'use_confluent_schema' => config,
82
86
  }
83
87
  d = create_driver(conf)
84
88
  datum = {"username" => "foo", "age" => 42, "verified" => true}
85
- encoded = encode_datum(datum, SCHEMA)
89
+ encoded = encode_datum(datum, SCHEMA, config)
86
90
  d.instance.parse(encoded) do |_time, record|
87
91
  assert_equal datum, record
88
92
  end
89
93
 
90
94
  datum = {"username" => "baz", "age" => 34}
91
- encoded = encode_datum(datum, SCHEMA)
95
+ encoded = encode_datum(datum, SCHEMA, config)
92
96
  d.instance.parse(encoded) do |_time, record|
93
97
  assert_equal datum.merge("verified" => nil), record
94
98
  end
95
99
  end
96
100
 
97
- def test_parse_with_avro_schema
101
+ data("use_confluent_schema" => true,
102
+ "plain" => false)
103
+ def test_parse_with_avro_schema(data)
104
+ config = data
98
105
  conf = {
99
- 'schema_file' => File.join(__dir__, "..", "data", "user.avsc")
106
+ 'schema_file' => File.join(__dir__, "..", "data", "user.avsc"),
107
+ 'use_confluent_schema' => config,
100
108
  }
101
109
  d = create_driver(conf)
102
110
  datum = {"username" => "foo", "age" => 42, "verified" => true}
103
- encoded = encode_datum(datum, SCHEMA)
111
+ encoded = encode_datum(datum, SCHEMA, config)
104
112
  d.instance.parse(encoded) do |_time, record|
105
113
  assert_equal datum, record
106
114
  end
107
115
 
108
116
  datum = {"username" => "baz", "age" => 34}
109
- encoded = encode_datum(datum, SCHEMA)
117
+ encoded = encode_datum(datum, SCHEMA, config)
110
118
  d.instance.parse(encoded) do |_time, record|
111
119
  assert_equal datum.merge("verified" => nil), record
112
120
  end
113
121
  end
114
122
 
115
- def test_parse_with_readers_and_writers_schema
123
+ data("use_confluent_schema" => true,
124
+ "plain" => false)
125
+ def test_parse_with_readers_and_writers_schema(data)
126
+ config = data
116
127
  conf = {
117
128
  'writers_schema_json' => SCHEMA,
118
129
  'readers_schema_json' => READERS_SCHEMA,
130
+ 'use_confluent_schema' => config,
119
131
  }
120
132
  d = create_driver(conf)
121
133
  datum = {"username" => "foo", "age" => 42, "verified" => true}
122
- encoded = encode_datum(datum, SCHEMA)
134
+ encoded = encode_datum(datum, SCHEMA, config)
123
135
  d.instance.parse(encoded) do |_time, record|
124
136
  datum.delete("verified")
125
137
  assert_equal datum, record
126
138
  end
127
139
  end
128
140
 
129
- def test_parse_with_readers_and_writers_schema_files
141
+ data("use_confluent_schema" => true,
142
+ "plain" => false)
143
+ def test_parse_with_readers_and_writers_schema_files(data)
144
+ config = data
130
145
  conf = {
131
146
  'writers_schema_file' => File.join(__dir__, "..", "data", "writer_user.avsc"),
132
147
  'readers_schema_file' => File.join(__dir__, "..", "data", "reader_user.avsc"),
148
+ 'use_confluent_schema' => config,
133
149
  }
134
150
  d = create_driver(conf)
135
151
  datum = {"username" => "foo", "age" => 42, "verified" => true}
136
- encoded = encode_datum(datum, SCHEMA)
152
+ encoded = encode_datum(datum, SCHEMA, config)
137
153
  d.instance.parse(encoded) do |_time, record|
138
154
  datum.delete("verified")
139
155
  assert_equal datum, record
140
156
  end
141
157
  end
142
158
 
143
- def test_parse_with_complex_schema
159
+ data("use_confluent_schema" => true,
160
+ "plain" => false)
161
+ def test_parse_with_complex_schema(data)
162
+ config = data
144
163
  conf = {
145
164
  'schema_json' => COMPLEX_SCHEMA,
146
- 'time_key' => 'time'
165
+ 'time_key' => 'time',
166
+ 'use_confluent_schema' => config,
147
167
  }
148
168
  d = create_driver(conf)
149
169
  time_str = "2020-09-25 15:08:09.082113 +0900"
@@ -162,7 +182,7 @@ class AvroParserTest < Test::Unit::TestCase
162
182
  }
163
183
  }
164
184
 
165
- encoded = encode_datum(datum, COMPLEX_SCHEMA)
185
+ encoded = encode_datum(datum, COMPLEX_SCHEMA, config)
166
186
  d.instance.parse(encoded) do |time, record|
167
187
  assert_equal Time.parse(time_str).to_r, time.to_r
168
188
  datum.delete("time")
@@ -185,6 +205,22 @@ class AvroParserTest < Test::Unit::TestCase
185
205
  res.status = 200
186
206
  res.body = 'running'
187
207
  end
208
+ server.mount_proc("/schemas/ids") do |req, res|
209
+ req.path =~ /^\/schemas\/ids\/([^\/]*)$/
210
+ version = $1
211
+ @got.push({
212
+ version: version,
213
+ })
214
+ if version == "1"
215
+ res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-1.avsc"))
216
+ elsif version == "21"
217
+ res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-21.avsc"))
218
+ elsif version == "41"
219
+ res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-41.avsc"))
220
+ elsif version == "42"
221
+ res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-42.avsc"))
222
+ end
223
+ end
188
224
  server.mount_proc("/subjects") do |req, res|
189
225
  req.path =~ /^\/subjects\/([^\/]*)\/([^\/]*)\/(.*)$/
190
226
  avro_registered_name = $1
@@ -204,6 +240,8 @@ class AvroParserTest < Test::Unit::TestCase
204
240
  res.body = File.read(File.join(__dir__, "..", "data", "persons-avro-value3.avsc"))
205
241
  elsif version == "4"
206
242
  res.body = File.read(File.join(__dir__, "..", "data", "persons-avro-value4.avsc"))
243
+ elsif version == "latest"
244
+ res.body = File.read(File.join(__dir__, "..", "data", "persons-avro-value4.avsc"))
207
245
  end
208
246
  end
209
247
  server.start
@@ -318,63 +356,109 @@ class AvroParserTest < Test::Unit::TestCase
318
356
  assert_equal 4, @got.size
319
357
  assert_equal 'persons-avro-value', @got[3][:registered_name]
320
358
  assert_equal '3', @got[3][:version]
359
+
360
+ assert_equal '200', client.request_get('/schemas/ids/1').code
361
+ assert_equal 5, @got.size
362
+ assert_nil @got[4][:registered_name]
363
+ assert_equal '1', @got[4][:version]
364
+
365
+ assert_equal '200', client.request_get('/schemas/ids/21').code
366
+ assert_equal 6, @got.size
367
+ assert_nil @got[5][:registered_name]
368
+ assert_equal '21', @got[5][:version]
369
+
370
+ assert_equal '200', client.request_get('/schemas/ids/41').code
371
+ assert_equal 7, @got.size
372
+ assert_nil @got[6][:registered_name]
373
+ assert_equal '41', @got[6][:version]
374
+
375
+ assert_equal '200', client.request_get('/schemas/ids/42').code
376
+ assert_equal 8, @got.size
377
+ assert_nil @got[7][:registered_name]
378
+ assert_equal '42', @got[7][:version]
321
379
  end
322
380
 
323
- def test_schema_url
381
+ data("use_confluent_schema" => true,
382
+ "plain" => false)
383
+ def test_schema_url(data)
384
+ config = data
324
385
  conf = {
325
386
  'schema_url' => "http://localhost:8081/subjects/persons-avro-value/versions/1",
326
- 'schema_url_key' => 'schema'
387
+ 'schema_url_key' => 'schema',
388
+ 'use_confluent_schema' => config,
327
389
  }
328
390
  d = create_driver(conf)
329
391
  datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
330
- encoded = encode_datum(datum, REMOTE_SCHEMA)
392
+ encoded = encode_datum(datum, REMOTE_SCHEMA, config)
331
393
  d.instance.parse(encoded) do |_time, record|
332
394
  assert_equal datum, record
333
395
  end
334
396
  end
335
397
 
336
- def test_schema_url_with_version2
398
+ data("use_confluent_schema" => true,
399
+ "plain" => false)
400
+ def test_schema_url_with_version2(data)
401
+ config = data
337
402
  conf = {
338
403
  'schema_url' => "http://localhost:8081/subjects/persons-avro-value/versions/2",
339
- 'schema_url_key' => 'schema'
404
+ 'schema_url_key' => 'schema',
405
+ 'use_confluent_schema' => config,
340
406
  }
341
407
  d = create_driver(conf)
342
408
  datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
343
- encoded = encode_datum(datum, REMOTE_SCHEMA2)
409
+ encoded = encode_datum(datum, REMOTE_SCHEMA2, config)
344
410
  d.instance.parse(encoded) do |_time, record|
345
411
  assert_equal datum.merge("verified" => false), record
346
412
  end
347
413
  end
348
414
 
349
- def test_schema_registery_with_subject_url
350
- conf = {
351
- 'schema_registery_with_subject_url' => "http://localhost:8081/subjects/persons-avro-value/",
352
- 'schema_url_key' => 'schema'
353
- }
415
+ def test_confluent_registry_with_schema_version
416
+ conf = Fluent::Config::Element.new(
417
+ '', '', {'@type' => 'avro'}, [
418
+ Fluent::Config::Element.new('confluent_registry', '', {
419
+ 'url' => 'http://localhost:8081',
420
+ 'subject' => 'persons-avro-value',
421
+ 'schema_key' => 'schema',
422
+ 'schema_version' => '1',
423
+ }, [])
424
+ ])
354
425
  d = create_driver(conf)
355
426
  datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
356
- encoded = encode_datum(datum, REMOTE_SCHEMA2)
427
+ schema = Yajl.load(File.read(File.join(__dir__, "..", "data", "schema-persions-value-1.avsc")))
428
+ encoded = encode_datum(datum, schema.fetch("schema"), true, 1)
357
429
  d.instance.parse(encoded) do |_time, record|
358
- assert_equal datum.merge("verified" => nil), record
430
+ assert_equal datum, record
359
431
  end
360
432
  end
361
433
 
362
- def test_schema_registery_with_invalid_subject_url
363
- conf = {
364
- 'schema_registery_with_subject_url' => "http://localhost:8081/subjects/persons-avro-value",
365
- 'schema_url_key' => 'schema'
366
- }
367
- assert_raise(Fluent::ConfigError) do
368
- create_driver(conf)
434
+ def test_confluent_registry_with_fallback
435
+ conf = Fluent::Config::Element.new(
436
+ '', '', {'@type' => 'avro'}, [
437
+ Fluent::Config::Element.new('confluent_registry', '', {
438
+ 'url' => 'http://localhost:8081',
439
+ 'subject' => 'persons-avro-value',
440
+ 'schema_key' => 'schema',
441
+ }, [])
442
+ ])
443
+ d = create_driver(conf)
444
+ datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
445
+ schema = Yajl.load(File.read(File.join(__dir__, "..", "data", "schema-persions-value-1.avsc")))
446
+ encoded = encode_datum(datum, schema.fetch("schema"), true, 1)
447
+ d.instance.parse(encoded) do |_time, record|
448
+ assert_equal datum, record
369
449
  end
370
450
  end
371
451
  end
372
452
 
373
453
  private
374
454
 
375
- def encode_datum(datum, string_schema)
455
+ def encode_datum(datum, string_schema, use_confluent_schema = true, schema_id = 1)
376
456
  buffer = StringIO.new
377
457
  encoder = Avro::IO::BinaryEncoder.new(buffer)
458
+ if use_confluent_schema
459
+ encoder.write(Fluent::Plugin::AvroParser::MAGIC_BYTE)
460
+ encoder.write([schema_id].pack("N"))
461
+ end
378
462
  schema = Avro::Schema.parse(string_schema)
379
463
  writer = Avro::IO::DatumWriter.new(schema)
380
464
  writer.write(datum, encoder)
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-parser-avro
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Hiroshi Hatake
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-09-29 00:00:00.000000000 Z
11
+ date: 2020-09-30 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: avro
@@ -100,13 +100,18 @@ files:
100
100
  - LICENSE
101
101
  - README.md
102
102
  - Rakefile
103
- - fluent-plugin-avro.gemspec
103
+ - fluent-plugin-parser-avro.gemspec
104
+ - lib/fluent/plugin/confluent_avro_schema_registry.rb
104
105
  - lib/fluent/plugin/parser_avro.rb
105
106
  - test/data/persons-avro-value.avsc
106
107
  - test/data/persons-avro-value2.avsc
107
108
  - test/data/persons-avro-value3.avsc
108
109
  - test/data/persons-avro-value4.avsc
109
110
  - test/data/reader_user.avsc
111
+ - test/data/schema-persions-value-1.avsc
112
+ - test/data/schema-persions-value-21.avsc
113
+ - test/data/schema-persions-value-41.avsc
114
+ - test/data/schema-persions-value-42.avsc
110
115
  - test/data/user.avsc
111
116
  - test/data/writer_user.avsc
112
117
  - test/helper.rb
@@ -140,6 +145,10 @@ test_files:
140
145
  - test/data/persons-avro-value3.avsc
141
146
  - test/data/persons-avro-value4.avsc
142
147
  - test/data/reader_user.avsc
148
+ - test/data/schema-persions-value-1.avsc
149
+ - test/data/schema-persions-value-21.avsc
150
+ - test/data/schema-persions-value-41.avsc
151
+ - test/data/schema-persions-value-42.avsc
143
152
  - test/data/user.avsc
144
153
  - test/data/writer_user.avsc
145
154
  - test/helper.rb