avro-patches 1.0.0.pre0 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e1818ee748fd5ee8c17e74d141bcfc9dc29c195762e3e05e10a475a609528c9c
4
- data.tar.gz: 2d5ad54b308815f87dc0d2ac05d238926f14ecfd081f2dc5ef3aecf7cdefa4b4
3
+ metadata.gz: cd089766c05ce4cdd751d3daec9f49fe60d54b579c9c27b38caca027a1551653
4
+ data.tar.gz: dffabc9b71215adc99ce91578db8967dc494df108c83007d23274731db6d4502
5
5
  SHA512:
6
- metadata.gz: b07ce6b1231d84258d3e88d7342bd76ea0b28244de5805f9e75cad6ade35c4b46f6c445899bb180766bd66c59d3fa0cb27c8c79281b3d1176dc105fd5df4bfa6
7
- data.tar.gz: 430350d55b3096351c7a6b00fd45682476d574585ce2184319496c7206ece97a2b5ade02d4422597b27700c6c84370f30fd7b978b1b1a0e804daaf840a5c2eb5
6
+ metadata.gz: 41d70773537c150975e7cb409a5c6a2cc3924de378f11593c596eb6bd50d0380cbbcf5cc43a095c621d8caec0ae3b0ea2e7ecc4ddad16e01412dcb80475d8dff
7
+ data.tar.gz: 8e7e5ee24b8e56fc1c0b675cd9cbe456fc372b00b62a745bbedbf61068ec1aa92cc6f25be7c02967dfba360db6cb6cf8297ec2526eb1daf62cc1b8a385056d0c
@@ -1,8 +1,8 @@
1
1
  language: ruby
2
2
  rvm:
3
- - 2.3.5
4
- - 2.4.4
5
- - 2.5.1
6
- before_install: gem install bundler -v 1.16.2 --no-document
3
+ - 2.4.6
4
+ - 2.5.5
5
+ - 2.6.3
6
+ before_install: gem install bundler -v 2.0.1 --no-document
7
7
  script:
8
8
  - bundle exec rake test
@@ -4,6 +4,9 @@
4
4
  - Release for Avro v1.9.0. This removes all patches as all changes
5
5
  from the previous release are included in Avro v1.9.0.
6
6
 
7
+ ## v0.4.1
8
+ - Optimize binary encoder and decoder.
9
+
7
10
  ## v0.4.0
8
11
  - Optionally fail validation when extra fields are present.
9
12
  - Check that field defaults have the correct type.
data/README.md CHANGED
@@ -1,25 +1,18 @@
1
1
  # avro-patches
2
2
 
3
- This gem contains patches to the official [Apache Avro](https://avro.apache.org/)
4
- Ruby gem v1.8.2.
5
-
6
- We have attempted to follow the coding conventions used in the official `avro`
7
- repo.
8
-
9
- The following pending or unreleased changes are included:
10
- - [AVRO-1886: Add validation messages](https://github.com/apache/avro/pull/111)
11
- - [AVRO-1695: Ruby support for logical types revisited](https://github.com/apache/avro/pull/116)
12
- - [AVRO-1969: Add schema compatibility checker for Ruby](https://github.com/apache/avro/pull/170)
13
- - [AVRO-2039: Ruby encoding performance improvements](https://github.com/apache/avro/pull/230)
14
- - [AVRO-2200: Option to fail when extra fields are in the payload](https://github.com/apache/avro/pull/321)
15
- - [AVRO-2199: Validate that field defaults have the correct type](https://github.com/apache/avro/pull/320)
16
-
17
- In addition, compatibility with Ruby 2.4 (https://github.com/apache/avro/pull/191)
18
- has been integrated with the changes above.
19
-
20
- The following Ruby changes are not included, but could be added in the future:
21
- - [AVRO-2001: Adding support for doc attribute](https://github.com/apache/avro/pull/197)
22
- - [AVRO-1873: Add CRC32 checksum to Snappy-compressed blocks](https://github.com/apache/avro/pull/121)
3
+ ## Avro v1.9.0
4
+
5
+ After the official release of [Apache Avro](https://avro.apache.org/) v1.9.0 this
6
+ gem non longer contains any patches. This version is being released as a compatibility
7
+ layer for Avro v1.9.0.
8
+
9
+ As Ruby changes are submitted for the next Avro release, it is expected that they
10
+ be collected in future releases of this gem.
11
+
12
+ ## Avro v1.8.2
13
+
14
+ See the [avro-v1.8.2 branch](https://github.com/salsify/avro-patches/tree/avro-1.8.2)
15
+ for details about the previous version of this gem which supported Avro v1.8.2.
23
16
 
24
17
  ## Installation
25
18
 
@@ -26,7 +26,7 @@ Gem::Specification.new do |spec|
26
26
  spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(bin|test|spec|features)/}) }
27
27
  spec.require_paths = ['lib']
28
28
 
29
- spec.add_development_dependency 'bundler', '~> 1.15'
29
+ spec.add_development_dependency 'bundler', '~> 2.0'
30
30
  spec.add_development_dependency 'rake', '~> 10.0'
31
31
  spec.add_development_dependency 'test-unit'
32
32
  spec.add_development_dependency 'overcommit'
@@ -1,3 +1,3 @@
1
1
  module AvroPatches
2
- VERSION = '1.0.0.pre0'.freeze
2
+ VERSION = '1.0.0'.freeze
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: avro-patches
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0.pre0
4
+ version: 1.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Salsify, Inc
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-04-26 00:00:00.000000000 Z
11
+ date: 2019-05-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -16,14 +16,14 @@ dependencies:
16
16
  requirements:
17
17
  - - "~>"
18
18
  - !ruby/object:Gem::Version
19
- version: '1.15'
19
+ version: '2.0'
20
20
  type: :development
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - "~>"
25
25
  - !ruby/object:Gem::Version
26
- version: '1.15'
26
+ version: '2.0'
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: rake
29
29
  requirement: !ruby/object:Gem::Requirement
@@ -100,23 +100,6 @@ files:
100
100
  - Rakefile
101
101
  - avro-patches.gemspec
102
102
  - lib/avro-patches.rb
103
- - lib/avro-patches/default_validation.rb
104
- - lib/avro-patches/default_validation/schema.rb
105
- - lib/avro-patches/ensure_encoding.rb
106
- - lib/avro-patches/ensure_encoding/io.rb
107
- - lib/avro-patches/logical_types.rb
108
- - lib/avro-patches/logical_types/io.rb
109
- - lib/avro-patches/logical_types/logical_types.rb
110
- - lib/avro-patches/logical_types/schema.rb
111
- - lib/avro-patches/logical_types/schema_validator.rb
112
- - lib/avro-patches/schema_compatibility.rb
113
- - lib/avro-patches/schema_compatibility/io.rb
114
- - lib/avro-patches/schema_compatibility/schema.rb
115
- - lib/avro-patches/schema_compatibility/schema_compatibility.rb
116
- - lib/avro-patches/schema_validator.rb
117
- - lib/avro-patches/schema_validator/io.rb
118
- - lib/avro-patches/schema_validator/schema.rb
119
- - lib/avro-patches/schema_validator/schema_validator.rb
120
103
  - lib/avro-patches/version.rb
121
104
  - lib/avro_patches.rb
122
105
  homepage: https://github.com/salsify/avro-patches
@@ -135,9 +118,9 @@ required_ruby_version: !ruby/object:Gem::Requirement
135
118
  version: '0'
136
119
  required_rubygems_version: !ruby/object:Gem::Requirement
137
120
  requirements:
138
- - - ">"
121
+ - - ">="
139
122
  - !ruby/object:Gem::Version
140
- version: 1.3.1
123
+ version: '0'
141
124
  requirements: []
142
125
  rubygems_version: 3.0.3
143
126
  signing_key:
@@ -1 +0,0 @@
1
- require "avro-patches/default_validation/schema"
@@ -1,27 +0,0 @@
1
- module AvroPatches
2
- module DefaultValidation
3
- module FieldPatch
4
- def initialize(type, name, default=:no_default, order=nil, names=nil, namespace=nil)
5
- super
6
-
7
- validate_default! if default?
8
- end
9
-
10
- private
11
-
12
- def validate_default!
13
- type_for_default = if type.type_sym == :union
14
- type.schemas.first
15
- else
16
- type
17
- end
18
-
19
- Avro::SchemaValidator.validate!(type_for_default, default)
20
- rescue Avro::SchemaValidator::ValidationError => e
21
- raise Avro::SchemaParseError, "Error validating default for #{name}: #{e.message}"
22
- end
23
- end
24
- end
25
- end
26
-
27
- Avro::Schema::Field.prepend(AvroPatches::DefaultValidation::FieldPatch)
@@ -1,5 +0,0 @@
1
- # Change from "AVRO-1783: Ruby: Ensure correct binary encoding for byte strings"
2
- # https://github.com/apache/avro/commit/315d842148d57590a58fafecf6e5ea378e9e0d74
3
-
4
- # Only part of the above commit is included as we are not using protocols and RPC
5
- require_relative 'ensure_encoding/io'
@@ -1,12 +0,0 @@
1
- Avro::IO::DatumWriter.class_eval do
2
- # A string is encoded as a long followed by that many bytes of
3
- # UTF-8 encoded character data
4
- def write_string(datum)
5
- # The original commit used:
6
- # datum = datum.encode('utf-8') if datum.respond_to? :encode
7
- # This always allocated a new string even if the string was already UTF-8 encoded.
8
- # The form below is slightly more efficient.
9
- datum = datum.encode(Encoding::UTF_8) if datum.respond_to?(:encode) && datum.encoding != Encoding::UTF_8
10
- write_bytes(datum)
11
- end
12
- end
@@ -1,6 +0,0 @@
1
- # Changes from "AVRO-1695: Ruby support for logical types revisited"
2
- # https://github.com/apache/avro/pull/116
3
- require_relative 'logical_types/logical_types'
4
- require_relative 'logical_types/schema_validator'
5
- require_relative 'logical_types/schema'
6
- require_relative 'logical_types/io'
@@ -1,42 +0,0 @@
1
- Avro::IO::DatumWriter.class_eval do
2
- def write_data(writers_schema, logical_datum, encoder)
3
- datum = writers_schema.type_adapter.encode(logical_datum)
4
-
5
- unless Avro::Schema.validate(writers_schema, datum, { recursive: false, encoded: true })
6
- raise Avro::IO::AvroTypeError.new(writers_schema, datum)
7
- end
8
-
9
- # function dispatch to write datum
10
- case writers_schema.type_sym
11
- when :null; encoder.write_null(datum)
12
- when :boolean; encoder.write_boolean(datum)
13
- when :string; encoder.write_string(datum)
14
- when :int; encoder.write_int(datum)
15
- when :long; encoder.write_long(datum)
16
- when :float; encoder.write_float(datum)
17
- when :double; encoder.write_double(datum)
18
- when :bytes; encoder.write_bytes(datum)
19
- when :fixed; write_fixed(writers_schema, datum, encoder)
20
- when :enum; write_enum(writers_schema, datum, encoder)
21
- when :array; write_array(writers_schema, datum, encoder)
22
- when :map; write_map(writers_schema, datum, encoder)
23
- when :union; write_union(writers_schema, datum, encoder)
24
- when :record, :error, :request; write_record(writers_schema, datum, encoder)
25
- else
26
- raise Avro::AvroError.new("Unknown type: #{writers_schema.type}")
27
- end
28
- end
29
- end
30
-
31
- module AvroPatches
32
- module LogicalTypes
33
- module DatumReaderPatch
34
- def read_data(writers_schema, readers_schema, decoder)
35
- datum = super
36
- readers_schema.type_adapter.decode(datum)
37
- end
38
- end
39
- end
40
- end
41
-
42
- Avro::IO::DatumReader.prepend(AvroPatches::LogicalTypes::DatumReaderPatch)
@@ -1,73 +0,0 @@
1
- require 'date'
2
-
3
- module Avro
4
- module LogicalTypes
5
- module IntDate
6
- EPOCH_START = Date.new(1970, 1, 1)
7
-
8
- def self.encode(date)
9
- return date.to_i if date.is_a?(Numeric)
10
-
11
- (date - EPOCH_START).to_i
12
- end
13
-
14
- def self.decode(int)
15
- EPOCH_START + int
16
- end
17
- end
18
-
19
- module TimestampMillis
20
- def self.encode(value)
21
- return value.to_i if value.is_a?(Numeric)
22
-
23
- time = value.to_time
24
- time.to_i * 1000 + time.usec / 1000
25
- end
26
-
27
- def self.decode(int)
28
- s, ms = int / 1000, int % 1000
29
- Time.at(s, ms * 1000).utc
30
- end
31
- end
32
-
33
- module TimestampMicros
34
- def self.encode(value)
35
- return value.to_i if value.is_a?(Numeric)
36
-
37
- time = value.to_time
38
- time.to_i * 1000_000 + time.usec
39
- end
40
-
41
- def self.decode(int)
42
- s, us = int / 1000_000, int % 1000_000
43
- Time.at(s, us).utc
44
- end
45
- end
46
-
47
- module Identity
48
- def self.encode(datum)
49
- datum
50
- end
51
-
52
- def self.decode(datum)
53
- datum
54
- end
55
- end
56
-
57
- TYPES = {
58
- "int" => {
59
- "date" => IntDate
60
- },
61
- "long" => {
62
- "timestamp-millis" => TimestampMillis,
63
- "timestamp-micros" => TimestampMicros
64
- },
65
- }.freeze
66
-
67
- def self.type_adapter(type, logical_type)
68
- return unless logical_type
69
-
70
- TYPES.fetch(type, {}.freeze).fetch(logical_type, Identity)
71
- end
72
- end
73
- end
@@ -1,110 +0,0 @@
1
- Avro::Schema.class_eval do
2
- attr_reader :logical_type
3
-
4
- # Build Avro Schema from data parsed out of JSON string.
5
- def self.real_parse(json_obj, names=nil, default_namespace=nil)
6
- if json_obj.is_a? Hash
7
- type = json_obj['type']
8
- logical_type = json_obj['logicalType']
9
- raise Avro::SchemaParseError, %Q(No "type" property: #{json_obj}) if type.nil?
10
-
11
- # Check that the type is valid before calling #to_sym, since symbols are never garbage
12
- # collected (important to avoid DoS if we're accepting schemas from untrusted clients)
13
- unless Avro::Schema::VALID_TYPES.include?(type)
14
- raise Avro::SchemaParseError, "Unknown type: #{type}"
15
- end
16
-
17
- type_sym = type.to_sym
18
- if Avro::Schema::PRIMITIVE_TYPES_SYM.include?(type_sym)
19
- return Avro::Schema::PrimitiveSchema.new(type_sym, logical_type)
20
-
21
- elsif Avro::Schema::NAMED_TYPES_SYM.include? type_sym
22
- name = json_obj['name']
23
- namespace = json_obj.include?('namespace') ? json_obj['namespace'] : default_namespace
24
- case type_sym
25
- when :fixed
26
- size = json_obj['size']
27
- return Avro::Schema::FixedSchema.new(name, namespace, size, names, logical_type)
28
- when :enum
29
- symbols = json_obj['symbols']
30
- return Avro::Schema::EnumSchema.new(name, namespace, symbols, names)
31
- when :record, :error
32
- fields = json_obj['fields']
33
- return Avro::Schema::RecordSchema.new(name, namespace, fields, names, type_sym)
34
- end
35
-
36
- else
37
- case type_sym
38
- when :array
39
- return Avro::Schema::ArraySchema.new(json_obj['items'], names, default_namespace)
40
- when :map
41
- return Avro::Schema::MapSchema.new(json_obj['values'], names, default_namespace)
42
- else
43
- raise Avro::SchemaParseError.new("Unknown Valid Type: #{type}")
44
- end
45
- end
46
-
47
- elsif json_obj.is_a? Array
48
- # JSON array (union)
49
- return Avro::Schema::UnionSchema.new(json_obj, names, default_namespace)
50
- elsif Avro::Schema::PRIMITIVE_TYPES.include? json_obj
51
- return Avro::Schema::PrimitiveSchema.new(json_obj)
52
- else
53
- raise Avro::UnknownSchemaError.new(json_obj)
54
- end
55
- end
56
-
57
- # Determine if a ruby datum is an instance of a schema
58
- def self.validate(expected_schema, logical_datum, options = { recursive: true, encoded: false })
59
- Avro::SchemaValidator.validate!(expected_schema, logical_datum, options)
60
- true
61
- rescue Avro::SchemaValidator::ValidationError
62
- false
63
- end
64
-
65
- def initialize(type, logical_type=nil)
66
- @type_sym = type.is_a?(Symbol) ? type : type.to_sym
67
- @logical_type = logical_type
68
- end
69
-
70
- def type_adapter
71
- @type_adapter ||= Avro::LogicalTypes.type_adapter(type, logical_type) || Avro::LogicalTypes::Identity
72
- end
73
-
74
- def to_avro(names=nil)
75
- props = {'type' => type}
76
- props['logicalType'] = logical_type if logical_type
77
- props
78
- end
79
- end
80
-
81
- Avro::Schema::NamedSchema.class_eval do
82
- def initialize(type, name, namespace=nil, names=nil, logical_type=nil)
83
- super(type, logical_type)
84
- @name, @namespace = Avro::Name.extract_namespace(name, namespace)
85
- Avro::Name.add_name(names, self)
86
- end
87
- end
88
-
89
- Avro::Schema::PrimitiveSchema.class_eval do
90
- def initialize(type, logical_type=nil)
91
- if Avro::Schema::PRIMITIVE_TYPES_SYM.include?(type)
92
- super(type, logical_type)
93
- elsif Avro::Schema::PRIMITIVE_TYPES.include?(type)
94
- super(type.to_sym, logical_type)
95
- else
96
- raise Avro::AvroError.new("#{type} is not a valid primitive type.")
97
- end
98
- end
99
- end
100
-
101
- Avro::Schema::FixedSchema.class_eval do
102
- def initialize(name, space, size, names=nil, logical_type=nil)
103
- # Ensure valid cto args
104
- unless size.is_a?(Integer)
105
- raise Avro::AvroError, 'Fixed Schema requires a valid integer for size property.'
106
- end
107
- super(:fixed, name, space, names, logical_type)
108
- @size = size
109
- end
110
- end
@@ -1,69 +0,0 @@
1
- module AvroPatches
2
- module LogicalTypes
3
- module SchemaValidatorPatch
4
- def validate!(expected_schema, logical_datum, options = { recursive: true, encoded: false, fail_on_extra_fields: false})
5
- options ||= {}
6
- options[:recursive] = true unless options.key?(:recursive)
7
-
8
- result = Avro::SchemaValidator::Result.new
9
- if options[:recursive]
10
- validate_recursive(expected_schema, logical_datum,
11
- Avro::SchemaValidator::ROOT_IDENTIFIER, result, options)
12
- else
13
- validate_simple(expected_schema, logical_datum,
14
- Avro::SchemaValidator::ROOT_IDENTIFIER, result, options)
15
- end
16
- fail Avro::SchemaValidator::ValidationError, result if result.failure?
17
- result
18
- end
19
-
20
- private
21
-
22
- def validate_recursive(expected_schema, logical_datum, path, result, options = {})
23
- datum = resolve_datum(expected_schema, logical_datum, options[:encoded])
24
-
25
- # The entire method is overridden so that encoded: true can be passed here
26
- validate_simple(expected_schema, datum, path, result, encoded: true)
27
-
28
- case expected_schema.type_sym
29
- when :array
30
- validate_array(expected_schema, datum, path, result)
31
- when :map
32
- validate_map(expected_schema, datum, path, result)
33
- when :union
34
- validate_union(expected_schema, datum, path, result)
35
- when :record, :error, :request
36
- fail Avro::SchemaValidator::TypeMismatchError unless datum.is_a?(Hash)
37
- expected_schema.fields.each do |field|
38
- deeper_path = deeper_path_for_hash(field.name, path)
39
- validate_recursive(field.type, datum[field.name], deeper_path, result)
40
- end
41
- if options[:fail_on_extra_fields]
42
- datum_fields = datum.keys.map(&:to_s)
43
- schema_fields = expected_schema.fields.map(&:name)
44
- (datum_fields - schema_fields).each do |extra_field|
45
- result.add_error(path, "extra field '#{extra_field}' - not in schema")
46
- end
47
- end
48
- end
49
- rescue Avro::SchemaValidator::TypeMismatchError
50
- result.add_error(path, "expected type #{expected_schema.type_sym}, got #{actual_value_message(datum)}")
51
- end
52
-
53
- def validate_simple(expected_schema, logical_datum, path, result, options = {})
54
- datum = resolve_datum(expected_schema, logical_datum, options[:encoded])
55
- super(expected_schema, datum, path, result)
56
- end
57
-
58
- def resolve_datum(expected_schema, logical_datum, encoded)
59
- if encoded
60
- logical_datum
61
- else
62
- expected_schema.type_adapter.encode(logical_datum) rescue nil
63
- end
64
- end
65
- end
66
- end
67
- end
68
-
69
- Avro::SchemaValidator.singleton_class.prepend(AvroPatches::LogicalTypes::SchemaValidatorPatch)
@@ -1,5 +0,0 @@
1
- # Changes from "AVRO-1969: Add schema compatibility checker for Ruby"
2
- # https://github.com/apache/avro/pull/170
3
- require_relative 'schema_compatibility/schema_compatibility'
4
- require_relative 'schema_compatibility/schema'
5
- require_relative 'schema_compatibility/io'
@@ -1,35 +0,0 @@
1
- Avro::IO::DatumReader.class_eval do
2
- def self.match_schemas(writers_schema, readers_schema)
3
- Avro::SchemaCompatibility.match_schemas(writers_schema, readers_schema)
4
- end
5
-
6
- def read_record(writers_schema, readers_schema, decoder)
7
- readers_fields_hash = readers_schema.fields_hash
8
- read_record = {}
9
- writers_schema.fields.each do |field|
10
- if readers_field = readers_fields_hash[field.name]
11
- field_val = read_data(field.type, readers_field.type, decoder)
12
- read_record[field.name] = field_val
13
- else
14
- skip_data(field.type, decoder)
15
- end
16
- end
17
-
18
- # fill in the default values
19
- if readers_fields_hash.size > read_record.size
20
- writers_fields_hash = writers_schema.fields_hash
21
- readers_fields_hash.each do |field_name, field|
22
- unless writers_fields_hash.has_key? field_name
23
- if field.default?
24
- field_val = read_default_value(field.type, field.default)
25
- read_record[field.name] = field_val
26
- else
27
- raise Avro::AvroError, "Missing data for #{field.type} with no default"
28
- end
29
- end
30
- end
31
- end
32
-
33
- read_record
34
- end
35
- end
@@ -1,69 +0,0 @@
1
- Avro::Schema.class_eval do
2
- def read?(writers_schema)
3
- Avro::SchemaCompatibility.can_read?(writers_schema, self)
4
- end
5
-
6
- def be_read?(other_schema)
7
- other_schema.read?(self)
8
- end
9
-
10
- def mutual_read?(other_schema)
11
- Avro::SchemaCompatibility.mutual_read?(other_schema, self)
12
- end
13
- end
14
-
15
- Avro::Schema::RecordSchema.class_eval do
16
- def initialize(name, namespace, fields, names=nil, schema_type=:record)
17
- if schema_type == :request || schema_type == 'request'
18
- @type_sym = schema_type.to_sym
19
- @namespace = namespace
20
- else
21
- super(schema_type, name, namespace, names)
22
- end
23
- @fields = if fields
24
- self.class.make_field_objects(fields, names, self.namespace)
25
- else
26
- {}
27
- end
28
- end
29
- end
30
-
31
- Avro::Schema::UnionSchema.class_eval do
32
- def initialize(schemas, names=nil, default_namespace=nil)
33
- super(:union)
34
-
35
- @schemas = schemas.each_with_object([]) do |schema, schema_objects|
36
- new_schema = subparse(schema, names, default_namespace)
37
- ns_type = new_schema.type_sym
38
-
39
- if Avro::Schema::VALID_TYPES_SYM.include?(ns_type) &&
40
- !Avro::Schema::NAMED_TYPES_SYM.include?(ns_type) &&
41
- schema_objects.any?{|o| o.type_sym == ns_type }
42
- raise Avro::SchemaParseError, "#{ns_type} is already in Union"
43
- elsif ns_type == :union
44
- raise Avro::SchemaParseError, "Unions cannot contain other unions"
45
- else
46
- schema_objects << new_schema
47
- end
48
- end
49
- end
50
- end
51
-
52
-
53
- module AvroPatches
54
- module SchemaCompatibility
55
- module FieldPatch
56
- def default?
57
- @default != :no_default
58
- end
59
-
60
- def to_avro(names = Set.new)
61
- super.tap do |avro|
62
- avro['default'] = default if default?
63
- end
64
- end
65
- end
66
- end
67
- end
68
-
69
- Avro::Schema::Field.prepend(AvroPatches::SchemaCompatibility::FieldPatch)
@@ -1,154 +0,0 @@
1
- module Avro
2
-
3
- # see http://avro.apache.org/docs/current/spec.html#Schema+Resolution for what this should do
4
- module SchemaCompatibility
5
- def self.can_read?(writers_schema, readers_schema)
6
- Checker.new.can_read?(writers_schema, readers_schema)
7
- end
8
-
9
- def self.mutual_read?(writers_schema, readers_schema)
10
- Checker.new.mutual_read?(writers_schema, readers_schema)
11
- end
12
-
13
- def self.match_schemas(writers_schema, readers_schema)
14
- # Note: this does not support aliases!
15
- w_type = writers_schema.type_sym
16
- r_type = readers_schema.type_sym
17
-
18
- # This conditional is begging for some OO love.
19
- if w_type == :union || r_type == :union
20
- return true
21
- end
22
-
23
- if w_type == r_type
24
- return true if Avro::Schema::PRIMITIVE_TYPES_SYM.include?(r_type)
25
-
26
- case r_type
27
- when :record
28
- return writers_schema.fullname == readers_schema.fullname
29
- when :error
30
- return writers_schema.fullname == readers_schema.fullname
31
- when :request
32
- return true
33
- when :fixed
34
- return writers_schema.fullname == readers_schema.fullname &&
35
- writers_schema.size == readers_schema.size
36
- when :enum
37
- return writers_schema.fullname == readers_schema.fullname
38
- when :map
39
- return match_schemas(writers_schema.values, readers_schema.values)
40
- when :array
41
- return match_schemas(writers_schema.items, readers_schema.items)
42
- end
43
- end
44
-
45
- # Handle schema promotion
46
- if w_type == :int && [:long, :float, :double].include?(r_type)
47
- return true
48
- elsif w_type == :long && [:float, :double].include?(r_type)
49
- return true
50
- elsif w_type == :float && r_type == :double
51
- return true
52
- elsif w_type == :string && r_type == :bytes
53
- return true
54
- elsif w_type == :bytes && r_type == :string
55
- return true
56
- end
57
-
58
- return false
59
- end
60
-
61
- class Checker
62
- SIMPLE_CHECKS = Avro::Schema::PRIMITIVE_TYPES_SYM.dup.add(:fixed).freeze
63
-
64
- attr_reader :recursion_set
65
- private :recursion_set
66
-
67
- def initialize
68
- @recursion_set = Set.new
69
- end
70
-
71
- def can_read?(writers_schema, readers_schema)
72
- full_match_schemas(writers_schema, readers_schema)
73
- end
74
-
75
- def mutual_read?(writers_schema, readers_schema)
76
- can_read?(writers_schema, readers_schema) && can_read?(readers_schema, writers_schema)
77
- end
78
-
79
- private
80
-
81
- def full_match_schemas(writers_schema, readers_schema)
82
- return true if recursion_in_progress?(writers_schema, readers_schema)
83
-
84
- return false unless Avro::SchemaCompatibility.match_schemas(writers_schema, readers_schema)
85
-
86
- if writers_schema.type_sym != :union && SIMPLE_CHECKS.include?(readers_schema.type_sym)
87
- return true
88
- end
89
-
90
- case readers_schema.type_sym
91
- when :record
92
- match_record_schemas(writers_schema, readers_schema)
93
- when :map
94
- full_match_schemas(writers_schema.values, readers_schema.values)
95
- when :array
96
- full_match_schemas(writers_schema.items, readers_schema.items)
97
- when :union
98
- match_union_schemas(writers_schema, readers_schema)
99
- when :enum
100
- # reader's symbols must contain all writer's symbols
101
- (writers_schema.symbols - readers_schema.symbols).empty?
102
- else
103
- if writers_schema.type_sym == :union && writers_schema.schemas.size == 1
104
- full_match_schemas(writers_schema.schemas.first, readers_schema)
105
- else
106
- false
107
- end
108
- end
109
- end
110
-
111
- # reader is a union
112
- def match_union_schemas(writers_schema, readers_schema)
113
- raise 'readers_schema must be a union' unless readers_schema.type_sym == :union
114
-
115
- case writers_schema.type_sym
116
- when :union
117
- writers_schema.schemas.all? { |writer_type| full_match_schemas(writer_type, readers_schema) }
118
- else
119
- readers_schema.schemas.any? { |reader_type| full_match_schemas(writers_schema, reader_type) }
120
- end
121
- end
122
-
123
- # reader is a record
124
- def match_record_schemas(writers_schema, readers_schema)
125
- case writers_schema.type_sym
126
- when :union
127
- return false
128
- else
129
- writer_fields_hash = writers_schema.fields_hash
130
- readers_schema.fields.each do |field|
131
- if writer_fields_hash.key?(field.name)
132
- return false unless full_match_schemas(writer_fields_hash[field.name].type, field.type)
133
- else
134
- return false unless field.default?
135
- end
136
- end
137
-
138
- return true
139
- end
140
- end
141
-
142
- def recursion_in_progress?(writers_schema, readers_schema)
143
- key = [writers_schema.object_id, readers_schema.object_id]
144
-
145
- if recursion_set.include?(key)
146
- true
147
- else
148
- recursion_set.add(key)
149
- false
150
- end
151
- end
152
- end
153
- end
154
- end
@@ -1,5 +0,0 @@
1
- # Changes from "AVRO-1886: Add validation messages"
2
- # https://github.com/apache/avro/pull/111
3
- require_relative 'schema_validator/schema_validator'
4
- require_relative 'schema_validator/schema'
5
- require_relative 'schema_validator/io'
@@ -1,50 +0,0 @@
1
- Avro::IO::DatumWriter.class_eval do
2
- def write_data(writers_schema, datum, encoder)
3
- unless Avro::Schema.validate(writers_schema, datum, recursive: false)
4
- raise Avro::IO::AvroTypeError.new(writers_schema, datum)
5
- end
6
-
7
- # function dispatch to write datum
8
- case writers_schema.type_sym
9
- when :null; encoder.write_null(datum)
10
- when :boolean; encoder.write_boolean(datum)
11
- when :string; encoder.write_string(datum)
12
- when :int; encoder.write_int(datum)
13
- when :long; encoder.write_long(datum)
14
- when :float; encoder.write_float(datum)
15
- when :double; encoder.write_double(datum)
16
- when :bytes; encoder.write_bytes(datum)
17
- when :fixed; write_fixed(writers_schema, datum, encoder)
18
- when :enum; write_enum(writers_schema, datum, encoder)
19
- when :array; write_array(writers_schema, datum, encoder)
20
- when :map; write_map(writers_schema, datum, encoder)
21
- when :union; write_union(writers_schema, datum, encoder)
22
- when :record, :error, :request; write_record(writers_schema, datum, encoder)
23
- else
24
- raise Avro::AvroError.new("Unknown type: #{writers_schema.type}")
25
- end
26
- end
27
- end
28
-
29
- module AvroPatches
30
- module SchemaValidator
31
- module IOPatches
32
- def write_record(writers_schema, datum, encoder)
33
- raise Avro::IO::AvroTypeError.new(writers_schema, datum) unless datum.is_a?(Hash)
34
- super
35
- end
36
-
37
- def write_array(writers_schema, datum, encoder)
38
- raise Avro::IO::AvroTypeError.new(writers_schema, datum) unless datum.is_a?(Array)
39
- super
40
- end
41
-
42
- def write_map(writers_schema, datum, encoder)
43
- raise Avro::IO::AvroTypeError.new(writers_schema, datum) unless datum.is_a?(Hash)
44
- super
45
- end
46
- end
47
- end
48
- end
49
-
50
- Avro::IO::DatumWriter.prepend(AvroPatches::SchemaValidator::IOPatches)
@@ -1,9 +0,0 @@
1
- Avro::Schema.class_eval do
2
- # Determine if a ruby datum is an instance of a schema
3
- def self.validate(expected_schema, datum, options = { recursive: true })
4
- Avro::SchemaValidator.validate!(expected_schema, datum, options)
5
- true
6
- rescue Avro::SchemaValidator::ValidationError
7
- false
8
- end
9
- end
@@ -1,228 +0,0 @@
1
- # Licensed to the Apache Software Foundation (ASF) under one
2
- # or more contributor license agreements. See the NOTICE file
3
- # distributed with this work for additional information
4
- # regarding copyright ownership. The ASF licenses this file
5
- # to you under the Apache License, Version 2.0 (the
6
- # "License"); you may not use this file except in compliance
7
- # with the License. You may obtain a copy of the License at
8
- #
9
- # http://www.apache.org/licenses/LICENSE-2.0
10
- #
11
- # Unless required by applicable law or agreed to in writing, software
12
- # distributed under the License is distributed on an "AS IS" BASIS,
13
- # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14
- # See the License for the specific language governing permissions and
15
- # limitations under the License.
16
-
17
- module Avro
18
- class SchemaValidator
19
- ROOT_IDENTIFIER = '.'.freeze
20
- PATH_SEPARATOR = '.'.freeze
21
- INT_RANGE = Schema::INT_MIN_VALUE..Schema::INT_MAX_VALUE
22
- LONG_RANGE = Schema::LONG_MIN_VALUE..Schema::LONG_MAX_VALUE
23
- COMPLEX_TYPES = [:array, :error, :map, :record, :request].freeze
24
- BOOLEAN_VALUES = [true, false].freeze
25
-
26
- class Result
27
- attr_reader :errors
28
-
29
- def initialize
30
- @errors = []
31
- end
32
-
33
- def <<(error)
34
- @errors << error
35
- end
36
-
37
- def add_error(path, message)
38
- self << "at #{path} #{message}"
39
- end
40
-
41
- def failure?
42
- @errors.any?
43
- end
44
-
45
- def to_s
46
- errors.join("\n")
47
- end
48
- end
49
-
50
- class ValidationError < StandardError
51
- attr_reader :result
52
-
53
- def initialize(result = Result.new)
54
- @result = result
55
- super
56
- end
57
-
58
- def to_s
59
- result.to_s
60
- end
61
- end
62
-
63
- TypeMismatchError = Class.new(ValidationError)
64
-
65
- class << self
66
- # This method is replaced by code in AvroPatches::LogicalTypes::SchemaValidatorPatch.
67
- def validate!(expected_schema, datum, options = { recursive: true })
68
- options ||= {}
69
- options[:recursive] = true unless options.key?(:recursive)
70
-
71
- result = Avro::SchemaValidator::Result.new
72
- if options[:recursive]
73
- validate_recursive(expected_schema, datum, ROOT_IDENTIFIER, result)
74
- else
75
- validate_simple(expected_schema, datum, ROOT_IDENTIFIER, result)
76
- end
77
- fail Avro::SchemaValidator::ValidationError, result if result.failure?
78
- result
79
- end
80
-
81
- private
82
-
83
- def validate_type(expected_schema)
84
- unless Avro::Schema::VALID_TYPES_SYM.include?(expected_schema.type_sym)
85
- fail "Unexpected schema type #{expected_schema.type_sym} #{expected_schema.inspect}"
86
- end
87
- end
88
-
89
- # This method is replaced by code in AvroPatches::LogicalTypes::SchemaValidatorPatch.
90
- # The patches are layered this way because SchemaValidator exists on
91
- # avro's master branch but logical type support is still in PR.
92
- def validate_recursive(expected_schema, datum, path, result)
93
- validate_simple(expected_schema, datum, path, result)
94
-
95
- case expected_schema.type_sym
96
- when :array
97
- validate_array(expected_schema, datum, path, result)
98
- when :map
99
- validate_map(expected_schema, datum, path, result)
100
- when :union
101
- validate_union(expected_schema, datum, path, result)
102
- when :record, :error, :request
103
- fail TypeMismatchError unless datum.is_a?(Hash)
104
- expected_schema.fields.each do |field|
105
- deeper_path = deeper_path_for_hash(field.name, path)
106
- validate_recursive(field.type, datum[field.name], deeper_path, result)
107
- end
108
- end
109
- rescue TypeMismatchError
110
- result.add_error(path, "expected type #{expected_schema.type_sym}, got #{actual_value_message(datum)}")
111
- end
112
-
113
- def validate_simple(expected_schema, datum, path, result)
114
- validate_type(expected_schema)
115
-
116
- case expected_schema.type_sym
117
- when :null
118
- fail TypeMismatchError unless datum.nil?
119
- when :boolean
120
- fail TypeMismatchError unless BOOLEAN_VALUES.include?(datum)
121
- when :string, :bytes
122
- fail TypeMismatchError unless datum.is_a?(String)
123
- when :int
124
- fail TypeMismatchError unless datum.is_a?(Integer)
125
- result.add_error(path, "out of bound value #{datum}") unless INT_RANGE.cover?(datum)
126
- when :long
127
- fail TypeMismatchError unless datum.is_a?(Integer)
128
- result.add_error(path, "out of bound value #{datum}") unless LONG_RANGE.cover?(datum)
129
- when :float, :double
130
- fail TypeMismatchError unless datum.is_a?(Float) || datum.is_a?(Integer)
131
- when :fixed
132
- if datum.is_a? String
133
- result.add_error(path, fixed_string_message(expected_schema.size, datum)) unless datum.bytesize == expected_schema.size
134
- else
135
- result.add_error(path, "expected fixed with size #{expected_schema.size}, got #{actual_value_message(datum)}")
136
- end
137
- when :enum
138
- result.add_error(path, enum_message(expected_schema.symbols, datum)) unless expected_schema.symbols.include?(datum)
139
- end
140
- rescue TypeMismatchError
141
- result.add_error(path, "expected type #{expected_schema.type_sym}, got #{actual_value_message(datum)}")
142
- end
143
-
144
- def fixed_string_message(size, datum)
145
- "expected fixed with size #{size}, got \"#{datum}\" with size #{datum.size}"
146
- end
147
-
148
- def enum_message(symbols, datum)
149
- "expected enum with values #{symbols}, got #{actual_value_message(datum)}"
150
- end
151
-
152
- def validate_array(expected_schema, datum, path, result)
153
- fail TypeMismatchError unless datum.is_a?(Array)
154
- datum.each_with_index do |d, i|
155
- validate_recursive(expected_schema.items, d, path + "[#{i}]", result)
156
- end
157
- end
158
-
159
- def validate_map(expected_schema, datum, path, result)
160
- fail TypeMismatchError unless datum.is_a?(Hash)
161
- datum.keys.each do |k|
162
- result.add_error(path, "unexpected key type '#{ruby_to_avro_type(k.class)}' in map") unless k.is_a?(String)
163
- end
164
- datum.each do |k, v|
165
- deeper_path = deeper_path_for_hash(k, path)
166
- validate_recursive(expected_schema.values, v, deeper_path, result)
167
- end
168
- end
169
-
170
- def validate_union(expected_schema, datum, path, result)
171
- if expected_schema.schemas.size == 1
172
- validate_recursive(expected_schema.schemas.first, datum, path, result)
173
- return
174
- end
175
- failures = []
176
- compatible_type = first_compatible_type(datum, expected_schema, path, failures)
177
- return unless compatible_type.nil?
178
-
179
- complex_type_failed = failures.detect { |r| COMPLEX_TYPES.include?(r[:type]) }
180
- if complex_type_failed
181
- complex_type_failed[:result].errors.each { |error| result << error }
182
- else
183
- types = expected_schema.schemas.map { |s| "'#{s.type_sym}'" }.join(', ')
184
- result.add_error(path, "expected union of [#{types}], got #{actual_value_message(datum)}")
185
- end
186
- end
187
-
188
- def first_compatible_type(datum, expected_schema, path, failures)
189
- expected_schema.schemas.find do |schema|
190
- result = Result.new
191
- validate_recursive(schema, datum, path, result)
192
- failures << { type: schema.type_sym, result: result } if result.failure?
193
- !result.failure?
194
- end
195
- end
196
-
197
- def deeper_path_for_hash(sub_key, path)
198
- "#{path}#{PATH_SEPARATOR}#{sub_key}".squeeze(PATH_SEPARATOR)
199
- end
200
-
201
- def actual_value_message(value)
202
- avro_type = if value.is_a?(Integer)
203
- ruby_integer_to_avro_type(value)
204
- else
205
- ruby_to_avro_type(value.class)
206
- end
207
- if value.nil?
208
- avro_type
209
- else
210
- "#{avro_type} with value #{value.inspect}"
211
- end
212
- end
213
-
214
- def ruby_to_avro_type(ruby_class)
215
- {
216
- NilClass => 'null',
217
- String => 'string',
218
- Float => 'float',
219
- Hash => 'record'
220
- }.fetch(ruby_class, ruby_class)
221
- end
222
-
223
- def ruby_integer_to_avro_type(value)
224
- INT_RANGE.cover?(value) ? 'int' : 'long'
225
- end
226
- end
227
- end
228
- end