avro_turf 0.8.0 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: a708b9aabeca7d45e1db532e180b2d80e4a5aecb
4
- data.tar.gz: acb21f2435fd5126efed47803395d84eb4f5a220
2
+ SHA256:
3
+ metadata.gz: 692b0814cd5b8fcaaf55e6d607687714ae9bc496461fe0b838253d5fbeb2d218
4
+ data.tar.gz: 506a5d9bbe91a9386b1b92eb5c7e0366ad06856615442424725a82d4834cc6a6
5
5
  SHA512:
6
- metadata.gz: 4bcc5e9832804eafb4f295a6fc85273a5a52c17a515f87ab3da0deb31034ed428c6aaee3754abcaa45d78d37daf6062c6c1d1007644f8578e2561b17f70f1614
7
- data.tar.gz: 89071ed406d0be937344cc70814499e29206f05e52484fd4ba4f797f12bcb9236c36c4b080f22ffedd815ac3f6081e13405cb10fa1dc4184cfe73c8bbb279952
6
+ metadata.gz: 0db4e3e78577224cdb07ca4747c759bcdaae620dc03bbcac03194941a3bdf52bc69871cfad88dc14fc8ee6fd798a34d5acab289f8c841a97581290f258213f06
7
+ data.tar.gz: 672aec874705185e68dce6c8806ccf93b18ec8c3d0f34eda074e8fca6c372d4761d682ff316d3ba3d08c4b69b61a853bdb3c31ff68d4da2f0e37463a407ca656
@@ -0,0 +1,36 @@
1
+ version: 2
2
+ jobs:
3
+ build:
4
+ environment:
5
+ CIRCLE_ARTIFACTS: /tmp/circleci-artifacts
6
+ CIRCLE_TEST_REPORTS: /tmp/circleci-test-results
7
+ docker:
8
+ - image: circleci/ruby:2.6.2
9
+ steps:
10
+ - checkout
11
+ - run: mkdir -p $CIRCLE_ARTIFACTS $CIRCLE_TEST_REPORTS
12
+ - restore_cache:
13
+ keys:
14
+ # This branch if available
15
+ - v1-dep-{{ .Branch }}-
16
+ # Default branch if not
17
+ - v1-dep-master-
18
+ # Any branch if there are none on the default branch - this should be unnecessary if you have your default branch configured correctly
19
+ - v1-dep-
20
+ - run: gem install bundler --no-document
21
+ - run: 'bundle check --path=vendor/bundle || bundle install --path=vendor/bundle --jobs=4 --retry=3'
22
+ # Save dependency cache
23
+ - save_cache:
24
+ key: v1-dep-{{ .Branch }}-{{ epoch }}
25
+ paths:
26
+ - vendor/bundle
27
+ - ~/.bundle
28
+ - run: mkdir -p $CIRCLE_TEST_REPORTS/rspec
29
+ - run:
30
+ command: bundle exec rspec --color --require spec_helper --format progress
31
+ - store_test_results:
32
+ path: /tmp/circleci-test-results
33
+ - store_artifacts:
34
+ path: /tmp/circleci-artifacts
35
+ - store_artifacts:
36
+ path: /tmp/circleci-test-results
@@ -0,0 +1,20 @@
1
+ name: Ruby
2
+
3
+ on: [push, pull_request]
4
+
5
+ jobs:
6
+ build:
7
+
8
+ runs-on: ubuntu-latest
9
+
10
+ steps:
11
+ - uses: actions/checkout@v1
12
+ - name: Set up Ruby 2.6
13
+ uses: actions/setup-ruby@v1
14
+ with:
15
+ ruby-version: 2.6.x
16
+ - name: Build and test with RSpec
17
+ run: |
18
+ gem install bundler
19
+ bundle install --jobs 4 --retry 3
20
+ bundle exec rspec
@@ -0,0 +1,19 @@
1
+ name: Mark stale issues and pull requests
2
+
3
+ on:
4
+ schedule:
5
+ - cron: "0 0 * * *"
6
+
7
+ jobs:
8
+ stale:
9
+
10
+ runs-on: ubuntu-latest
11
+
12
+ steps:
13
+ - uses: actions/stale@v1
14
+ with:
15
+ repo-token: ${{ secrets.GITHUB_TOKEN }}
16
+ stale-issue-message: 'Stale issue message'
17
+ stale-pr-message: 'Stale pull request message'
18
+ stale-issue-label: 'no-issue-activity'
19
+ stale-pr-label: 'no-pr-activity'
@@ -1,6 +1,35 @@
1
- # avro_turf
1
+ # AvroTurf
2
+
3
+ ## Unreleased
4
+
5
+ ## v1.0.0
6
+
7
+ - Stop caching nested sub-schemas (#111)
8
+
9
+ ## v0.11.0
10
+
11
+ - Add proxy support (#107)
12
+ - Adding support for client certs (#109)
13
+
14
+ ## v0.10.0
15
+
16
+ - Add more disk caching (#103)
17
+ - Include schema information when decoding (#100, #101, #104)
18
+
19
+ ## v0.9.0
20
+
21
+ - Compatibility with Avro v1.9.0 (#94)
22
+ - Disable the auto registeration of schema (#95)
23
+ - abstracted caching from CachedConfluentSchemaRegistry (#74)
24
+ - Load avro-patches if installed to silence deprecation errors (#85)
25
+ - Make schema store to be thread safe (#92)
26
+
27
+ ## v0.8.1
28
+
29
+ - Allow accessing schema store from outside AvroTurf (#68).
2
30
 
3
31
  ## v0.8.0
32
+
4
33
  - The names `AvroTurf::SchemaRegistry`, `AvroTurf::CachedSchemaRegistry`, and
5
34
  `FakeSchemaRegistryServer` are deprecated and will be removed in a future release.
6
35
  Use `AvroTurf::ConfluentSchemaRegistry`, `AvroTurf::CachedConfluentSchemaRegistry`,
data/Gemfile CHANGED
@@ -2,6 +2,3 @@ source 'https://rubygems.org'
2
2
 
3
3
  # Specify your gem's dependencies in avro_turf.gemspec
4
4
  gemspec
5
-
6
- # Used by CircleCI to format RSpec results.
7
- gem 'rspec_junit_formatter', :git => 'git@github.com:circleci/rspec_junit_formatter.git'
data/README.md CHANGED
@@ -16,6 +16,48 @@ These classes have been renamed to `AvroTurf::ConfluentSchemaRegistry`,
16
16
 
17
17
  The aliases for the original names will be removed in a future release.
18
18
 
19
+ ## Note about finding nested schemas
20
+
21
+ As of AvroTurf version 0.12.0, only top-level schemas that have their own .avsc file will be loaded and resolvable by the `AvroTurf::SchemaStore#find` method. This change will likely not affect most users. However, if you use `AvroTurf::SchemaStore#load_schemas!` to pre-cache all your schemas and then rely on `AvroTurf::SchemaStore#find` to access nested schemas that are not defined by their own .avsc files, your code may stop working when you upgrade to v0.12.0.
22
+
23
+ As an example, if you have a `person` schema (defined in `my/schemas/contacts/person.avsc`) that defines a nested `address` schema like this:
24
+
25
+ ```json
26
+ {
27
+ "name": "person",
28
+ "namespace": "contacts",
29
+ "type": "record",
30
+ "fields": [
31
+ {
32
+ "name": "address",
33
+ "type": {
34
+ "name": "address",
35
+ "type": "record",
36
+ "fields": [
37
+ { "name": "addr1", "type": "string" },
38
+ { "name": "addr2", "type": "string" },
39
+ { "name": "city", "type": "string" },
40
+ { "name": "zip", "type": "string" }
41
+ ]
42
+ }
43
+ }
44
+ ]
45
+ }
46
+ ```
47
+ ...this will no longer work in v0.12.0:
48
+ ```ruby
49
+ store = AvroTurf::SchemaStore.new(path: 'my/schemas')
50
+ store.load_schemas!
51
+
52
+ # Accessing 'person' is correct and works fine.
53
+ person = store.find('person', 'contacts') # my/schemas/contacts/person.avsc exists
54
+
55
+ # Trying to access 'address' raises AvroTurf::SchemaNotFoundError
56
+ address = store.find('address', 'contacts') # my/schemas/contacts/address.avsc is not found
57
+ ```
58
+
59
+ For details and context, see [this pull request](https://github.com/dasch/avro_turf/pull/111).
60
+
19
61
  ## Installation
20
62
 
21
63
  Add this line to your application's Gemfile:
@@ -124,9 +166,29 @@ avro = AvroTurf::Messaging.new(registry_url: "http://my-registry:8081/")
124
166
  # time a schema is used.
125
167
  data = avro.encode({ "title" => "hello, world" }, schema_name: "greeting")
126
168
 
169
+ # If you don't want to automatically register new schemas, you can pass explicitly
170
+ # subject and version to specify which schema should be used for encoding.
171
+ # It will fetch that schema from the registry and cache it. Subsequent instances
172
+ # of the same schema version will be served by the cache.
173
+ data = avro.encode({ "title" => "hello, world" }, subject: 'greeting', version: 1)
174
+
175
+ # You can also pass explicitly schema_id to specify which schema
176
+ # should be used for encoding.
177
+ # It will fetch that schema from the registry and cache it. Subsequent instances
178
+ # of the same schema version will be served by the cache.
179
+ data = avro.encode({ "title" => "hello, world" }, schema_id: 2)
180
+
127
181
  # When decoding, the schema will be fetched from the registry and cached. Subsequent
128
182
  # instances of the same schema id will be served by the cache.
129
183
  avro.decode(data) #=> { "title" => "hello, world" }
184
+
185
+ # If you want to get decoded message as well as the schema used to encode the message,
186
+ # you can use `#decode_message` method.
187
+ result = avro.decode_message(data)
188
+ result.message #=> { "title" => "hello, world" }
189
+ result.schema_id #=> 3
190
+ result.writer_schema #=> #<Avro::Schema: ...>
191
+ result.reader_schema #=> nil
130
192
  ```
131
193
 
132
194
  ### Confluent Schema Registry Client
@@ -17,16 +17,17 @@ Gem::Specification.new do |spec|
17
17
  spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
18
18
  spec.require_paths = ["lib"]
19
19
 
20
- spec.add_dependency "avro", ">= 1.7.7", "< 1.9"
21
- spec.add_dependency "excon", "~> 0.45"
20
+ spec.add_dependency "avro", ">= 1.7.7", "< 1.10"
21
+ spec.add_dependency "excon", "~> 0.71"
22
22
 
23
- spec.add_development_dependency "bundler", "~> 1.7"
24
- spec.add_development_dependency "rake", "~> 10.0"
25
- spec.add_development_dependency "rspec", "~> 3.2.0"
26
- spec.add_development_dependency "fakefs", "~> 0.6.7"
23
+ spec.add_development_dependency "bundler", "~> 2.0"
24
+ spec.add_development_dependency "rake", "~> 13.0"
25
+ spec.add_development_dependency "rspec", "~> 3.2"
26
+ spec.add_development_dependency "fakefs", "~> 0.20.0"
27
27
  spec.add_development_dependency "webmock"
28
28
  spec.add_development_dependency "sinatra"
29
29
  spec.add_development_dependency "json_spec"
30
+ spec.add_development_dependency "rack-test"
30
31
 
31
32
  spec.post_install_message = %{
32
33
  avro_turf v0.8.0 deprecates the names AvroTurf::SchemaRegistry,
@@ -1,9 +1,18 @@
1
+ begin
2
+ require 'avro-patches'
3
+ rescue LoadError
4
+ false
5
+ end
1
6
  require 'avro_turf/version'
2
7
  require 'avro'
3
8
  require 'json'
4
9
  require 'avro_turf/schema_store'
5
10
  require 'avro_turf/core_ext'
6
- require 'avro_turf/schema_to_avro_patch'
11
+
12
+ # check for something that indicates Avro v1.9.0 or later
13
+ unless defined?(::Avro::LogicalTypes)
14
+ require 'avro_turf/schema_to_avro_patch'
15
+ end
7
16
 
8
17
  class AvroTurf
9
18
  class Error < StandardError; end
@@ -15,13 +24,15 @@ class AvroTurf
15
24
  # Create a new AvroTurf instance with the specified configuration.
16
25
  #
17
26
  # schemas_path - The String path to the root directory containing Avro schemas (default: "./schemas").
27
+ # schema_store - A schema store object that responds to #find(schema_name, namespace).
18
28
  # namespace - The String namespace that should be used to qualify schema names (optional).
19
29
  # codec - The String name of a codec that should be used to compress messages (optional).
20
30
  #
21
31
  # Currently, the only valid codec name is `deflate`.
22
- def initialize(schemas_path: nil, namespace: nil, codec: nil)
32
+ def initialize(schemas_path: nil, schema_store: nil, namespace: nil, codec: nil)
23
33
  @namespace = namespace
24
- @schema_store = SchemaStore.new(path: schemas_path || DEFAULT_SCHEMAS_PATH)
34
+ @schema_store = schema_store ||
35
+ SchemaStore.new(path: schemas_path || DEFAULT_SCHEMAS_PATH)
25
36
  @codec = codec
26
37
  end
27
38
 
@@ -1,16 +1,23 @@
1
1
  require 'avro_turf/confluent_schema_registry'
2
+ require 'avro_turf/in_memory_cache'
3
+ require 'avro_turf/disk_cache'
2
4
 
3
5
  # Caches registrations and lookups to the schema registry in memory.
4
6
  class AvroTurf::CachedConfluentSchemaRegistry
5
7
 
6
- def initialize(upstream)
8
+ # Instantiate a new CachedConfluentSchemaRegistry instance with the given configuration.
9
+ # By default, uses a provided InMemoryCache to prevent repeated calls to the upstream registry.
10
+ #
11
+ # upstream - The upstream schema registry object that fully responds to all methods in the
12
+ # AvroTurf::ConfluentSchemaRegistry interface.
13
+ # cache - Optional user provided Cache object that responds to all methods in the AvroTurf::InMemoryCache interface.
14
+ def initialize(upstream, cache: nil)
7
15
  @upstream = upstream
8
- @schemas_by_id = {}
9
- @ids_by_schema = {}
16
+ @cache = cache || AvroTurf::InMemoryCache.new()
10
17
  end
11
18
 
12
19
  # Delegate the following methods to the upstream
13
- %i(subjects subject_versions subject_version check compatible?
20
+ %i(subjects subject_versions check compatible?
14
21
  global_config update_global_config subject_config update_subject_config).each do |name|
15
22
  define_method(name) do |*args|
16
23
  instance_variable_get(:@upstream).send(name, *args)
@@ -18,10 +25,15 @@ class AvroTurf::CachedConfluentSchemaRegistry
18
25
  end
19
26
 
20
27
  def fetch(id)
21
- @schemas_by_id[id] ||= @upstream.fetch(id)
28
+ @cache.lookup_by_id(id) || @cache.store_by_id(id, @upstream.fetch(id))
22
29
  end
23
30
 
24
31
  def register(subject, schema)
25
- @ids_by_schema[subject + schema.to_s] ||= @upstream.register(subject, schema)
32
+ @cache.lookup_by_schema(subject, schema) || @cache.store_by_schema(subject, schema, @upstream.register(subject, schema))
33
+ end
34
+
35
+ def subject_version(subject, version = 'latest')
36
+ @cache.lookup_by_version(subject, version) ||
37
+ @cache.store_by_version(subject, version, @upstream.subject_version(subject, version))
26
38
  end
27
39
  end
@@ -3,11 +3,30 @@ require 'excon'
3
3
  class AvroTurf::ConfluentSchemaRegistry
4
4
  CONTENT_TYPE = "application/vnd.schemaregistry.v1+json".freeze
5
5
 
6
- def initialize(url, logger: Logger.new($stdout))
6
+ def initialize(
7
+ url,
8
+ logger: Logger.new($stdout),
9
+ proxy: nil,
10
+ client_cert: nil,
11
+ client_key: nil,
12
+ client_key_pass: nil,
13
+ client_cert_data: nil,
14
+ client_key_data: nil
15
+ )
7
16
  @logger = logger
8
- @connection = Excon.new(url, headers: {
9
- "Content-Type" => CONTENT_TYPE,
10
- })
17
+ headers = {
18
+ "Content-Type" => CONTENT_TYPE
19
+ }
20
+ headers[:proxy] = proxy if proxy&.present?
21
+ @connection = Excon.new(
22
+ url,
23
+ headers: headers,
24
+ client_cert: client_cert,
25
+ client_key: client_key,
26
+ client_key_pass: client_key_pass,
27
+ client_cert_data: client_cert_data,
28
+ client_key_data: client_key_data
29
+ )
11
30
  end
12
31
 
13
32
  def fetch(id)
@@ -0,0 +1,83 @@
1
+ # A cache for the CachedConfluentSchemaRegistry.
2
+ # Extends the InMemoryCache to provide a write-thru to disk for persistent cache.
3
+ class AvroTurf::DiskCache < AvroTurf::InMemoryCache
4
+
5
+ def initialize(disk_path)
6
+ super()
7
+
8
+ # load the write-thru cache on startup, if it exists
9
+ @schemas_by_id_path = File.join(disk_path, 'schemas_by_id.json')
10
+ @schemas_by_id = JSON.parse(File.read(@schemas_by_id_path)) if File.exist?(@schemas_by_id_path)
11
+
12
+ @ids_by_schema_path = File.join(disk_path, 'ids_by_schema.json')
13
+ @ids_by_schema = JSON.parse(File.read(@ids_by_schema_path)) if File.exist?(@ids_by_schema_path)
14
+
15
+ @schemas_by_subject_version_path = File.join(disk_path, 'schemas_by_subject_version.json')
16
+ @schemas_by_subject_version = {}
17
+ end
18
+
19
+ # override
20
+ # the write-thru cache (json) does not store keys in numeric format
21
+ # so, convert id to a string for caching purposes
22
+ def lookup_by_id(id)
23
+ super(id.to_s)
24
+ end
25
+
26
+ # override to include write-thru cache after storing result from upstream
27
+ def store_by_id(id, schema)
28
+ # must return the value from storing the result (i.e. do not return result from file write)
29
+ value = super(id.to_s, schema)
30
+ File.write(@schemas_by_id_path, JSON.pretty_generate(@schemas_by_id))
31
+ return value
32
+ end
33
+
34
+ # override to include write-thru cache after storing result from upstream
35
+ def store_by_schema(subject, schema, id)
36
+ # must return the value from storing the result (i.e. do not return result from file write)
37
+ value = super
38
+ File.write(@ids_by_schema_path, JSON.pretty_generate(@ids_by_schema))
39
+ return value
40
+ end
41
+
42
+ # checks instance var (in-memory cache) for schema
43
+ # checks disk cache if in-memory cache doesn't exists
44
+ # if file exists but no in-memory cache, read from file and sync in-memory cache
45
+ # finally, if file doesn't exist return nil
46
+ def lookup_by_version(subject, version)
47
+ key = "#{subject}#{version}"
48
+ schema = @schemas_by_subject_version[key]
49
+
50
+ return schema unless schema.nil?
51
+
52
+ hash = JSON.parse(File.read(@schemas_by_subject_version_path)) if File.exist?(@schemas_by_subject_version_path)
53
+ if hash
54
+ @schemas_by_subject_version = hash
55
+ @schemas_by_subject_version[key]
56
+ end
57
+ end
58
+
59
+ # check if file exists and parse json into a hash
60
+ # if file exists take json and overwite/insert schema at key
61
+ # if file doesn't exist create new hash
62
+ # write the new/updated hash to file
63
+ # update instance var (in memory-cache) to match
64
+ def store_by_version(subject, version, schema)
65
+ key = "#{subject}#{version}"
66
+ hash = JSON.parse(File.read(@schemas_by_subject_version_path)) if File.exist?(@schemas_by_subject_version_path)
67
+ hash = if hash
68
+ hash[key] = schema
69
+ hash
70
+ else
71
+ {key => schema}
72
+ end
73
+
74
+ write_to_disk_cache(@schemas_by_subject_version_path, hash)
75
+
76
+ @schemas_by_subject_version = hash
77
+ @schemas_by_subject_version[key]
78
+ end
79
+
80
+ private def write_to_disk_cache(path, hash)
81
+ File.write(path, JSON.pretty_generate(hash))
82
+ end
83
+ end
@@ -0,0 +1,38 @@
1
+ # A cache for the CachedConfluentSchemaRegistry.
2
+ # Simply stores the schemas and ids in in-memory hashes.
3
+ class AvroTurf::InMemoryCache
4
+
5
+ def initialize
6
+ @schemas_by_id = {}
7
+ @ids_by_schema = {}
8
+ @schema_by_subject_version = {}
9
+ end
10
+
11
+ def lookup_by_id(id)
12
+ @schemas_by_id[id]
13
+ end
14
+
15
+ def store_by_id(id, schema)
16
+ @schemas_by_id[id] = schema
17
+ end
18
+
19
+ def lookup_by_schema(subject, schema)
20
+ key = subject + schema.to_s
21
+ @ids_by_schema[key]
22
+ end
23
+
24
+ def store_by_schema(subject, schema, id)
25
+ key = subject + schema.to_s
26
+ @ids_by_schema[key] = id
27
+ end
28
+
29
+ def lookup_by_version(subject, version)
30
+ key = "#{subject}#{version}"
31
+ @schema_by_subject_version[key]
32
+ end
33
+
34
+ def store_by_version(subject, version, schema)
35
+ key = "#{subject}#{version}"
36
+ @schema_by_subject_version[key] = schema
37
+ end
38
+ end