w3c_api 0.2.0 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: b1b6d2f332eeee4355bee1550931410695851be8e5349ee1ef0168fe370dab36
4
- data.tar.gz: b9d8dfa4caae188d47acbd6ebe28e5c29a37035a9e43b60bae880467de5f22df
3
+ metadata.gz: ceb115051fa4d11d8bdfb690b26c93305556c3086a911a95e1605787e91ad437
4
+ data.tar.gz: 4c67a5092e6888768c13ff63c1b77b94b59db847e9df99c01501bdd401934199
5
5
  SHA512:
6
- metadata.gz: 3b056eed82f4d6d6134f1e6f51b2b18a04d7e92683c359cdb5762495885afe5c142d8da2603cf05bf4c09eb75639343f7e0ec6c0433d38d9b8c602f417c4bae6
7
- data.tar.gz: 43030538729e62863ddf3166c726c1f05a841d0b59a2c62f5590dce395a489601220be7ebd52b853ee5b55bb970209a60bc4ec74d29b091aa51af2f9de134580
6
+ metadata.gz: ed887189c60decd73e63fd1ee772baa656efdef09a667c921b0b438d3c6d53b92795b764526b461fb81223b92095ba552f18f0000063600f3d9a0f3b05e7b2d8
7
+ data.tar.gz: 33852b89ce9551bad83c5a1a524e06fe82393047daff471cffb540a4670c22f3dbd02986d91531fe4fb40a73c5ceb4a8628d14178b04f74c973e7b320fa396a2
data/CLAUDE.md ADDED
@@ -0,0 +1,113 @@
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## Overview
6
+
7
+ `w3c_api` is a Ruby client + Thor CLI for the W3C API (https://api.w3.org).
8
+ It is a thin layer over [lutaml-hal](https://github.com/lutaml/lutaml-hal):
9
+ lutaml-hal owns the HTTP client, HAL link realization, and pagination; this gem
10
+ contributes the endpoint registry, the resource models, a convenience `Client`
11
+ facade, and the CLI.
12
+
13
+ ## Commands
14
+
15
+ ```sh
16
+ bundle install # install dependencies
17
+ bundle exec rake # default task: spec + rubocop
18
+ bundle exec rake spec # run all tests
19
+ bundle exec rake rubocop # lint only
20
+ bundle exec rspec spec/w3c_api/client_spec.rb # single file
21
+ bundle exec rspec spec/w3c_api/client_spec.rb:42 # single example by line
22
+ exe/w3c_api <command> # run the CLI without installing
23
+ ```
24
+
25
+ Note: the README's `bin/setup` and `bin/console` do not exist in this repo —
26
+ use `bundle install` and `bundle exec`.
27
+
28
+ ## Architecture
29
+
30
+ The request path is: **CLI command → `Client` → `Hal` register → lutaml-hal →
31
+ `Models`**.
32
+
33
+ - **`lib/w3c_api/hal.rb`** — the heart of the gem. `Hal` is a `Singleton` that
34
+ builds the Faraday connection (with retry + cache config) and registers
35
+ *every* API endpoint in `setup`
36
+ (one `add_endpoint` call per endpoint, mapping an endpoint id like
37
+ `:specification_resource` → URL template + model + parameters). To add or
38
+ change an API endpoint, edit `setup` here. `SimpleParameter` exists only to
39
+ satisfy lutaml-hal's parameter-validation interface without pulling in its
40
+ full `EndpointParameter` machinery.
41
+
42
+ - **`lib/w3c_api/client.rb`** — `Client` is a flat convenience facade. Almost
43
+ every method is one line delegating to `Hal.instance.register.fetch(endpoint_id, **params)`
44
+ via the private `fetch_resource`. Many method families
45
+ (`group_*`, `user_*`, `affiliation_*`, `ecosystem_*`) are generated with
46
+ `define_method` loops — follow that pattern rather than writing them out.
47
+
48
+ - **`lib/w3c_api/models/`** — one file per resource. Two kinds:
49
+ - **Resources** subclass `Lutaml::Hal::Resource` (e.g. `Specification`).
50
+ - **Indexes** subclass `Lutaml::Hal::Page` (e.g. `SpecificationIndex`) and
51
+ provide pagination.
52
+ Models declare attributes, `hal_link` relationships (each link names a
53
+ `realize_class` and may be a `collection`), and a `key_value` block mapping
54
+ the API's hyphenated JSON keys (`series-version`) to underscored Ruby
55
+ attributes (`series_version`). `models.rb` requires every model file.
56
+
57
+ - **`lib/w3c_api/commands/`** — one Thor class per resource, all wired into the
58
+ root `Cli` in `cli.rb` as subcommands. Commands instantiate `Client`, call
59
+ it, and print via the shared `OutputFormatter` (`--format yaml|json`, YAML
60
+ default). `exe/w3c_api` is the executable.
61
+
62
+ ### HAL link realization
63
+
64
+ The defining feature: links are lazy. Calling `.realize` on a link follows its
65
+ `href` and returns the typed model — and chains
66
+ (`spec.links.latest_version.realize.links.editors...`). With `embed: true` on
67
+ supported index endpoints (`:specification_index`, `:group_index`,
68
+ `:serie_index`), the index response embeds child resources and `.realize` uses
69
+ that embedded data instead of issuing a new HTTP request (see
70
+ `lib/w3c_api/embed.rb` and `Client.embed_supported_endpoints`).
71
+
72
+ ### Rate limiting & retries (`hal.rb`)
73
+
74
+ Two cooperating layers, both tuned to grow 1→2→4→8→16s:
75
+ - lutaml-hal's `RateLimiter` (via `rate_limiting_options`) retries **429 and
76
+ 5xx**.
77
+ - A Faraday `:retry` middleware in `connection` covers what lutaml-hal does not:
78
+ the W3C API signals rate-limiting with **HTTP 403**, plus connection/timeout
79
+ errors.
80
+
81
+ Owning retries in the client means consumers get resilience without wrapping.
82
+ Tune via `Hal.instance.configure_rate_limiting(...)`; changing options resets
83
+ the memoized client.
84
+
85
+ ### Caching (`hal.rb`)
86
+
87
+ lutaml-hal caches realized objects keyed by their canonical URL, so a document
88
+ linked from many places (editors, working groups, versions) is fetched once per
89
+ process. It is **enabled by default** (in-memory) — the register is built with
90
+ `cache: cache_options`. Tune or persist via `Hal.instance.configure_cache(...)`
91
+ (e.g. a `:filesystem` adapter) or turn it off with `disable_cache`; like the
92
+ rate-limiting setters these rebuild the register. `reset_register` also
93
+ unregisters from lutaml-hal's `GlobalRegister`, otherwise the rebuild raises
94
+ "replacing another one".
95
+
96
+ ## Testing
97
+
98
+ Specs use **VCR** (`hook_into :faraday`) with cassettes in
99
+ `spec/fixtures/vcr_cassettes/`. Default record mode is `:new_episodes` and
100
+ requests match on `method, uri, body` — so a new test that hits an unrecorded
101
+ request will perform a real HTTP call and record it. `spec_helper.rb` resets the
102
+ `Hal` singleton's register and the lutaml-hal `GlobalRegister` around every
103
+ example to prevent cross-test endpoint-registration bleed — which also gives
104
+ each example a fresh object cache, so caching doesn't mask expected requests.
105
+
106
+ ## Conventions
107
+
108
+ - Ruby >= 3.1. RuboCop inherits Ribose's shared OSS config; `LineLength` max is
109
+ 180. `.rubocop_todo.yml` holds grandfathered offences — prefer fixing over
110
+ growing it.
111
+ - Endpoint ids follow `<resource>_resource` (single) / `<resource>_index`
112
+ (collection) naming; keep new ids consistent so the `define_method` loops and
113
+ `embed_supported?` discovery keep working.
data/README.adoc CHANGED
@@ -123,6 +123,36 @@ The stress test example demonstrates:
123
123
  * Dynamic configuration changes
124
124
  * Bulk operation patterns
125
125
 
126
+ === Caching
127
+
128
+ Realized objects are cached keyed by their (canonical) URL, so a resource linked from many places — editors, working groups, versions — is fetched only once per process. Caching is enabled by default (in-memory) and works transparently with `fetch` and link `.realize`.
129
+
130
+ ==== Quick start
131
+
132
+ [source,ruby]
133
+ ----
134
+ require 'w3c_api'
135
+
136
+ # Object caching is enabled by default (in-memory)
137
+ client = W3cApi::Client.new
138
+
139
+ # The same linked resource is fetched once, then served from cache
140
+ specs = client.specifications
141
+ spec = specs.links.specifications.first.realize # HTTP request
142
+ spec.links.latest_version.realize # HTTP request
143
+ spec.links.latest_version.realize # cache hit, no HTTP
144
+
145
+ # Disable caching entirely
146
+ W3cApi::Hal.instance.disable_cache
147
+
148
+ # Or persist the cache across runs on disk
149
+ W3cApi::Hal.instance.configure_cache(
150
+ adapter: { type: :filesystem, options: { path: "tmp/w3c_cache" } }
151
+ )
152
+ ----
153
+
154
+ The cache is backed by https://github.com/lutaml/lutaml-store[lutaml-store]: the default `:memory` adapter lives for the process, while the `:filesystem` (and `:sqlite`) adapters persist across runs.
155
+
126
156
  === Embed support with auto-realize
127
157
 
128
158
  ==== General
data/lib/w3c_api/hal.rb CHANGED
@@ -1,6 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require "singleton"
4
+ require "faraday/retry"
4
5
  require "lutaml/hal"
5
6
  require_relative "models"
6
7
 
@@ -50,18 +51,55 @@ module W3cApi
50
51
  def client
51
52
  @client ||= Lutaml::Hal::Client.new(
52
53
  api_url: API_URL,
54
+ connection: connection,
53
55
  rate_limiting: rate_limiting_options,
54
56
  )
55
57
  end
56
58
 
59
+ # Faraday connection mirroring lutaml-hal's default middleware stack, with a
60
+ # retry layer for the failures lutaml-hal's RateLimiter does not cover: the
61
+ # W3C API signals rate-limiting with HTTP 403, plus transient connection and
62
+ # timeout errors. (lutaml-hal still retries 429 and 5xx.) Owning retries here
63
+ # means every consumer of the client is resilient without its own wrapper.
64
+ def connection
65
+ @connection ||= Faraday.new(url: API_URL.delete_suffix("/")) do |conn|
66
+ conn.request :retry, retry_options
67
+ conn.use Faraday::FollowRedirects::Middleware
68
+ conn.request :json
69
+ conn.response :json, content_type: /\bjson$/
70
+ conn.adapter Faraday.default_adapter
71
+ end
72
+ end
73
+
74
+ # Retry policy for the W3C-specific transient failures (HTTP 403 and
75
+ # connection/timeout). Grows 1, 2, 4, 8, 16s, matching rate_limiting_options.
76
+ def retry_options
77
+ {
78
+ max: 5,
79
+ interval: 1.0,
80
+ backoff_factor: 2,
81
+ max_interval: 30.0,
82
+ retry_statuses: [403],
83
+ exceptions: [
84
+ Errno::ETIMEDOUT, Timeout::Error,
85
+ Faraday::TimeoutError, Faraday::ConnectionFailed
86
+ ],
87
+ }
88
+ end
89
+
57
90
  # Configure rate limiting options
91
+ #
92
+ # lutaml-hal's RateLimiter retries 429 and 5xx responses with exponential
93
+ # backoff (base_delay * backoff_factor**(attempt - 1), capped at max_delay).
94
+ # These defaults grow 1, 2, 4, 8, 16s so a rate-limited or briefly
95
+ # overloaded W3C API is given real room to recover during a bulk crawl.
58
96
  def rate_limiting_options
59
97
  @rate_limiting_options ||= {
60
98
  enabled: true,
61
99
  max_retries: 5,
62
- base_delay: 0.1,
63
- max_delay: 10.0,
64
- backoff_factor: 1.5,
100
+ base_delay: 1.0,
101
+ max_delay: 30.0,
102
+ backoff_factor: 2.0,
65
103
  }
66
104
  end
67
105
 
@@ -82,10 +120,45 @@ module W3cApi
82
120
  configure_rate_limiting(enabled: true)
83
121
  end
84
122
 
123
+ # Cache options for the model register
124
+ #
125
+ # lutaml-hal caches realized objects keyed by their (canonical) URL, so a
126
+ # document linked from many places is fetched once. In-memory by default;
127
+ # pass an adapter config such as
128
+ # `{ adapter: { type: :filesystem, options: { path: "..." } } }` for
129
+ # cross-run persistence. Returns nil when caching is disabled.
130
+ def cache_options
131
+ return @cache_options if defined?(@cache_options)
132
+
133
+ @cache_options = { adapter: :memory }
134
+ end
135
+
136
+ # Set cache options (merged into the current ones)
137
+ def configure_cache(options = {})
138
+ @cache_options = (cache_options || {}).merge(options)
139
+ reset_register
140
+ end
141
+
142
+ # Disable caching of realized objects
143
+ def disable_cache
144
+ @cache_options = nil
145
+ reset_register
146
+ end
147
+
148
+ # Enable caching of realized objects
149
+ def enable_cache(options = nil)
150
+ @cache_options = options || { adapter: :memory }
151
+ reset_register
152
+ end
153
+
85
154
  def register
86
155
  return @register if @register
87
156
 
88
- @register = Lutaml::Hal::ModelRegister.new(name: :w3c_api, client: client)
157
+ @register = Lutaml::Hal::ModelRegister.new(
158
+ name: :w3c_api,
159
+ client: client,
160
+ cache: cache_options,
161
+ )
89
162
  Lutaml::Hal::GlobalRegister.instance.register(:w3c_api, @register)
90
163
 
91
164
  # Re-run setup to register all endpoints with the new register
@@ -95,6 +168,9 @@ module W3cApi
95
168
  end
96
169
 
97
170
  def reset_register
171
+ # Drop the global registration too, otherwise rebuilding the register
172
+ # raises "replacing another one" when it re-registers the same name.
173
+ Lutaml::Hal::GlobalRegister.instance.unregister(:w3c_api)
98
174
  @register = nil
99
175
  end
100
176
 
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module W3cApi
4
- VERSION = "0.2.0"
4
+ VERSION = "0.3.1"
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: w3c_api
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 0.3.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ribose Inc.
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2026-04-21 00:00:00.000000000 Z
11
+ date: 2026-06-03 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: faraday
@@ -38,20 +38,34 @@ dependencies:
38
38
  - - ">="
39
39
  - !ruby/object:Gem::Version
40
40
  version: '0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: faraday-retry
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '2.0'
48
+ type: :runtime
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '2.0'
41
55
  - !ruby/object:Gem::Dependency
42
56
  name: lutaml-hal
43
57
  requirement: !ruby/object:Gem::Requirement
44
58
  requirements:
45
59
  - - "~>"
46
60
  - !ruby/object:Gem::Version
47
- version: 0.1.10
61
+ version: 0.2.0
48
62
  type: :runtime
49
63
  prerelease: false
50
64
  version_requirements: !ruby/object:Gem::Requirement
51
65
  requirements:
52
66
  - - "~>"
53
67
  - !ruby/object:Gem::Version
54
- version: 0.1.10
68
+ version: 0.2.0
55
69
  - !ruby/object:Gem::Dependency
56
70
  name: lutaml-model
57
71
  requirement: !ruby/object:Gem::Requirement
@@ -118,6 +132,7 @@ extra_rdoc_files: []
118
132
  files:
119
133
  - ".rubocop.yml"
120
134
  - ".rubocop_todo.yml"
135
+ - CLAUDE.md
121
136
  - LICENSE.md
122
137
  - README.adoc
123
138
  - Rakefile