gem-guardian 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. checksums.yaml +4 -4
  2. data/.github/workflows/{main.yml → ci.yml} +3 -21
  3. data/.rubocop.yml +12 -0
  4. data/CHANGELOG.md +25 -1
  5. data/CODE_OF_CONDUCT.md +1 -1
  6. data/Gemfile +0 -1
  7. data/README.md +397 -49
  8. data/Rakefile +27 -27
  9. data/bin/console +2 -2
  10. data/gem-guardian.gemspec +11 -9
  11. data/lib/gem/guardian/artifact_store.rb +13 -2
  12. data/lib/gem/guardian/checksum_provider.rb +181 -0
  13. data/lib/gem/guardian/cli.rb +99 -7
  14. data/lib/gem/guardian/configuration.rb +88 -0
  15. data/lib/gem/guardian/dependency.rb +5 -1
  16. data/lib/gem/guardian/github_release_verifier.rb +2 -2
  17. data/lib/gem/guardian/lockfile_parser.rb +32 -6
  18. data/lib/gem/guardian/progress.rb +66 -0
  19. data/lib/gem/guardian/provenance_verifier.rb +1 -3
  20. data/lib/gem/guardian/registry.rb +83 -0
  21. data/lib/gem/guardian/registry_audit.rb +81 -0
  22. data/lib/gem/guardian/report_builder.rb +3 -4
  23. data/lib/gem/guardian/result_printer.rb +35 -5
  24. data/lib/gem/guardian/rubygems_client.rb +366 -21
  25. data/lib/gem/guardian/verifier.rb +119 -12
  26. data/lib/gem/guardian/version.rb +1 -1
  27. data/lib/gem/guardian.rb +4 -0
  28. data/script/registry_provenance_audit.rb +41 -0
  29. metadata +16 -19
  30. data/sig/gem/guardian/artifact_store.rbs +0 -22
  31. data/sig/gem/guardian/checksum.rbs +0 -14
  32. data/sig/gem/guardian/cli.rbs +0 -60
  33. data/sig/gem/guardian/dependency.rbs +0 -18
  34. data/sig/gem/guardian/error.rbs +0 -26
  35. data/sig/gem/guardian/lockfile_parser.rbs +0 -55
  36. data/sig/gem/guardian/rubygems_client.rbs +0 -46
  37. data/sig/gem/guardian/verifier.rbs +0 -40
  38. data/sig/gem/guardian/version.rbs +0 -10
  39. data/sig/gem/guardian.rbs +0 -4
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a18ee9f2f2111d0eca38def30302a7d66b9b6b9b96df06909f883be2978c1bcd
4
- data.tar.gz: e6a421612c4c50423ef7fe29f74f2bc2503cb624b813befb114bc679940fcd5f
3
+ metadata.gz: d5b22c1d6bd9bad2286db3b0bf061e14df64e544a7ec050b5c3ba883dc5dbaf9
4
+ data.tar.gz: 98768de703f861207c355cdf9299428d9b70caeab628b8861e643ed8d38fa6ab
5
5
  SHA512:
6
- metadata.gz: 3577f45013b8bb9b94905e7f2c081501ef3b863f2913faee08a0604184e5181ed278bd75a57cee9f9d571d00395b26c3f666ab1c126abe251c88cc54f24cb8c3
7
- data.tar.gz: faa68291c29be60c7b42d7cf1452dc8e5d29222ee8174e22ed7bbb6f913c5405a264555fe16c020e934d23863232aec74ee2befb537a7bb06192ab52d154bffb
6
+ metadata.gz: e167265b43bfb4311f73df8390579c9707034dfab202cd694fb0a43e3a0a75c68431217e7068de518bd75b14cb71ee1a08cffa0ccbe59e8dff869d35db958fb4
7
+ data.tar.gz: 419cf82e3c70e1e34fd29e37e277899a96355b410a0b54dfb9830daebb8d5e3818f4016d835408802fad6ab588d37c97bab3836cff62c5455d28780476620c1b
@@ -1,4 +1,4 @@
1
- name: Ruby
1
+ name: CI
2
2
 
3
3
  on:
4
4
  push:
@@ -31,23 +31,5 @@ jobs:
31
31
  with:
32
32
  ruby-version: ${{ matrix.ruby }}
33
33
  bundler-cache: true
34
- - name: Run the test suite
35
- run: COVERAGE=true bundle exec rake test
36
- - name: Validate RBS signatures
37
- run: bundle exec rake rbs:validate
38
-
39
- lint:
40
- runs-on: ubuntu-latest
41
- name: RuboCop
42
-
43
- steps:
44
- - uses: actions/checkout@v6
45
- with:
46
- persist-credentials: false
47
- - name: Set up Ruby
48
- uses: ruby/setup-ruby@v1
49
- with:
50
- ruby-version: "3.2"
51
- bundler-cache: true
52
- - name: Run RuboCop
53
- run: bundle exec rubocop
34
+ - name: Run the default rake task
35
+ run: bundle exec rake
data/.rubocop.yml CHANGED
@@ -18,3 +18,15 @@ Style/StringLiterals:
18
18
 
19
19
  Style/StringLiteralsInInterpolation:
20
20
  EnforcedStyle: double_quotes
21
+
22
+ Metrics/ClassLength:
23
+ Max: 110
24
+
25
+ Metrics/MethodLength:
26
+ Max: 16
27
+
28
+ Layout/LineLength:
29
+ Max: 130
30
+
31
+ Metrics/ParameterLists:
32
+ CountKeywordArgs: false
data/CHANGELOG.md CHANGED
@@ -1,6 +1,30 @@
1
1
  # Changelog
2
2
 
3
- ## [Unreleased]
3
+ ## Unreleased
4
+
5
+ ## [0.4.0]
6
+
7
+ - Add `.gem-guardian.yml` project configuration for publisher checksum providers.
8
+ - Support source-scoped checksum URL providers for private/commercial gem registries.
9
+ - Tighten gemspec positioning around lockfile, registry, artifact checksum verification, and supply-chain provenance.
10
+ - Add checksum-provider branch coverage for the default `Net::HTTP` path used by publisher checksum URLs.
11
+ - Keep the known JSON stdout noise issue tracked as a follow-up rather than blocking the checksum-provider release.
12
+
13
+ - Implemented checksum-source triage: lockfile, registry, and artifact.
14
+ - Added optional registry SHA256 cross-check in lockfile mode.
15
+ - Updated JSON checksum payloads with `registry_sha256`.
16
+ - Clarified README trust model around `PASS` vs `RECORDED`.
17
+
18
+ - Resolve explicit private-registry checksums through Bundler/RubyGems Compact Index metadata (`/info/<gem>`) when RubyGems.org-style versions APIs are unavailable.
19
+ - Fall back to artifact digest recording for explicit private-registry gems when no independent registry checksum is exposed.
20
+ - Skip Trusted Publishing provenance lookups for non-RubyGems.org sources so private registry gems report unsupported provenance instead of API 404 errors.
21
+
22
+ - Improve YARD documentation for CLI lockfile filtering and progress helpers.
23
+ - Expand README with real-world Rails provenance results, private registry behavior, lockfile filtering, CI/CD guidance, and registry audit usage.
24
+
25
+ - Fix registry audit source handling by preserving `Gem::SourceList` for `Gem::SpecFetcher`.
26
+ - Add regression coverage for registry source normalization and private/source-specific artifact paths.
27
+ - Add focused branch coverage for RubyGems client edge cases around source resolution, redirects, authentication, and provenance parsing.
4
28
 
5
29
  ## [0.3.0] - 2026-06-12
6
30
 
data/CODE_OF_CONDUCT.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Code of Conduct
2
2
 
3
- "cdc-parallel" follows [The Ruby Community Conduct Guideline](https://www.ruby-lang.org/en/conduct) in all "collaborative space", which is defined as community communications channels (such as mailing lists, submitted patches, commit comments, etc.):
3
+ "gem-guardian" follows [The Ruby Community Conduct Guideline](https://www.ruby-lang.org/en/conduct) in all "collaborative space", which is defined as community communications channels (such as mailing lists, submitted patches, commit comments, etc.):
4
4
 
5
5
  * Participants will be tolerant of opposing views.
6
6
  * Participants must ensure that their language and actions are free of personal attacks and disparaging personal remarks.
data/Gemfile CHANGED
@@ -11,5 +11,4 @@ gem "rubocop", "~> 1.87"
11
11
  gem "rubocop-minitest", "~> 0.39.1"
12
12
  gem "rubocop-rake", "~> 0.7.1"
13
13
  gem "simplecov", "~> 0.22.0"
14
- gem "steep", "~> 2.0"
15
14
  gem "yard", "~> 0.9.44"
data/README.md CHANGED
@@ -5,124 +5,472 @@
5
5
  [![Ruby Version](https://img.shields.io/badge/ruby-%3E%3D%203.2-ruby.svg)](https://www.ruby-lang.org/en/)
6
6
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
7
 
8
+ Consumer-side integrity and provenance verification for Ruby gems.
8
9
 
9
- Consumer-side integrity verification for Ruby gems.
10
-
11
- `gem-guardian` audits Bundler checksum coverage, verifies `.gem` artifacts against RubyGems SHA256 data when needed, and can verify Trusted Publishing provenance for supported releases, including GitHub release checksum/signature discovery and signed-tag attestation checks when the release data exposes them. It stays intentionally small: no Bundler monkeypatching, no install hooks, and no custom publishing flow required.
10
+ `gem-guardian` audits Bundler checksum coverage, verifies `.gem` artifacts against lockfile and registry checksum sources, records artifact digests when no independent checksum exists, and reports Trusted Publishing provenance when the registry exposes it. It is intentionally small: no Bundler monkeypatching, no install hooks, and no custom publishing flow required.
12
11
 
13
12
  ## Why
14
13
 
15
- RubyGems.org displays SHA256 checksums for published gem artifacts, Bundler 2.6 can store and enforce checksums in `Gemfile.lock`, and RubyGems now exposes attestation data for Trusted Publishing releases. That means the most useful current release is an audit and verification tool that tells you whether your bundle and release metadata are actually protected.
14
+ Ruby now has several useful supply-chain signals:
15
+
16
+ - RubyGems.org exposes SHA256 checksums for published gem artifacts.
17
+ - Bundler can store and enforce checksums in `Gemfile.lock`.
18
+ - RubyGems exposes provenance metadata for gems published through Trusted Publishing.
16
19
 
17
- This 0.2.0 scope is:
20
+ The missing piece is consumer-side visibility.
21
+
22
+ `gem-guardian` helps answer:
18
23
 
19
24
  ```text
20
- Gemfile.lock
21
-
22
- CHECKSUMS coverage audit
23
-
24
- RubyGems.org checksum comparison when needed
25
-
26
- Trusted Publishing provenance verification when available
27
-
28
- Actionable report for CI or local review
25
+ Did the artifact match my lockfile checksum?
26
+ Did it also match registry or publisher checksum metadata?
27
+ Was only an artifact digest recorded?
28
+ Was it published through Trusted Publishing?
29
+ Which repository, workflow, and commit produced it?
29
30
  ```
30
31
 
31
- This reports whether your lockfile is using Bundler checksum protection, whether any locked gems are missing expected checksum data, whether RubyGems exposes Trusted Publishing provenance for the gem being verified, and whether GitHub release assets and tag attestations are available for the release being inspected.
32
32
 
33
- ## Installation
33
+ ## Integrity model
34
34
 
35
- From a local checkout:
35
+ `gem-guardian` separates checksum sources from the downloaded artifact. The downloaded `.gem` is always hashed locally. A result is considered verified only when that artifact digest is compared with an independent expected digest.
36
+
37
+ | Level | Required checks | Output | Meaning |
38
+ | --- | --- | --- | --- |
39
+ | Lockfile + registry + artifact | `lockfile SHA256 == registry SHA256 == artifact SHA256` | `PASS`, `source lockfile`, `registry ...` | Strongest path. The project lockfile and registry metadata agree with the downloaded artifact. |
40
+ | Lockfile + artifact | `lockfile SHA256 == artifact SHA256` | `PASS`, `source lockfile` | Strong project-level verification. Works even when the registry does not expose checksum metadata. |
41
+ | Registry + artifact | `registry SHA256 == artifact SHA256` | `PASS`, `source registry` | Registry-anchored verification for ad-hoc gem checks without a lockfile. |
42
+ | Artifact only | `artifact SHA256` only | `RECORDED`, `source artifact` | Informational only. The artifact was hashed, but no independent checksum source was available. |
43
+
44
+ Verification priority is:
45
+
46
+ ```text
47
+ lockfile > registry > artifact
48
+ ```
49
+
50
+ `RECORDED` is intentionally not called `PASS`: there is no independent checksum to compare against.
51
+
52
+ ## Real-world example
53
+
54
+ Against a freshly generated Rails 8 application with lockfile checksums enabled:
36
55
 
37
56
  ```bash
38
- gem build gem-guardian.gemspec
39
- gem install ./gem-guardian-0.2.0.gem
57
+ bundle lock --add-checksums
58
+ gem-guardian verify --provenance
40
59
  ```
41
60
 
42
- ## Usage
61
+ Observed result:
62
+
63
+ ```text
64
+ CHECKSUMS coverage: 142/142
65
+
66
+ PROVENANCE PASS: 35
67
+ PROVENANCE UNSUPPORTED: 107
68
+ ```
69
+
70
+ | Signal | Coverage |
71
+ | --- | ---: |
72
+ | Bundler lockfile checksums | 142 / 142 (100%) |
73
+ | Trusted Publishing provenance | 35 / 142 (24.6%) |
74
+ | Provenance unavailable | 107 / 142 (75.4%) |
75
+
76
+ This illustrates the distinction between integrity and provenance:
77
+
78
+ ```text
79
+ Integrity:
80
+ Did I receive the expected artifact?
81
+
82
+ Provenance:
83
+ Who built and published this artifact?
84
+ ```
43
85
 
44
- Build and install the current release from a local checkout:
86
+ A dependency graph can have complete checksum coverage while still having limited provenance visibility.
87
+
88
+ ## Installation
89
+
90
+ Install from RubyGems:
45
91
 
46
92
  ```bash
47
- gem build gem-guardian.gemspec
48
- gem install ./gem-guardian-0.2.0.gem
49
- gem-guardian version
93
+ gem install gem-guardian
50
94
  ```
51
95
 
52
- Show the built-in help:
96
+ Or build from a local checkout:
53
97
 
54
98
  ```bash
55
- gem-guardian help
56
- gem-guardian --help
99
+ gem build gem-guardian.gemspec
100
+ gem install ./gem-guardian-0.3.1.gem
57
101
  ```
58
102
 
59
- Prepare a locked project for checksum auditing:
103
+ ## Quick start
104
+
105
+ Prepare a Bundler project for checksum auditing:
60
106
 
61
107
  ```bash
62
108
  bundle lock --add-checksums
63
109
  ```
64
110
 
65
- Verify all gems in `Gemfile.lock`:
111
+ Verify the lockfile:
66
112
 
67
113
  ```bash
68
114
  gem-guardian verify
69
115
  ```
70
116
 
71
- Verify a specific gem version:
117
+ Verify integrity and provenance:
118
+
119
+ ```bash
120
+ gem-guardian verify --provenance
121
+ ```
122
+
123
+ Emit JSON for CI:
124
+
125
+ ```bash
126
+ gem-guardian verify --json --provenance
127
+ ```
128
+
129
+ ## Usage
130
+
131
+ Show help:
132
+
133
+ ```bash
134
+ gem-guardian help
135
+ gem-guardian --help
136
+ ```
137
+
138
+ Verify a specific published gem:
72
139
 
73
140
  ```bash
141
+ gem-guardian verify rails:8.1.3
74
142
  gem-guardian verify cdc-sidekiq:0.1.1
75
143
  gem-guardian verify ratomic:0.4.1
76
144
  ```
77
145
 
78
- Verify a platform gem:
146
+ Verify a platform-specific gem:
79
147
 
80
148
  ```bash
81
149
  gem-guardian verify nokogiri:1.18.9:x86_64-linux
82
150
  ```
83
151
 
84
- Use a non-default lockfile:
152
+ Verify all gems in a non-default lockfile:
85
153
 
86
154
  ```bash
87
155
  gem-guardian verify --lockfile path/to/Gemfile.lock
88
156
  ```
89
157
 
90
- Emit JSON for CI:
158
+ Verify only selected gems from a lockfile:
91
159
 
92
160
  ```bash
93
- gem-guardian verify --json
94
- gem-guardian verify --json --provenance
161
+ gem-guardian verify --lockfile path/to/Gemfile.lock --provenance mammoth:0.1.1
162
+ gem-guardian verify --lockfile path/to/Gemfile.lock --provenance nokogiri:1.18.9:x86_64-linux
163
+ ```
164
+
165
+ When a platform is omitted in lockfile mode, `gem-guardian` matches every locked platform for that gem and version.
166
+
167
+ ## How verification works
168
+
169
+ `gem-guardian` separates three integrity signals:
170
+
171
+ 1. **Lockfile checksum** — expected SHA256 comes from Bundler's `Gemfile.lock` `CHECKSUMS` section.
172
+ 2. **Registry or publisher checksum** — expected SHA256 comes from registry metadata or a configured checksum provider when available.
173
+ 3. **Artifact digest** — SHA256 is computed from the downloaded `.gem` file.
174
+
175
+ The artifact digest is always computed locally. Lockfile and registry/publisher checksums are independent trust anchors used for comparison.
176
+
177
+ ### Lockfile mode
178
+
179
+ ```bash
180
+ gem-guardian verify --lockfile Gemfile.lock
95
181
  ```
96
182
 
97
- When you verify a lockfile that already contains Bundler `CHECKSUMS`, `gem-guardian` reports coverage and compares the locked checksum to the downloaded artifact. When a checksum is missing, it falls back to RubyGems.org metadata and marks that verification accordingly.
183
+ In lockfile mode, `gem-guardian` treats `Gemfile.lock` as the primary trust anchor:
184
+
185
+ ```text
186
+ expected SHA256 = Gemfile.lock CHECKSUMS
187
+ actual SHA256 = downloaded .gem artifact
188
+ ```
189
+
190
+ If the registry also exposes a checksum, `gem-guardian` performs a stronger three-way check:
191
+
192
+ ```text
193
+ lockfile SHA256 == registry SHA256 == artifact SHA256
194
+ ```
195
+
196
+ This mode is the preferred CI/CD path because Bundler has already recorded the expected artifact digest for the application. It is also registry-agnostic: the registry only needs to resolve and serve the artifact after the checksum has been committed to the lockfile.
197
+
198
+ A successful lockfile-only verification reports:
199
+
200
+ ```text
201
+ PASS cdc-orchestrator-pro 0.1.0 ruby
202
+ sha256 fa82bd6f...
203
+ source lockfile
204
+ ```
205
+
206
+ If a registry or publisher checksum is also available, the output includes the cross-check source:
207
+
208
+ ```text
209
+ PASS cdc-sidekiq 0.1.1 ruby
210
+ sha256 d91d298d...
211
+ source lockfile
212
+ registry d91d298d...
213
+ provider rubygems-api
214
+ verify https://rubygems.org/api/v1/versions/cdc-sidekiq.json
215
+ ```
216
+
217
+ ### Explicit gem mode
218
+
219
+ ```bash
220
+ gem-guardian verify GEM:VERSION[:PLATFORM]
221
+ ```
222
+
223
+ In explicit mode, there is no lockfile trust anchor. `gem-guardian` resolves the gem from the configured RubyGems source list, downloads the `.gem` artifact, computes its SHA256 digest, and then behaves as follows:
224
+
225
+ ```text
226
+ If registry or publisher checksum exists:
227
+ expected SHA256 = registry/publisher checksum
228
+ actual SHA256 = downloaded artifact checksum
229
+ result = PASS or FAIL
230
+
231
+ If no independent checksum is available:
232
+ expected SHA256 = none
233
+ actual SHA256 = downloaded artifact checksum
234
+ result = RECORDED
235
+ ```
236
+
237
+ `RECORDED` means the artifact was found and hashed, but there was no independent checksum source to compare against. It is useful inventory data, not proof of integrity.
238
+
239
+ Example with registry checksum support:
240
+
241
+ ```text
242
+ PASS cdc-sidekiq 0.1.1 ruby
243
+ sha256 d91d298d...
244
+ source registry
245
+ ```
246
+
247
+ Example without registry checksum support:
248
+
249
+ ```text
250
+ RECORDED cdc-orchestrator-pro 0.1.0 ruby
251
+ sha256 fa82bd6f...
252
+ source artifact
253
+ ```
254
+
255
+ ### Checksum providers
256
+
257
+ Registry and publisher checksums are obtained through checksum providers. Built-in providers include:
258
+
259
+ - `rubygems-api` — RubyGems.org-style versions API.
260
+ - `compact-index` — RubyGems/Bundler compact index metadata when available.
261
+ - `url` — publisher-controlled checksum files, useful for private or commercial gem distribution.
98
262
 
99
- Use `--provenance` to inspect Trusted Publishing metadata when RubyGems exposes it. Unsupported gems are reported, but they do not fail the run unless the provenance data is present and mismatched.
263
+ Provider metadata is included in JSON output as:
264
+
265
+ ```json
266
+ {
267
+ "registry_sha256": "d91d298d...",
268
+ "registry_checksum_provider": "rubygems-api",
269
+ "registry_checksum_uri": "https://rubygems.org/api/v1/versions/cdc-sidekiq.json"
270
+ }
271
+ ```
272
+
273
+ The URL provider is intentionally generic so a publisher can expose a checksum file without implementing RubyGems.org's metadata API.
274
+ A commercial or private registry can expose something like:
275
+
276
+ ```text
277
+ https://example.com/checksums/{filename}.sha256
278
+ ```
279
+
280
+ with contents such as:
281
+
282
+ ```text
283
+ <sha256> <filename>
284
+ ```
285
+
286
+ ### Project configuration
287
+
288
+ Project-level checksum providers can be declared in `.gem-guardian.yml`:
289
+
290
+ ```yaml
291
+ checksum_providers:
292
+ - name: awesome-gems-registry
293
+ source: https://gems.everything-is-awesome.com/
294
+ template: https://gems.everything-is-awesome.com/checksums/{filename}.sha256
295
+ ```
296
+
297
+ The `source` field scopes the provider to gems resolved from that registry, so a private checksum URL is not queried for unrelated public gems. The `template` field supports these placeholders:
298
+
299
+ - `{name}`
300
+ - `{version}`
301
+ - `{platform}`
302
+ - `{filename}`
303
+
304
+ For example, a locked `mammoth-pro` artifact named `mammoth-pro-1.0.0.gem` would resolve to:
305
+
306
+ ```text
307
+ https://gems.everything-is-awesome.com/checksums/mammoth-pro-1.0.0.gem.sha256
308
+ ```
309
+
310
+ This lets private publishers integrate with gem-guardian without implementing the RubyGems.org versions API. When the checksum file is available, explicit mode can verify:
311
+
312
+ ```text
313
+ publisher checksum == artifact checksum
314
+ ```
315
+
316
+ and lockfile mode can perform the strongest path:
317
+
318
+ ```text
319
+ lockfile checksum == publisher checksum == artifact checksum
320
+ ```
321
+
322
+ Set `GEM_GUARDIAN_CONFIG=/path/to/config.yml` to load configuration from a non-default location.
323
+
324
+ ## Provenance mode
325
+
326
+ ```bash
327
+ gem-guardian verify --provenance GEM:VERSION
328
+ gem-guardian verify --lockfile Gemfile.lock --provenance
329
+ ```
330
+
331
+ Checksum verification and provenance verification are related but separate. Checksums answer:
332
+
333
+ ```text
334
+ Did the artifact bytes match an expected digest?
335
+ ```
336
+
337
+ Provenance answers:
338
+
339
+ ```text
340
+ Who built and published this artifact?
341
+ Which repository, workflow, and commit produced it?
342
+ ```
343
+
344
+ When RubyGems exposes Trusted Publishing provenance, `gem-guardian` reports information such as:
345
+
346
+ - repository
347
+ - workflow
348
+ - commit/ref
349
+ - issuer
350
+ - subject
351
+
352
+ Unsupported provenance is reported as unsupported and does not fail the run. Provenance mismatches and verifier errors fail the run.
353
+
354
+ ## Private registries
355
+
356
+ `gem-guardian` uses RubyGems' configured source list for source discovery. That means explicit and lockfile verification can work with RubyGems-compatible private registries such as GitHub Packages, Gemfury, CodeArtifact, Artifactory, or self-hosted gem servers when they are present in `gem sources`.
357
+
358
+ ```bash
359
+ gem sources --list
360
+ gem-guardian verify cdc-orchestrator-pro:0.1.0
361
+ ```
362
+
363
+ Private registries vary in how much metadata they expose. Some expose enough metadata for registry-checksum verification. Others expose enough index data to resolve and download a gem, but do not expose an independent SHA256 checksum. Publishers can also provide checksums outside the registry through a configured checksum provider.
364
+
365
+ In explicit mode, that distinction is reflected in the result:
366
+
367
+ ```text
368
+ PASS source registry # registry/publisher checksum matched artifact checksum
369
+ RECORDED source artifact # artifact was hashed, but no independent checksum was available
370
+ ```
371
+
372
+ For stronger verification of private gems, prefer lockfile mode after running:
373
+
374
+ ```bash
375
+ bundle lock --add-checksums
376
+ ```
377
+
378
+ Once the checksum is recorded in `Gemfile.lock`, `gem-guardian` can verify the downloaded artifact against the lockfile checksum even if the registry does not expose checksum metadata later:
379
+
380
+ ```bash
381
+ gem-guardian verify --lockfile Gemfile.lock cdc-orchestrator-pro:0.1.0
382
+ ```
383
+
384
+ ```text
385
+ PASS cdc-orchestrator-pro 0.1.0 ruby
386
+ sha256 fa82bd6f...
387
+ source lockfile
388
+ CHECKSUMS coverage: 1/1
389
+ ```
390
+
391
+ If the private registry or publisher also exposes a checksum, lockfile mode performs the stronger three-way comparison:
392
+
393
+ ```text
394
+ lockfile SHA256 == registry SHA256 == artifact SHA256
395
+ ```
396
+
397
+ ## Registry audit research script
398
+
399
+ The repository includes an experimental registry audit script for ecosystem research:
400
+
401
+ ```bash
402
+ ./script/registry_provenance_audit.rb
403
+ ```
404
+
405
+ By default it inspects the RubyGems-compatible registries visible through `Gem.sources`.
406
+
407
+ Limit the scan:
408
+
409
+ ```bash
410
+ MAX_GEMS=100 ./script/registry_provenance_audit.rb
411
+ ```
412
+
413
+ Restrict the scan to one source:
414
+
415
+ ```bash
416
+ REGISTRY_SOURCE=https://rubygems.org/ MAX_GEMS=100 ./script/registry_provenance_audit.rb
417
+ ```
418
+
419
+ This script is intentionally separate from the main CLI. It is useful for answering questions such as:
420
+
421
+ ```text
422
+ Which gems visible from my configured registries expose Trusted Publishing provenance?
423
+ Which gems have checksum verification but no provenance metadata?
424
+ ```
425
+
426
+ ## CI/CD integration
427
+
428
+ Example GitHub Actions steps:
429
+
430
+ ```yaml
431
+ - name: Add Bundler checksums
432
+ run: bundle lock --add-checksums
433
+
434
+ - name: Verify gem integrity and provenance
435
+ run: gem-guardian verify --json --provenance
436
+ ```
437
+
438
+ This can be used as a security job to:
439
+
440
+ - verify Bundler checksum coverage
441
+ - compare lockfile checksums with downloaded artifacts
442
+ - cross-check registry or publisher checksums when available
443
+ - detect artifact checksum mismatches
444
+ - audit Trusted Publishing adoption
445
+ - archive JSON results for later review
100
446
 
101
447
  ## Exit codes
102
448
 
103
- - `0` — all verified artifacts matched
104
- - `1` — mismatch, missing checksum, fetch error, or lockfile error
449
+ - `0` — all required verification checks passed
450
+ - `1` — mismatch, missing checksum, fetch error, provenance mismatch, or lockfile error
105
451
  - `2` — CLI usage error
106
452
 
107
- ## MVP constraints
453
+ ## Design constraints
108
454
 
109
- - Audits `Gemfile.lock` for Bundler `CHECKSUMS` coverage.
110
- - Uses RubyGems.org as a fallback checksum source when the lockfile is incomplete or an explicit gem is supplied.
111
- - Downloads artifacts from RubyGems.org `/downloads/<gem-file>.gem` only when verification is needed.
112
- - Caches downloaded artifacts under the system temp directory.
113
- - Does not integrate into Bundler install hooks.
114
- - GitHub Release checksum/signature discovery and signed tag/release attestation checks are supported when the release metadata is available.
455
+ - Complements Bundler instead of replacing it.
456
+ - Does not hook into `bundle install`.
457
+ - Uses `Gemfile.lock` checksums as the preferred trust anchor.
458
+ - Cross-checks registry or publisher checksums when available.
459
+ - Records artifact digests only as informational data when no independent checksum exists.
460
+ - Uses configured RubyGems sources for source discovery.
461
+ - Keeps JSON output structured for CI consumers. (RubyGems fetcher noise in JSON mode is tracked separately.)
462
+ - Treats unsupported provenance as visibility data, not as a failure.
115
463
 
116
464
  ## Roadmap
117
465
 
118
- - Expand release provenance checks to additional publishing workflows beyond GitHub release provenance.
119
-
466
+ - Expand release provenance checks to additional publishing workflows beyond GitHub Trusted Publishing.
467
+ - Add richer policy controls for CI enforcement.
468
+ - Track provenance adoption over time using registry audit snapshots.
120
469
 
121
470
  ## License
122
471
 
123
472
  [MIT](./LICENSE.txt)
124
473
 
125
-
126
474
  ## Code of Conduct
127
475
 
128
476
  Everyone interacting in the Gem::Guardian project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/kanutocd/gem-guardian/blob/main/CODE_OF_CONDUCT.md).