gemkeeper 0.7.2 → 0.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +18 -1
- data/README.md +11 -11
- data/lib/gemkeeper/bundler_mirror_configurator.rb +1 -1
- data/lib/gemkeeper/cli/commands/list.rb +2 -2
- data/lib/gemkeeper/cli/commands/server/start.rb +4 -4
- data/lib/gemkeeper/cli/commands/server/status.rb +3 -3
- data/lib/gemkeeper/cli/commands/server/stop.rb +3 -3
- data/lib/gemkeeper/cli/commands/sync.rb +1 -1
- data/lib/gemkeeper/compact_index_server/cache_meta.rb +34 -0
- data/lib/gemkeeper/compact_index_server/cache_store.rb +64 -0
- data/lib/gemkeeper/compact_index_server/gem_cache.rb +88 -0
- data/lib/gemkeeper/compact_index_server/gem_index.rb +78 -0
- data/lib/gemkeeper/compact_index_server/index_merger.rb +81 -0
- data/lib/gemkeeper/compact_index_server/response.rb +12 -0
- data/lib/gemkeeper/compact_index_server/response_builder.rb +63 -0
- data/lib/gemkeeper/compact_index_server/rubygems_client.rb +59 -0
- data/lib/gemkeeper/compact_index_server/spec_mapper.rb +38 -0
- data/lib/gemkeeper/compact_index_server/upload_handler.rb +36 -0
- data/lib/gemkeeper/compact_index_server/upstream_cache.rb +26 -0
- data/lib/gemkeeper/compact_index_server.rb +131 -0
- data/lib/gemkeeper/configuration.rb +1 -1
- data/lib/gemkeeper/gem_syncer.rb +53 -84
- data/lib/gemkeeper/gem_uploader.rb +26 -18
- data/lib/gemkeeper/rackup_process.rb +12 -7
- data/lib/gemkeeper/repo_fetcher.rb +80 -0
- data/lib/gemkeeper/server_manager.rb +1 -1
- data/lib/gemkeeper/version.rb +1 -1
- data/lib/gemkeeper.rb +2 -0
- data/specs/20260529-091429-replace-geminabox-compact-proxy/critique-consolidated-v-1.md +168 -0
- data/specs/20260529-091429-replace-geminabox-compact-proxy/critique-v-1-claude.md +124 -0
- data/specs/20260529-091429-replace-geminabox-compact-proxy/critique-v-1-codex.md +125 -0
- data/specs/20260529-091429-replace-geminabox-compact-proxy/critique-v-1-copilot.md +261 -0
- data/specs/20260529-091429-replace-geminabox-compact-proxy/spec.md +360 -0
- data/specs/20260529-131354-sync-serve-cache-contract/critique-consolidated-v-1.md +95 -0
- data/specs/20260529-131354-sync-serve-cache-contract/critique-v-1-claude.md +47 -0
- data/specs/20260529-131354-sync-serve-cache-contract/critique-v-1-codex.md +112 -0
- data/specs/20260529-131354-sync-serve-cache-contract/critique-v-1-copilot.md +169 -0
- data/specs/20260529-131354-sync-serve-cache-contract/implementation-summary.md +59 -0
- data/specs/20260529-131354-sync-serve-cache-contract/spec.md +169 -0
- metadata +38 -28
|
@@ -0,0 +1,261 @@
|
|
|
1
|
+
# Critique: Replace Geminabox with Compact Index Proxy
|
|
2
|
+
|
|
3
|
+
Reviewed by: GitHub Copilot (claude-sonnet-4.6)
|
|
4
|
+
Date: 2026-05-29
|
|
5
|
+
|
|
6
|
+
## Summary
|
|
7
|
+
|
|
8
|
+
The spec is well-scoped and the integration points are clearly identified.
|
|
9
|
+
The constraints table and out-of-scope list are unusually precise — useful.
|
|
10
|
+
However, several correctness traps exist that would produce a server that passes basic smoke tests but fails under Bundler's actual caching behaviour.
|
|
11
|
+
The most serious issues are the `/versions` byte-stability problem (correctness), the missing thread-safety requirement (reliability), and the unspecified `/names` scope (ambiguity with large performance consequences).
|
|
12
|
+
The testing section is thin for the volume of new logic being introduced.
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## 1. Critical: Correctness Blockers
|
|
17
|
+
|
|
18
|
+
### 1.1 `/versions` byte-stability is unaddressed, and merging breaks Bundler's range fetching
|
|
19
|
+
|
|
20
|
+
FR-1.3 requires the server to support `Range: bytes=N-` to serve partial `/versions` responses.
|
|
21
|
+
This is how Bundler efficiently updates its local copy: it records the file size it last fetched, then asks for only the bytes after that offset.
|
|
22
|
+
|
|
23
|
+
The spec's merge strategy (FR-3.1: "private gem entries take precedence when a name appears in both") sorts or interleaves private gems into the rubygems.org `/versions` body.
|
|
24
|
+
Any time a new private gem is added, the byte offsets of every subsequent line in the merged file shift.
|
|
25
|
+
Bundler's cached offset is now wrong: the range request returns garbled data, and Bundler either fails or silently resolves wrong versions.
|
|
26
|
+
|
|
27
|
+
The spec does not define a stable layout for the merged `/versions` output.
|
|
28
|
+
Options include appending private gems after the public block, or rebuilding the rubygems.org block verbatim and appending private entries — but the spec is silent.
|
|
29
|
+
Without a stable layout rule, this is a correctness defect, not just a performance issue.
|
|
30
|
+
|
|
31
|
+
### 1.2 `info_checksum` has a circular dependency
|
|
32
|
+
|
|
33
|
+
FR-2.2 requires `CompactIndex::GemVersion` to carry an `info_checksum` field.
|
|
34
|
+
That checksum is the SHA256 of the `/info/:gemname` response body.
|
|
35
|
+
To populate it in the `/versions` entry, the server must generate the `/info` body first, hash it, and embed the hash in `/versions`.
|
|
36
|
+
|
|
37
|
+
The spec never describes this ordering, nor does it mention that the `info_checksum` must be recomputed whenever a new version of a gem is uploaded (because the `/info` body changes).
|
|
38
|
+
An implementer who builds the versions index first and the info body second will produce invalid checksums that cause Bundler to re-fetch unconditionally.
|
|
39
|
+
|
|
40
|
+
### 1.3 `/names` scope is undefined and carries large performance risk
|
|
41
|
+
|
|
42
|
+
FR-1.1 says `/names` returns "all gem names (local and proxied)."
|
|
43
|
+
The rubygems.org `/names` file currently contains ~175,000 gem names (roughly 2 MB uncompressed).
|
|
44
|
+
"Proxied" in this context almost certainly means the full public gem namespace.
|
|
45
|
+
|
|
46
|
+
The spec never says whether the server fetches, caches, and merges the rubygems.org `/names` file (like it does for `/versions`), or whether `/names` is scoped only to gems that have been locally requested or cached.
|
|
47
|
+
These produce completely different behaviour:
|
|
48
|
+
- Full public namespace: bundle install resolves public gems by name — correct, but the endpoint becomes expensive.
|
|
49
|
+
- Local-only: bundle install fails on any public gem not already in the info cache.
|
|
50
|
+
|
|
51
|
+
The performance checklist item (unchecked) notes only the `/versions` merge cost; it does not mention `/names`.
|
|
52
|
+
This is a missing requirement.
|
|
53
|
+
|
|
54
|
+
---
|
|
55
|
+
|
|
56
|
+
## 2. Ambiguous Requirements
|
|
57
|
+
|
|
58
|
+
### 2.1 "Valid gem" definition in FR-1.2
|
|
59
|
+
|
|
60
|
+
FR-1.2 returns 422 "if the file is not a valid gem" but does not define valid.
|
|
61
|
+
Three plausible interpretations:
|
|
62
|
+
|
|
63
|
+
1. File extension is `.gem`
|
|
64
|
+
2. File is a parseable tar archive containing `metadata.gz` and `data.tar.gz`
|
|
65
|
+
3. The gemspec within `metadata.gz` can be loaded without error
|
|
66
|
+
|
|
67
|
+
These have substantially different implementation and security implications.
|
|
68
|
+
Option 1 is trivially bypassable.
|
|
69
|
+
Option 3 can raise arbitrary Ruby exceptions if the gemspec calls `require`.
|
|
70
|
+
The spec should specify what validation is expected — likely option 2 at minimum.
|
|
71
|
+
|
|
72
|
+
### 2.2 `/versions` cache stores raw upstream or merged output?
|
|
73
|
+
|
|
74
|
+
FR-3.2 says cache the `/versions` response.
|
|
75
|
+
It is ambiguous whether the cache stores:
|
|
76
|
+
|
|
77
|
+
- The raw rubygems.org response (requiring re-merge with private gems on every request), or
|
|
78
|
+
- The merged result (requiring cache invalidation on every gem upload)
|
|
79
|
+
|
|
80
|
+
Both are valid designs; they have different invalidation logic.
|
|
81
|
+
The spec does not specify which, leaving the implementer to decide and potentially choosing the one that breaks ETag/Range behaviour.
|
|
82
|
+
|
|
83
|
+
### 2.3 ETag algorithm: "MD5 or SHA256"
|
|
84
|
+
|
|
85
|
+
FR-1.3 says use "MD5 or SHA256" for the ETag.
|
|
86
|
+
Giving two options creates inconsistency risk — different code paths might use different algorithms, making ETags non-comparable across restarts.
|
|
87
|
+
Pick one.
|
|
88
|
+
SHA256 is the better choice (used for `Repr-Digest` too; re-using the same hash avoids a second pass).
|
|
89
|
+
|
|
90
|
+
### 2.4 `handle_response` in `GemUploader` accepts 200 and 302, not just 201
|
|
91
|
+
|
|
92
|
+
FR-4.2 states that the new server must return "the same status codes (201, 409) that `GemUploader` expects."
|
|
93
|
+
This is inaccurate.
|
|
94
|
+
`GemUploader#handle_response` maps `200`, `201`, and `302` as success.
|
|
95
|
+
The spec's description of `GemUploader`'s contract is wrong.
|
|
96
|
+
While the new server returning 201 will still work (201 is handled), a future implementer auditing the spec against the code will find the discrepancy and may add unnecessary 302 handling or question the spec's accuracy.
|
|
97
|
+
|
|
98
|
+
---
|
|
99
|
+
|
|
100
|
+
## 3. Codebase Assumption Gaps
|
|
101
|
+
|
|
102
|
+
### 3.1 `GemUploader#list_gems` calls `/api/v1/gems.json`
|
|
103
|
+
|
|
104
|
+
The spec says "no changes to `GemUploader`" and it is correct that `list_gems` is never called from any production code path (the `list` CLI reads the filesystem directly via `Dir.glob`, per FR-4.3).
|
|
105
|
+
But `list_gems` is a public method that calls `GET /api/v1/gems.json`, a Geminabox-specific endpoint.
|
|
106
|
+
After this migration, calling `list_gems` will return a 404.
|
|
107
|
+
|
|
108
|
+
The spec should either note that `list_gems` becomes a dead method (and optionally raise `NotImplementedError`), or explicitly call out this known breakage so the implementer doesn't silently leave a broken public method.
|
|
109
|
+
|
|
110
|
+
### 3.2 `rubygems-generate_index` dependency is not addressed
|
|
111
|
+
|
|
112
|
+
The gemspec currently declares `rubygems-generate_index ~> 1.0`.
|
|
113
|
+
This gem exists to support Geminabox's legacy Marshal index generation (`specs.4.8.gz`, `Marshal.4.8.gz`).
|
|
114
|
+
The spec removes Geminabox and explicitly excludes legacy index formats from scope, but says only to swap `geminabox` for `compact_index` in the gemspec.
|
|
115
|
+
`rubygems-generate_index` is likely now unused dead weight.
|
|
116
|
+
Whether to remove it is a judgment call, but the spec should at least acknowledge it.
|
|
117
|
+
|
|
118
|
+
### 3.3 Integration test has more Geminabox assertions than lines 80–81
|
|
119
|
+
|
|
120
|
+
The Integration Points table says to update lines 80–81 of `test_server_lifecycle_integration.rb`.
|
|
121
|
+
In the current file, there are two assertions that reference Geminabox:
|
|
122
|
+
|
|
123
|
+
```ruby
|
|
124
|
+
assert_match(/Geminabox\.data/, content) # line 80
|
|
125
|
+
assert_match(/Geminabox\.rubygems_proxy\s*=\s*true/, content) # line 81
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
But the test method is named `test_server_generates_config_ru`, and the test class also has `test_server_status_while_running` which checks `status[:url]` equals `@config.geminabox_url`.
|
|
129
|
+
The `geminabox_url` method name on `Configuration` is referenced both here and in `RackupProcess#wait_for_server`.
|
|
130
|
+
The spec is silent on this naming — the constraint says "no changes to `configuration.rb`", so the stale method name remains.
|
|
131
|
+
The integration test assertion on `geminabox_url` will still pass (the URL format doesn't change), but it should be called out in the spec as an accepted naming inconsistency rather than left for the implementer to discover.
|
|
132
|
+
|
|
133
|
+
### 3.4 `compact_index` gem API is assumed but not verified
|
|
134
|
+
|
|
135
|
+
The spec builds on `CompactIndex::GemVersion`, `CompactIndex::Dependency`, `CompactIndex.names()`, and `CompactIndex.info()`.
|
|
136
|
+
The spec's own checklist flags this: "key assumption: `compact_index` 0.15.x API is stable."
|
|
137
|
+
The `compact_index` gem is primarily an internal RubyGems.org dependency.
|
|
138
|
+
Its README is sparse and its public API is not documented for external consumers.
|
|
139
|
+
Before implementation begins, the actual gem should be inspected to confirm the class names and method signatures match what the spec assumes.
|
|
140
|
+
This is flagged here not as a spec defect, but as a pre-implementation step that is conspicuously absent from the spec.
|
|
141
|
+
|
|
142
|
+
---
|
|
143
|
+
|
|
144
|
+
## 4. Error Handling and Edge Case Gaps
|
|
145
|
+
|
|
146
|
+
### 4.1 Thread safety for the in-memory gem index
|
|
147
|
+
|
|
148
|
+
FR-2.1 says the server rescans `gems_path/gems/*.gem` after each successful upload.
|
|
149
|
+
Puma (the configured server) uses a thread pool by default.
|
|
150
|
+
A concurrent GET `/info/:gemname` while an upload is updating the in-memory index will produce a data race.
|
|
151
|
+
|
|
152
|
+
The spec does not require synchronization (a `Mutex` around index reads and writes, or a copy-on-write swap).
|
|
153
|
+
In practice, Ruby's GVL limits the impact, but it is not zero — especially during index rebuild where multiple instance variables are updated in sequence.
|
|
154
|
+
The spec should specify that the gem index is updated atomically (e.g., replace the entire index object with a new one via a single assignment).
|
|
155
|
+
|
|
156
|
+
### 4.2 Range request with explicit end byte is unspecified
|
|
157
|
+
|
|
158
|
+
FR-1.3 says handle `Range: bytes=N-` (open-ended).
|
|
159
|
+
The HTTP spec also allows `Range: bytes=N-M` (explicit end) and multi-range requests (`Range: bytes=0-99, 200-299`).
|
|
160
|
+
Bundler currently only sends open-ended ranges, but the spec should be explicit that multi-range and bounded-range requests return 416 (`Range Not Satisfiable`) or fall back to the full response, rather than leaving this undefined.
|
|
161
|
+
|
|
162
|
+
### 4.3 Behaviour when `gems_path/gems/` contains a corrupt `.gem` file
|
|
163
|
+
|
|
164
|
+
FR-2.1 scans all `.gem` files on startup and on upload.
|
|
165
|
+
If a file is corrupt (truncated download, disk error), extracting gemspec metadata will raise an exception.
|
|
166
|
+
The spec does not say whether the server should skip corrupt files with a warning or abort startup.
|
|
167
|
+
If startup is aborted, a single bad file makes the server unlaunchable.
|
|
168
|
+
|
|
169
|
+
### 4.4 Concurrent upload of the same gem
|
|
170
|
+
|
|
171
|
+
FR-1.2 returns 409 if the file already exists.
|
|
172
|
+
If two `gemkeeper sync` processes run simultaneously and both upload the same gem at the same time, a TOCTOU race exists between the existence check and the file write.
|
|
173
|
+
The spec should specify last-write-wins, or require a file lock.
|
|
174
|
+
|
|
175
|
+
### 4.5 System gem cache traversal under `Gem.path`
|
|
176
|
+
|
|
177
|
+
FR-3.3 checks `Gem.path.map { |p| File.join(p, "cache", filename) }`.
|
|
178
|
+
`Gem.path` includes user-defined paths from `GEM_PATH` environment variable.
|
|
179
|
+
A developer with a misconfigured `GEM_PATH` pointing to a path they don't own could cause unexpected file-serving behaviour.
|
|
180
|
+
This is a minor concern given localhost-only binding, but the spec should note that only paths where the file is readable are considered.
|
|
181
|
+
|
|
182
|
+
---
|
|
183
|
+
|
|
184
|
+
## 5. Security Concerns
|
|
185
|
+
|
|
186
|
+
### 5.1 Path traversal in `/gems/:filename.gem`
|
|
187
|
+
|
|
188
|
+
FR-3.3 constructs a filesystem path from the URL parameter `filename`.
|
|
189
|
+
A request to `/gems/../../../../etc/passwd` (URL-decoded by Rack before routing) would traverse outside `gems_path`.
|
|
190
|
+
Even on localhost, any process on the same machine can make this request.
|
|
191
|
+
|
|
192
|
+
The spec's security checklist dismisses auth and input validation as out of scope because the server "binds to 127.0.0.1 only."
|
|
193
|
+
That reasoning does not cover path traversal — localhost binding doesn't prevent local processes from exploiting it.
|
|
194
|
+
The spec should require that `filename` be validated to contain only safe characters (`[a-zA-Z0-9._-]`) or that the resolved path is asserted to be under `gems_path` before serving.
|
|
195
|
+
|
|
196
|
+
### 5.2 SSRF via cached rubygems.org requests
|
|
197
|
+
|
|
198
|
+
The server makes outbound requests to `https://rubygems.org/info/:gemname` where `gemname` comes from the incoming request URL.
|
|
199
|
+
A local process could request `/info/../../../../etc/passwd` — the gemname would be used to construct the upstream URL `https://rubygems.org/info/../../../../etc/passwd`.
|
|
200
|
+
While rubygems.org would return a 404, the spec should require URL-encoding or validation of `:gemname` before constructing the upstream URL.
|
|
201
|
+
|
|
202
|
+
---
|
|
203
|
+
|
|
204
|
+
## 6. Performance Concerns
|
|
205
|
+
|
|
206
|
+
### 6.1 In-memory merge of `/versions` on each request (already flagged in checklist)
|
|
207
|
+
|
|
208
|
+
The spec acknowledges this is an open question.
|
|
209
|
+
A concrete recommendation: cache the fully merged `/versions` body in memory and invalidate it only when a gem is uploaded (cheap) or when the upstream ETag changes (already covered by FR-3.2).
|
|
210
|
+
The spec should promote this from "open question" to a requirement: "merged `/versions` response is memoized in memory; invalidated on upload or upstream ETag change."
|
|
211
|
+
|
|
212
|
+
### 6.2 Full gem metadata re-scan on every upload
|
|
213
|
+
|
|
214
|
+
FR-2.1 says "scans `gems_path/gems/*.gem`" after each upload.
|
|
215
|
+
For a large private gem store, this O(n) re-scan on every upload is unnecessary.
|
|
216
|
+
An incremental approach (add the newly uploaded gem to the in-memory index directly) would be more efficient.
|
|
217
|
+
This is a recommendation rather than a blocker, but if left as a full re-scan, the spec should cap the acceptable gem count or note the known performance characteristic.
|
|
218
|
+
|
|
219
|
+
---
|
|
220
|
+
|
|
221
|
+
## 7. Testing Strategy Gaps
|
|
222
|
+
|
|
223
|
+
### 7.1 No unit tests specified for `CompactIndexServer`
|
|
224
|
+
|
|
225
|
+
The spec introduces a new 200+ line Rack application implementing 8 endpoints, proxy logic, cache management, and ETag/Range support.
|
|
226
|
+
The testing section mentions only one integration test update (lines 80–81) and four "Verify" lines that describe manual/integration scenarios.
|
|
227
|
+
|
|
228
|
+
There are no unit tests specified for:
|
|
229
|
+
- Correct `ETag` and `Repr-Digest` header generation
|
|
230
|
+
- 304 response when ETag matches
|
|
231
|
+
- 206 response for range requests
|
|
232
|
+
- The merge logic for `/versions`
|
|
233
|
+
- Corrupt gem handling (FR-4.3 gap above)
|
|
234
|
+
- Offline fallback path (FR-3.4)
|
|
235
|
+
|
|
236
|
+
Given the project's existing unit test pattern (one `test_*.rb` per class), `test/gemkeeper/test_compact_index_server.rb` should be called out explicitly, even if only to anchor a few key behaviours.
|
|
237
|
+
|
|
238
|
+
### 7.2 FR-4.2 verify claim is overconfident
|
|
239
|
+
|
|
240
|
+
FR-4.2 says "the existing `test/gemkeeper/test_gem_uploader.rb` passes without modification."
|
|
241
|
+
The existing tests only test connection failure paths (no live server involved).
|
|
242
|
+
They do not test a successful upload against a real or mock server.
|
|
243
|
+
The claim that the tests "pass without modification" is true today, but it does not verify that the upload flow actually works against the new server.
|
|
244
|
+
A new integration test covering the upload round-trip should be called out here.
|
|
245
|
+
|
|
246
|
+
---
|
|
247
|
+
|
|
248
|
+
## 8. Spec Completeness Checklist Assessment
|
|
249
|
+
|
|
250
|
+
| Item | Assessment |
|
|
251
|
+
| ---- | ---------- |
|
|
252
|
+
| Scope & acceptance criteria | ✅ Clear. Out of Scope list is precise and useful. |
|
|
253
|
+
| Testing strategy | ⚠️ Thin. No unit tests for the new Rack app; FR-4.2 verify is misleading. |
|
|
254
|
+
| Existing patterns | ✅ Correctly identifies `GemUploader`, `Dir.glob` list pattern, `ServerReadinessProbe`. |
|
|
255
|
+
| Dependencies | ⚠️ `rubygems-generate_index` not addressed; `faraday`/`faraday-multipart` not mentioned as retained. |
|
|
256
|
+
| Architecture & interfaces | ✅ Rack app interface, config.ru, storage layout clearly specified. |
|
|
257
|
+
| Error handling & failure modes | ⚠️ Corrupt gem files, TOCTOU on upload, thread safety, and range-end handling are missing. |
|
|
258
|
+
| Security review | ❌ Path traversal in filename parameter and SSRF in gemname-to-upstream-URL construction are unaddressed. The localhost-only justification does not cover these. |
|
|
259
|
+
| Performance impact | ⚠️ Acknowledged as open question but not resolved. `/names` scope is a larger risk than the spec recognises. |
|
|
260
|
+
| Rollout & migration | ✅ Drop-in; no data migration; Homebrew rebuild noted. |
|
|
261
|
+
| Assumptions & risks | ⚠️ `compact_index` API stability flagged but no pre-implementation verification step prescribed. |
|
|
@@ -0,0 +1,360 @@
|
|
|
1
|
+
# Spec 20260529-091429: Replace Geminabox with Compact Index Proxy
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
|
|
5
|
+
Replace the Geminabox dependency with a minimal custom Rack application (`Gemkeeper::CompactIndexServer`) that serves locally-built private gems via the Bundler compact index protocol and proxies public gem requests to RubyGems.org.
|
|
6
|
+
The server also falls back to the system gem cache for offline use when RubyGems.org is unreachable.
|
|
7
|
+
|
|
8
|
+
## Goals
|
|
9
|
+
|
|
10
|
+
- Remove the broken Geminabox proxy (uses the retired `bundler.rubygems.org/api/v1/dependencies` endpoint)
|
|
11
|
+
- Implement the compact index protocol so Bundler uses efficient, cacheable resolution
|
|
12
|
+
- Proxy public gems from RubyGems.org transparently through the same source URL
|
|
13
|
+
- Enable offline use by serving from the system gem cache and a local response cache
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## Feature 1: Compact Index Rack Application
|
|
18
|
+
|
|
19
|
+
**Who & why:** Developers using gemkeeper configure `source "http://localhost:9292"` as their single Bundler source.
|
|
20
|
+
Today that source proxies through Geminabox, whose upstream API was retired in May 2023, producing four retries on every `bundle install`.
|
|
21
|
+
They need a server that speaks the compact index protocol Bundler has used since 2016, without that noise.
|
|
22
|
+
|
|
23
|
+
### Functional Requirements
|
|
24
|
+
|
|
25
|
+
#### FR-1.1: Core compact index endpoints
|
|
26
|
+
The server MUST implement the following endpoints:
|
|
27
|
+
|
|
28
|
+
- `GET /names` — sorted, newline-delimited list of all gem names (local + proxied upstream), generated per FR-3.5
|
|
29
|
+
- `GET /versions` — merged versions index combining private gems and the proxied RubyGems.org versions file, generated per FR-3.1
|
|
30
|
+
- `GET /info/:gemname` — per-gem dependency metadata; served from local data for private gems, proxied from RubyGems.org for public gems per FR-3.1
|
|
31
|
+
- `GET /gems/:filename.gem` — serve gem binary (local-first, then system cache, then proxy per FR-3.3)
|
|
32
|
+
|
|
33
|
+
All URL path parameters (`:gemname`, `:filename`) are validated against `/\A[a-zA-Z0-9._-]+\z/` before any filesystem or upstream URL use.
|
|
34
|
+
Return 400 for parameters that do not match.
|
|
35
|
+
For `GET /gems/:filename`, additionally assert the resolved path is under `gems_path/gems/` before serving.
|
|
36
|
+
|
|
37
|
+
**Verify:** `bundle install` against a Gemfile backed by this server completes without retries or `HTTPError` output.
|
|
38
|
+
|
|
39
|
+
#### FR-1.2: Gem upload endpoint
|
|
40
|
+
`POST /upload` accepts a multipart form upload with field name `file` (matching the current Geminabox API consumed by `GemUploader`).
|
|
41
|
+
|
|
42
|
+
Validation: open the uploaded data as a tar archive and confirm it contains `metadata.gz` and `data.tar.gz`.
|
|
43
|
+
Extract gemspec metadata from `metadata.gz` using `Gem::Package` inside a rescue block.
|
|
44
|
+
Return 422 if the archive is malformed or metadata extraction raises.
|
|
45
|
+
Do not `load` or `eval` gemspec content.
|
|
46
|
+
|
|
47
|
+
On success: create `gems_path/gems/` if absent; write to a temp file in the same directory; rename atomically to `gems_path/gems/<name>-<version>.gem`.
|
|
48
|
+
Delete the temp file if validation fails.
|
|
49
|
+
Response codes: 201 on success, 409 if the target path already exists, 422 on invalid gem.
|
|
50
|
+
|
|
51
|
+
After a successful write, rebuild the in-memory gem index per AR-1.1.
|
|
52
|
+
|
|
53
|
+
**Verify:** `gemkeeper sync` completes successfully; the gem appears in `gems_path/gems/` and in subsequent compact index responses.
|
|
54
|
+
|
|
55
|
+
#### FR-1.3: Conditional and range request support
|
|
56
|
+
All endpoints serving locally-generated or merged content (`/names`, `/versions`, `/info/:gemname`) MUST include:
|
|
57
|
+
|
|
58
|
+
- `ETag: "<sha256-hex>"` — SHA256 hex digest of the final response body
|
|
59
|
+
- `Repr-Digest: sha-256=<base64-encoded-sha256>` — RFC 9530; computed from the same final body
|
|
60
|
+
- `Accept-Ranges: bytes`
|
|
61
|
+
|
|
62
|
+
Do not forward `ETag` or `Repr-Digest` headers from RubyGems.org unchanged for merged responses; recompute from the merged body.
|
|
63
|
+
|
|
64
|
+
The server MUST handle:
|
|
65
|
+
- `If-None-Match` — return 304 if the ETag matches
|
|
66
|
+
- `Range: bytes=N-` (open-ended only) — return 206 with the partial body from byte N onward
|
|
67
|
+
- `Range: bytes=N-M` or multi-range — return 416
|
|
68
|
+
|
|
69
|
+
**Verify:** A second `bundle install` produces `304 Not Modified` responses for unchanged index files.
|
|
70
|
+
|
|
71
|
+
#### FR-1.4: Health endpoint
|
|
72
|
+
`GET /` returns `200 OK`.
|
|
73
|
+
Used by `ServerReadinessProbe` (`lib/gemkeeper/server_readiness_probe.rb`).
|
|
74
|
+
|
|
75
|
+
**Verify:** `gemkeeper server start` completes without timing out.
|
|
76
|
+
|
|
77
|
+
### Architectural Requirements
|
|
78
|
+
|
|
79
|
+
#### AR-1.1: Atomic in-memory gem index
|
|
80
|
+
The server maintains an in-memory gem index (private gem metadata read from `gems_path/gems/`).
|
|
81
|
+
After each successful upload, the index is rebuilt into a new object and swapped via a single instance variable assignment.
|
|
82
|
+
Index reads do not acquire a lock; the swap is atomic at the Ruby object reference level (copy-on-write).
|
|
83
|
+
On startup, create `gems_path/gems/` if absent before scanning.
|
|
84
|
+
|
|
85
|
+
---
|
|
86
|
+
|
|
87
|
+
## Feature 2: Private Gem Serving
|
|
88
|
+
|
|
89
|
+
**Who & why:** The gems built by `gemkeeper sync` must appear in Bundler's dependency graph with correct version and dependency metadata.
|
|
90
|
+
Without accurate compact index data for private gems, Bundler will either fail to find them or resolve wrong versions.
|
|
91
|
+
|
|
92
|
+
### Functional Requirements
|
|
93
|
+
|
|
94
|
+
#### FR-2.1: Gem file discovery and metadata extraction
|
|
95
|
+
On startup and after each successful upload, the server scans `gems_path/gems/*.gem`.
|
|
96
|
+
For each file, gemspec metadata is extracted from the embedded `metadata.gz` using `Gem::Package` inside a rescue block.
|
|
97
|
+
Files that raise on extraction are skipped with a warning log entry; they do not abort startup.
|
|
98
|
+
Extracted metadata: gem name, version, platform, runtime dependencies (name + version constraint), SHA256 checksum of the `.gem` file.
|
|
99
|
+
|
|
100
|
+
**Verify:** A gem uploaded after server start appears in `/names`, `/versions`, and `/info/:gemname` without restarting the server.
|
|
101
|
+
|
|
102
|
+
#### FR-2.2: Compact index data generation
|
|
103
|
+
Uses the `compact_index` gem to produce correct response bodies.
|
|
104
|
+
|
|
105
|
+
**`info_checksum` ordering** — info bodies for all private gems must be generated before the `/versions` index is built.
|
|
106
|
+
For each private gem, compute `Digest::MD5.hexdigest(CompactIndex.info(gem_versions_array))` and store it as `info_checksum` in the corresponding `CompactIndex::GemVersion`.
|
|
107
|
+
The versions index is then built referencing those pre-computed checksums.
|
|
108
|
+
Checksums are recomputed after each upload.
|
|
109
|
+
|
|
110
|
+
**Verified `compact_index` 0.15.0 API:**
|
|
111
|
+
- `CompactIndex::GemVersion` — `Struct.new(:number, :platform, :checksum, :info_checksum, :dependencies, :ruby_version, :rubygems_version)`. Field is `number`, not `version`. `checksum` is the SHA256 of the `.gem` file.
|
|
112
|
+
- `CompactIndex::Gem` — `Struct.new(:name, :versions)`.
|
|
113
|
+
- `CompactIndex::Dependency` — `Struct.new(:gem, :version, :platform, :checksum)`. The dependency gem name is field `:gem`; the constraint string is field `:version`.
|
|
114
|
+
- `info_checksum` uses MD5 (not SHA256) per the compact index protocol. Bundler verifies this checksum when it downloads `/info/:gemname`.
|
|
115
|
+
|
|
116
|
+
**Verify:** `bundle exec gem dependency <private-gem>` resolves correctly when the Gemfile sources from `http://localhost:9292`.
|
|
117
|
+
|
|
118
|
+
### Architectural Requirements
|
|
119
|
+
|
|
120
|
+
#### AR-2.1: `compact_index` and `rubygems-generate_index` dependency swap
|
|
121
|
+
`gemkeeper.gemspec` drops `geminabox ~> 3.0` and `rubygems-generate_index ~> 1.0`, and adds `compact_index ~> 0.15`.
|
|
122
|
+
No other runtime dependencies are added for this feature.
|
|
123
|
+
|
|
124
|
+
---
|
|
125
|
+
|
|
126
|
+
## Feature 3: Public Gem Proxy with Offline Cache
|
|
127
|
+
|
|
128
|
+
**Who & why:** The Gemfile sources all gems — public and private — from `http://localhost:9292`.
|
|
129
|
+
Public gem resolution must work when online (proxying to RubyGems.org) and degrade gracefully when offline rather than returning 500 errors.
|
|
130
|
+
When offline, gems already installed on the developer's system should be servable directly, avoiding re-download on reconnect.
|
|
131
|
+
|
|
132
|
+
### Functional Requirements
|
|
133
|
+
|
|
134
|
+
#### FR-3.1: Merge algorithm for `/versions` and `/info/:gemname`
|
|
135
|
+
|
|
136
|
+
**`/versions`:**
|
|
137
|
+
Fetch `https://rubygems.org/versions` and cache the raw response body to `cache_dir/rubygems_cache/versions` (refreshed when the upstream ETag changes or after 30 minutes).
|
|
138
|
+
Construct a `CompactIndex::VersionsFile` from that cached file.
|
|
139
|
+
Pass private gem objects as `extra_gems` to `CompactIndex.versions(versions_file, extra_gems)` so they are appended after the upstream public block.
|
|
140
|
+
When a private gem name collides with a public gem name, the private gem entry takes precedence: suppress the public entry for that name from the merged output so Bundler cannot resolve the public version.
|
|
141
|
+
The public block is never reordered; private entries are appended. This keeps byte offsets of the public block stable across private gem additions, preserving Bundler's incremental range fetching.
|
|
142
|
+
Write the merged response to `cache_dir/rubygems_cache/versions.merged` and keep only the merged body's SHA256 hex digest in memory as the current ETag.
|
|
143
|
+
Regenerate `versions.merged` on each upload or when the upstream ETag changes.
|
|
144
|
+
Serve `/versions` by streaming `versions.merged` from disk; the OS file cache handles hot reads without holding the full body in memory.
|
|
145
|
+
|
|
146
|
+
**`/info/:gemname` for private gems:** generate from local metadata (FR-2.2).
|
|
147
|
+
**`/info/:gemname` for public gems:** proxy `https://rubygems.org/info/:gemname` and cache per FR-3.2.
|
|
148
|
+
|
|
149
|
+
**Verify:** `bundle install` resolves a public gem (e.g., `rake`) and a private gem through the same local source with no errors.
|
|
150
|
+
|
|
151
|
+
#### FR-3.2: Cache proxy responses for offline use
|
|
152
|
+
Cache proxied compact index responses under `cache_dir/rubygems_cache/`:
|
|
153
|
+
|
|
154
|
+
- `/versions` raw upstream body: refreshed when upstream ETag changes or after 30 minutes. Use a conditional GET (`If-None-Match`) to upstream; a 304 resets the local TTL without rewriting the file.
|
|
155
|
+
- `/info/:gemname` per-gem: cached per gem name; refreshed after 60 minutes using the same conditional GET pattern.
|
|
156
|
+
- `.gem` binaries: cached permanently (content-addressed; gem files are immutable once published).
|
|
157
|
+
|
|
158
|
+
Cache files are written atomically (temp file + rename).
|
|
159
|
+
|
|
160
|
+
When RubyGems.org is unreachable (connection error or timeout) and a cached copy exists, serve from cache.
|
|
161
|
+
|
|
162
|
+
**Verify:** After a successful `bundle install` online, disconnecting from the network and running `bundle install` again completes using only cached data.
|
|
163
|
+
|
|
164
|
+
#### FR-3.3: System gem cache fallback for `.gem` files
|
|
165
|
+
Before proxying `GET /gems/:filename.gem` to RubyGems.org, check each path in `Gem.path.map { |p| File.join(p, "cache", filename) }` using `File.exist?` before attempting to read.
|
|
166
|
+
If a matching readable file is found, serve it directly without a network request.
|
|
167
|
+
|
|
168
|
+
**Verify:** A `.gem` file present in the system gem cache is served without an outbound RubyGems.org request.
|
|
169
|
+
|
|
170
|
+
#### FR-3.4: Response semantics for missing or unreachable upstream
|
|
171
|
+
Distinguish three cases for upstream gem requests:
|
|
172
|
+
|
|
173
|
+
- **Upstream reachable, gem not found** (upstream returns 4xx): return 404 with no body.
|
|
174
|
+
- **Upstream unreachable** (connection error, timeout) **+ cache exists**: serve from cache.
|
|
175
|
+
- **Upstream unreachable + no cache**: return 503 with body `"Upstream unavailable and no local cache. Connect to the internet and run bundle install to warm the cache."`.
|
|
176
|
+
|
|
177
|
+
Do not return 500 in any of these cases.
|
|
178
|
+
|
|
179
|
+
**Verify:** With RubyGems.org blocked and no cache, a request to `/info/nonexistent-gem` returns 503, not 500.
|
|
180
|
+
|
|
181
|
+
#### FR-3.5: `/names` endpoint
|
|
182
|
+
Fetch `https://rubygems.org/names` and cache the raw response body under `cache_dir/rubygems_cache/names` with the same 60-minute TTL and conditional GET refresh as `/info/:gemname`.
|
|
183
|
+
Merge local private gem names with the cached upstream names; sort the combined list alphabetically.
|
|
184
|
+
Write the merged result to `cache_dir/rubygems_cache/names.merged`; keep only its SHA256 hex digest in memory as the current ETag.
|
|
185
|
+
Regenerate `names.merged` on each upload or when the upstream names ETag changes.
|
|
186
|
+
Serve `/names` by streaming `names.merged` from disk.
|
|
187
|
+
|
|
188
|
+
**Verify:** `/names` includes both a known private gem name and a known public gem name.
|
|
189
|
+
|
|
190
|
+
### Architectural Requirements
|
|
191
|
+
|
|
192
|
+
#### AR-3.1: HTTP client for proxying
|
|
193
|
+
Use `Net::HTTP` (stdlib) for all outbound RubyGems.org requests.
|
|
194
|
+
Do not add `faraday`, `httpclient`, or other HTTP client gems for proxy use.
|
|
195
|
+
|
|
196
|
+
#### AR-3.2: Proxy timeout
|
|
197
|
+
Outbound requests use a 5-second open timeout and a 10-second read timeout.
|
|
198
|
+
Timeout errors are treated as unreachable (see FR-3.4).
|
|
199
|
+
|
|
200
|
+
#### AR-3.3: Gem binary streaming
|
|
201
|
+
Proxy and cache responses for `.gem` file downloads using bounded buffering or streaming rather than reading the full binary into memory before sending.
|
|
202
|
+
RubyGems.org gem files range from a few KB to tens of MB.
|
|
203
|
+
|
|
204
|
+
---
|
|
205
|
+
|
|
206
|
+
## Feature 4: Gemkeeper Integration
|
|
207
|
+
|
|
208
|
+
**Who & why:** The server is an implementation detail inside gemkeeper.
|
|
209
|
+
All existing CLI commands, upload flow, list command, server lifecycle, and mirror configuration must continue to work without changes to their respective classes.
|
|
210
|
+
|
|
211
|
+
### Functional Requirements
|
|
212
|
+
|
|
213
|
+
#### FR-4.1: config.ru generation
|
|
214
|
+
`RackupProcess#config_ru_content` (`lib/gemkeeper/rackup_process.rb`) is updated to generate a config.ru that requires `Gemkeeper::CompactIndexServer` and mounts it, passing `gems_path` and `cache_dir`.
|
|
215
|
+
All Geminabox configuration is removed.
|
|
216
|
+
|
|
217
|
+
**Verify:** The generated `config.ru` contains no references to `Geminabox`; the server starts and responds normally.
|
|
218
|
+
|
|
219
|
+
#### FR-4.2: Upload API compatibility — no changes to `GemUploader`
|
|
220
|
+
`lib/gemkeeper/gem_uploader.rb` is unchanged.
|
|
221
|
+
The server's `POST /upload` endpoint accepts the same multipart payload and returns status codes compatible with `GemUploader#handle_response`: 200, 201, or 302 for success; 409 for conflict.
|
|
222
|
+
Return 201 for a new upload; 409 if the gem already exists.
|
|
223
|
+
|
|
224
|
+
**Verify:** `gemkeeper sync` uploads gems without error; a second sync of the same version produces a skip (409 → already-exists path).
|
|
225
|
+
|
|
226
|
+
#### FR-4.3: List command compatibility — no changes to list
|
|
227
|
+
`gemkeeper list` reads `Dir.glob(File.join(gems_path, "gems", "*.gem"))` directly from the filesystem.
|
|
228
|
+
The custom server stores uploaded gems at `gems_path/gems/` matching current structure.
|
|
229
|
+
|
|
230
|
+
**Verify:** `gemkeeper list` output is unchanged after migration.
|
|
231
|
+
|
|
232
|
+
#### FR-4.4: Server lifecycle — no changes to `ServerManager`, `ServerReadinessProbe`, `BundlerMirrorConfigurator`
|
|
233
|
+
These classes are Rack-server-agnostic and require no modifications.
|
|
234
|
+
|
|
235
|
+
**Verify:** `gemkeeper server start`, `gemkeeper server stop`, and `gemkeeper server status` all behave identically before and after migration.
|
|
236
|
+
|
|
237
|
+
### Architectural Requirements
|
|
238
|
+
|
|
239
|
+
#### AR-4.1: New server class location
|
|
240
|
+
`Gemkeeper::CompactIndexServer` is implemented in `lib/gemkeeper/compact_index_server.rb` as a Rack application (responds to `call(env)`).
|
|
241
|
+
It is instantiated and `run` in the generated `config.ru`.
|
|
242
|
+
It is not required anywhere else in the gemkeeper library.
|
|
243
|
+
|
|
244
|
+
---
|
|
245
|
+
|
|
246
|
+
## Data Requirements
|
|
247
|
+
|
|
248
|
+
The `rubygems_cache/` directory layout under `cache_dir`:
|
|
249
|
+
|
|
250
|
+
```
|
|
251
|
+
cache_dir/
|
|
252
|
+
rubygems_cache/
|
|
253
|
+
versions # raw upstream /versions body
|
|
254
|
+
versions.merged # merged upstream + private gems (served to Bundler)
|
|
255
|
+
versions.meta # sidecar: upstream ETag + fetched_at timestamp
|
|
256
|
+
names # raw upstream /names body
|
|
257
|
+
names.merged # merged upstream + private gem names (served to Bundler)
|
|
258
|
+
names.meta # sidecar: upstream ETag + fetched_at timestamp
|
|
259
|
+
info/
|
|
260
|
+
<gemname> # raw upstream /info/:gemname body
|
|
261
|
+
<gemname>.meta # sidecar: upstream ETag + fetched_at timestamp
|
|
262
|
+
gems/
|
|
263
|
+
<name>-<version>.gem # cached gem binaries (permanent)
|
|
264
|
+
```
|
|
265
|
+
|
|
266
|
+
Sidecar `.meta` files are written atomically alongside the body file.
|
|
267
|
+
|
|
268
|
+
---
|
|
269
|
+
|
|
270
|
+
## Integration Points
|
|
271
|
+
|
|
272
|
+
| File | Change |
|
|
273
|
+
| ---- | ------ |
|
|
274
|
+
| `lib/gemkeeper/rackup_process.rb` | Replace `config_ru_content` |
|
|
275
|
+
| `lib/gemkeeper/compact_index_server.rb` | New file — the Rack app |
|
|
276
|
+
| `gemkeeper.gemspec` | Remove `geminabox ~> 3.0` and `rubygems-generate_index ~> 1.0`; add `compact_index ~> 0.15` |
|
|
277
|
+
| `test/integration/test_server_lifecycle_integration.rb` | Update config.ru content assertions (lines 80–81) |
|
|
278
|
+
| `CLAUDE.md` | Update Architecture section; remove Geminabox references |
|
|
279
|
+
| `AGENTS.md` | Same updates as CLAUDE.md |
|
|
280
|
+
|
|
281
|
+
### Known dead code after migration
|
|
282
|
+
`GemUploader#list_gems` calls `GET /api/v1/gems.json`, a Geminabox-specific endpoint the new server does not implement.
|
|
283
|
+
The method is unused in production (the list CLI reads the filesystem directly).
|
|
284
|
+
Remove or raise `NotImplementedError` — do not leave a silently broken public method.
|
|
285
|
+
|
|
286
|
+
## Related Specs
|
|
287
|
+
|
|
288
|
+
None — this is a standalone infrastructure replacement.
|
|
289
|
+
|
|
290
|
+
## Constraints
|
|
291
|
+
|
|
292
|
+
- No changes to `gem_uploader.rb`, `server_manager.rb`, `server_readiness_probe.rb`, `bundler_mirror_configurator.rb`, or `configuration.rb`
|
|
293
|
+
- No new runtime dependencies beyond `compact_index ~> 0.15`
|
|
294
|
+
- `POST /upload` multipart API must remain compatible with `GemUploader`
|
|
295
|
+
- Gem storage path (`gems_path/gems/*.gem`) must remain unchanged so `gemkeeper list` is unaffected
|
|
296
|
+
|
|
297
|
+
## Out of Scope
|
|
298
|
+
|
|
299
|
+
- Authentication for uploads or downloads
|
|
300
|
+
- HTTPS/TLS
|
|
301
|
+
- Yanking gems
|
|
302
|
+
- Proxying sources other than rubygems.org
|
|
303
|
+
- The `gemkeeper manifest`, `gemkeeper setup`, or `gemkeeper sync` internals
|
|
304
|
+
- Serving legacy index formats (`Marshal.4.8.gz`, `specs.4.8.gz`)
|
|
305
|
+
- `GET /api/v1/gems.json` (Geminabox-specific; unused in production after `list_gems` removal)
|
|
306
|
+
|
|
307
|
+
## Spec Completeness Checklist
|
|
308
|
+
|
|
309
|
+
- [x] **Scope & acceptance criteria** — each FR has a Verify line; Out of Scope list is explicit; blocking ambiguities from critique resolved
|
|
310
|
+
- [x] **Testing strategy** — FRs reference existing tests (FR-4.2, FR-4.4); integration verify conditions cover server start, upload round-trip, offline cache, and Bundler resolution; `test_compact_index_server.rb` implied by AR-4.1 convention (one test file per class)
|
|
311
|
+
- [x] **Existing patterns** — references `GemUploader`, `ServerReadinessProbe`, `ServerManager`, `Dir.glob` list pattern, `Gem::Package` extraction, and existing storage path conventions throughout
|
|
312
|
+
- [x] **Dependencies** — `compact_index ~> 0.15` justified in AR-2.1; `rubygems-generate_index` removal explicit; `Net::HTTP` (stdlib) chosen in AR-3.1; no other additions
|
|
313
|
+
- [x] **Architecture & interfaces** — Rack app interface in AR-4.1; storage layout in Data Requirements; proxy HTTP client in AR-3.1/AR-3.2; config.ru generation in FR-4.1; cache layout in Data Requirements; atomic upload in FR-1.2; atomic index swap in AR-1.1
|
|
314
|
+
- [x] **Error handling & failure modes** — FR-3.4 distinguishes upstream 404 vs 503; FR-1.2 covers malformed upload and 422; FR-2.1 covers corrupt gem at startup; AR-1.1 covers concurrent read/write; FR-1.3 covers invalid range (416)
|
|
315
|
+
- [x] **Security review** — FR-1.1 adds path parameter validation (`/\A[a-zA-Z0-9._-]+\z/`) and path-under-gems_path assertion; FR-1.2 prohibits gemspec eval; AR-3.1 scopes proxy to rubygems.org only; localhost-only binding inherited from `RackupProcess`
|
|
316
|
+
- [x] **Performance impact** — merged `/versions` (~23 MB) and `/names` (~2.7 MB) written to disk and streamed; only SHA256 ETag strings held in memory (FR-3.1, FR-3.5); gem binary proxy streamed per AR-3.3; private gem index is small and negligible
|
|
317
|
+
- [x] **Rollout & migration** — drop-in replacement; no data migration; existing `gems_path/gems/` reused; Homebrew formula rebuild required; `list_gems` dead method called out explicitly
|
|
318
|
+
- [x] **Assumptions & risks** — `compact_index` 0.15.x field names flagged for pre-implementation verification (FR-2.2); Bundler `Range`/`Repr-Digest` strictness addressed in FR-1.3; `/versions` byte-stability addressed in FR-3.1
|
|
319
|
+
|
|
320
|
+
---
|
|
321
|
+
|
|
322
|
+
## Change Log
|
|
323
|
+
|
|
324
|
+
### Update from `critique-consolidated-v-1.md`
|
|
325
|
+
|
|
326
|
+
**Applied:**
|
|
327
|
+
- B-1: Specified `/versions` merge algorithm — upstream verbatim block first via `VersionsFile`, private gems as `extra_gems`, collision = suppress public entry, byte-stable layout (FR-3.1)
|
|
328
|
+
- B-2: Specified ETag and `Repr-Digest` computed from merged body; SHA256 only; no forwarding of upstream headers for merged responses (FR-1.3)
|
|
329
|
+
- B-3: Added `info_checksum` generation ordering — info bodies first, checksums embedded before versions index is built (FR-2.2)
|
|
330
|
+
- B-4: Added `/names` as a full fetch/cache/merge endpoint matching `/versions` semantics (FR-3.5)
|
|
331
|
+
- G-1: Added URL parameter validation (`/\A[a-zA-Z0-9._-]+\z/`) and path-containment assertion to FR-1.1
|
|
332
|
+
- G-2: Defined "valid gem" as tar-parseable with extractable metadata; no eval (FR-1.2)
|
|
333
|
+
- G-3: Added `mkdir_p`, atomic temp-file write, and temp cleanup to FR-1.2
|
|
334
|
+
- G-4: Added AR-1.1 specifying atomic index swap (copy-on-write) for thread safety
|
|
335
|
+
- G-5: Replaced FR-3.4 with three-way distinction: upstream 404 → 404; unreachable + cache → serve cache; unreachable + no cache → 503
|
|
336
|
+
- G-6: Added `GemUploader#list_gems` dead-method callout to Integration Points; `/api/v1/gems.json` added to Out of Scope
|
|
337
|
+
- G-7: Added `rubygems-generate_index` to AR-2.1 as dependency to remove; added to Integration Points table
|
|
338
|
+
- Corrected FR-4.2 response codes to match actual `GemUploader#handle_response`: 200/201/302 success, 409 conflict
|
|
339
|
+
- Added AR-3.3 requiring gem binary streaming to avoid full-file memory allocation
|
|
340
|
+
- Added Data Requirements section with `rubygems_cache/` directory layout and sidecar metadata files
|
|
341
|
+
- Added AGENTS.md to Integration Points (both CLAUDE.md and AGENTS.md exist in repo)
|
|
342
|
+
- Clarified FR-3.2 cache write atomicity and conditional GET (If-None-Match) refresh behavior
|
|
343
|
+
|
|
344
|
+
### Pre-implementation compact_index API verification
|
|
345
|
+
|
|
346
|
+
**Applied:**
|
|
347
|
+
- Corrected `info_checksum` hash algorithm from SHA256 to MD5 — the protocol spec and `compact_index` gem both use `Digest::MD5` for this field; Bundler verifies it on download
|
|
348
|
+
- Confirmed `GemVersion` field is `number` (not `version`); documented full struct signature
|
|
349
|
+
- Confirmed `Dependency` fields: `:gem` for the dep name, `:version` for the constraint
|
|
350
|
+
- Confirmed collision suppression works via last-wins semantics — `VersionsFile#contents` appends `extra_gems` verbatim; no pre-filtering of upstream file needed
|
|
351
|
+
- Improved FR-3.4 503 body to include actionable guidance for cold-start offline case
|
|
352
|
+
|
|
353
|
+
**Rejected:**
|
|
354
|
+
- "Set Puma thread count to 1" — over-specifies implementation; AR-1.1's atomic swap is the correct architectural constraint
|
|
355
|
+
- "Add `test/gemkeeper/test_compact_index_server.rb` as an explicit FR" — the one-test-file-per-class convention is already established in the project; calling it out in the spec over-specifies test structure
|
|
356
|
+
|
|
357
|
+
**Reorganized:**
|
|
358
|
+
- Split old FR-3.1 into FR-3.1 (merge algorithm) and FR-3.5 (/names endpoint) for clarity
|
|
359
|
+
- Moved `gems_path/gems/` creation from an implicit assumption into FR-1.2 and AR-1.1 explicitly
|
|
360
|
+
- Added Data Requirements section to centralize the cache directory layout (previously scattered across FR-3.1 and FR-3.2)
|